Unlike traditional database systems, Postgres is an object-relation database system, meaning it relies heavily on plug-in objects to enable flexible behavior, and many of these objects are supplied as shared object libraries (or dynamically-loadable libraries (DDL) on Windows). Examples of external objects include the PL/pgSQL server-side language, the pgcrypto cryptographic library, and the PostGIS geographic information system. These are all implemented as shared libraries that are dynamically loaded into the database server when accessed. The process of loading a library into a running executable is called dynamic loading and is the way most modern operating systems access libraries (the non-dynamic method is called “static linking”).Don’t Ignore shared_preload_libraries
Archive for April 2012
There are three ways to install Postgres, and they are all listed on the Download menu on the Postgres website. While the web authors did their best to explain the differences between the various installation methods, I thought it would be helpful to more fully explain their advantaged and disadvantages. The three methods are:
- Graphical Installers (also called “one-click” or “click-through” installers)
- Advantages: Provide an easy-to-install Postgres environment with access to many add-on modules via Stack Builder. This is ideal for first-time Postgres users. It is also possible to use the installer in non-interactive mode.
- Disadvantages: As mentioned on the Postgres web site, the one click installers do not integrate with platform-specific packaging systems.
- Platform-Specific Packages
- Advantages: These are better integrated with other software installed on your operating system. This is ideal for production servers that rely on operating-system-supplied tools.
- Disadvantages: Requires work to identify which other packages are needed for a complete solution. Obtaining newer major versions of Postgres on older operating systems might also require work.
- Source Code:
- Advantages: Allows selection of specific configure and compile options for Postgres binaries, and allows the addition of patches to enhance or fix Postgres. This is ideal for experienced users who can benefit from additional control of Postgres.
- Disadvantages: Requires compilation experience and managing Postgres tool integration, and requires user and server start/stop configuration.
My recent blog post showing linear read-scaling out to 64 cores has generated a lot of attention, which I frankly did not expect. To be sure, it’s a cool result, but keep in mind that it’s testing a certain very specific workload: very short transactions running very simple queries to do repeated primary key lookups on a single table. This, of course, is not an unimportant workload; in fact, you could argue that some of the NoSQL products on the market are basically purpose-built databases for exactly this kind of workload (sometimes coupled with a facility for durable or non-durable writes).
On the flip side, it’s also not the only workload. The reason why PostgreSQL 9.2devel so completely stomps the performance of earlier releases on this test is that the transactions are very, very short. If you ran the same test, but with BEGIN before the first query and END after the last one, or if you striped the queries across multiple tables, you would eliminate the lock manager contention that holds performance on PostgreSQL 9.1 in check on this test.
For most people, therefore, there’s probably no reason to panic just because you’re running one of the existing releases. You may get better performance on PostgreSQL 9.2, when it’s released, but chances are that if you had the sort of workload for which these changes make a truly dramatic difference, you wouldn’t have picked PostgreSQL in the first place. What I think is exciting about these changes is not so much that existing users will see huge performance benefits (although some will; we have a lot of good changes in this release) but that PostgreSQL will become usable in environments where it currently can’t compete.
Of course, that’s not to say that we’re going to put memcached out of business. There will probably always be cheaper alternatives to an RDBMS if the only work you need to do is primary key lookups and stores, and especially if you don’t need durability. But many people need good performance on large numbers of simple queries and additionally need the ability to do some more complex processing, and I’m hopeful that these scalability changes will make it much simpler to deploy PostgreSQL effectively in such environments.
Although for most people there’s no huge rush to upgrade, if you’re running PostgreSQL 8.3.x or older, it’s time to think hard about getting onto a newer version. Community support for PostgreSQL 8.3.x will end in Q1 of next year. If you’re running anything older than that, it’s already unsupported; moreover, every release from 7.4 through 8.3 featured major performance improvements.Don’t Take Me Too Seriously
Having attended several conferences recently, I saw confirmation of my previous observation that Postgres is poised for a new wave of adoption. The last time I saw such an upturn in adoption was with the release of Postgres 8.0 in 2005, which included a native port of Postgres to Windows. You can see the increase in the volume of postings to the Postgres jobs email list. (The spike in January of 2008 was Sun buying MySQL.)
And that’s not all — Robert Haas’s recent blog post about Postgres scaling linearly to 64-cores in upcoming Postgres 9.2 means that, by the end of the year, Postgres will be a major contender on high-end hardware. We have always done well on small to medium-sized servers, but we are now poised to compete heavily on the high-end.The New Postgres Era
Remember when I blogged about linear read scalability out to 32 cores? Well, the awesome Nate Boley provided me with access to his brand new 64-core server. I ran my usual suite of read-only pgbench tests just to baseline its performance, and found that the performance scaled linearly all the way out to 64 clients. OK, it wasn’t quite linear: the 64-client performance was only 63.68 times the single-client performance. Still, I’ll take it. Graph is below.Did I Say 32 Cores? How about 64?
There exist several processes that can be used to communicate information between Java EE applications: JMS, database storage/retrieval, printing/OCR scanning, etc. (For a distributed version of the last one, see RFC 1149.)
But each of these requires the creation and management of resources, along with complicated APIs. Probably the best “hidden” feature of Java, beginning with JDK 1.4, is that there is already such a capability that can conduct complicated communications connecting Java EE applications: the Logger object.
Loggers can be given arbitrary names, and these logger objects are shared across the JVM. So if more than one application within an app server run the following:
Logger logger = Logger.getLogger(”logger0″);
…both applications have a handle to the same object. We have the beginning of our inter-application communications framework so far. What makes this useful is that every Logger can contain a Level, and Levels hold a value that can be any integer. So we have, in essence, a map of strings to integers that can be accessed by any application within the server. The possibilities are wide open.
To test this, I have a simple web application that asks the user for 3 numbers and stores them in loggers like this (simplified slightly):