Developer(s) | PostgreSQL Global Development Group |
---|---|
Stable release | 8.4.4 / 8.3.11 / 8.2.17 / 8.1.21 / 8.0.25 / 7.4.29 / May 17, 2010 |
Preview release | 9.0 beta4 / August 2, 2010 |
Written in | C |
Operating system | Cross-platform |
Type | ORDBMS |
License | PostgreSQL licence[1][2][3] |
Website | http://www.postgresql.org/ |
PostgreSQL, often simply Postgres, is an object-relational database management system (ORDBMS).[4] It is released under an MIT-style license and is thus free and open source software. As with many other open-source programs, PostgreSQL is not controlled by any single company - a global community of developers and companies develops the system.
Contents |
The mixed-capitalization of the PostgreSQL name can confuse some people on first viewing. The several pronunciations of "SQL" can lead to this confusion. PostgreSQL's developers pronounce it /ˈpoʊstɡrɛs ˌkjuː ˈɛl/; (Audio sample, 5.6k MP3). It is abbreviated as "Postgres", its original name. Because of ubiquitous support for the SQL Standard amongst most relational databases, the community considered changing the name back to Postgres. However, the PostgreSQL Core Team announced in 2007 that the product would continue to use the name PostgreSQL.[5] The name refers to the project's origins as a "post-Ingres" database, the original authors having also developed the Ingres database. (The name Ingres was an abbreviation for INteractive Graphics REtrieval System.)
PostgreSQL evolved from the Ingres project at University of California, Berkeley. In 1982, the project leader, Michael Stonebraker, left Berkeley to make a proprietary version of Ingres. He returned to Berkeley in 1985 and started a post-Ingres project to address the problems with contemporary database systems that had become increasingly clear during the early 1980s. The new project, Postgres, aimed to add the fewest features needed to completely support types. These features included the ability to define types and to fully describe relationships – something used widely before but maintained entirely by the user. In Postgres, the database "understood" relationships, and could retrieve information in related tables in a natural way using rules. Postgres used many of the ideas of Ingres, but not its code.
Starting in 1986, the team published a number of papers describing the basis of the system, and by 1988 had a prototype version. The team released version 1 to a small number of users in June 1989, then version 2 with a re-written rules system in June 1990. Version 3, released in 1991, again re-wrote the rules system, and added support for multiple storage managers and an improved query engine. By 1993 the great number of users began to overwhelm the project with requests for support and features. After releasing version 4 — primarily a cleanup — the project ended.
But open-source developers could obtain copies and develop the system further, because Berkeley had released Postgres under a MIT-style license. In 1994, Berkeley graduate students Andrew Yu and Jolly Chen replaced the Ingres-based QUEL query language interpreter with one for the SQL query language, creating Postgres95. The code was released on the web.
In July 1996, Marc Fournier at Hub.Org Networking Services provided the first non-university development server for the open-source development effort. Along with Bruce Momjian and Vadim B. Mikheev, work began to stabilize the code inherited from Berkeley. The first open-source version was released on August 1, 1996.
In 1996, the project was renamed to PostgreSQL to reflect its support for SQL. The first PostgreSQL release formed version 6.0 in January 1997. Since then, the software has been maintained by a group of database developers and volunteers around the world, coordinating via the Internet.
The PostgreSQL project continues to make major releases (approximately annually) and minor "bugfix" releases, all available under the same license. Code comes from contributions from proprietary vendors, support companies, and open-source programmers at large.
Functions allow blocks of code to be executed by the server. Although these blocks can be written in SQL, the lack of basic programming operations which existed prior to version 8.4, such as branching and looping, has driven the adoption of other languages inside of functions. Some of the languages can even execute inside of triggers. Functions in PostgreSQL can be written in the following languages:
PostgreSQL supports row-returning functions, where the output of the function is a set of values which can be treated much like a table within queries. Custom aggregates and window functions can also be defined.
Functions can be defined to execute with the privileges either of the caller or of the user who defined the function. Functions are sometimes referred to as stored procedures, although there is a slight technical distinction between the two.
PostgreSQL includes built-in support for B+-tree, hash, GiST and GiN indexes. In addition, user-defined index methods can be created, although this is quite an involved process. Indexes in PostgreSQL also support the following features:
WHERE
clause to the end of the CREATE INDEX
statement. This allows a smaller index to be created.Triggers are events triggered by the action of SQL DML statements. For example, an INSERT statement might activate a trigger that checked if the values of the statement were valid. Most triggers are only activated by either INSERT or UPDATE statements.
Triggers are fully supported and can be attached to tables but not to views. Views can have rules, though. Multiple triggers are fired in alphabetical order. In addition to calling functions written in the native PL/PgSQL, triggers can also invoke functions written in other languages like PL/Perl.
PostgreSQL manages concurrency through a system known as Multi-Version Concurrency Control (MVCC), which gives each user a "snapshot" of the database, allowing changes to be made without being visible to other users until a transaction is committed. This largely eliminates the need for read locks, and ensures the database maintains the ACID principles in an efficient manner.
Rules allow the "query tree" of an incoming query to be rewritten. One common usage is to implement views, including updatable views. Rules, or more properly, "Query Re-Write Rules", are attached to a table/class and "Re-Write" the incoming DML (select, insert, update, and/or delete) into one or more queries that either replace the original DML statement or execute in addition to it. Query Re-Write occurs after DML statement parsing, but, before query planning.
A wide variety of native data types are supported, including:
In addition, users can create their own data types which can usually be made fully indexable via PostgreSQL's GiST infrastructure. Examples of these include the geographic information system (GIS) data types from the PostGIS project for PostgreSQL.
New types of almost all objects inside the database can be created, including:
Tables can be set to inherit their characteristics from a "parent" table. Data in child tables will appear to exist in the parent tables, unless data is selected from the parent table using the ONLY keyword, i.e. select * from ONLY PARENT_TABLE. Adding a column in the parent table will cause that column to appear in the child table.
Inheritance can be used to implement table partitioning, using either triggers or rules to direct inserts to the parent table into the proper child tables.
As of 2010[update] this feature is not fully supported yet—in particular, table constraints are not currently inheritable. As of the 8.4 release, all check constraints and not-null constraints on a parent table are automatically inherited by its children. Other types of constraints (unique, primary key, and foreign key constraints) are not inherited.
Inheritance provides a way to map the features of generalization hierarchies depicted in Entity Relationship Diagrams (ERD) directly into the PostgreSQL database.
The currently in beta testing PostgreSQL 9.0 includes built-in binary replication, based on shipping the changes made to all database blocks to other systems asynchronously after commmit. Adding synchronous replication is on the roadmap for the 9.1 release.
9.0 also introduces the ability to run read-only queries against these replicated slaves, where earlier versions would only allow that after promoting them to be a new master. This allows splitting read traffic among multiple nodes efficiently. Earlier replication software that allowed similar read scaling normally relied on adding replication triggers to the master, introducing additional load onto it.
There are several asynchronous master/slave replication packages also available. These remain useful even after introduction of the expanded core capabilities, for situations where binary replication of an entire database cluster isn't the appropriate approach:
There are also proxy (middleware) tools that enable replication, failover or load management and balancing for PostgreSQL:
Some synchronous multi-master derivatives or extensions exist, including
The primary front-end for PostgreSQL is the psql
command-line program, which can be used to enter SQL queries directly, or execute them from a file. In addition, psql provides a number of meta-commands and various shell-like features to facilitate writing scripts and automating a wide variety of tasks; for example tab completion of object names and SQL syntax.
pgAdmin is a free and open source graphical front-end administration tool for PostgreSQL, which is supported on most popular computer platforms. The program is available in more than a dozen languages. The first prototype, named pgManager, was written for PostgreSQL 6.3.2 from 1998, and rewritten and released as pgAdmin under the GPL License in later months. The second incarnation (named pgAdmin II) was a complete rewrite, first released on 16 January 2002. The current version is pgAdmin III, which was originally released under the Artistic License and is now released under the same license as PostgreSQL. Unlike prior versions that were written in Visual Basic, pgAdmin III is written in C++, using the wxWidgets framework allowing it to run on most common operating systems.
phpPgAdmin is a web-based administration tool for PostgreSQL written in PHP and based on the popular phpMyAdmin interface originally written for MySQL administration.[10]
A number of companies offer proprietary tools for PostgreSQL. They often consist of a universal core that is adapted for various specific database products. These tools mostly share the administration features with the open source tools but offer improvements in data modeling, importing, exporting or reporting.
Many informal performance studies of PostgreSQL have been done[11]. Performance improvements aimed at improving scalability started heavily with version 8.1, and running simple benchmarks the current version 8.4 has been shown to be at least 7.5 times faster on both read and write workloads compared with version 8.0.[12]
The first industry-standard and peer-validated benchmark was completed in June 2007 using the Sun Java System Application Server (proprietary version of GlassFish) 9.0 Platform Edition, UltraSPARC T1 based Sun Fire server and Postgres 8.2.[13] This result of 778.14 SPECjAppServer2004 JOPS@Standard compares favourably with the 874 JOPS@Standard with Oracle 10 on an Itanium based HP-UX system.[11]
In August 2007, Sun submitted an improved benchmark score of 813.73 SPECjAppServer2004 JOPS@Standard. With the system under test at a reduced price, the price/performance improved from $US 84.98/JOPS to $US 70.57/JOPS.[14]
The default configuration of PostgreSQL only uses a small amount of dedicated memory for performance critical purposes such as caching database blocks and sorting. This limitation is primarily because many operating systems will not allow allocating large blocks of shared memory by default, which means the database requires kernel changes before it can be tuned properly. Advice on basic recommended performance practice covering all recent versions of PostgreSQL is available at Tuning Your PostgreSQL Server.
As of 2008[update] PostgreSQL has received the following awards:[29]
Although the license allowed proprietary products based on Postgres, the code did not develop in the proprietary space at first — somewhat surprisingly considering the advantages Postgres offered. The main offshoot originated when Paula Hawthorn (an original Ingres team member who moved from Ingres) and Michael Stonebraker formed Illustra Information Technologies to make a proprietary product based on Postgres.
In 2000, former Red Hat investors created the company Great Bridge to make a proprietary product based on PostgreSQL and compete against proprietary database vendors. Great Bridge sponsored several PostgreSQL developers and donated many resources back to the community,[30] but by late 2001 closed due to tough competition from companies like Red Hat and to poor market conditions.[31]
In 2001, Command Prompt, Inc. released Mammoth PostgreSQL, a proprietary product based on PostgreSQL. In 2008, Command Prompt, Inc. released the source under the original license. Command Prompt, Inc. continues to support the PostgreSQL community actively through developer sponsorships and projects including PL/Perl, PL/php, and hosting of community projects such as the PostgreSQL Build Farm.
In January 2005, PostgreSQL received backing by database vendor Pervasive Software, known for its Btrieve product which was ubiquitous on the Novell NetWare platform. Pervasive announced commercial support and community participation and achieved some success. In July 2006, Pervasive left the PostgreSQL support market.[32]
In mid-2005 two other companies announced plans to make proprietary products based on PostgreSQL with focus on separate niche markets. EnterpriseDB added functionality to allow applications written to work with Oracle to be more readily run with PostgreSQL. Greenplum contributed enhancements directed at data warehouse and business intelligence applications, including the BizGres project.
In October 2005, John Loiacono, executive vice president of software at Sun Microsystems, commented: "We're not going to OEM Microsoft but we are looking at PostgreSQL right now,"[33] although no specifics were released at that time. By November 2005, Sun had announced support for PostgreSQL.[34] By June 2006, Sun Solaris 10 (6/06 release) shipped with PostgreSQL.
In August 2007, EnterpriseDB announced[35] the Postgres Resource Center [4] and EnterpriseDB Postgres, designed as a fully configured distribution of PostgreSQL including many contrib modules and add-on components. EnterpriseDB Postgres was renamed to Postgres Plus in March 2008. Postgres Plus is available in two versions: Postgres Plus Standard Server which has all the features of PostgreSQL plus additional QA testing, integrated components, tuning and one-click install, and Postgres Plus Advanced Server which has all the features of Postgres Standard Server plus Oracle compatibility, scalability features, and DBA and developer tools. Both versions are available for free and are fully supported, though the free version of Postgres Plus Advanced Server is restricted by a "limited use" license, which is defined as "confined to a single CPU, utilizing 1 GB RAM, storing no more than 6GB of data in a NON-PRODUCTION environment."