Partial index

From Wikipedia, the free encyclopedia

A partial index is a database index which has some condition applied to it such that it only includes a portion of the rows in the table.

This can allow the index to remain small even though the table may be rather large, and have fairly extreme selectivity.

Suppose you have a transaction table where entries start out with STATUS = 'A' (active), and then may pass through other statuses ('P' for pending, 'W' for "being worked on") before reaching a final status, 'F', at which point it is no longer likely to be processed again.

A useful partial index might be defined as:

 create index partial_status on txn_table (status) where status in ('A', 'P', 'W');

This index would not bother storing any of the millions of rows that have reached "final" status, 'F', and would allow queries looking for transactions that still "need work" to efficiently search via this index.

Similarly, a partial index can be used to index only those rows where a column is not null, which will be of benefit when the column usually is null.

  create index partial_object_update on object_table (updated_on) where updated_on is not null;

This index would allow the following query to read only the updated tuples:

  select * from object_table where updated_on is not null order by updated_on;

It is not necessary that the condition be the same as the index criterion; Stonebraker's paper below presents a number of examples with indexes similar to the following:

  create index partial_salary on employee(age) where salary > 2100;

Partial indexes have long been available for PostgreSQL.

[edit] External links