Dimension (data warehouse)

From Wikipedia, the free encyclopedia

For other senses of dimension, see dimension (disambiguation).

In a data warehouse, a dimension is a data element that categorizes each item in a data set into non-overlapping regions.

A dimensional data element is similar to a categorical variable in statistics. Greater than or less than comparisons are not used between categories.

If dimensions can be shared between multiple cubes, a great deal of duplicated effort can be avoided. This process is called creating "conformed dimensions" and is a keystone of the Kimball data warehouse methodology.

Contents

[edit] Example

For example a data warehouse may have a database of people, where each person is categorized as having a gender of male, female or unknown. A user of the data warehouse would then be able to filter or categorize each presentation or report by either filtering based on the gender dimension or displaying results broken out by the gender. Gender could then be a property of the person dimension table. Another example would include the attribute fields "day" "month" "week" and "year". These could all be members of the Time dimension within a data Warehouse.

[edit] Use of ISO representation terms

When referencing data from a metadata registry such as ISO/IEC 11179, representation terms such as Indicator (a boolean true/false value), Code (a set of non-overlapping enumerated values) are typically used as dimensions. For example using the National Information Exchange Model the data element name would be PersonGenderCode and the enumerated values would be male, female and unknown.

[edit] Relationship to other components of a data warehouse

A data warehouse cube is frequently composed of both dimensions or measures. These can then be placed into dimension and fact tables in a relational database.

[edit] See also

[edit] References