Dimension (data warehouse)
From Wikipedia, the free encyclopedia
It has been suggested that this article or section be merged with dimension table. (Discuss) |
This article or section is in need of attention from an expert on the subject. Please help recruit one or improve this article yourself. See the talk page for details. Please consider using {{Expert-subject}} to associate this request with a WikiProject |
- For other senses of dimension, see dimension (disambiguation).
In a data warehouse, a dimension is a data element that categorizes each item in a data set into non-overlapping regions.
Contents |
[edit] Introduction
A data warehouse dimension provides the means to "slice and dice" data in a data warehouse. Dimensions provide structured labeling information to otherwise unordered numeric measures. For example, "Customer", "Date", and "Product" are all dimensions that could be applied meaningfully to a sales receipt. A dimensional data element is similar to a categorical variable in statistics.
The primary function of dimensions is threefold: to provide filtering, grouping and labeling. For example, in a data warehouse where each person is categorized as having a gender of male, female or unknown, a user of the data warehouse would then be able to filter or categorize each presentation or report by either filtering based on the gender dimension or displaying results broken out by the gender.
[edit] Hierarchies
Each dimension in a data warehouse may have one or more hierarchies applied to it. For the "Date" dimension, there are several possible hierarchies: "Day > Month > Year", "Day > Week > Year", "Day > Month > Quarter > Year", etc.
[edit] Role-Playing Dimensions
Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
[edit] Use of ISO representation terms
When referencing data from a metadata registry such as ISO/IEC 11179, representation terms such as Indicator (a boolean true/false value), Code (a set of non-overlapping enumerated values) are typically used as dimensions. For example using the National Information Exchange Model the data element name would be PersonGenderCode and the enumerated values would be male, female and unknown.
[edit] Relationship to other components of a data warehouse
A data warehouse cube is frequently composed of both dimensions or measures. These can then be placed into dimension and fact tables in a relational database.
[edit] See also
- categorical variable
- data warehouse
- dimension table
- degenerate dimension
- fact table
- ISO/IEC 11179
- measure (data warehouse)
- metadata
[edit] References
- Kimball, Ralph et al (1998); The Data Warehouse Lifecycle Toolkit, p17. Pub. Wiley. ISBN 0-471-25547-5.
- Kimball, Ralph (1996); The Data Warehouse Toolkit, p. 100. Pub. Wiley. ISBN 0-471-15337-0.