Multidimensional database

From Wikipedia, the free encyclopedia

It has been suggested that this article be split into multiple articles accessible from a disambiguation page. (Discuss)

Database model
Common models
Hierarchical Network Relational Object-relational Object
Other models
Associative Concept-oriented Multi-dimensional Star schema XML database

Multidimensional databases are variously (depending on the context) data aggregators which combine data from a multitude of data sources; databases which offer networks, hierarchies, arrays and other data formats difficult to model in SQL; or databases which give a high degree of flexibility in the definition of dimensions, units, and unit relationships, regardless of data format.

Multi-dimensional databases are especially useful in sales and marketing applications that involve time series. Large volumes of sales and inventory data can be stored to ultimately be used for logistics and executive planning. For example, data can be more readily segregated by sales region, product, or time period.

While many of the major database vendors have recognized and implemented at least a partial solution, most frequently they rely upon a Star schema database design. However, the star design for relational databases can result in "sparse data," or sets of ordered data with large gaps between data entries. While modern database engines use strategies to limit the impact of sparse data sets on query performance, such as compressing large blocks of empty data elements for quicker access, star databases can still present worse performance than other alternatives.

The data cube is a conceptual representation of database which can be implemented in a variety of ways, including top-down, bottom-up, and arrays. Multi-dimensional databases for time-series or other data vector analysis is preferable over relational databases. However, dimensionality in OLAP databases becomes problematic because working with more than four dimensions under this model often results in sparse or empty data sets. Attempting to eliminate sparse or empty data is risky because it can ruin the context and more specifically the vector coordinates of the data.

This is an active area of database development, in which the set of desired features is somewhat vague, but better-defined than the set of known or proposed solutions. Defining and implementing a database which allows people at each level of an organization to define tables and data formats in the way that is most useful to them, yet which supports a single clear query language and consistent infrastructure, remains an open problem.

[edit] Examples

Pick operating system
Vectornova/Vectorstar
Panda Project Parallel Processing Datacubes
OpenQM
RealityX
IBM U2
OLAP versions of many major databases, such as MDX
Microsoft Analysis Services
Pilot Software Time Server Time Series Database Server from The 1990's
Essbase

[edit] References

This article does not cite any references or sources. (August 2007)
Please help improve this article by adding citations to reliable sources. Unverifiable material may be challenged and removed.