Multidimensional Expressions

From Wikipedia, the free encyclopedia

Multidimensional Expressions (MDX) is a query language for OLAP databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas.

Contents

[edit] Background

The MultiDimensional eXpressions (MDX) language provides a rich and powerful syntax for querying and manipulating the multidimensional data stored in OLAP cubes.[1] While it is possible to translate some of these into traditional SQL, it would frequently require the synthesis of clumsy SQL expressions even for very simple MDX expressions. MDX has been embraced by wide majority of OLAP vendors and became de-facto standard for OLAP systems.

[edit] History

MDX was first introduced as part of the OLE DB for OLAP (ODBO) specification in 1997 from Microsoft, with Mosha Pasumansky being one of the architects of the language. The specification was quickly followed by commercial release of Microsoft OLAP Services 7.0 in 1998 and later by Microsoft Analysis Services. The latest version of the OLE DB for OLAP (ODBO) specification was issued by Microsoft in 1999.

While it was not an open standard, but rather a Microsoft owned specification, it was adopted by the wide range of OLAP vendors. This included both vendors on the server side such as Applix, Microstrategy, SAS, SAP, Whitelight, NCR and vendors on the client side such as Panorama Software, Proclarity, AppSource, Cognos, Business Objects, Brio Technology, Crystal Reports, Microsoft Excel, Microsoft Reporting Services, etc.

With the invention of XML for Analysis, which standardized MDX as a query language, even more companies- such as Hyperion Solutions- began supporting MDX.

The XML for Analysis specification referred back to the OLE DB for OLAP (ODBO) specification for details on the MDX Query Language. In Analysis Services 2005, Microsoft has added some MDX Query Language extensions like subselects. Products like Microsoft Excel 2007 have started to use these new MDX Query Language extensions.

[edit] mdXML

In 2001 the XMLA Council released the XML for Analysis standard, which included mdXML as a query language. In the current XMLA 1.1 specification, mdXML is essentially MDX wrapped in the XML <Statement> tag.

[edit] MDX data types

There are six primary data types in MDX

[edit] Scalar

Scalar is either a number or a string. It can be specified as a literal, e.g. number 5 or string "OLAP" or it can be returned by an MDX function, e.g. Aggregate (number), UniqueName (string), .Value (number or string) etc.

[edit] Dimension

Dimension is a dimension of a cube. A dimension is a primary organizer of measure and attribute information in a cube. MDX does not know of, nor does it assume any, dependencies between dimensions- they are assumed to be mutually independent. A dimension will contain some members (see below) organized in some hierarchy or hierarchies containing levels. It can be specified by its unique name, e.g. [Time] or it can be returned by an MDX function, e.g. .Dimension.

Hierarchy is a dimension hierarchy of a cube. It can be specified by its unique name, e.g. [Time].[Fiscal] or it can be returned by an MDX function, e.g. .Hierarchy. Hierarchies are contained within dimensions.

OLEDB for OLAP MDX specification does not distinguish between dimension and hierarchy data types. Some implementations, such as Microsoft Analysis Services treat them differently.

[edit] Level

Level is a level in a dimension hierarchy. It can be specified by its unique name, e.g. [Time].[Fiscal].[Month] or it can be returned by an MDX function, e.g. .Level.

[edit] Member

Member is a member in a dimension hierarchy. It can be specified by its unique name, e.g. [Time].[Fiscal].[Month].[August], by qualified name, e.g. [Time].[Fiscal].[2006].[Q2].[August] or returned by an MDX function, e.g. .PrevMember, .Parent, .FirstChild etc.

Note that all members are specific to a hierarchy. If the self-same product is a member of two different hierarchy ([Product].[ByManufacturer] and [Product].[ByCategory]), there will be two different members visible that may need to be coordinated in sets and tuples (see below).

[edit] Tuple

Tuple is an ordered collection of one or more members from different dimensions. Tuples can be specified enumerating the members, e.g. ([Time].[Fiscal].[Month].[August], [Customer].[By Geography].[All Customers].[USA], [Measures].[Sales]) or returned by an MDX function, e.g. .Item.

[edit] Set

Set is an ordered collection of tuples with the same dimensionality, or hierarchality in the case of Microsoft's implementation. It can be specified enumerating the tuples, e.g. {([Measures].[Sales], [Time].[Fiscal].[2006]), ([Measures].[Sales], [Time].[Fiscal].[2007])} or returned by MDX function or operator, e.g. Crossjoin, Filter, Order, Descendants etc.

[edit] Other data types

Member properties are equivalent to attributes in the data warehouse sense. They can be retrieved by name in a query through an axis PROPERTIES clause of a query. The scalar data value of a member property for some member can be accessed in an expression through MDX, either by naming the property (for example, [Product].CurrentMember.[Sales Price]) or by using a special access function (for example, [Product].CurrentMember.Properties("Sales Price")).

In limited contexts, MDX allows other data types as well - for example Array can be used inside the SetToArray function to specify an array that is not processed by MDX but passed to a user-defined function in an ActiveX library. Objects of other data types are represented as scalar strings indicating the object names, such as measure group name in Microsoft's MeasureGroupMeasures function or KPI name in for example Microsoft's KPIValue or KPIGoal functions.

[edit] Example Query

The following example, adapted from the SQL Server 2000 Books Online, shows a basic MDX query that uses the SELECT statement. This query returns a result set that contains the 2002 and 2003 store sales amounts for stores in the state of California.

SELECT
   { [Measures].[Store Sales] } ON COLUMNS,
   { [Date].[2002], [Date].[2003] } ON ROWS
FROM Sales
WHERE ( [Store].[USA].[CA] )

In this example, the query defines the following result set information:

  • The SELECT clause sets the query axes as the Store Sales Amount member of the Measures dimension, and the 2002 and 2003 members of the Date dimension.
  • The FROM clause indicates that the data source is the Sales cube.
  • The WHERE clause defines the "slicer axis" as the California member of the Store dimension.

Note: You can specify up to 128 query axes in an MDX query.

[edit] References

[edit] External references

  • George Spofford, Sivakumar Harinath, Chris Webb, Dylan Hai Huang, Francesco Civardi: MDX-Solutions: With Microsoft SQL Server Analysis Services 2005 and Hyperion Essbase. Wiley, 2006, ISBN 0-471-74808-0
  • Mosha Pasumansky, Mark Whitehorn, Rob Zare: Fast Track to MDX. ISBN 1-84628-174-1