Cubes (OLAP server)
Original author(s) | Stefan Urbanek[1] |
---|---|
Initial release | March 27, 2011 |
Stable release | 1.0.1 / 26 March 2015 |
Development status | Active |
Written in | Python |
Operating system | Cross-platform |
Type | OLAP |
License | MIT License[2] |
Website |
cubes |
Cubes is a light-weight open source multidimensional modelling and OLAP toolkit for development reporting applications and browsing of aggregated data written in Python programming language released under the MIT License.
Cubes provides to an analyst or any application end-user "understandable and natural way of reporting using concept of data Cubes – multidimensional data objects".
Cubes was first publicly released in March 2011. The project was originally developed for Public Procurements of Slovakia.[3] Cubes 1.0 was released in September 2014 and presented on the PyData Conference in New York[4]
Features
- OLAP and aggregated browsing (default is ROLAP)
- logical model of OLAP cubes in JSON or provided from external sources
- hierarchical dimensions (attributes that have hierarchical dependencies, such as category-subcategory or country-region)
- multiple hierarchies in a dimension
- arithmetic expressions for computing derived measures and aggregates
- localizable metadata and data
Model
The logical conceptual model in Cubes is described using JSON and can be provided either in a form of a file, directory bundle or from an external model provider (for example a database). The basic model objects are: cubes and their measures and aggregates, dimensions and their attributes, hierarchies. Logical model also contains mapping from logical attributes to their physical location in a database (or other data source).
Example model:
{ "cubes": [ { "name": "sales", "label": "Our Sales", "dimensions": [ "date", "customer", "location", "product" ], "measures": [ "amount" ] } ] "dimensions": [ { "name": "product", "label": "Product", "levels": [ { "name":"category", "label":"Category", "attributes": [ "category_id", "category_label" ], }, { "name":"product", "label":"Product", "attributes": [ "product_id", "product_label" ], } ] }, ... ] }
Operations
Cubes provides basic set of operations such as Data drilling and filtering (slicing and dicing). The operations can be accessed either through Python interface or through a light web server called Slicer.
Example of the python interface:
import cubes workspace = Workspace("slicer.ini") browser = workspace.browser("sales") result = browser.aggregate() print(result.summary)
Server
The Cubes provides a non-traditional OLAP server with HTTP queries and JSON response API. Example query to get "total amount of all contracts between January 2012 and June 2016 by month":
The response looks like:
{ "summary": { "contract_amount_sum": 10000000.0 }, "remainder": {}, "cells": [ { "date.year": 2012, "criteria.code": "ekonaj", "contract_amount_sum": 12345.0, "criteria.description": "economically best offer", "criteria.sdesc": "best offer", "criteria.id": 3 }, { "date.year": 2012, "criteria.code": "cena", "contract_amount_sum": 23456.0, "criteria.description": "lowest price", "criteria.sdesc": "lowest price", "criteria.id": 4 }, ... "total_cell_count": 6, "aggregates": [ "contract_amount_sum" ], "cell": [ { "type": "range", "dimension": "date", "hierarchy": "default", "level_depth": 2, "invert": false, "hidden": false, "from": ["2012", "1" ], "to": ["2015", "6" ] } ], "levels": { "criteria": [ "criteria" ], "date": [ "year" ] } }
The simple HTTP/JSON interface makes it very easy to integrate OLAP reports in web applications written in pure HTML and JavaScript.
The Slicer server contains endpoints describing the cube metadata which helps to create generic reporting applications[5] that don't have to know the database model structure and conceptual hierarchies up-in-front.
The Slicer server is written using the Flask (web framework).
ROLAP and SQL
The built-in SQL backend of the framework provides ROLAP functionality on top a relational database. Cubes contains a SQL query generator that translates the reporting queries into SQL statements. The query generator takes into account topology of the star or snowflake schema and executes only joins that are necessary to retrieve attributes required by the data analyst.
The SQL backend uses SQLAlchemy Python toolkit to construct the queries.
References
External links
- Home page
- Documentation
- Source code at Github
Related projects: