Concept-oriented model

From Wikipedia, the free encyclopedia

Database models
Common models
Hierarchical Network Relational Object-relational Object
Other models
Associative Concept-oriented Multi-dimensional Star schema XML database

The concept-oriented data model is a data model based on lattice theory and ordered sets. Another source of inspiration in creating this model is formal concept analysis (FCA). One of the main ideas underlying the concept-oriented approach is that the model has to be hierarchical and multidimensional (simultaneously). It also intersects with the Functional Data Model (FDM) and the Universal Relation Model (URM).

The fundamental principle of the concept-oriented paradigm is that objects are living in space where the space structure describes the model syntax or schema while the object structure represents its semantics. Elementary parts of the space are referred to as concepts while objects, which are concept instances are called data items. Concepts are analogous to relations or tables while items are analogous to rows or records in the relational model.

The concept-oriented model allows the database designer to describe a natural representation of data syntax and semantics, which reflects both hierarchical and multidimensional properties. Proceeding from only a small number of basic notions and principles this approach allows modeling a variety of existing methods and practical use cases such as multi-valued variables, higher level relationships, grouping and aggregation, online analytical processing (OLAP), inference, lifecycle management, complex categorization, ontologies, knowledge sharing and many other mechanisms.

1 Model syntax
2 Model semantics
3 Example
4 See also
5 External links

[edit] Model syntax

At the syntactic level each concept is defined as a combination of its superconcepts. As a consequence a subconcept is included into each of its superconcepts simultaneously. Formally the model syntax or schema are complemented by one top concept and one bottom concepts and this structure then constitutes a lattice. The top concept is a direct or indirect parent for any other concept in the model while the bottom concept directly or indirectly includes any other concept in the model.

Alternatively the concept-oriented syntax (schema) can be described in the conventional terms of dimensions and domains. Each superconcept in the definition of the concept is supposed to be a domain for the dimension associated with this pair of subconcept-superconcept. A dimension normally has a unique name within the scope of its concept. Thus each concept is defined a set of dimension names with their domains in other concepts. The database schema can be then represented as an acyclic graph where nodes are concepts and edges are dimensions leading from a concept to its domains in superconcepts. Dual to dimension is the notion of inverse dimension, which is thought of as a characteristic or attribute taking values from some subconcept (rather than superconcept). It is important that dimensions are single valued while inverse dimensions are multi-valued.

[edit] Model semantics

At the semantic level the data model is represented by its items. An item is defined as a combination of superitems taken from the superconcepts. The richness of the concept-oriented model is based on the existence of very different interpretations of its formal semantics:

A superitem can be interpreted as a characteristic of this item taken by the corresponding attribute or a coordinate of this item in the space of the superconcept. In this case the whole model can be viewed as a hierarchical multidimensional coordinate system where objects are coordinates for other objects.

Superitem can be also interpreted as sets, groups or categories for their subitems. Thus each item in the model is included into several group superitems and itself includes its subitems.

Each item is supposed to be an instance of some relation with respect to its superitems and on the other hand its subitems relate it to other items in the model.

A superitem can be interpreted as a base object for its subitems and on the other hand subitems are extensions for their superitems.

[edit] Example

The diagram describes the syntactic structure of a company, which receives orders (concept Orders) consisting of a set of parts from concept OrderParts and then executing them in several operations (concept OrderOperations).

Syntactically Orders concept is characterized by two dimensions a and c with the domains in domains in Addresses and Customers. It also has two inverse dimensions {OrderParts.o} and {OrderOperations.o} with the domains in the concepts OrderParts and OrderOperations. Notice that dimensions are always single valued and correspond to many-to-one relationship. Inverse dimensions are multi-valued and correspond to one-to-many relationship.

Many-to-many relationships are implemented via common subconcepts. For example, we might define a many-to-many relationship isOrderedBy between Products and Customers, which returns a set of customers that ordered some product using subconcepts OrderParts and Orders to implement it.

The primitive or canonical dimensionality of this model is 7 because it is the number of paths from the bottom to the top: {op.p.pg, op.p.a, op.o.a, op.o.c.a, oo.o.a, oo.o.c.a, oo.o} .

Semantically each order item consists of either a set of its parts or a set of operations to be executed for this order. Dually, it is a combination of one (delivery) address and one customer. In the canonical form each item could be represented as a combination of 7 primitive items.

Order items are interpreted as relation instances with respect to its superitems from concepts Addresses and Customers. On the other hand these very Order items are connected with other items in the model by means of subitems from OrderParts and OrderOperations interpreted as relation instances. Thus the role of order items is relative and depend on the level currently considered.

Top concept represents the problem domain at the most abstract level with no details at all. By propagating information in the upward direction we get aggregated values of all properties for the whole company. For example, we might compute the total number of orders or order operations executed in the company by using the corresponding inverse dimensions of the top concept. One inverse dimension is a path with opposite direction. The total number of order parts is computed as follows: orderCount = sum(top.{OrderParts.o.a.t}). Here top is the only top item representing the whole company, o.a.t is the path from OrderParts to this item, and sum is the aggregation function. In fact, this will return all order parts. If we need to count all order parts delivered to some address then this can be done as follows: orderCount = sum(address.{OrderParts.o.a}). Here address is some concrete item.

Bottom concept represents the problem domain at the most detailed level and is equal to the sum of all concepts with no subconcept. Normally this level includes a lot of items for which we can get a lot of properties by using their dimensions. For example, an order part is characterized by one order and one product, which in turn are characterized by their higher level properties and so on till the top concept.