ADO.NET Entity Framework

From Wikipedia, the free encyclopedia

The .NET Framework stack.
The .NET Framework stack.

The ADO.NET Entity Framework, part of the ADO.NET components of the .NET Framework, is a framework for providing services on data models from Microsoft. Although an object-relational mapping service is an important part of it, it aims to provide more services such as query, view and reporting services. It is geared towards solving the mismatch between the formats in which data is stored in a database and in which it is consumed in an object-oriented programming language or other front ends.

ADO.NET Entity Framework is planned to be included with .NET Framework 3.5 Service Pack 1 and Visual Studio 2008 Service Pack 1, but no release date has been announced.[1] It will also include the capability of executing LINQ against ADO.NET Entity Framework entities.

Contents

[edit] Overview

[edit] Background

ADO.NET Entity Framework abstracts the relational (logical) schema of the data that is stored in a database and presents its conceptual schema to the application. For example, in the database, entries about a customer and their information can be stored in the Customers table, their orders in the Orders table and their contact information in yet another Contacts table. For an application to deal with this database, it has to know which information is in which table, i.e., the relational schema of the data is hardcoded into the application.

The disadvantage of this approach is that if this schema is changed the application is not shielded from the change. Also, the application has to perform SQL joins to traverse the relationships of the data elements in order to find related data. For example, to find the orders of a certain customer, the customer need to be selected from the Customers table, joined with the Orders table, and then projected to remove unwanted columns.

This model of traversing relationships between items is very different from the model used in object-oriented programming languages, where the relationships an object features in is exposed as Properties of the object and accessing the property traverses the relationship. Also, using SQL queries expressed as strings, only to have it processed by the database, keeps the programming language from making any guarantees about the operation and from providing compile time type information.

The mapping of the logical schema into the physical schema that defines how the data is structured and stored on the disk is the job of the database system and client side data access mechanisms are shielded from it as the database exposes the data in the way specified by its logical schema.

[edit] Mapping

ADO.NET Entity Framework introduces a 1:1 (one to one) mapping between the database schema and the conceptual schema. In the relational schema, the elements are composed of the tables, with the primary and foreign keys gluing the related tables together. In contrast, the Entity Types define the conceptual schema of the data.

The entity types are an aggregation of multiple typed fields – each field maps to a certain column in the database - and can contain information from multiple physical tables. The entity types can be related to each other, independent of the relationships in the physical schema. Related entities are also exposed similarly – via a field whose name denotes the relation they are participating in and accessing which, instead of retrieving the value from some column in the database, traverses the relationship and returns the entity (or a collection of entities) it is related with.

Entity Types form the class of objects entities conform to, with the Entities being instances of the entity types. Entities represent individual objects which form a part of the problem being solved by the application and are indexed by a key. For example, converting the physical schema described above, we will have two entity types:

  • CustomerEntity, which contains their name from the Customers table, and address from the Contacts table.
  • OrderEntity, which encapsulates the orders of a certain customer, retrieving it from the Orders table.

The logical schema and its mapping with the physical schema is represented as an Entity Data Model (EDM), specified as an XML file. ADO.NET Entity Framework uses the EDM to actually perform the mapping letting the application work with the entities, while internally abstracting the use of ADO.NET constructs like DataSet and RecordSet. ADO.NET Entity Framework performs the joins necessary to have entity reference information from multiple tables, or when a relationship is traversed. When an entity is updated, it traces back which table which information came from and issues SQL update statements to update the tables in which some data has been updated. ADO.NET Entity Framework uses eSQL, a derivative of SQL, to perform queries, set-theoretic operations, and updates on entities and their relationships. Queries in eSQL, if required, are then translated to the native SQL flavor of the underlying database.

Entity types and entity sets just form the logical EDM schema, and can be exposed as anything. ADO.NET Entity Framework includes Object Service that presents these entities as Objects with the elements and relationships exposed as properties. Thus Entity objects are just front-end to the instances of the EDM entity types, which lets Object Oriented languages access and use them. Similarly, other front-ends can be created, which expose the entities via web services (e.g., Astoria) or XML which is used when entities are serialized for persistence storage or over-the-wire transfer.

[edit] Entity Data Model

The Entity data model (EDM) specifies the conceptual model of the data via the Entity-Relationship data model, which deals primarily with Entities and the Relationships they participate in. In addition, the mapping of the elements of the conceptual schema to the logical schema is also needed to be specified. The EDM schema is expressed in the Schema Definition Language (SDL), which is an application of XML. The mapping specification is also expressed in XML. ADO.NET also provides Entity Designer, for visual creation of the EDM and the mapping specification. The output of the tool is the XML file specifying the schema and the mapping. These files can also be created or edited by hand.

[edit] Entities

Entities are instances of Entity Types; they represent the individual instances of the objects (such as customer, orders) to which the information pertains. The identity of an entity is defined by the entity type it is an instance of; in that sense an entity type defines the class an entity belongs to and also defines what properties an entity will have. Properties describe some aspect of the entity by giving it a name and a type. The properties of an entity type in ADO.NET Entity Framework is fully typed, and is fully compatible with the type system used in a DBMS system, as well as the Common Type System of the .NET Framework. A property can be SimpleType, ComplexType, or RowType', and can be multi valued as well. All entity types belong to some namespace, and have an EntityKey property which uniquely identifies each instances of the entity type. The different property types are distinguished as follows:

  • SimpleType, corresponds to primitive data types such as Integer, Characters and Floating Point numbers.
  • ComplexType, is an aggregate of multiple properties of type SimpleType, ComplexType, or RowType. Unlike entity types, however, ComplexTypes cannot have an EntityKey.
  • RowType, similar to ComplexType, except that they cannot be inherited.

All entity instances are housed in EntityContainers, which are per-project containers for entities. Each project has one or more named EntityContainers, which can reference entities across multiple namespaces and entity types. Multiple instances of one entity type can be stored in collections called EntitySets. One entity type can have multiple EntitySets.

[edit] Relationships

Any two entity types can be related, by either an Association relation or a Containment relation. For example, a shipment is billed to a customer is an association whereas an order contains order details is a containment relation. A containment relation can also be used to model inheritance between entities. The relation between two entity types is specified by a Relationship Type, instances of which, called Relationships, relate entity instances. In future releases, other kinds of relationship types such as Composition, or Identification, may be introduced. Relationship types are characterized by their degree (arity) or the count of entity types they relate and their multiplicity. However, in the initial release of ADO.NET Entity Framework, relationships are limited to a binary (of degree two) bi-directional relationship. Multiplicity defines how many entity instances can be related together. Based on multiplicity, relationships can be either one-to-one, one-to-many, or many-to-many. Relationships between entities are named; the name is called a Role. It defines the purpose of the relationship. A relationship type can also have an Operation or Action associated with it, which allows some action to be performed on an entity in the event of an action being performed on a related entity. A relationship can be specified to take an Action when some Operation is done on a related entity. For example, on deleting an entity that forms the part of a relation (the OnDelete operation) the actions that can be taken are:

  • Cascade, which instructs to delete the relationship instance and all associated entity instances.
  • Restrict, which prevents the operation as long as the relationship holds true.
  • RemoveAssociation, which removes the relationship between the entity instance and all entities associated via the relationship.

For association relationships, which can have different semantics at either ends, different actions can be specified for either end.

[edit] Schema definition language

ADO.NET Entity Framework uses an XML based Data Definition Language called Schema Definition Language (SDL) to define the EDM Schema. The SDL defines the SimpleTypes similar to the CTS primitive types, including String, Int32, Float, Decimal, Xml, Guid, and DateTime, among others. An Enumeration, which defines a map of primitive values and names, is also considered a simple type. Enumerations are currently unsupported in the Beta 3 version of the framework. ComplexTypes are created from an aggregation of other types. A collection of properties of these types define an Entity Type. This definition can be written in EBNF grammar as:

EntityType  ::= ENTITYTYPE entityTypeName [BASE entityTypeName] 
 [ABSTRACT true|false] KEY propertyName [, propertyName]*   
 {(propertyName PropertyType [PropertyFacet]*) +} 

PropertyType ::= ((PrimitiveType [PrimitiveTypeFacets]*) 
 |  (complexOrEnumTypeName) | RowType 
PropertyFacet ::= ( [NULLABLE true | false] |  
 [DEFAULT defaultVal] | [MULTIPLICITY [ 1|*] ] )   
PropertyTypeFacet ::= LENGTH|MAXLENGTH|PRECISION|SCALE|UNICODE|COLLATION 
 |  DATETIMEKIND|PRESERVESECONDS  
PrimitiveType ::=  BINARY | BOOLEAN  
 |DATETIME|DOUBLE|DECIMAL|FLOAT|GUID|SBYTE|INT16| INT32
 | INT64 | BYTE | UINT16 | UINT32 | UINT64 | STRING | XML | MONEY)

Facets are used to describe metadata of a property, such as whether it is nullable or has a default value, as also the cardinality of the property, i.e., whether the property is single valued or multi valued. A multiplicity of “1” denotes a single valued property; a “*” means it is a multi-valued property. As an example, an entity can be denoted in SDL as:

<EnumerationType Name = "CtyEnum">
     <EnumerationMember Name = "Country1"/>
     <EnumerationMember Name = "Country2"/>
     <EnumerationMember Name = "Country3"/>
 </EnumerationType>
 <ComplexType Name = "Addr">
     <Property Name = "Street" Type = "String" Nullable = "false" />
     <Property Name = "City" Type = "String" Nullable = "false" />
     <Property Name = "Country" Type = "CtyEnum" Nullable = "false" />
     <Property Name = "PostalCode" Type = "Int32" />
 </ComplexType>
 <EntityType Name = "Customer" Key = "Email">
     <Property Name = "Name" Type = "String" /> 
     <Property Name = "Email" Type = "String" Nullable = "false" />
     <Property Name = "Address" Type = "Addr" Multiplicity = "*" />
 </EntityType>

A relationship type is defined as specifying the end points and their multiplicities. For example, a one-to-many relationship between Customer and Orders can be defined as

<Association Name = "CustomerAndOrders">
     <End Type = "Customer" Multiplicity = "1">
         <OnDelete Action = "RemoveAssociation"/>
     </End>
     <End Type = "Orders" Multiplicity = "*">
         <OnDelete Action = "Cascade"/>
     </End>
 </Association>

[edit] Entity SQL

ADO.NET Entity Framework uses a variant of the SQL query language, named Entity SQL, which is aimed at writing declarative queries and updates over entities and entity relationships – at the conceptual level. It differs from SQL in that it does not have explicit constructs for joins because the EDM is designed to abstract partitioning data across tables. Querying against the conceptual model is facilitated by EntityClient classes, which accepts an Entity SQL query. The query pipeline parses the Entity SQL query into a command tree, segregating the query across multiple tables, which is handed over to the EntityClient provider. Like ADO.NET data providers, an EntityClient provider is also initialized using a Connection object, which in addition to the usual parameters of data store and authentication info, requires the SDL schema and the mapping information. The EntityClient provider in turn then turns the Entity SQL command tree into an SQL query in the native flavor of the database. The execution of the query then returns an Entity SQL ResultSet, which is not limited to a tabular structure, unlike ADO.NET ResultSets.

Entity SQL enhances SQL by adding intrinsic support for:

  • Types, as ADO.NET entities are fully typed.
  • EntitySets, which are treated as collections of entities.
  • Composability, which removes restrictions on where subqueries can be used.

[edit] Architecture

ADO.NET Entity Framework stack
ADO.NET Entity Framework stack

The architecture of the ADO.NET Entity Framework, from the bottom up, consists of the following:

  • Data source specific providers, which abstracts the ADO.NET interfaces to connect to the database when programming against the conceptual schema.
  • Map provider, a database-specific provider that translates the Entity SQL command tree into a query in the native SQL flavor of the database. It includes the Store specific bridge, which is the component that is responsible for translating the generic command tree into a store-specific command tree.
  • EDM parser and view mapping, which takes the SDL specification of the data model and how it maps onto the underlying relational model and enables programming against the conceptual model. From the relational schema, it creates views of the data corresponding to the conceptual model. It aggregates information from multiple tables in order to aggregate them into an entity, and splits an update to an entity into multiple updates to whichever table contributed to that entity.
  • Query and update pipeline, processes queries, filters and update-requests to convert them into canonical command trees which are then converted into store-specific queries by the map provider.
  • Metadata services, which handle all metadata related to entities, relationships and mappings.
  • Transactions, to integrate with transactional capabilities of the underlying store. If the underlying store does not support transactions, support for it needs to be implemented at this layer.
  • Conceptual layer API, the runtime that exposes the programming model for coding against the conceptual schema. It follows the ADO.NET pattern of using Connection objects to refer to the map provider, using Command objects to send the query, and returning EntityResultSets or EntitySets containing the result.
  • Disconnected components, which locally caches datasets and entity sets for using the ADO.NET Entity Framework in an occasionally connected environment.
    • Embedded database: ADO.NET Entity Framework includes a lightweight embedded database for client-side caching and querying of relational data.
  • Design tools, such as Mapping Designer are also included with ADO.NET Entity Framework which simplifies the job on mapping a conceptual schema to the relational schema and specifying which properties of an entity type correspond to which table in the database.
  • Programming layers, which exposes the EDM as programming constructs which can be consumed by programming languages.
    • Object services, automatically generate code for CLR classes that expose the same properties as an entity, thus enabling instantiation of entities as .NET objects.
    • Web services, which expose entities as web services.
  • High level services, such as reporting services which work on entities rather than relational data.

[edit] See also

[edit] References

  1. ^ Elisa Flasko (April 9, 2008). Entity Framework & ADO.NET Data Services to Ship with VS 2008 SP1 & .NET 3.5 SP1. ADO.NET team blog. MSDN Blogs. Retrieved on 2008-04-09.

[edit] External links