Object-relational mapping
From Wikipedia, the free encyclopedia
Object-Relational mapping (aka O/RM, ORM, and O/R mapping), is a programming technique that links databases to object-oriented language concepts, creating (in effect) a "virtual object database." There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to code their own object-relational mapping for their systems.
Contents |
[edit] The problem
In object-oriented (OO) programming, programming objects represent real-world objects. To illustrate, consider the example of an address book, which contains listings of people along with zero or more phone numbers and zero or more addresses. In object-oriented terms this would be represented by a "person object" with "slots" (fields, members, instance variables etc.) to hold the data that make up this listing: the person's name, a list (or array) of phone numbers, and a list of addresses.
The crux of the problem is in translating those objects to forms which can be stored in files or databases, and which can later be retrieved easily while preserving the properties of the objects and their relationships; these objects can then be said to be persistent. Historically, there have been several approaches to solving this problem.
[edit] Relational database management systems (RDBMS)
The solution to these sorts of data storage problems already exists: relational database management systems. The primary kind of database used is the relational database, which predates the popularisation of object-oriented programming in the 1990s. Using relational databases to store object-oriented data leads to a semantic gap where programmers would be required to allow their software to function in two different worlds — processing of data would be done in object-oriented form, but the same data would have to be stored in relational form. Requiring this constant conversion between two different forms of the same data not only had the effect of stifling performance, but imposed difficulties to the programmer as the relational or object-oriented forms would impose limitations on each other. For example, relational databases make complicated associations difficult, and they tend to "map" poorly into the OO world because they fail to implement the relational model's user-defined types. This problem is known as the Object-Relational impedance mismatch.
Relational databases use a series of tables representing simple data. Optional or related information is stored in other tables. A single (persistent object) record in the database often spans several of these tables, and requires a join to collect all of the related information back into a single piece of data for processing. This would be the case for the address book, which would likely include at least a user and address table, but perhaps even a phone number table as well.
In the object world there is a clear sense of "ownership", where a particular person object owns a particular phone number. This is not the case in relational databases, where the tables have no idea how they relate to other tables at a fundamental level. Instead, the user must construct a "query" to gather the information back together. Queries not only request what information to return but also need to know how the tables involved are related to each other, illustrating the point that tables do not know their relationships when they are sitting in the database, these relationships are only known when a query is run to specify the relationships. Relational databases (which try to implement the Relational Model), do maintain relationships via constraints but the SQL query language is generally unaware of these.
Because RDBMS usually don't implement relational awareness of the physical level, it can be very expensive to submit several queries in a row. One can't, for instance, expect good performance if one does a series of operations like "find this user, ok, now find this user's addresses, ok...". Instead, one must construct a single large SQL query that says "find this user and all their addresses and phone numbers and return them in this format."; this would enable a truly relational system's optimiser to reach much higher global performance than possible with OO hand-tuned access, even if probably some particular OO access paths will be faster.
Some object-relational mappers automatically keep the loaded objects in memory in constant synchronisation with the database. For this to be possible, after construction of an object-to-SQL mapping query, first returned data is copied into the fields of the objects in question, like any object-SQL mapping package. Once there, the object has to watch to see if these values change, and then carefully reverse the process to write the data back out to the database.
RDBMS are generally capable of much faster performance on global queries that involve a large proportion of the database; however, object-oriented access is deemed to be more efficient when manipulating a smaller amount of data since the semantic gap between the object form and relational form is eliminated.
Given these two very different worlds, object code for working with relational databases tends to be complex and susceptible to bugs. Database-driven software developers looked for a better way to achieve persistence for their objects.
[edit] The solution
Many packages have been developed to take the strain of developing object-relational mapping systems from programmers.
To O/R map or not To O/R map - Different O/R mapping tools have completely excluded each others different approaches to object-oriented programming and will highly affect your design. It really matters if you will use Entity(Chen/Yourdon) approach or Domain model(Fowler/Evans) approach.
Object-Relational systems attempt to solve this problem by providing libraries of classes which are able to do this mapping automatically. Given a list of tables in the database, and objects in the program, they will automatically map requests from one to the other. Asking a person object for its phone numbers will result in the proper query being created and sent, and the results being "magically" translated directly into phone number objects inside the program.[1]
From a programmer's perspective, the system should look like a persistent object store. One can create objects and work with them as one would normally, and they automatically end up in the relational database.
In practice, however, things are never quite that simple. All O/RM systems tend to make themselves visible in various ways, reducing to some degree one's ability to ignore the database. Worse, the translation layer can be slow and inefficient (notably in terms of the SQL it writes), resulting in programs that are slower and use more memory than code written "by hand."
A number of O/RM systems have been created over the years, but their effect on the market seems mixed. Considered one of the best was NeXT's Enterprise Objects Framework (EOF), but it failed to have a lasting impact on the market, chiefly because it was tightly tied to NeXT's entire toolkit, OpenStep. It was later integrated into NeXT's WebObjects, the first object-oriented Web Application Server. Since Apple Computer bought NeXT in 1997, EOF provides the technology behind the company's e-commerce Web site, the .Mac services and the iTunes Music Store. Apple provides EOF in two implementations: the Objective-C implementation that comes with the Apple Developers Tools and the Pure Java implementation that comes in WebObjects 5.2.
Upon opening any developer journal or magazine these days, you will see advertisements for so-called post-SQL databases such as Caché. These are identical to the usual object-relational mapping utilities except they are built on their own proprietary database software to maximise performance for both the object-oriented and relational view of the system.
An alternative approach is being taken with technologies such as RDF and SPARQL, and the concept of the "triplestore". RDF is a serialization of the subject-predicate-object concept, RDF/XML is an XML representation of it, SPARQL is an SQL-like query language, and a triplestore is a general description of any database that deals with a triple.
More recently, a similar system has started to evolve in the Java world, known as Java Data Objects (JDO). Unlike EOF, JDO is a standard, and several implementations are available from different vendors. The Enterprise Java Beans 3.0 (EJB3) specification also covers this same area. There has been standards conflict between the two standards bodies in terms of pre-eminence. JDO has several commercial implementations, while EJB 3.0 is still under development. However, most recently another new standard has been announced by JCP to bring these two standards together and make the future standard something that works with various Java architectures. Another example to mention is Hibernate (Java), the most used framework of O/R mapping in the Java world that has inspired the EJB3 specification.
In the Web framework Ruby on Rails, object-relational mapping plays a central role and is handled by the ActiveRecord wrapping tool.
[edit] Object-oriented databases
The ideal solution would be to use an object-oriented database management system, which, as the name implies, is a database designed specifically to look at and work with object-oriented programs. Using an OODBMS would eliminate the need for converting data to and from its SQL form, as the data would be stored in its object representation.
Object-oriented databases are yet to come into widespread use. One of their main limitations is that switching from an SQL DBMS to a purely object-oriented DBMS means you lose the capability to create SQL queries, a tried and tested method for retrieving ad hoc combinations of data. For this reason, many programmers find themselves more at home with an object-SQL mapping system, even though most commercial object-oriented databases are able to process SQL queries to a limited extent.
However, object-oriented databases are beginning to gain popularity as programmers seek to improve on the performance of object-SQL mapping.
[edit] See also
- Object-relationship modelling
- List of object-relational mapping software
- CORBA
- Database
- Object database
- Object-relational database
- Relational model
- SQL
- Object-SQL_Impedance_Mismatch
[edit] External links
- ORMappers.Com: A community portal devoted to O/R Mappers managed by Roy Osherove
- Scott W. Ambler: Mapping Objects to Relational Databases: O/R Mapping In Detail
- Object Relational Tool Comparison in .NET
- Core J2EE Design Pattern: Data Access Objects
- ORM-Net, for .NET
- Citations from CiteSeer
- Data Access Objects versus Object Relational Mapping
- Patterns for Object / Relational Mapping and Access Layers (not up2date)
- ObJectRelationalBridge
- PolePosition Benchmark -- shows the performance trade-offs for solutions in the object-relational impedance mismatch context.
- Choosing an Object-Relational mapping tool
- Relationship Service
- Persistent State Service
- JDBCPersistence Fast ORM for Java
- Persistor.NET (2Top)
- O/R Mapping for .NET (Vanatec)
- O/R Mapping for Java(JDO) (Vanatec)
- Genome for .NET
- EntityBroker
- Opf3
- [1]
- "The Vietnam of Computer Science" by Ted Neward
- Core Data (Apple)