Entity-relationship model
From Wikipedia, the free encyclopedia
Databases are used to store structured data. The structure of this data, together with other constraints, can be designed using a variety of techniques, one of which is called entity-relationship modeling or ERM. The end-product of the ERM process is an entity-relationship diagram or ERD. Data modeling requires a graphical notation for representing such data models. An ERD is a type of conceptual data model or semantic data model.
The first stage of information system design uses these models to describe information needs or the type of information that is to be stored in a database during the requirements analysis. The data modeling technique can be used to describe any ontology (i.e. an overview and classifications of used terms and their relationships) for a certain universe of discourse (i.e. area of interest). In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Note that sometimes, both of these phases are referred to as "physical design".
There are a number of conventions for entity-relationship diagrams (ERDs). The classical notation is described in the remainder of this article, and mainly relates to conceptual modelling. There are a range of notations more typically employed in logical and physical database design, including information engineering, IDEF1x (ICAM DEFinition Language) and dimensional modelling.
Contents |
[edit] Common symbols
An entity represents a discrete object. Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem. A relationship captures how two or more entities are related to one another. Relationships can be thought of as verbs. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem. Entities are drawn as rectangles, relationships as diamonds.
Entities and relationships can both have attributes. Examples: an employee entity might have a social security number attribute (in the US); the proved relationship may have a date attribute. Attributes are drawn as ovals connected to their owning entity sets by a line.
Every entity (unless it is a weak entity) must have a minimal set of uniquely identifying attributes, which is called the entity's primary key.
Entity-relationship diagrams don't show single entities or single instances of relations. Rather, they show entity sets and relationship sets (displayed as rectangles and diamonds respectively). Example: a particular song is an entity. The collection of all songs in a database is an entity set. The proved relationship between Andrew Wiles and Fermat's last theorem is a single relationship. The set of all such mathematician-theorem relationships in a database is a relationship set.
Lines are drawn between entity sets and the relationship sets they are involved in. If all entities in an entity set must participate in the relationship set, a thick or double line is drawn. This is called a participation constraint. If each entity of the entity set can participate in at most one relationship in the relationship set, an arrow is drawn from the entity set to the relationship set. This is called a key constraint. To indicate that each entity in the entity set is involved in exactly one relationship, a thick arrow is drawn.
Associative entity is used to solve the problem of two entities with a many-to-many relationship [1].
Unary Relationships - a unary relationship is a relationship between the rows of a single table.
[edit] Less common symbols
A weak entity is an entity that can't be uniquely identified by its own attributes alone, and therefore must use as its primary key both its own attributes and the primary key of an entity it is related to. A weak entity set is indicated by a bold rectangle (the entity) connected by a bold arrow to a bold diamond (the relationship). Double lines can be used instead of bold ones.
Attributes in an ER model may be further described as multi-valued, composite, or derived. A multi-valued attribute, illustrated with a double-line ellipse, may have more than one value for at least one instance of its entity. For example, a piece of software (entity=application) may have the multivalued attribute "platform" because at least one instance of that entity runs on more than one operating system. A composite attribute may itself contain two or more attributes and is indicated as having at least contributing attributes of its own. For example, addresses usually are composite attributes, composed of attributes such as street address, city, and so forth. Derived attributes are attributes whose value is entirely dependent on another attribute and are indicated by dashed ellipses. For example, if we have an employee database with an employee entity along with an age attribute, the age attribute would be derived from a birth date attribute.
Sometimes two entities are more specific subtypes of a more general type of entity. For example, programmers and marketers might both be types of employees at a software company. To indicate this, a triangle with "ISA" on the inside is drawn. The superclass is connected to the point on top and the two (or more) subclasses are connected to the base.
A relation and all its participating entity sets can be treated as a single entity set for the purpose of taking part in another relation through aggregation, indicated by drawing a dotted rectangle around all aggregated entities and relationships.
[edit] Alternative diagramming conventions
[edit] Crow's Feet
The "Crow's Feet" notation is named for the symbol used to denote the many sides of a relationship, which resembles the forward digits of a bird's claw. You can see this claw shape in the diagram to the right, representing the same relationship depicted in the Common symbols section above.
In the diagram, the following facts are detailed:
- An Artist can perform many Songs, identified by the crow's foot.
- An Artist must perform at least one Song, shown by the perpendicular line.
- A Song may or may not be performed by any Artist, as indicated by the open circle.
This notation is gaining acceptance through common usage in Oracle texts, and in tools such as Visio and PowerDesigner, with the following benefits:
- Clarity in identifying the many, or child, side of the relationship, using the crow's foot.
- Concise notation for identifying mandatory relationship, using a perpendicular bar, or an optional relationship, using an open circle.
[edit] Classification
Entity relationship models can be classified in BERMs (Binary Entity Relation Model) and GERMs (General Entity Relationship Model) according to whether only binary relationships are allowed. A binary relationship is a relationship between two entities. Thus, in a GERM, relationships between three or more entities are also allowed.
[edit] See also
[edit] ER diagramming tools
- AllFusion ERwin Data Modeler - ERD tool, ablity to generate HTML reports.
- ConceptDraw - cross platform software for creating ER Diagrams
- DBDesigner - OpenSource ER Data Modeler.
- DB Visual ARCHITECT - Support UML Class Diagram and ERD
- Dia - a free software program to draw ER diagrams
- Ferret (software) - a free software ER drawing tool
- ER/Studio - robust, easy-to use ER modeling tool from Embarcadero.
- Kivio - a free software flowcharting program that supports ER Diagrams
- Microsoft Visio - diagramming software, some versions can auto-generate an ERD from a database
- OmniGraffle - diagramming software for MacOS
- PowerDesigner - modeling suite from Sybase which includes Data Architect for constructing or reverse engineering conceptual, logical and physical models with many of the leading RDBMS brands.
- SILVERRUN ModelSphere - supporting conceptual, logical and physical data modeling including interfaces for multiple target systems.
- SmartDraw - point and click drawing method combined with many templates creates professional diagrams.
[edit] References
- Chen, Peter P. (1976). "The Entity-Relationship Model - Toward a Unified View of Data". ACM Transactions on Database Systems 1 (1): 9-36.
This paper is one of the most cited papers in the computer field. It was selected as one of the most influential papers in computer science in a recent survey of over 1,000 computer science professors. The citation is listed, for example, in DBLP: http://dblp.uni-trier.de/ [2]
[edit] External links
- Peter Chen home page at Louisiana State University http://bit.csc.lsu.edu/~chen/chen.html
- Case study: E-R diagram for Acme Fashion Supplies by Mark H. Ridley
- IDEF1X
- Notes: Logical Data Structures (LDSs) - Getting started - by - Tony Drewry
- Introduction to Data Modeling