Boyce-Codd normal form
From Wikipedia, the free encyclopedia
Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). A table is in Boyce-Codd normal form if and only if, for every one of its non-trivial functional dependencies X → Y, X is a superkey—that is, X is either a candidate key or a superset thereof.
BCNF was developed in 1974 by Raymond F. Boyce and Edgar F. Codd[1] to address certain types of anomaly not dealt with by 3NF as originally defined. Chris Date has pointed out that a definition of what we now know as BCNF appeared in a paper[2] by Ian Heath in 1971. "Since that definition predated Boyce and Codd's own definition by some three years," writes Date, "it seems to me that BCNF ought by rights to be called Heath normal form. But it isn't."[3]
Contents |
[edit] 3NF tables not meeting BCNF
The factual accuracy of this section is disputed. Please see the relevant discussion on the talk page.(June 2008) |
Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table which does not contain multiple overlapping candidate keys is guaranteed to be in BCNF.[4] Depending on what its functional dependencies are, a 3NF table containing two or more overlapping candidate keys may or may not be in BCNF.
An example of a 3NF table that does not meet BCNF is:
Tutor ID | Tutor Soc. Security Num. | Student ID |
---|---|---|
1078 | 088-51-0074 | 31850 |
1078 | 088-51-0074 | 37921 |
1293 | 096-77-4146 | 46224 |
1480 | 072-21-2223 | 31850 |
The purpose of the table is to show which tutors are assigned to which students. More than one student can be associated with a single tutor, and more than one tutor can be associated with a single student. The table's candidate keys, which happen to overlap, are:
- {Tutor ID, Student ID}
- {Tutor Social Security Number, Student ID}
Therefore all three attributes of the table are prime attributes: that is, all three attributes belong to candidate keys.
Recall that 2NF prohibits partial functional dependencies of non-prime attributes on candidate keys, and that 3NF prohibits transitive functional dependencies of non-prime attributes on candidate keys. Since the table above lacks any non-prime attributes, it adheres to both 2NF and 3NF.
BCNF is more stringent than 3NF in that it does not permit any functional dependency in which the determining set of attributes is not a candidate key (or superset thereof). The dependency of Tutor ID on Tutor Social Security Number is such a dependency. Accordingly, the table above is not in BCNF.
Any table that falls short of BCNF will be vulnerable to logical inconsistencies. The candidate keys of the Tutor/Student Cross-Reference table cannot prevent a situation in which two different Tutor IDs are shown, illegitimately, as corresponding to the same Tutor Social Security Number.
Correcting the problem in this case would involve 1) modifying the Tutor/Student Cross-Reference table so that it uses only one scheme of identifiers for Tutors: either IDs or Social Security Numbers, but not both; and 2) showing Tutor ID / Tutor Social Security Number associations in a separate table.
Tutor ID | Student ID |
---|---|
1078 | 31850 |
1078 | 37921 |
1293 | 46224 |
1480 | 31850 |
Tutor ID | Tutor Surname | Tutor Soc. Security Num. |
---|---|---|
1078 | Quinn | 088-51-0074 |
1293 | Stevens | 096-77-4146 |
1480 | Jones | 072-21-2223 |
The sole candidate key for the revised Tutor/Student Cross-Reference table is {Tutor ID, Student ID}; and the candidate keys of the Tutor table are {Tutor ID} and {Tutor Social Security Number}. Both tables are in BCNF; and so long as the uniqueness of all candidate keys is enforced, there is no possibility of showing multiple distinct Tutor IDs against the same Tutor Social Security Number or vice-versa.
[edit] Achievability of BCNF
In some cases, a non-BCNF table cannot be decomposed into tables that satisfy BCNF and preserve the dependencies that held in the original table. Beeri and Bernstein showed in 1979 that, for example, a set of functional dependencies {AB → C, C → B} cannot be represented by a BCNF schema.[5] Thus, unlike the first three normal forms, BCNF is not always achievable.
Consider the following non-BCNF table whose functional dependencies follow the {AB → C, C → B} pattern:
Person | Shop Type | Nearest Shop |
---|---|---|
Davidson | Optician | Eagle Eye |
Davidson | Hairdresser | Snippets |
Wright | Bookshop | Merlin Books |
Fuller | Bakery | Doughy's |
Fuller | Hairdresser | Sweeney Todd's |
Fuller | Optician | Eagle Eye |
For each Person / Shop Type combination, the table tells us which shop of this type is geographically nearest to the person's home.
The candidate keys of the table are:
- {Person, Shop Type}
- {Person, Nearest Shop}
Because all three attributes are prime attributes (i.e. belong to candidate keys), the table is in 3NF. The table is not in BCNF, however, as the Shop Type attribute is functionally dependent on a non-superkey: Nearest Shop.
The violation of BCNF means that the table is subject to anomalies. For example, Eagle Eye might be shown as an Optician on its "Davidson" record and Optometrist on its "Fuller" record. Holding each shop's Shop Type only once would seem preferable, as doing so would prevent such anomalies from occurring:
Person | Shop |
---|---|
Davidson | Eagle Eye |
Davidson | Snippets |
Wright | Merlin Books |
Fuller | Doughy's |
Fuller | Sweeney Todd's |
Fuller | Eagle Eye |
Shop | Shop Type |
---|---|
Eagle Eye | Optician |
Snippets | Hairdresser |
Merlin Books | Bookshop |
Doughy's | Bakery |
Sweeney Todd's | Hairdresser |
In this revised design, the "Shop Near Person" table has a candidate key of {Person, Shop}, and the "Shop" table has a candidate key of {Shop}. Unfortunately, although this design adheres to BCNF, it is unacceptable on different grounds: it allows us to record multiple shops of the same type against the same person. In other words, its candidate keys do not guarantee that the functional dependency {Person, Shop Type} → {Shop} will be respected.
A design that eliminates all of these anomalies (but does not conform to BCNF) is possible.[6] This design consists of the original "Nearest Shops" table supplemented by the "Shop" table described above.
Person | Shop Type | Nearest Shop |
---|---|---|
Davidson | Optician | Eagle Eye |
Davidson | Hairdresser | Snippets |
Wright | Bookshop | Merlin Books |
Fuller | Bakery | Doughy's |
Fuller | Hairdresser | Sweeney Todd's |
Fuller | Optician | Eagle Eye |
Shop | Shop Type |
---|---|
Eagle Eye | Optician |
Snippets | Hairdresser |
Merlin Books | Bookshop |
Doughy's | Bakery |
Sweeney Todd's | Hairdresser |
If a referential integrity constraint is defined to the effect that {Shop Type, Nearest Shop} from the first table must refer to a {Shop Type, Shop} from the second table, then the data anomalies described previously are prevented.
[edit] References
- ^ Codd, E. F. "Recent Investigations into Relational Data Base Systems." IBM Research Report RJ1385 (April 23rd, 1974). Republished in Proc. 1974 Congress (Stockholm, Sweden, 1974). New York, N.Y.: North-Holland (1974).
- ^ Heath, I. "Unacceptable File Operations in a Relational Database." Proc. 1971 ACM SIGFIDET Workshop on Data Description, Access, and Control, San Diego, Calif. (November 11th-12th, 1971).
- ^ Date, C.J. Database in Depth: Relational Theory for Practitioners. O'Reilly (2005), p. 142.
- ^ Vincent, M.W. and B. Srinivasan. "A Note on Relation Schemes Which Are in 3NF But Not in BCNF." Information Processing Letters 48(6), 1993, pp. 281-83.
- ^ Beeri, Catriel and Bernstein, Philip A. "Computational problems related to the design of normal form relational schemas." ACM Transactions on Database Systems 4(1), March 1979, p. 50.
- ^ Zaniolo, Carlo. "A New Normal Form for the Design of Relational Database Schemata." ACM Transactions on Database Systems 7(3), September 1982, pp. 493.
[edit] Further reading
- Date, C. J. (1999), An Introduction to Database Systems (8th ed.). Addison-Wesley Longman. ISBN 0-321-19784-4.
[edit] External links
- Rules Of Data Normalization
- Advanced Normalization by ITS, University of Texas.
Topics in Database normalization |
First normal form | Second normal form | Third normal form |