Talk:First normal form
From Wikipedia, the free encyclopedia
Contents |
[edit] Primary keys with informational content
A commonly overlooked violation of 1NF is building intelligence into an identifier. The VIN for an automobile is not atomic, besides a sequence number it contains:
1) A code identifying the region where the vehicle was manufactured;
2) A code identifying the manufacturer;
3) Several codes identifying attributes of the vehicle;
4) A code indicating the vehicle model year;
5) A code indicating the plant where the vehicle was made;
Sadly, the VIN is probably the primary key to many tables in existence today. Business persons have a natural tendency to build inteligent keys. There is no mention of this in the main article, it only addresses the classic textbook example of eliminating redundant columns. Mooredc 17:56, 16 August 2006 (UTC)
- GUIDs contain (or, did in the past) a mac address and a timestamp, does that make a GUID a bad key? Isn’t the question more to do with whether that “intelligence” is actually used for anything? —Random832 20:33, 19 September 2007 (UTC)
-
- In August 2006 when Mooredc wrote this, the article did not contain a section on atomicity; now it does, and hopefully it gives a sense of why Date and others consider it a mistake to hang one's definition of 1NF on the notion of atomicity. 1NF aside, there are many situations in which building "intelligence" into a domain - particularly when it's a primary key column's domain - is a bad idea. If CAR_ID = "RED16", and the "RED" substring refers to the colour of the car in question, then we have a dilemma when someone paints the car blue. The identifier isn't much of an identifier if we are compelled to alter its value from time to time; on the other hand, if we keep its value the same in these types of situations, its meaning becomes muddled. Further, if knowing the colour of the car is important, we may as well expose the colour to the RDBMS by making the colour a column in its own right. You are exactly right when you say that what matters is whether the intelligence is used for anything - if it is, we get problems of the type described; if it isn't, we don't. --Nabav 22:03, 2 October 2007 (UTC)
[edit] Links to other forms articles
I don't know if this is the right place to add my two pence, but I was wondering if it wouldn't be a good idea to have some kind of links in the bottom of the article to move from 1NF to 2NF to 3NF and so on, instead of having to go back to the normalization in order to move from the article on one normal form to the other. ray 17:04, 21 August 2006 (UTC)
- Glad I'm not the only person thinking that. I'll get on with that template.--VinceBowdren 18:04, 21 August 2006 (UTC)
- Obviously, your comment here did get noticed. I've been treating Talk:Database normalization as the place for discussion of this set of articles in general, though I and the others working in this area have all the articles on our watchlists anyway. I don't think its worth setting up a WikiProject for this few linked articles.--VinceBowdren 19:12, 21 August 2006 (UTC)
[edit] What if NULL is part of the applicable domain?
I mean, take this:
First Name | Last Name | Middle Name |
---|---|---|
John | Public | Q |
J | Hacker | Random |
John | Doe |
“Has no middle name” is a valid answer, yet having a nullable column seems to violate that requirement —Random832 20:28, 19 September 2007 (UTC)
- The first point to make is that Null is never part of the applicable domain. The Null (SQL) article puts this succinctly: 'Since Null is not a member of any data domain, it is not considered a "value", but rather a marker (or placeholder) indicating the absence of value.'
- Second, although you are completely right in saying that "has no middle name" is a valid answer, it does not follow that a Null in a Middle Name column is the only way of expressing the proposition. There are other ways of doing it. A logically-sound way that does not involve Nulls is:
Person ID | First Name | Last Name |
---|---|---|
1 | John | Public |
2 | J | Hacker |
3 | John | Doe |
4 | George | Bush |
Person ID | Middle Name Sequence Num | Middle Name |
---|---|---|
1 | 1 | Q |
2 | 1 | Random |
4 | 1 | Herbert |
4 | 2 | Walker |
- Notice that with this method we also allow a person to have any number of middle names.
- The two tables above are in 1NF even by Date's stringent definition. --Nabav 06:51, 2 October 2007 (UTC)
[edit] Seems like the use of the word tuple is inconsistent
The article contains the following:
A view whose definition mandates that results be returned in a particular order, so that the row-ordering is an intrinsic and meaningful aspect of the view.[5] This violates condition 1. The tuples in true relations are not ordered.
However, when I followed the link for tuple, the formal definition said that tuples are ordered. —Preceding unsigned comment added by 84.75.117.176 (talk) 10:44, 30 September 2007 (UTC)
-
- The formal definition is talking about how the values within a tuple are ordered, whereas here we're talking about something different, namely the possibility of tuples being ordered with respect to each other. I've changed the wording to make this more clear. --Nabav 22:49, 1 October 2007 (UTC)
[edit] 1NF Example is not appropriate
Please revise the 1NF Example as it can be an example of 2NF also. —Preceding unsigned comment added by 122.164.254.49 (talk) 11:14, 1 October 2007 (UTC)
- done.Heathcliff (talk) 03:39, 8 February 2008 (UTC)
- There was nothing wrong with the 1NF example being in 2NF as well. As has been explained before on the discussion pages of some of the other NF articles, and in the main database normalisation article, normalisation is not an iterative process: we do not normalise to 1NF, then to 2NF, then to 3NF, etc. On the contrary, if a design problem prevents a table from meeting Nth Normal Form, correction of the problem typically causes the revised table(s) to meet not only Nth Normal Form, but the higher normal forms as well (thus for example the revised table(s) will be overwhelmingly likely to be in 5th Normal Form). Thus I've reverted the example back. Having said all this, I believe the article DOES need some additional material that briefly notes that some 1NF tables suffer from problems, and that the job of 2NF is to address those problems. An example of such a 1NF table could be given. All of this would be in a separate section called something like "Normalization Beyond 1NF", and, of course, a link to the 2NF article would be given within that section. --Nabav (talk) 18:49, 29 February 2008 (UTC)
- I've added the "Normalization Beyond 1NF" section now. --Nabav (talk) 19:41, 28 April 2008 (UTC)
[edit] Atomicity
Seems like there should be some mention of the importance of atomicity instead of just explaining that Date disagrees with Codd and listing the ways atomicity can be taken to extremes. Ostensibly, Codd was trying to say that putting a field in a field isn't generally a good idea. For example, you wouldn't want a flight number field to contain the codes for the airports of departure and arrival. That's the common sense way to interpret atomicity. Seems like that should be mentioned first instead of not at all. Then talk about all the theoretical mumbo jumbo that has limited practical value to someone just learning about 1NF. --Trweiss (talk) 22:09, 12 May 2008 (UTC)
- Cases in which somebody wants to place more than one of the same type of thing in a single field have been covered in the "Repeating Groups" section of the article. The case you mention, involving placing multiple different types of things (i.e. departure airport and arrival airport) in the same field, arguably hasn't been covered. And covering it might be a good way to introduce atomicity. As an intro to the Atomicity section, we could mention something like the departure airport / arrival airport case, and make the point that this is the sort of thing people are trying to discourage when they encourage "atomicity". So in summary: I agree! We just need to be careful not to give the reader the impression that atomicity is a clearer and more well-defined concept than it actaully is. Atomicity is in the eye of the beholder (a string field can contain multiple letters of the alphabet combined together into words, for example, and no one complains about that) ... If the smaller components of the data within the field are not worth separating out as fields in their own right, then don't; if they are, then do. --Nabav (talk) 08:10, 13 May 2008 (UTC)