Talk:Relational model

From Wikipedia, the free encyclopedia

	This article is within the scope of WikiProject Databases.
Start	rated as start-Class on the assessment scale
Top	rated as top-importance on the assessment scale

1 Key ambiguity
2 relational / normalized
3 More than Codd, Date & Darwen . .
4 References to Chen?
5 Set Theory Stuff From the Database Normalization Article
6 Naming Conventions: Singular Vs. Plural
7 Misimplementation disputed statement
8 Proposed Merger with relational database
9 table
10 never
11 Musings on Tables vs Tuples
12 i can't understand this article
13 why build databases on the relational model?
14 "truly" relational
15 Organization of Articles Related to Relational Databases
16 kind of model
17 Can we get a good definition "Relational model"?
18 the columns and the rows of a table are ordered
19 What "relational" means (opening paragraph)
20 Use of "arbitrary" for selecting primary keys
21 Formal definition of Relation needed
22 "Criticism"
23 "Competition"
24 "Cartesian product becomes commutative"

[edit] Key ambiguity

I see the term "key" used by itself, with no prefix many times in articles. I am confused by what this means, is it a super key, a candidate key, or something else?

Typically, when "key" is used by itself, it refers to a superkey, (i.e., any kind of key, except for a foreign key. Any set of attributes which uniquely defines the relation). McKay 13:58, 14 August 2006 (UTC)

Okay, so using the key in this sense isn't really preferred, as it is ambiguous (especially to beginners), so I decided to look at it's usage in this article, and I only found one instance of key in the article that isn't specified, so I fixed it. In this case, the usage means either kind. It's a bit verbose, partially because superkeys are not referenced in that paragraph, but saying candidate keys isn't technically correct, so if anyone wants to improve readability, that's fine. McKay 14:02, 14 August 2006 (UTC)

[edit] relational / normalized

A database built on the pure relational model would be entirely normalized.

Under any reasonable interpretation of "entirely normalized" that would seem false to me. The relational model only requires that the relation is in 1NF and according to Chris Date even that is no longer required. -- Jan Hidders 15:05 Mar 4, 2003 (UTC)

Mr. Hidders, I don't remember Chris Date saying "1NF is no longer required" in any of his books. Can you kindly provide us a reference for it? What I know is that a relation is already in 1NF by definition (Warning: Don't confuse "relation" with "table"), and Date extended the definition of 1NF to include relations having relation-valued attributes (i.e., they have attributes whose values themselves are relations). See also Date's What First Normal Form Really Means and Fabian Pascal's What First Normal Form Means NOT for a more precise definition of 1NF. Oops, forgive me for referencing DBDebunk's papers. :-) -- Perry V. 02:53, 18 September 2005 (UTC)

I meant the term "1NF" is it is usually understood by database researchers and practitioners, and not how Chris Date and Fabian Pascal have redefined it. -- Jan Hidders 12:16, 24 January 2006 (UTC)

I wish I knew how "1NF" is "usually understood by database researchers ...". I wouldn't say that Chris Date has redefined it. Rather, he observes that if you discard that part of Codd's definition that we think has to be discarded, what is left is a definition that is redundant because under that revised definition every relation is by definition in 1NF. The part of Codd's definition that has to be discarded is the requirement for attribute values to be "atomic", a term that has so far defied attempts to define it precisely. (The definition offered by Codd himself in his 1990 book, The Relational Model for Database Management Version 2, is quite unacceptable.)

Some people think it might be useful to define 1NF to refer to a relation none of whose attributes is relation-typed or tuple-typed. Then it might be good advice to aim for database designs in which all the base relvars are in 1NF under that revised definition. Note that 1NF then becomes orthogonal to BCNF, 5NF and 6NF. Note also that normalisation is not a requirement of the model. A database consisting entirely of relvars is a relational database regardless of its degree of normalisation. AndrewWarden 13:12, 30 January 2006 (UTC)

A tuple is a set of attributes, which are ordered pairs of domain and value.

That's a very sloppy definition if not simply wrong. An attribute should be defined as a pair of attribute name and value, and it should be stated that a tuples cannot contain two attributes with the same name. It's probably better to simply define it as a partial function from attribute names to values. -- Jan Hidders 15:05 Mar 4, 2003 (UTC)

I completely agree that the definition is sloppy but I don't think the correction offered is quite right. Codd defined the term attribute to be a constituent of a relation heading, not a tuple. A tuple is a set of attribute values. An attribute is a pair of attribute name and type name. An attribute value is a pair of attribute and value. A heading is a set of attributes. Tuples have headings and so do relations. The heading of every tuple in the body of relation r is the heading of r itself (unless subtyping is supported, in which case the heading of the tuple must be such that each type of each attribute value is a subtype of the corresponding attribute in the heading or r). AndrewWarden 13:12, 30 January 2006 (UTC)

After I had written the above comment I decided to take the bull by the horns and make corresponding corrections to the article. I took the opportunity to address quite a number of other problems I found. AndrewWarden 14:14, 31 January 2006 (UTC)

The Information Principle is stated in a misleading way. Not all information in a database is in the values used; rather, it is in the way these values are structured in the database. For example, the value 5 doesn't give us any nformation; but when it is an attribute named Age of a relation named Person, it does, owing to our interpretation of these names. The given examples even use values that have no meaning at all, except to indicate structure!

Date phrases the principle as follows:

 All information in the database must be cast explicitly in terms of values in tables and in no other way.

The addition "in tables" is what is missing from the present text.

[edit] More than Codd, Date & Darwen . .

The article is kind of selective when it implies that two individuals are responsible for the subsequent maintenance and development of Codd's original work. Reading the research literature for the period 1975-2000 (TODS, SIGMOD, VLDB), you find that many others made substantial contributions; Fagin, Ullman, Goodman, Beeri, etc. In tone and substance the write-up reflects 'the gospel according to Date & Pascal' and not the more broad perspective found in most text-books.

FWIW, I fully agree. Any concrete suggestions on how to rewrite it? I'm afraid I'm lacking the time to do that properly. -- Jan Hidders 17:47, 1 Jun 2004 (UTC)

As one of the two individuals in question I have to feel embarrassed by the lack of mention of other contributors to this field. However, I'm not aware of any others who sought to get to the heart of the model in quite the way we did with The Third Manifesto. Nor of any others who have been quite so determined to draw attention to the failings of SQL as a purported implementation of that model. But why "Date and Pascal"? Shouldn't that be "Date and Darwen"?

I suppose the term "maintenance" refers to our attempt to clarify what Codd (1970) was not clear about, especially the term domain and its connection with the then existing term data type. Has anybody else done anything similar? As for "development", I'm not quite sure what kind of activities that might refer to, but I think most people would take it to include all sorts of activities that Date and I could not lay claim to as our own. (I'm sorry for any confusion caused by my whimsical use, here in Wikipedia, of the pseudonym I adopted in 1988 to avoid any conflict with my then employer, whose research organisation had been responsible for the development of the language I sought to criticise.) AndrewWarden 13:12, 30 January 2006 (UTC)

Just to chime in here, I was quite surprised that the page wasn't written by Fabian Pascal (honestly). From a neutral POV, there are (to my knowledge) at least three distinctly different constructions of the relational model that differ even in how they define a "tuple":

An n-tuple of values (as in Codd's RM/V1,p.379; [1], p.399; Elmasri & Navathe, p.128)
A set of ordered pairs <attribute-name, value> (e.g. in Codd's RM/T, p.399 and E&N, p.130)
A set of ordered triples: <attribute-name, type-name, value> (Date, p.141)

Of these, I would guess that the first is probably the most widely taught and easiest to grasp (but I don't know). Also, the very extreme anti-null stance of Date and Darwen, while deserving of presentation, shouldn't be presented as being essential to the relational model, IMHO, since some constructions of the relational model (Codd's RM/T, p.400), Codd himself until the day he died (IIRC according to Date, anectdotally), and common practice regard nulls as important and useful.

I can see this getting controversial :o)

As a final comment, while others may disagree, the link to DBDebunk seems pretty inappropriate to me - AFAICT (and I've been around that site quite a lot over the last couple of years), it just seems to funnel people into buying stuff and doesn't explain much unless you're prepared to pay $10 for each "paper"; I wouldn't regard it as the least bit useful for anyone who needs to look up "relational model" in an encylopaedia, and (although not that important) it is pretty dreadful (HTML and design-wise).

My 2¢ worth EmmetCaulfield 05:40, 17 Nov 2004 (UTC)

I've removed the link to dbdebunk. I agree with your description of it, and is clearly mainly devoted to selling their papers. Not approriate for an encyclopedia. It was added on 04:14, 9 Sep 2002. If someone wants to defend it, feel free to add it back in. JesseW 09:12, 22 Nov 2004 (UTC)

[edit] References to Chen?

Why are there references to Chen on this page? I think they should be on the page on ERDs. Note that ERDs are not part of the relational model and also not a way of describing a relational schema. They are a way of describing a conceptual data model that later may or may not be mapped the data model of a relational database, or to the data model of another type of databases. -- Jan Hidders 14:15, 26 Jun 2004 (UTC)

Entity-relationshp modelling has nothing to do with the relational model. In my opinion there should be no mention of it here at all. AndrewWarden 13:12, 30 January 2006 (UTC)

I agree completely -- the P. Chen's Entity-Relationship model is distinct from E. F. Codd's Relational Model. The Entity-Relationship model is an object-based logical model used to describe data, while the Relational model is a record-based logical model used to describe data. This is pretty much evident in every introductory textbook on databases, for example, in Silberschatz, Korth, and Sudarshan, Database System Concepts. I suggest removing all references to Chen and the Entity-Relational model from this page. For those who are having trouble understanding how the relational model and the E-R model are related: the relational model came first. But modeling data and relationships using pure relations (tables) was unwieldy so Chen later developed the E-R model to let users think about data in terms of entities and relationships (and draw diagrams). From the E-R diagrams, one could then translate the E-R model into tables in a relational database. They are two distinct models, often used together and often interchanged in casual discussion and tutorials, but they are in fact two distinct models. Kendrick Hang (talk) 05:35, 23 May 2008 (UTC)

If you look at references link you should understand how Peter chen references are foundamentals 'cause I can't find other than pdf files for illustrating how the concept taken from chinese written language evolution was adapted to computer science for the creation of the relational model, but I can't find any png or gif on the net, so images and descriptions are on those pdf... look at them, very cool 16:45, 26 Jun 2004 (CET)

Sure, they are interesting, but that is besides the point. They are perhaps fundamental for ERDs but not for the relational model, so I still don't think this article is the most appropriate place for them. -- Jan Hidders

But according to those "papers" there is nothing concerning the diagrams, all is about Relational Model! I can't found the word "diagram" or the ERD acronym anywere on those pages/links... all seems to mean "Peter Chen is the author of Relational Model. and that's all the story about that..." 23:54 26 Jun 2004 (CET)

The phrase "Entity-Relationship Diagram" appears in all those articles and none of them contain the claim that Peter Chen is the inventor of the relational model. Like I said, these articles are interesting and should certainly be linked to, but not in the article on the Relational Model. -- Jan Hidders 23:37, 26 Jun 2004 (UTC)

Dr. Peter Chen's original paper on the Entity-Relationship model (ER model) is one of the most cited papers in the computer software field. Recently, Dr. Chen was honored by the selection of his original ER model paper as one of the 38 most influential papers in Computer Science according to a survey of 1,000 computer science college professors (Table of Contents, Great Papers in Computer Science, edited by P. Laplante, West Publishing, 1996). Based on one particular citation database, Chen's paper is the 35th most cited article in Computer Science This is taken from the home page of peter chen and a CTRL+F "diagram" shows no results, please tell me where you find that word, please User:Aytharn 13:10 27 JUN 2004 (CET)

For example, see the last paragraph of section 1 in http://bit.csc.lsu.edu/~chen/pdf/erd.pdf . But with all respect, you seem to be missing the point. The point was that that the relational model and the entity-relationship model are two different things. Since Chen invented the latter and not the first the statements concerning him were moved to entity-relationship model. By the way, you might want to add the information that you mention here in your citation above on the article entity-relationship model. A description like "a recent survey shows" should be detailed a bit more and written such that it is also correct if the reader reads it 100 years later. -- Jan Hidders 11:29, 27 Jun 2004 (UTC)

[edit] Set Theory Stuff From the Database Normalization Article

There were complaints (which I agreed with) that the set theory definitions in the database normalization article were very off-putting and technical. Unfortunately, this is just a subset of the definitions--the ones most applicable to normalization. However, as these definitions apply to more than just normalization, it seems that they would be more appropriately defined in the relational model--or perhaps an article of their own? They certainly were making the other article unreadable. Metaeducation 11:41, 30 May 2005 (UTC)

My (= rgb.pile) comments relate to 'Set Theory Stuff' and what is said in the articel under 'History':

   "The foundation for the relational model included important works published by Georg Cantor (1874) 
   and D.L. Childs (1968)."

More precisely, I'd say that above statement about the foundation should include the Cauchy/Cantor Diagonal Method. [2] just rediscovered a paper of W. T. Hardgrave:

   Hardgrave, W. T. (1976): A technique for implementing a set processor. 
   In: ACM SIGPLAN Notices, Vol. 11, Nr. SI, p. 86–94. Available online at
   http://doi.acm.org/10.1145/942574.807126

Hardgrave gives three references to Childs.

[edit] Naming Conventions: Singular Vs. Plural

There is no real establishment or discussion thusfar for naming conventions in the Relational model. Some preach about singular table names and some preach about plural names. I don't think we could ever come to a concensus on that, but I would like to propose a new note or preferably a section concerning singular and plural table naming that highlights the pros and cons of each. -- 24.173.83.230 18:06, 6 December 2005 (UTC)

[edit] Misimplementation disputed statement

Relational database management system describes two schools of thought. One considers SQL to follow the relational model, and the other does not. This article says that it does not. The two articles need to be reconciled. It would be nice if both schools of thought could be documented in external sources. -- Beland 06:56, 18 December 2005 (UTC)

Also, this section doesn't really say how SQL is considered to violation the relational model, other than using different vocabulary to describe its data structures. -- Beland 07:20, 18 December 2005 (UTC)

Probably the most obvious violation of the relational model is that SQL permits duplicate rows; in contrast, a mathematical relation is a set (not a multiset) of tuples in which a duplicate cannot arise by definition. Inclusion in recent versions of the standard of other things like ARRAYs (SQL99) and XML (SQL03) have further distanced SQL from the conventional relational model, which requires values in tuples to be single values from the corresponding domain: arrays are, by definition, multiple values from some domain and hence violate this requirement. -- EmmetCaulfield 03:43, 22 December 2005 (UTC)

The violations are: duplicate rows, three-valued logic for evaluation of Boolean expressions, anonymous columns, having more than one column of the same name in a single table, that the order in which the columns of a table appear has significance, and that an update to a view can be accepted as an update to the database even though that update does not necessarily have the expressed effect on the view. I might have missed a couple. AndrewWarden 13:12, 30 January 2006 (UTC)

I think it's pretty well accepted that SQL is not now relational, but was inspired by the relational model, and can be used to implement a relational database, albeit with some care. It is probably safe to say that available SQL DBMSs implement an ill-defined superset of the features required by the relational model. What is a religious issue is whether the bits "added on" are a good idea or not: in making ad-hoc additions, the guarantees of the relational model (e.g. that all expressible queries are answerable, and that all answerable queries are expressible), which arise solely from the mathematical foundations, are lost.

The SQL99 Standard (I don't have the 2003 version to check) does not use the word "relation" or "relational" at all (just searched the PDFs), and I doubt the terms have been reintroduced into SQL'03, so I think it's probably fair to say that even the SQL standards committee have distanced SQL somewhat from the term "relational". -- EmmetCaulfield 03:43, 22 December 2005 (UTC)

As a former member of that committee (1988-2004), I can confirm that the decision to avoid using the term "relation" was a deliberate one, taken in the 1980s some time before I joined. AndrewWarden 13:51, 30 January 2006 (UTC)

[edit] Proposed Merger with relational database

relational model MERGE relational database

It is wrong to merge the entry for 'Relational Model' with 'Relational Database' . Relational Data Model is the blueprint for a Relational database. Merging the entry for 'Relational Model' with 'Relational Database' is like merging the entry for 'Blueprint' with the physical thing for which the blueprint was made. First design, then build.

JA: No, I think the term "relational database" still has its uses, while the term "relational model" suffers from the general overloading of the word "model". Whereas "model" in database theory means something like a style of formalism or a canonical form, the term "relational model" raises the question: Do they mean models in the sense of model theory? This would lead to a different idea than a concrete database. Jon Awbrey 05:40, 6 January 2006 (UTC)

JA: Uno, the more I think about it, and the more that I find myself constantly having to refer other article readers to three (3) different articles -- Relational algebra, Relational database, and Relational model -- on the contemporary practical applications of that which DeMorgan and Peirce hath wrought so long ago, the more I'm beginning to think that it might be good to subsume the little bit of stuff that's in Relational database under a heading of the Relational model article, saying something to the effect that a relational database is a single instance under the relational model (in the sense of formalism, paradigm, or style). The logical theory/model distinction, then, would still be preserved in the difference between rel algebra and rel model. What do folks think about that? Jon Awbrey 13:16, 7 February 2006 (UTC)

I think this is it, this is the reference you've been referring to, the one where it was decided how to seperate the three articles. I disagree with this proposal as it is written because it isn't factually correct. Relational algebra is an algebra for performing computation on relational databases. Relational databases are databases which conform to the relational model. Sometimes Relational databases don't perfectly conform to the relational model, but I think the current relational database article reflects this. McKay 17:44, 28 September 2006 (UTC)

I think that it's not correct to merge with relational database. The model has been created as logical model in the 1970 and the first appliance to database was on the 1981. This model is connected strictly with matematics and sets. --Ilario 21:46, 22 February 2006 (UTC)

From a relative newbie's standpoint, I find the distinction between "Relational Model" and "Relational Database" quite useful, and think they should remain separate articles.

What I would hope to get is from the "Model" article is background on the theoretical model (with various interpretations as apropriate). From the "Database" article I would expect to learn about implementations: History, how they vary in compliance with the model, practical implementation issues, what to expect from real software.

BTW: Some treatment of how the relational model (pardon the expression) relates to object modeling and some mention of object/relational mapping ought to be in here somewhere, but I don't know where, or to what extent. -ef

I oppose the proposed merger. 132.205.45.148 03:38, 3 June 2006 (UTC)

I also oppose the proposed merger. "relational model" is a mathematical term. While, in general, it is possible to blur the two, the two have different degrees of definition strictness. The relational model has very strict and stringent definitions. It involves the mathematical principles underneath. Relational database isn't nearly as strict. Any database made in any RDBMS should qualify. The distinction I'm trying to make here, is that almost all experts would agree on relational model, but there is a great deal of disagreement on what constitutes a Relational database, and more particularly, a Relational database management system. McKay 05:50, 4 June 2006 (UTC)

[edit] table

small definition question is a table not rather the visual representation of a relation instead of the visual representation of a relvar? see third line of first 'chapter'

Quite right! By the way, it has been observed that there are two relations that cannot reasonably be represented this way, namely, the ones I chose to call TABLE_DEE and TABLE_DUM back in 1988! AndrewWarden 13:40, 30 January 2006 (UTC)

Having written my agreement, I then went and made the obvious correction to the article. I suppose it might help to remove the comment altogether now, but I'm not sure what the policy should be or is on such matters. AndrewWarden 14:14, 31 January 2006 (UTC)

[edit] never

DONT DO IT SCOTTY!! NOOOOOOOOOOOOOO!!!!!

and also - can u ever explain stuff for dumb a*rses that use the site - trying to write a report on stuff I dont understand is hard enough.... stop making it hard!!

No, we refuse... 210.50.86.217 14:11, 9 May 2006 (UTC)

[edit] Musings on Tables vs Tuples

The distinction between tables and tuples is explained and it is pointed out that tables can be a convenient way to visualize tuples.

It might be instructive to present the same set of tuples in tabular and non-tabular form.

There are several systems for managing data in text tables in a relational (or quasi-relational) way. /rdb, nosql, bell labs unity come to mind.

Could, say, a row from a /rdb table be presented in a tagged-data format? For example, assuming a header that associated field names with data types, could an entry in a vCard or LDAP style format be considered a representation of a tuple?

This sort of thing might help abstract the concepts and relatinships from the representations by allowing the reader to identify what the representations have in common.

[edit] i can't understand this article

I like to think I understand relational databases (I worked at Oracle for two years) and I like to think I have some mathematical ability, but I really could not understand this article. It seems to be written by mathematicians for mathematicians. There's a lot of formal definition of terms with no motivation or intuitive guides. If I wanted this kind of dry treatment I would read the original papers not a general purpose encyclopedia entry.

I'll be the first to admit my understanding of the relational model is not strong, if it was I wouldn't have looked it up. But the relational model is famous for it's application in computer science, and as a professional programmer I should be able to understand this article.

Agreed :-)

To show you the depth of my confusion I'm going to go through the article line by line and just spew the questions that pop into my head.

I know what a set is. I know what a cartesian product is. I know what a subset is. It is difficult for me to combine the concepts into an understanding of what a mathematical n-ary relation is.

Loosely, an n-ary relation is a subset of the Cartesian product of n domains (D1, D2, ..., Dn). So, it is a set of tuples (a tuple is just an ordered set), of cardinality (size) n, (e1, e2, ..., en), each of which draws its ith element from the corresponding domain: e1 from D1, e2 from D2, etc.

I say "loosely" because some of the domains might be the same, which complicates things a little. For example, when a mathematician says "a binary relation on R" he means a set of ordered pairs (2-tuples) both of whose elements are drawn from R.

I may get flamed for saying it, but for the current purpose, domain is more-or-less synonymous with datatype. This being the case, "position in a tuple", "column in a table", are pretty well the same thing, and have values "of a datatype" or "from a domain". A good deal of what Date has written is aimed at eradicating the positional significance that I've assumed for simplicity, and he re-defines tuple to that end. It is his alternative definition of tuple that is given in this article. EmmetCaulfield

Tell me, in a relational database is a row an n-ary relation?

No, a row is just one of the n-tuples in the n-ary relation. EmmetCaulfield

or a whole table?

A table is just a way of displaying an n-ary relation. It has n columns. The ith column contains the ith value from each tuple in the relation. EmmetCaulfield

if I think hard and painfully I can guess that the cartesian product of n sets is the space of possible rows that might exist (evaluate to true) while the subset is the rows that actually do exist.

Bingo! Buy yourself a beer! EmmetCaulfield

Why can't an explanation like that go in the article? Ideogram 18:37, 28 May 2006 (UTC)

I dunno, I didn't write it. -- EmmetCaulfield 21:17, 28 May 2006 (UTC)

Is the person who wrote it the only one allowed to insert an explanation like that? What if I tried to just insert it myself? Ideogram 01:14, 29 May 2006 (UTC)

For christ's sake don't make me click on the predicate logic link to understand the article.

IMHO, logic is a prerequisite to understanding databases, just as calculus is a prerequisite for understanding physics. Just because you can drive/fix a car, and work for Ford, doesn't mean that the underlying theory of the internal combustion engine is, or should be, readily accessible to you. Similarly, just because you can operate/program a computer, and work for Oracle, doesn't mean that the underlying theory of the database is, or should be, readily accessible to you. What you are saying seems to me to be equivalent to "I can drive, and I know what an engine is, so why is there all that integral calculus crap in the Carnot cycle article?".

The alternative to understanding the mathematics of the relational model is to battle through the turgid pseudo-formalism of most well-known textbooks without ever really getting it, and survive your professional involvement with databases, as most do, with ill-founded half-knowledge and a cook-book approach. Trust me, I have been that soldier: I thought I knew about relational databases until I found out that I didn't and my understanding is still not complete.

Understanding is never complete. It proceeds in a series of steps. The size of a step that can be made from reading an encyclopedia article is necessarily small. You can't possibly make a bigger step than a textbook here. Your admission that this article has higher goals than an entire textbook is a weakness not a strength.

Huh? I have no "goals" for this article at all, I have made little or no contribution to the article and explicitly stated (below) that I would have written a different one. However, I don't think that "dumbing down" the article is the answer. Once you get to the level of asking about the relational model, you have reached a level of detail where it is largely impossible to have a meaningful non-mathematical explanation. -- EmmetCaulfield 21:17, 28 May 2006 (UTC)

I didn't say you have goals for this article. Ideogram 01:13, 29 May 2006 (UTC)

As for ill-founded half-knowledge, there is a huge demand for explanations of how quantum physics changes the way you look at the world, as a paradigm you might say, without having to learn calculus first. Granted this ill-founded half-knowledge leads to claptrap such as comparisons of quantum physics with eastern philosophy, but it is still better than nothing. I am asking for an explanation on this level of the paradigm of the relational model. Ideogram 18:37, 28 May 2006 (UTC)

In terms of levels of detail within their respective fields, quantum physics (a major branch of physics) is approximately equivalent to databases (a major branch of computing). In Database, you will indeed find generalities for consumption by the layman as you do in Quantum Physics. However, the explanation of the relational model (theoretical foundations of databases) is, in terms of detail within the field, properly comparable to something like Dirac's Bra-ket notation, which legitimately supposes that the reader knows what a Hilbert space and a complex conjugate are, just as this article legitimately supposes that the reader knows what predicate logic and sets are.

That doesn't change the fact that there are many people who want an explanation for the layman (or at least someone familar with relational databases but not mathematics), and, I submit, that audience is larger than the audience of mathematicians who can understand this article. Ideogram 01:16, 29 May 2006 (UTC)

If you really want to understand the elegance, beauty, and simplicity of the relational model, and the wonderful insight of Codd, you really have to understand first-order logic. You will probably have to learn a little about proof theory and model theory along the way, but honestly and truly, it is well worth your effort in the understanding. EmmetCaulfield

what is an evaluation?

In this context, he's just saying that any logical expression evaluates to true or false, which AFAICT is really just a statement of the law of excluded middle for the relational model. EmmetCaulfield

what is a relational calculus or algebra?

There are generally held to be two relational calculi (the domain relational calculus and the tuple relational calculus) and one relational algebra. All of them are (pseudo-)formal mathematical languages which function as query languages in the relational model. I say (pseudo-)formal because they are often presented without mathematical rigour. EmmetCaulfield

i haven't even finished the first paragraph and I need to read four other articles. Why is the fact that a relational calculus and algebra are equivalent in expressive power even relevant? at least I have some idea of what that sentence means, most people wouldn't.

The equivalence of the relational languages means that any query that can be expressed in one can be expressed in the other two. A practical query language (I would say "like SQL" here, except I'd be flamed) is (or, perhaps, "should be") a superset of the relational algebra. There is no way of establishing that all queries are answerable (i.e. that we cannot write a query which the DBMS cannot answer) or that all answerable queries are expressible (that we have some way of stating every answerable query) other than by proving the completeness and soundness of these formal systems. It turns out that domain relational calculus is a model of first-order predicate calculus, so we know that it is complete and sound (we can copy & paste the proof); by showing the other languages to be equivalent, we have the same guarantees for them. Tuple relational calculus is a largely trivial relabelling of domain relational calculus. Relational algebra is a procedural, rather than declarative, query language, and it is more like SQL in this respect; it tends to be quite ill-defined, treated very informally, and have a lot of meta-language mixed into it when treated in textbooks. EmmetCaulfield

Ok I lied. I'm exhausted by the first paragraph. I need some feedback to show that there are actually people willing to work with me on this before I continue. Please, please work with me, I'll be happy to rewrite the article if you will help me understand it.

Ideogram 21:30, 26 May 2006 (UTC)

First, I understand that the current article has been reworked on-and-off by at least one internationally recognised expert in the field (If I'm misattributing, I apologise, but the article strongly reflects the gospel according to the Date/Darwen cabal, and I've seen Andrew Warden, which I understand to be a pseudonym of Hugh Darwen, on more than one edit), so if you were planning a major rewrite, you may have something of a credibility gap to span; I haven't rewritten it myself for exactly this reason. I would certainly write a different article, but I recognise that, as it stands, the article is largely accurate, even if it sometimes reflects views that I don't share.

One of the problems that you might have is that there are different versions of the relational model, Date's assertion to the contrary is really just saying "There is only one version of the relational model: mine!". A number of disciples have bought into his view as the one true Relational Model. IMHO, however, the different versions are virtually identical, and it makes very little difference which one you study (actually, I suspect that they are formally isomorphic in the categorial sense), but the adherents of these minor differences are sometimes like zealots who believe the same thing about the same scripture, but would burn each other at the stake over the colour of the carpet in the church. -- EmmetCaulfield 17:39, 28 May 2006 (UTC)

I am not challenging the accuracy of the article. I am questioning the intended audience.

I understand that, but you seem to be blissfully unaware that there are a few schools of thought in this area, the balance is pretty delicate, and achieving consensus probably needs everyone to bite their tongue a little. You seemed very gung-ho, and I was subtly trying to suggest to you that if you were to traipse all over the article with your size 9 boots on, you might not tread lightly enough to avoid igniting a religious war. -- EmmetCaulfield 21:17, 28 May 2006 (UTC)

Please note below I give up on the idea of editing this article myself. Ideogram 01:30, 29 May 2006 (UTC)

There is a place for explanations of quantum physics that skip the rigor required for true understanding and are accessible to laymen. I find that article comprehensible on its own without even needing to click on the "non-technical" introduction.

The old joke about quantum mechanics, variously attributed to Feynmann, Pauli and many others, is that if you think you understand it, you don't. :-)

You must ask yourself who the intended audience of this article is. This article is clearly written for mathematicians.

As a moderately mathematically literate non-mathematician, I would've thought that this article should be understandable to anyone with a year of college mathematics.

You must be joking. One year of college mathematics at MIT is calculus and vector calculus. Advanced students may start with differential equations and linear algebra. Ideogram 01:30, 29 May 2006 (UTC)

No self-respecting mathematician would have the slightest interest in this "old hat": the relational model reached mathematical maturity 20 years ago and is no longer interesting. Even then, it was applied maths: I doubt if it was ever interesting to "real" mathematicians. -- EmmetCaulfield 21:17, 28 May 2006 (UTC)

That said, I would be amenable to writing a new article, with links between the two. Ideogram 18:12, 28 May 2006 (UTC)

I doubt that anyone would object to a less/non technical introduction, but Relational database, as the term commonly adopted and abused by the marketroids and thus the place where casual passers-by are most likely to land, is probably the place to do that rather than here. My own view is that relational model as a term, is commonly stretched, overloaded, and misused, and it would be a shame to contribute to the problem here. -- EmmetCaulfield 21:17, 28 May 2006 (UTC)

This article is already listed under category:databases and there is a proposal to merge it with Relational database where it will confuse all the casual passers-by you mention. This article properly belongs in category:mathematics, with a link to it from a non technical introduction in Relational database. Ideogram 01:30, 29 May 2006 (UTC)

Okay I calmed down and made another go at the article and found it more comprehensible if I just ignore the first paragraph.

Let me describe my understanding and I hope you can check it for correctness.

A type in a typical relational database might be an int, a char, a boolean, and so on. a type name would be the string "int", "char", "boolean", etc. An attribute is part of declaring a table (note that I think of a table as a data structure inside the program, not the visual representation) and specifies a column name and the type of the value that goes under it. a tuple is a row, except rows are ordered and tuples are not. an attribute name might be "name" or "age". An attribute value is a specific entry in a specific column and row, such as "John Doe" or "35".

A type can be so much more than just an int or a char. It can also be more complicated things like "person" or "mp3".

A relation is a specification for a table along with the data in it. A heading is defined by the declaration of the table. A body is the data that goes into the table. I'm not sure how a tuple can have a heading, unless every row can also be considered a relation by itself.

A triple is the set of (type name, attribute name, attribute value). A tuple is a set of triples, a relation is a set of tuples with the same header. A tuple can be considered a relation by itself, since it is a set of exactly 1 tuple. A Header is just the set of (type name, attribute name) doubles on their own without the values. Tuple header and Relation header are the same thing. The headers of tuples in a relation must match the relation header.

I'm not sure what a relvar is, since there doesn't seem to be a definition of what the type of a relation is. I guess the type of a relation contains the same information as the heading of a relation, in which case a relvar would be the combination of the name of a table and the data structures that define the table and contain its data.

this analogy may help. constant:variable::relation:relvar. A relation is just a single set of tuples, frozen in time. A relvar is a named variable, whose associated relation is "variable" or changes over time.

Ok how is that? Ideogram 06:25, 29 May 2006 (UTC)

In defense of the first paragraph,

a 'relation' R can be thought of as a simple English sentence, such as

'b R c' (read 'noun verb word').

When you take a relation in this sense, a 'table' is an embodiment of a sentence. For example

'Joe married Mary'.

In the real world, such a simple sentence would need a lot more detail, just as the date of the marriage, where the marriage occurred, whether the marriage ended. And, in the real world, whether the parties to the marriage are valid; for example whether Joe is alive, or whether Mary is a citizen or not. So a simple two-column table of b's and c's would have to be expanded to b c d e f g, etc.

That is the reason for NULL values. The assumption is that every row in a table represents a true relation.

Thus if Joe were not a person, a simple two column table, say '[R]' would have a row

[NULL, 'Mary'].

That table, R has the ability to represent a lot of instances (rows or 'tuples') of marriages.

That same viewpoint about relations means that you can have sentences that need not be static, but rather can be generated by questions (called 'queries'). And you can build tables which do not have to be stored in the database, but might can be built up using SQL scripts, such as the encyclopedia you are reading right now. And since this is a wiki, a SQL script might also be writing what you have just typed in, to a database.

This also means that a database query might return a lot of rows, and not just a single row

['Joe','Mary'], or no rows at all, as answer to a query.

But it shows that database scripts can be very subtle, which you can see by the workings of this encyclopedia, a testimony to the subtlety of the languages it was written in (including English, not just in SQL).

--Ancheta Wis 08:56, 5 August 2007 (UTC)

[edit] why build databases on the relational model?

Why is the relational model a good foundation for building databases? Ideogram 13:17, 31 May 2006 (UTC)

I am by no means an expert on any of this, but it is my understanding that the relational model is rigrously defined on the basis of logic. It's a good model to build a database on, because adherance to the relational model ensures the integrity of data. Put another way, as long as your database adheres to the principals of logic, it will always contain, and answer with, true statements. Stray from the relational model and you risk situations (easily unforeseen) in which your database structure becomes illogical, and makes false statements. In other words, a form of data corruption which would easily lead to a pandoras box of bugs, that you must endlessly patch in order to repair.

A good example of this is the problems involved with the null. Does the null mean "Unknown" or "Not Applicable" or "Doesn't Exist". Would you feel confident making an assumption on any one of them?

another example is that SQL systems typically don't allow user definable types, and further, require the explicit naming of columns in joins. This forces a lot of data into fitting primatives like INT and CHAR. Thus, the joining of two unrelated logical types is possible, such as joining a SALARY column with an ID column. With tuples defined as a set of triples, along with user definable types, a lot of the nonsense involved with this kind of just goes away. Joins happen on matched types, not according to the whims of the query. —The preceding unsigned comment was added by 71.211.148.108 (talk • contribs) .

Using the relational model for building a database is good foundation because... the relational model follows a fully-closed set of mathematical operators which guarantee that all data in the database is accessible. The relational model follows mathematically proven rules guaranteeing data integrity while avoiding the problems of representing complex relationships experienced by the hierarchical model. // Brick Thrower 19:46, 28 September 2006 (UTC)

[edit] "truly" relational

First things first. I'm definitely in the "truly relational" camp. If you ask me, DBMSs that violate some of Codd's 12 rules (like EVERY SQL DMBS), shouldn't be considered "relational". Yes, I understand that my feelings on the matter shouldn't be the only definition in Wikipedia, because there is a large group of people who considers SQL DBMSs relational. So, beacuse I'm a bigot, I'm going to show everyone the error of SQLs ways in wikipedia, and enforce this when I have time. SQL violates the row ordering rule, it basically comes down to the fact that DISTINCT (in SELECT DISTINCT), is not the default option. This sample doesn't show how this relates to DISTINCT very clearly, but it does clearly show that row order matters in this DBMS, because of SQL.

//"random" database code, translate to your favorite (R)DBMS
create table Foo
{
   ID : Integer,
   Name : String,
   Data : String,
   key { ID } // think unique index, or primary key if you have a hard time with non-primary keys
};

insert table { row { 1 ID, "Codd" Name, "Database Guy" Data }, row { 2, "Codd", "Relational Guy" }, { 3, "Ellison", "Pseudo-relational guy" } } into Foo;

// Now (MS)SQL viewing on that table
select top 1
   Data
   from Foo
   where Name = "Codd"

// What gets returned? "Database Guy" or "Relational Guy"

Some might complain that their DBMS is better than this, because it doesn't support the "top" clause. Sadly, this probably isn't the case: SQL standard: ROW_NUMBER() PostgreSQL, MySQL: LIMIT clause MSSQL: TOP clause [3] (note his "top-n" query is how it "should" work) McKay 08:18, 1 August 2006 (UTC)

[edit] Organization of Articles Related to Relational Databases

JA: There was a lot of thought and discussion early this year on the coordination of the articles related to relational databases, especially these three, in order of decreasing abstraction:

JA: The relational model article was intended to form a middle ground between the abstractions of relational algebra and the concrete specifics of relational databases, but still keep to the general principles of the underlying logical model. The relational database article was intended to deal with specializations of and the departures from the relational model that arise in practical implementations. Somewhere along the line that organization has gotten totally messed over. There is now a lot of stuff in the relational model article that just plain does not belong there, and should be taken up in the relational database article. Jon Awbrey 13:46, 9 August 2006 (UTC)

"Intended"? Intended by who? The way I see it is that the relational model is the framework for creating relational databases, and relational algebra is the query language that can be used to grab data from it. I'm not saying that this article is correct, but I think we should define what should be where, kinda before we continue. I've rewritten "Relational database", Is there anyone that thinks that it should be different? How should we proceed? McKay 17:27, 28 September 2006 (UTC)

I agree with some of the above comments: I think this article is seriously confused at which level of abstraction it is being written at. In this article on the relational model, the writing jumps in and out of a higher level of abstraction (relational databases, DBMSes) and it also jumps in and out of lower level mathematical details. I'll come back with some ideas on how to resolve this problem. Kendrick Hang (talk) 05:55, 23 May 2008 (UTC)

[edit] kind of model

What kind of model is the relational model. Relational model is an alternative to Network model, and network model calls itself a database model. That seemed a little wrong to me (because I've got a database model, aka a database schema, for the database I'm working on, for example). But Database model lists relational and network models as it's kind. So, I guess it's right with that article in mind. Thoughts? Some brick thrower has changed the reference to a link on the model page, which shouldn't be used because it's a dab page, and Model (abstract), specificialyl referencing "structure of models", which doesn't seem right to me either. It isn't a conceptual model, it's a model for how to make conceptual models. I think that Database model fits neatly, partially because network model also refers to database model, but it's a different meaning than database schema, but I'm open to new suggestions. McKay 17:11, 28 September 2006 (UTC)

The common meaning of the term "database model" usually indicates a specific model of a specific database, but I do see your argument. I will first try to disambiguate the meaning on that page, similar to the distinction made in the article object model (numbers 1. and 2.), except give them headers and thus an anchor to link back from relational model here. // Brick Thrower 19:15, 28 September 2006 (UTC)

This is how I understand the "kinds" of models: object-based logical models (entity-relationship model, object-oriented model, semantic data model, functional data model), record-based logical models (relational model, network model, hierarchical model), and physical data models (unifying model, frame-memory model). If we're talking about the relational model, I've seen it referred to as a data model -- a way to model data. Note that this (data model, relational model) does not necessarily have to involve a database, but it's more than likely that the relational data model will then be used as a basis for creating a relational database system. I'm being picky here because "data model" fits the level of abstraction for this article, but "database model" refers to a higher level of abstraction (the one talked about in the relational database article). Kendrick Hang (talk) 06:04, 23 May 2008 (UTC)

The relational model isn't really "record"-based: its tuples/rows do not have addresses, while the term "record" suggests that they do. Of course in practice we often see relational schemass in which tables have "ID columns" that act as symbolic row addresses, but the model doesn't assume or require this. Rp (talk) 17:47, 23 May 2008 (UTC)

[edit] Can we get a good definition "Relational model"?

I'm sorry, but I couldn't extract the GENERAL definition of "Relational model" from the first paragraph or two of this Wikipedia entry. Can someone give a GENERAL meaning of "Relational model" in the FIRST sentence or paragraph of the Wikipedia entry?

I've heard that "Relational model" means, in general terms, "keeping data separate from function." But yet, when I came to this entry, I have no clue whether that is accurate statement or not.

A general description of what "Relational model" means in the first sentence or paragraph would be appreciated. —The preceding unsigned comment was added by 24.8.163.210 (talk) 16:39, 19 December 2006 (UTC).

Hmm. In a RDBMS, any scalar is acessible by the tuple (table name, column name, primary key). Relational means that a scalar value in a table my be used as (all or part of) a primary key to refer to some other value. Paul Murray 02:39, 24 January 2007 (UTC)

[edit] the columns and the rows of a table are ordered

but note that in the database language SQL the columns and the rows of a table are ordered

Eh? AFAIK, this is silpy not the case. —The preceding unsigned comment was added by Paul Murray (talk • contribs) 02:32, 24 January 2007 (UTC).

And yet you could easily relieve your ignorance of this fact simply by looking up a few sections in the discussion page here at the section titled "Truly Relational", which demonstrates that row ordering *does* matter in SQL. Strange that you didn't.

I removed a paragraph about Codd's great insight that names rather than ordering could be used to distinguish n-tuple "coordinates" because it was completely incorrect. Who ever wrote it does not understand the concept of a mathematical model and, it appears, has not read "A relational model of data for large shared data banks" -- E.F.Codd. InformationSpace 07:15, 7 August 2007 (UTC)

[edit] What "relational" means (opening paragraph)

The opening paragraph of this article says:

""Relation" is a mathematical term for "table", and thus "relational" roughly means "based on tables". It does not refer to the links or "keys" between tables, contrary to popular belief."

It links to Relation (mathematics), which in *its* first paragraph explains that relation is a:

"generalization of 2-place relations, such as the relation of equality, denoted by the sign "=" in a statement like "5 + 7 = 12".

If the first one (relation = table, not relationships) is correct, the link to the second one is highly confusing. The mathematical relation is not a table, but actually a relationship between things. They use a table as an example of how to show this for small finite sets, but that's not what a relation *is*. For example, the introductory paragraph uses "=" as an example of a relation, and "=" is not itself a table, nor can it be represented in a finite table.

Perhaps what this article meant to say was that: in CS, the "relation" of "relational model" refers to relationships explicitly represented by a single table, not by the implicit relationships between records of different tables.

Or maybe I'm completely wrong. It wouldn't surprise me, since I'm completely confused now! —Preceding unsigned comment added by 64.81.170.62 (talk) 17:56, 27 September 2007 (UTC)

Although a mathematical relation is conceptually a "relationship between things", it is formally defined as a subset of the cartesian product of a number of sets, the elements of this subset being n-tuples. So for example if the relation is equality of natural numbers, the relation is the set of ordered pairs (0,0), (1,1), (2,2) etc. You can imagine these listed in a tabular format, with column headings of, say, "Number 1", "Number 2". In the "relation" = "database table" equivalence, the sets in the relation are "All possible values of column 1", "All possible values of column 2", etc, and the "relationship between things" is that "v1,v2,v3,... are related" if and only if (v1,v2,v3,...) occurs in the table; or in other words if and only if (v1,v2,v3,...) is a member of the relation.

Another point to note is that a relation doesn't have to be expressed by a "rule" (such as a = b. or x + y < z) - any subset of the cartesian product is a relation: including the empty set (nothing is related) and the full product (all combinations are related). Incidentally, a function is a particular sort of relation where again the underlying concept involves a "rule" of how x maps to f(x), but the formal definition doesn't.

Hope this makes some sense... AndrewWTaylor 12:35, 28 September 2007 (UTC)

It does, mostly, after I read all of this explanation. But it still sounds weird the way the opening paragraph says it. It would have made more sense to me if the warning read simply "Contrary to popular belief, '_relation_' refers to the relationship defined by the records of a table, and not the links or "keys" between tables." Trying to define "Relation (mathematics)" in the opening paragraph this way confused me ... or maybe the problem is that the opening section of the Relation article dances around actually saying what a relation is.

Anyway, thanks! —Preceding unsigned comment added by 75.172.90.202 (talk) 03:38, 5 November 2007 (UTC)

[edit] Use of "arbitrary" for selecting primary keys

It may be somewhat arbitrary but in fact there are very clear rules for selecting primary keys. The term "arbitrary" seems wrong to me in this context.

DruidZ 15:27, 24 October 2007 (UTC)

What is meant is that these rules aren't cast in stone and cannot be expressed in terms of the mathematical model. Rp (talk) 17:49, 23 May 2008 (UTC)

[edit] Formal definition of Relation needed

While a general characterization of relations are presented (an 'extension of first-order predicate logic'), this should be given in formal detail. I had to bolt to another site to get this information. (http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter3/node1.html)

BadZen 21:57, 2 February 2008 (UTC)

[edit] "Criticism"

I've removed (again) the "Criticism" section added by User:194.66.238.27:

The relational database model is not turing complete.
There is a direct impediance mismatch between physical and logical views of data.
The relational database model suffer poor abstractional capabilities due its use of untyped strings and numerical values.

on the grounds that:

it is not cited
the assertions are not justified in any way, or even explained, and may be original research

(The poor spelling, grammar and formatting could be fixed but not until the text is more suitable for inclusion.) AndrewWTaylor (talk) 12:46, 2 January 2008 (UTC)

The whole article is filled with uncited but obvious claims. These criticism are obvious to anybody with even limited experience of Relational systems. Further criticism can be found here at http://c2.com/cgi-bin/wiki?SqlFlaws ~~ —Preceding unsigned comment added by 194.66.238.27 (talk) 13:47, 7 January 2008 (UTC) http://c2.com/cgi/wiki?RelationalHasLimitedModelingCapability —Preceding unsigned comment added by 194.66.238.27 (talk) 16:48, 7 January 2008 (UTC)

[edit] "Competition"

This section is extremely weak. It falls into that category that is all-too-common in technical articles, of "made-up history". Just a few criticisms:

the higher the level of abstraction in an existing system, the easier it is to migrate. The difficulty in migrating many applications is because they have too low a level of abstraction.

the notion that object databases are not true DBMSs but "construction kits" seems to have no basis in reality

the description of the binary relational model as "recent" is ridiculous; it can be traced back at least to Abrial and Bracchi (independently) around 1971, and is closely related to work on the functional data model by Shipman and others in the early 1980s

The notion that the network model was only formulated after the relational model is bizarre, unless you have some definition of the term "model" that for some reason excludes the DBTG report of 1969.

Mhkay (talk) 21:37, 5 April 2008 (UTC)

[edit] "Cartesian product becomes commutative"

This is wrong: cartesian products are mathematical tuples, that is, ordered sets. What becomes commutative is a "db version" of the cartesian product, which is a "db version" of tuples. "Normal" cartesian product AxB = { (a,b) | a in A and b in B }. "DB" cartesian product AxB = { {a,b} | a in A and b in B }. Arthur Gabriel de Santana (talk) 05:52, 21 April 2008 (UTC)

The point is that the "db version" is also called "cartesian product". Rp (talk) 17:50, 23 May 2008 (UTC)