Talk:Data

From Wikipedia, the free encyclopedia

This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.

Contents

[edit] Old discussion(s)

It look difficult for me to understand A datum is a statement accepted at face value..

What do think about definition and explanations like this:

Data is ~evidence (or some another term) on the input of information system. Data is subject of data processing by information system. Data could contain usefull information and could not.

I think, it is good, when a definition uses other wikipedia terms. Not just plain English. Kenny sh 08:30, 10 May 2004 (UTC)

Also I think the current definition is wrong. A datum is a datum regardless of whether or not it is accepted. --(talk to)BozMo 10:36, 23 May 2004 (UTC)
Wouldn't it be better be to say "Data is an indefinite number of ratios.", and "A datum is a single ratio."? -Inyuki 19:32, 11 Nov 2004 (UTC)
Hello. There are a couple of serious problems with the above definition. The main problem is that it says data has to do with "information systems", "data processing", and "information". Either it's assumed these terms have to do with computers, in which case this definition is much too narrow, or not, in which case it's needlessly vague. A secondary problem is that the definition can't be understood without looking up some other terms. The existing definition, which uses only ordinary English words, is terse, comprehensible, and yet quite general. The proposed new definition does not have these merits. Regards, Wile E. Heresiarch 14:06, 10 May 2004 (UTC)

A separate page for datum is needed. In geology/cartography/geography and surveying a datum is a reference surface. For instance, sea-level is often used as a datum below which depths (or above which heights) are measured.


Hello COMPATT, to address your comments about the distinction between data and information -- I agree that programs are a form of data, but I think it's important to keep in mind that the word "data" has a history of usage that goes back much farther than computer science. The distinction between data and information, which is made in the article, is that information is derived from an interpretation of data. Some data don't have any obvious interpretation, and so we might noodle over ancient inscriptions for a long time, but some other data have such an immediate interpretation, especially in a given cultural context, that the interpretation is held to be the same as the data -- for example if I look at a photograph, I might immediately see "a dog" instead of "a pattern of silver particles which suggests a dog". I think the interpretation aspect, and its dependence on context, might be emphasized in the article. Well, I've rambled on long enough! Have a great day, Wile E. Heresiarch 14:33, 18 Mar 2004 (UTC)


Hello, as a comment on the edit that I just made. I put a new, short intro paragraph at the beginning, to hopefully get straight to the point. (The article was noodling around in etymology a little too much before getting to the punch line. Hopefully that's corrected now.) As the term "data" is rather general, I've attempted to give a general definition, and then immediately describe one of the most-used types of data (measurements & observations). I'm hoping that there is a right level of generality now. Happy editing, Wile E. Heresiarch 15:44, 19 Mar 2004 (UTC)


[edit] Usage in English

There's another meaning of the singular datum. In the US Navy, the term is applied to the last known position of a submarine whose precise location is no longer known. I don't think I ever heard it used in the plural; there just aren't that many submarines and there's a great deal of seawater under which to spread them. Dick Kimball (talk) 18:20, 2 April 2008 (UTC)


Hi,

- I inserted most general and shortest functional definition of data (see function definition)

- about

[edit] Meaning of data and information

I changed it. In my opinion: - too much information noise (uncertainty of the author (?)) in this paragraph.

- As it is, the phone number is not actionable - you know it is a phone number, but it is of no use. This information becomes knowledge when you can act on this information, either to solve a problem (for example, to call Helen, whose phone number it is), or to gain insight into an issue (e.g. by noting that other phone numbers have the same exchange). People or computers can find patterns in and between data to perceive relationships between information, creating or enhancing knowledge. Since knowledge is prerequisite to wisdom, we always want more data and information. But, as modern societies verge on information overload, we especially need better ways to find patterns.

This in not about data, it is not necessary digresion – I removed.

See also: http://en.wikipedia.org/wiki/Talk:Knowledge about DIKW.

I do not find (on the Web) any articles which confirm the interpretation of the DIKW model which were suggested.

--Adam M. Gadomski 18:01, 4 November 2005 (UTC)

Adam, please read again Wikipedia:No original research. You are linking extensively to your own research. Wikipedia is not the place to publish your original research. Also see Wikipedia:Guide_to_writing_better_articles. You seem to write in a heavy duty academic prose style, which isn't really used here. Some of what you write might have been OK but I can't tell it apart. Sbwoodside 22:30, 4 November 2005 (UTC)


Simon, your reply is a meta-response. Is it a style of "Space-invaders"? You copy the original research with not proper references - is it correct???

You (and only you) inserted DIKW in Wikipedia in a few articles.

Why do you do it?

- I see that your self-promotion on the Web is perfect, my congratulations, but I would like to see your sc.publications too - maybe this information could clear my doubts why "you are linking extensively" to and "update" this subject.

--Adam M. Gadomski 16:41, 24 November 2005 (UTC)

Information#Information is not data does not seem available anymore. --Inkiwna 15:42, 2 March 2007 (UTC)

[edit] Data WAS the plural of datum

The first line of this article needs to change. Datum WAS the plural of datum, but no one uses it this way. In fact, in surveying, datum and data are too completely different words. Datum is a coordinate system for locating a point on the earth, while surveyors use data to mean what everyone else does. The plural of survey datum is datums, since data has a completely different meaning.

English does not follow the rules of a dead language that it happened to borrow a word from. See the back-formation article for numerous examples. You'll note that no one ever complains that "asset" is incorrect usage.

Well, Datum has its own article, but I guess you're right that for this article probably the first line could be rewritten because in this context I think most people just talk about data and rarely use "datum" (not enough to justify the first sentence position). The first sentence / intro should summarize the article :-) Sbwoodside 19:22, 22 September 2006 (UTC)
By the way, lots of people have been talking about changing the intro, why not be bold? WP:BOLD Sbwoodside 19:23, 22 September 2006 (UTC)
The weakest statement that I am willing to make is this: at least one person still studiously treats datum as the singular, and data as plural. I execrate the mass noun treatment of data. But then, perhaps I'm one man crying in the wilderness ... Hair Commodore 20:02, 13 May 2007 (UTC)
I can't think of the last time I saw "datum" used, even by people who routinely treat "data" as plural ("data are"). I'd say that in popular usage, people tend to treat data as a mass noun as they would information (and use them interchangably). And, for good or ill, this popular usage seems to be crowding out the traditional academic/professional treatment. At work yesterday I reviewed a draft policy document regarding the handling of sensitive data; one paragraph used "data are" the next use "data is." I pointed this out and the second paragraph was changed to "data are" as the correct construction. Go ask a sample of demographers, social scientists, physicists, doctors, market researchers, and other people who work with data professionally and a significant majority will say that the "data are" construction is correct (and the others are wrong ;-) XKL 16:08, 26 May 2007 (UTC)
Educated folk have no problem using "datum" in the singular and "data" in the plural in English sentences. This whole discussion is an attempt to justify Newspeak, and is little more than a sorry excuse for mental laziness. The English Wikipedia wasn't to be written in Ebonics; that Wiki is yet to be created. —QuicksilverT @ 22:58, 5 December 2007 (UTC)
I have rewritten the intro for the article in an attempt to capture the meaning and usage of the word without introducing the controversy in the first sentence. Quicksilver, you are arguing ad hominem with your "Educated folk" remark. This wikipedia article is (should be) attempting to reflect reality, and diversity of opinion within it, not your own view. Personally, as an "educated folk" myself, I am strongly of the opinion that English is defined as far as possible by the people who speak it, and that examination of usage indicates a strong preference for regarding data as a mass noun (eg. the formation "database"). However I am content to have that debate elsewhere. Joffan (talk) 00:40, 4 January 2008 (UTC)

The statement "but these are English sentences, so Latin grammar rules do not apply" seems to be an unencyclopaedic opinion tagged on to an otherwise neutral sentence stating the status of the word as plural in Latin. The rules applied in English sentences are clearly rules of English grammar, not Latin, but English happens to have the same rule as Latin in this instance, i.e., that a plural noun requires a verb in the plural. The debate is not whether Latin rules should apply to English, but whether the word data is plural or singular in English, based on etymology and usage. I propose to delete the clause "but these are English sentences..." if there is no further discussion. GKantaris (talk) 15:45, 2 January 2008 (UTC) - OK, as there is no discussion, I've deleted the clause. GKantaris (talk) 16:21, 14 January 2008 (UTC)

[edit] Inaccurate pronunciations

Pronounced "Day-Ta" (US) and "Dar-Tar" (AU & UK*)

Living in the UK, I've only ever heard it pronounced as the former, "Day-ta"; only from Americans have I heard the latter, "Dar-Tar".

Living in the Southern and Mid-Atlantic U.S., I've only heard it pronounced "Day-ta". JD Lambert(T|C) 01:54, 15 July 2007 (UTC)

I've lived in many states in the US, from the west coast to the east coast to the midwest. I've never heard anyone say dar-tar. I've heard day-ta and daa-ta (like Dagwood). Never dar-tar. Entbark 03:48, 23 July 2007 (UTC)

Entbark, you may not have been to Massachusetts, or may not have heard someone from the Boston area, as they seem to be fond of injecting gratuitous "r"s into their speech. For example, listen to Norm Abram on The New Yankee Workshop. —QuicksilverT @ 23:41, 5 December 2007 (UTC)

[edit] Data synonym for information

Someone changed the page to say data is not a synonym for information. They should look it up in the dictionary: http://www.dict.org/bin/Dict?Form=Dict1&Query=data&Strategy=*&Database=* Daniel.Cardenas 15:34, 25 April 2007 (UTC)

How can you post a reference which denies your own statement ??? From your link :
Data on its own has no meaning, only when interpreted by some kind of data processing system does it take on meaning and become information.
...
1234567.89 is data.
"Your bank balance has jumped 8087% to $1234567.89" is information.

Bob Novak 06:42, 26 April 2007 (UTC)
I would also like to point you to some introductory material on information theory, like the one at MIT open course ware - Information and Entropy, where concepts like information, data and code are explained. Bob Novak 07:57, 26 April 2007 (UTC)

That is classroom material applicable to computer science people and the like, but not 100% applicable to the rest of the world. Thanks for the link. Daniel.Cardenas 15:04, 27 April 2007 (UTC)

[edit] Data: verb or noun?

This statement, 'The word data is the plural of Latin datum, neuter past participle of dare, "to give", hence "something given",' is a little confusing. If datum and data are both nouns, they cannot also be past participles since participles are verb forms. That statement makes it sound like the noun datum is a particple of dare. Nouns cannot be particples. The same word can be used as both a noun and a verb (e.g., "I scream" and "I heard a scream"), but a noun is NOT a participle EVER.

Oh, and I found where that phrase was taken from: http://www.johntcullen.com/sharpwriter/content/data_is.htm. Hardly a trustworthy source. He doesn't list any references, much less know the difference between a verb and noun.

Entbark 19:49, 12 July 2007 (UTC)

So, if no one is opposed to me changing it, I will modify the etymology section in a few days. Entbark 03:53, 23 July 2007 (UTC)

The English usage section is still confused. Rather than try and win a debate, this needs to take a NPOV stance and observe there are two viewpoints:
1. That this is a Latin neuter noun and therefore the rules for a Latin plural apply.
2. That this is an uncounted noun and legitimately used in the singular.
Clearly, we need a convention for this article. Common usage is the uncounted or mass noun. This seems to be backed up by the OED [1] which has this note on usage. Traditionally and in technical use data is treated as a plural, as in Latin it is the plural of datum. In modern non-scientific use, however, it is often treated as a singular, and sentences such as data was collected over a number of years are now acceptable. The etymology seems a little suspect though as we are told it is actually derived from a verb, yet the arguments used are that it takes the form of being a Latin singular neuter noun. Also, we know that datums is a legitimate plural usage of geological datum and people accept this, odd that the use of datums is not derided there through etymological argument. Spenny 13:59, 11 September 2007 (UTC)
It is a declined form of the past participle of the Latin verb dare, "to give". The Latin "data" would translate as an adjective, "given", or as a noun, "given things"; it is equivalent. Because it is a participle, it grammatically functions as a noun or an adjective, and so follows the same pluralization rules as nouns and adjectives: singular -um, plural -a. --Nucleusboy (talk) 03:00, 28 November 2007 (UTC)

[edit] Data as plural

I prefer "these data" because it makes everyone pause, and reflect on how wrong their notions of grammar are. —Preceding unsigned comment added by 71.193.226.225 (talk) 07:58, 2 April 2008 (UTC)