Talk:Data set
From Wikipedia, the free encyclopedia
what is dataset architecture? -unknown poster above
In the social science world, a dataset is a set of files which includes data, usually in encoded form, and documentation, such as a codebook. Rcrice 18:19, 30 November 2006 (UTC)Robin Rice
Your mom is a dataset
This article is a plagiarism of reference.com. Here is the url to the original article: [1] 216.166.206.16 01:20, 26 April 2007 (UTC)
- The url makes it clear that it just pulls up this article for display. Melcombe (talk) 15:57, 31 March 2008 (UTC)
[edit] Datasets are data sets!
I do not agree about the fact that the term 'data set' is unrelated or not consistent with set theory. In Statistics, and in all other fields i can imagine, each record of a dataset belongs to a distinct statistical unit (or observation), so two rows of a dataset are always distinct. A dataset is a set tuples, but rows can exchange their positions, so it is not an ordered list. It can be well done any sort of operation (unions, interesections, subsets extraction and complementary) considering a row of a dataset just like an (one- or n-dimensional) element of a set of data. This is also consistent with the definition of sample as a subset of a population, and in fact in multivariate analysis a single row of a data matrix is a k-variate observation, extracted from a joint k-variate density function (k being the number of columns). So in my opinion a dataset is just a particular form of a particular (data) set. I'm waiting for the community to contradict myself :-) !
Jabbba (talk) 12:39, 1 May 2008 (UTC)
- It is not obvious what you are arguing for or against. I don't think a connection with mathematical sets is any more worth mentioning in this instance than it would be for a "set of bowls". Melcombe (talk) 13:32, 1 May 2008 (UTC)
- Ok thank you for the answer. The reason why it is not obvious is that i've commented out the lines of the text i was referring to (as you see the history). Now i understand that it was a mistake, since people can't manage to understand my post. I'm going to restore the content, for now, even if i think it should be removed (according to your opinion, too). Jabbba (talk) 22:15, 3 May 2008 (UTC)
- Actually, I had seen the hidden text. What I meant was that it was not clear whether you were arguing for or against the inclusion of the portion of text. Your first sentence seemed to be arguing for the inclusion, while the last sentence seemed to say "exclude". I do think it should not be in the article. Melcombe (talk) 09:19, 6 May 2008 (UTC)
-
- Ok text removed :-) Jabbba (talk) 22:01, 9 May 2008 (UTC)
-
-
- Is the data set of SAT scores [739, 1336, 1336, 2173] the same as the data set [739, 1336, 2173]? What is its arithmetic mean, 1416 or 1396? In a set you may discard duplicates. Unlike the name suggests, the rows of a data set need not be distinct, and so a data set is in general not a set in the usual mathematical sense. The purpose of explaining that a data set is actually a multiset is that the reader will understand that duplicate members must be retained. The fact that identical members of a data set have a distinct origin does not imply the members themselves are distinct. Alice is distinct from Bob, but they may have identical SAT scores. --Lambiam 15:14, 21 May 2008 ()
-
[edit] crawler
if you need some information about vb.net crawler so contact me on arshad.qureshi72@gmail.com —Preceding unsigned comment added by 203.101.126.26 (talk) 20:40, 5 June 2008 (UTC)