Talk:Rough set
From Wikipedia, the free encyclopedia
David. I've just looked at this stuff and I'm afraid it is really terrible! I am sorry to have to say this. I suggest the following structure be followed closely: 1 - Brief history, background and motivation for rough sets 2 - Very brief and precise Math definition of rough sets. (definitions should not include examples) 3 - Very simple examples. 4 - optionally, some more complex real world examples or discussion. 5 - references, related links etc. This article could and should be much shorter. If written in the above format, I could work with you to improve the page. As is, there is simply too much wrong! InformationSpace 02:25, 10 May 2007 (UTC)
- I would be very happy to work with you. I think the changes I made were a considerable improvement over what was there before (really, just a terse collection of symbols), but I'm sure it can be improved further. Formal mathematical definitions are good, but not too helpful to the typical Wikipedia reader. (People likely to be reading this Wikipedia article are not the same people who live and breathe Machine Learning.) I think examples convey the concepts much better. Where do you want to start? —Dfass 19:03, 14 May 2007 (UTC)
- This is really quite a mathematical article, so I think a formal definition would not be at all out of place. The definition should be accompanied with an informal description. It is definately not appropriate to start with an attempt at a definition of an information system! Perhaps such a definition does not even have a place in this article! If you can do:
1 - Brief history, background and motivation for rough sets, and 2 - Very brief and precise Math definition of rough sets (perhaps with informal description). Then we can go from there. This should be easy for you! If not, you really have no place meddling with this article. InformationSpace 00:16, 17 May 2007 (UTC)
-
- ...no place meddling... There's a phrase I have not yet heard in these hallowed halls!! You make me laugh, which is good, because laughter is the best medicine. (And I love medicine.) Why is the definition of information system not appropriate? Almost every single introductory paper I've read on rough sets begins with information system. Is that not your experience? If you're such a professional in this area (which you may very well be), why don't you just go ahead and write the definition, and I'll meddle with it after you're done. Sound like a plan? —Dfass 23:27, 18 May 2007 (UTC)
- I do hope you are not unduly offended and that you get my point--I would expect anyone writing a scholarly article about X to be able to give 1 and 2 for X. I am not a big expert on rough sets at all. However I am researching generalizations of sets to model information systems. I am curious about rough sets, but I have been unable to find a good definition. I cannot afford to get too side-tracked from the main focus of my research and spend lots of time researching rough sets---especially as my supervisors (having seen the Wikipedia article on rough sets) have told me not to bother with such rubbish! If the article started with 1 and 2 then I could quickly see how and where rough sets fitted into the rest of my research. As is, I guess I'll take the advice of my supervisors and not bother with them. I am concerned that the article is not a good ambassador of rough sets and will not win many converts to rough set theory. BTW- the article says that rough sets were introduced in '91, but cites an '88 paper. "Rough sets Zdzislaw Pawlak, Jerzy Grzymala-Busse, Roman Slowinski, Wojciech Ziarko November 1995 Communications of the ACM, Volume 38 Issue 11" says they were introduced in the early 80s. Also, in the article, I don't know what [x]_p is. InformationSpace 05:39, 21 May 2007 (UTC)
-
- I will try to revise it. You should get Pawlak's book (from the library), though. It gives very nice definitions, and is fun to read. Simply, a "rough set" is just a pair of sets which approximate some "unknown" set; the first set of the pair specifies the elements which are *definitely* members of the hidden set, and the second set of the pair specifies elements which are *possibly* members of the hidden set. That's it. Pawlak provides a bunch of theorems and stuff, if that's what you like. The real interest to me comes more from the interpretation of rough sets as "concepts" in a machine learning context, and their use in describing what concepts are learnable or unlearnable based on different feature sets. I'm not too interested in their history, and I probably didn't write that section.
-
- I was not offended by any of your remarks, I just found them humorous, because "meddling" is what Wikipedian's do. And, usually, it is when someone does a crappy job on an article (which I'm not claiming I did) that more experienced people feel compelled to come in and make corrections. If no one meddles, nothing gets written. Regarding your research, if you mean "information system" in the sense that the article uses the term, I think you should definitely become familiar with rough sets, even if only so you can explain why it wasn't the appropriate model for your work. Pawlak's book will help. Good luck. —Dfass 19:53, 21 May 2007 (UTC)
- Ok. I will see if any libraries round here have Pawlak's book. I do like your informal definition of rough sets above. I think something similar should replace the description of rough sets at the beginning of the article! At present, the article does not do a good job in distinguishing between rough sets and their application w.r.t. information systems. Also, I really don't like the definition of "information system". From the example, it looks like an "information system" is just an n-ary Relation (see Codds work on relational databases...) Is there a difference? InformationSpace 06:19, 22 May 2007 (UTC)
-
- If your university library does not have the book, try to get it through interlibrary loan. (Now watch, you'll probably end up recalling my copy! Are you in the NY area?) Regarding the relationship to database theory, information system is not identically an n-ary relation, because (as I understand it) it may contain repeated rows, these indicating multiple observations of the same object (feature vector). Because a relation is a set, one could not represent these repeated observations in a relation. An information system is rather best thought of as simply a fancy name for a "spreadsheet". —Dfass 15:35, 27 May 2007 (UTC)
- I strongly suspect a number of things: 1 - the "definition" of "information system" in the article makes no mathematical sense. 2 - If an "information system" is not identically an n-ary relation, it could be defined simply using n-ary relations, or at least cartesian products. I'm not sure what you mean by "repeated rows". If you have n properties per object and 1 row for each object, then you have an n+1ary-relation. If you have more than 1 row for each object, you have an n+2ary-relation. InformationSpace 06:00, 29 May 2007 (UTC)
-
- I'm using the term "relation" in the set-theoretic or mathematical sense, in which case the "arity" would refer to the number of attributes, and there could not meaningfully be repetition of objects. But I'm not sure that (1) database relation === mathematical relation, and I'm also (2) not sure that I have it right about repeated rows being permitted in information system. I don't see where it is definitionaly ruled out (or why), but Pawlak does not mention this as one of the differences between information systems and relational database table on page 59 (bottom) of the book. So, I am either misunderstanding the scope of the term "relation" as used in database theory, or I am misunderstanding the scope of information system. I think that probably I have got the database idea wrong, because if a database relation already ruled out repeated rows, then what would be the special value of 1st normal form? I see the Wikipedia article for Table (database) says "For instance, an SQL table can potentially contain duplicate rows, whereas a true relation cannot contain duplicate tuples," but I still don't know whether database "relation" permits this. Clearly, mathematical "relation" does not. OK, later. Gotta go to work! —Dfass 14:20, 29 May 2007 (UTC)
- I still haven't read Pawlak's book. I don't know what he has to say about information systems. Some advice though: Revise sets, relations, Cartesian products, functions (and related material) Maths--unfortunately--can get quite complex, but most if it is based directly on these 4 concepts. Learn them thoroughly, and apply them carefully. There are a number of good texts---get one around 100 pages long. This article, and the stuff on attribute-value systems is currently, quite frankly, a bit of an embarrassment. BTW if you have an n-ary relation, but you want to repeat rows, one way is to use an n+1ary relation where the extra set consists of natural numbers. InformationSpace 23:49, 29 May 2007 (UTC)
-
- Ah, so we're back to that again. —Dfass 15:09, 30 May 2007 (UTC)
- 'fraid so. You'd be hard pressed to get a different opinion from any mathematician. InformationSpace 23:22, 30 May 2007 (UTC)
Hey, if you're still there, I've looked at the ROUGH SETS book now and I can see that the problem is not yours, but rather Zdzislaw Pawlak's. Whatever he is, he is definitely NOT a competent mathematician. In particular, that stuff on p59 about the difference between his `information systems' and the relational model of data is complete rubbish!! What is really amazing is that Pawlak's mathematical ineptitude doesn't seem to matter!! I think over 1000 research articles relating to rough sets were published last year... InformationSpace 01:59, 5 July 2007 (UTC)
-
- I'm still here, although I regret that I have not had a chance to work on the article any further. I can't assess Pawlak's competencies, since I am not a mathematician myself, but it's not surprising that the field has outgrown its originator. That often happens. I have to believe that some of the people working in this area now have strong mathematical credentials. —Dfass 01:37, 10 July 2007 (UTC)
Mmmm... I'm more dubious about that than you---I've been asking around for a definition of rough sets for a while now and nobody seems to know... It is a real pity. It seems to me that behind the maze of inept mathematics there are some beautifully simple and practical ideas. InformationSpace 04:12, 12 July 2007 (UTC)