User talk:AshLin/Categorisation

From Wikipedia, the free encyclopedia

Hi AshLin! You wanted some comments, so I'll tell you my thoughts.

Firstly, about the categories. Categories to family are useful. However, I don't think that creating categories beyond this taxonomic level are a good idea (species categories are particularly unhelpful in my opinion, as only one species will belong to it, in general). I don't think that creating genus categories will be a good thing, as it doesn't really make exploration easier. You will end up creating endless numbers of categories, which won't make things seem very user-friendly (which is one of the purposes of categories). Lists are probably much better adapted when you go beyond the family level (see Wikipedia:Categories, lists, and series boxes for more information on how and when to use each grouping system).

The other problem with creating extensively detailed categories is that articles will end up having a huge number of categories, which is not a good idea (to quote the above page, "categories become less effective the more there are on a given article").

In the case of a genus grouping, lists would probably turn out much more advantageous for a number of reasons : red links (ie. non existant articles) can be added to lists, not to categories; Lists can be annotated (not categories); lists are more accessible and easy to edit for newbies (newbies have hard time understanding how categories work); categories cannot include alternative names (common names, etc.).

If you decide to use a list for genera, you can put that list in the category of the family it belongs to, so that people browsing that category can easily find the list of genera. The articles for species should then be in the category for the family with a link to the genus list. The taxoboxes allow easy navigation from family to genus to species; I doubt that using categories will make it easier to navigate the data.

To take your List of Butterflies of India (Papilionidae) as an example, I don't think that making categories will make exploration easier. Simply visiting the page and checking out the genera is sufficiently easy. As for other butterflies that don't yet have such a page, creating a list of genera for the different families is one possibility, or even simply adding them into the taxoboxes.

Besides, there are to date no existing categories for genera, which seems to indicate that people in the past have prefered different grouping methods.

One last thing about categories: if you look at the WikiProject Tree of Life, you can read this : "Major groups should be given their own categories. When possible, these should use the common name in the plural. In general, only articles about major subgroups should be added, and more specific articles should be included in subcategories. However, when there are only a few articles about members of the group, they can all go directly into the main category. Use your judgement on when to split, aiming for an approximate category size of 10-50 articles."

Next, why do you want the genus articles to be in the format "[scientific genus name] (genus)"? This is not usually how things are done (see List of Coccinellidae genera for example), but maybe you have a reason I don't know about! Also, I'm thinking that if you do that you will need to create redirects from the name of the genus. For example, if you create an article "Meandrusa (genus)", you will need to create a redirect from "Meandrusa". This seems to me like extra work and time from which you could spare yourself!

Anyway, these are only suggestions, I wouldn't want you to spend lots of time on something that could easily be avoided! And once again, I would like to propose my assistance if you decide to create a WikiProject Lepidoptera, which would help you put all these guidelines in one place and make reference easier for wikipedians interested in editing Lepidoptera-related articles. Take care. IronChris | (talk) 22:53, 23 April 2006 (UTC)

[edit] Pause

Hmmm, lots to think about. I was thinking about end state and not practicality to users at the working level en route to it. Its obvious that we'll have to have either/or for 'have genus groupings for categories' or 'restrict ctegories to max 50 pages'.

Let me play devil's advocate. Why restrict? Suppose we do have say 400/500 species in a category, so what? 'Butterfly stubs' works too.

Let me stop categorisation till we get this right. You're right. It is hard work and if its not useful, I dont want to do it. Thanks for the useful feedback.00:22, 24 April 2006 (UTC)

Another thought, it's obvious that our policy must look at 'Navigation and User Accessibility' as one issue rather than separate visions of categories, lists, boxes and redirects. You have been more on the right lines, I find. AshLin 00:40, 24 April 2006 (UTC)

[edit] Viren's thoughts 1

We will need both categories and lists.

That is not the issue though. What is, is whether these categories should be at genus level. Each genus has got some fetures. That will go into the genus articles.

"Categories should be on major topics that are likely to be useful to someone reading the article." Suggestive of a bottom up approach (depending on perception)

Arguments in favour of genus categorisation 1. Right now there are few articles (sorry ashlin but your paps list of india while long is paltry compared to the list of paps of the world.) According to http://www.ucl.ac.uk/taxome/rhopnos.html there are 600+ species in paps of the world. This is the smallest family. A category only at family level will quickly get unmanageable ie. difficult to read. 2. The articles will have scientific name titles and hence the reader will see a list of 600+ binomials (ouch). This kind of categorisation will be helpful only to a reader who is MORE than just the "What is that green and black butterfly?" kind. 3. A genus level categorisation will help this type of a person much more than a family level categorisation. 4. Coming back to the "What is that green and black butterfly?" kind of reader. This guy is not going to understand family level or genus level categorisations. He NEEDS a list of species as ironchris suggests. That list will be broken up as it needs to be. All I can say is that it will be massive. (grin) Probably will have to drill to to common names of genera anyways to keep the list size manageable.

Arguments against genus categorisation 1. Levels to which a reader has to drill down to get to an article hmm thats about all I can think up

Now in terms of having a genera listing species or a list of genera I'm a bit confused. The coleoptera genera list leads no where. I cant tell what set of species belong to a particlar genus. For that I expect I would have to go to each beetle species page and see what genus it belong to. I think that we will have to create both lists and categories at the genus level. Bottom line. There are so many issues/parameters involved here. I am not able to get a complete picture of all the different ways a reader will approach the butterfly pages. Nor am I able to put down my thoughts in a clean manner. I need more discussion on this. I agree with Ashlin, categorisation and by extension stub making will have to be suspended until we get clarity on categorisation.

Ok, interesting thoughts here. There needs to be a discussion, for sure. I'm going to put this to the WikiProject Tree of Life, of which we depend (though we don't have to follow their guidelines, obviously!). I expect they'll have something to say on the subject. IronChris | (talk) 15:40, 24 April 2006 (UTC)


Here's the answer I got from the WikiProject Tree of Life :

We have discussed this before, but I can't find the right archive!
My rules of thumb:
Make categories for the articles we have now without worrying about it "getting too big later". If we do start getting lots of species level articles in large families, then we can refactor at that time.
Ball park category size: 10-50 articles, but
Use whatever other scientists use day-to-day to talk about a group. Sometimes a family is spoken about without too much consideration about the component genera. Seems wrong to make category genera in these cases.
Not worth trying to create a completely uniform standard for the whole tree, but should be consistent within an order.
Pcb21 Pete 16:39, 24 April 2006 (UTC)

You may find the whole discussion at Wikipedia talk:WikiProject Tree of Life#Categorisation. I wish we could find that archive! Hopefully some more guidelines will follow. IronChris | (talk) 16:52, 24 April 2006 (UTC)

"Make categories for the articles we have now without worrying about it "getting too big later". If we do start getting lots of species level articles in large families, then we can refactor at that time."
I dont think Pete has understood the rate at which we are going to fill those species articles in Indian butterflies. Refactoring later will be an enormous waste of time. I know cos it is I who does all the donkey work like changing categories.
Ok hypothetical situation. 1000 butterfly articles across all 5 families in various genera exist. What would the refactoring look like?? --Viren 03:19, 25 April 2006 (UTC)

[edit] More on Categorisation

Lets see some practical issues I've faced/envisage:-.

  • OK, I'm a Paps-oholic, guy, I have a ref which lists ALL the PAPs (its back home 300kms away so cant tell you exact number of spp now.) So, yes, I WOULD like to add them at least as stubs, after the Indian List is over. Now please tell me how I manage 600 articles in a single family category? Horror of horrors, at least two hundred (I think) have common names; some have three or four, see Tailed Jay, Graphium agamemnon as an example.
  • Each category page should ideally have max of 50 entries, but what are you going to do if it crosses hundreds. The problem is not of handling!'Butterfly stubs' handles hundreds, but the fact that we have to structure info for easy finding, not obscure it. Vast numbers of species accounts will drown out the small number of very important general articles which will be missed. We wont be able to see the wood for the trees. This has already happened! In fact, it is what led me to do this kind of work. The hundreds of species prevented me from seeing the info, or more importantly lack of it. Now that the dross is away, I've realised some deficiencies in linking, for example,
    • 'Butterfly Regional Lists' was not part of insect lists! Did that.
    • 'Fictional butterflies and moths' do occupy a wierd but legitimate corner of our wikispace. But it was not earlier linked to Butterflies and Moths. Linked them too. These 'pokemons' are ours!!! (indignantly) ;).
    • Now I realise by looking at the other families what we lack. For example, coloepterology has wikipages on important classical books or references of beetles. We need to do the same for, say in Indian context, Evans, Wynter-Blyth, Kunte and Haribal (the books, not the people). Now I noticed this because I now know what's in butterflies after shoving all the species into the families out of 'Butterflies'.
  • A category groups articles together, a list only shows whether a link of that name exists or not and no info whether there are any other articles on the subject. So, its not .OR., its .AND. We

need both lists and categories for the same issues.

  • The problem will exacerbate when general articles on families begin, eg when we write articles on inedibility and mimicry in Danaid, Paps, and ant association in Lycaenids, pest status with regard to Pierids, skippers etc. And they will come in their time! So, this change may well be forced on us at a point in the future when we will have to collectively tackle thousands of butterflies. Wikipedia will definitely exist ten, twenty, fifty, hundred years from now, like the internet. So Project Arthropods,though brand new now, will probably have many Wiki Projects under it like 'Family Coleopteridae','Family Odonata' etc, 'Family Diptera' etc. Some of these are much larger than Lepidoptera. And very active too.
  • Per se, usage should not hinder evolution. Ironchris has a genuine concern that we may be doing extra work, which may be wasted. We must realise to add info to any infobase, much less wikipedia, involves sorting or metadata overhead. The larger the collection, the more the overhead. You cant get away with it, or you'll have 'flat file' syndrome. You HAVE to learn the system. Wikipedia will never have the ideal, easy interface to give info neatly, packaged, just as you like it. See a search engine, it gives you tons of info,but is it ready to use? Wikisearch tries to make it useful by ordering it as per 'relevamnce', but is it exactly as you want know? Finding info will always remain a problem for people. Categories are an aid, another aspect, way or paradigm. We cannot sacrifice its utility for user convenience. AshLin 04:19, 25 April 2006 (UTC)