Wikipedia:Overcategorization
From Wikipedia, the free encyclopedia
Categorization is a useful tool to group articles for ease of navigation, and correlating similar information. However, not every verifiable fact (or the intersection of two or more such facts) in an article requires an associated category. For lengthy articles, this could potentially result in hundreds of categories, most of which aren't particularly relevant. This may also make it more difficult to find any particular category for a specific article. Such overcategorization is also known as "category clutter".
To address these concerns, this page lists types of categories that should generally be avoided. Based on existing guidelines and previous precedent at Wikipedia:Categories for discussion, such categories, if created, are likely to be deleted.
|
|
[edit] Non-defining or trivial characteristic
- Example: Bald People, Famous redheads
- In general, categorize by what may be considered notable in a person's life, such as his or her career, origin and major accomplishments. In contrast, someone's tastes in food, their favorite holiday destination, or the number of tattoos they have may be considered trivial. Such things may be interesting information for an article, but not useful for categorization. If something could be easily left out of a biography, it is likely not a defining characteristic.
- Note that this also includes grouping people by trivial circumstances of their deaths, such as categorizing people by the age at which they died or by whether they still had unreleased or unpublished work at the time of their death. Even though such categories may be interesting to some people, they aren't particularly encyclopedic.
[edit] Opinion about a question or issue
- Example: Cat lovers, Iraq liberation opposition, Star Trek fans
- Avoid categorizing people by their personal opinions, even if a reliable source can be found for the opinions. This includes supporters or critics of an issue, personal preferences (such as liking or disliking green beans), and opinions or allegations about the person by other people (e.g. "alleged criminals"). Please note, however, the distinction between holding an opinion and being an activist, the latter of which may be a defining characteristic (see Category:Activists).
[edit] Subjective inclusion criterion
- Examples: Obese people, Cult actors, Mysterious musicians, Outstanding Canadians
- Adjectives which imply a subjective inclusion criterion should not be used in naming/defining a category. Examples include such subjective words as: famous, notable, great, etc; any reference to size: large, small, tall, short, etc; or distance: near, far, etc; or character trait: beautiful, evil, friendly, greedy, honest, intelligent, old, popular, ugly, young, etc.
[edit] Arbitrary inclusion criterion
- Examples: School districts at the top 7% on Pennsylvania standardized tests, Locations with incomes over $30,000
- There is no particular reason for choosing "7%" or "$30,000" as cutoff points in these two cases. Likewise, a district with 3,800 students is not meaningfully different from one with 4,100 students. A better way of representing this kind of information is to put it in an article such as "List of school districts in (region) by size". Note that Wikipedia allows a table to be made sortable by any column.
- An exception to this is categorizing by year, since making a category for each year is not arbitrary.
[edit] Trivial intersection
- Example: Celebrity Gamers, Red haired kings
- Avoid intersections of two traits that are unrelated, even if some person can be found that has both traits. For example, celebrities are usually notable for reasons other than being gamers.
[edit] Intersection by location
- Examples: Roman Catholic Bishops from Ohio, Quarterbacks from Louisiana, Male models from Dallas
- Geographical boundaries may be useful for dividing subjects into regions that are directly related to the subjects' characteristics (for example, Roman Catholic Bishops of the Diocese of Columbus, Ohio or New Orleans Saints quarterbacks).
- In general, avoid subcategorizing subjects by geographical boundary if that boundary does not have any relevant bearing on the subjects' other characteristics. For example, quarterbacks' careers are not defined by the specific state that they once lived in (unless they played for a team within that state).
- However, location may be used as a way to split a large category into subcategories. For example, Category:American writers by state.
[edit] Non-notable intersections by ethnicity, religion, or sexual preference
- Example: Secular Jewish philosophers, LGBT murderers, German-American sportspeople
- Wikipedia:Categorization/Gender, race and sexuality states, in part:
- Dedicated group-subject subcategories, such as Category:LGBT writers or Category:African American musicians, should only be created where that combination is itself recognized as a distinct and unique cultural topic in its own right. If a substantial and encyclopedic head article (not just a list) cannot be written for such a category, then the category should not be created. Please note that this does not mean that the head article must already exist before a category may be created, but that it must at least be reasonable to create one.
- Likewise, people should only be categorized by ethnicity or religion if this has significant bearing on their career. For instance, in sports, German-Americans are not treated differently from Italian-Americans or French-Americans. Similarly, in criminology, a person's actions are more important than their sexual orientation. While "LGBT literature" is a specific genre and useful categorisation, "LGBT quantum physics" is not.
[edit] Narrow intersection
- Example: Pre-1933 two-digit Virginia state highways
- If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four such categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.
- In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories.
[edit] Small with no potential for growth
- Example: The Beatles' wives, Husbands of Elizabeth Taylor, Catalan-speaking countries
- Avoid categories that, by their very definition, will never have more than a few members, unless such categories are part of a large overall accepted sub-categorization scheme, such as subdividing songs in Category:Songs by artist or flags in Category:Flags by country.
[edit] Mostly-overlapping categories
- Example: 1971 National League All-Stars, 1852 religious leaders
- If two or more categories have a large overlap (e.g. because many athletes participate in multiple all-star games, and religious leadership does not radically change from year to year), it is generally better to merge the subjects to a single category, and create lists to detail the multiple instances.
[edit] Unrelated subjects with shared names
- Examples:Ice-named rappers, Churches named for St. Dunstan
- Avoid categorising by a subject's name when it is a non-defining characteristic of the subject, or by characteristics of the name rather than the subject itself. For example, a category for unrelated people who happen to be named "Jones" is not useful. However, a category may be useful if the people, objects, or places are directly related — for example, a category grouping subarticles directly related to a specific Jones family.
[edit] Eponymous categories for people
- Examples: John Wayne, Barbra Streisand, ZZ Top, Eponymous fashion model categories, Sports broadcasting families.
- In general, avoid creating categories named after individual people, or groupings of people (such as families or musical groups). Articles directly related to the subject (which would thus be potential members of such categories) typically are already links in the eponymous article in question. If these links are not present, then the links should be added before proposing such a category for deletion. Sometimes, renaming the category to reflect the topic, rather than the person, is a good alternative to deletion. Category:Shakespeare academia and Category:Tolkien studies, are two such examples.
- However, there are sometimes good reasons to have an eponymous category. Most examples are either collections of subarticles (see Wikipedia:Summary style), or collections of articles on a topic about the named person. Category:William Shakespeare and Category:J. R. R. Tolkien, (sub-categories of which were noted as examples above), are two such examples. Another example is Category:Alexander the Great, which includes subarticles as well as topic articles such as Alexander (film), Alexander Mosaic, Alexander Romance, Alexander in the Qur'an, Alexander the Great (1956 film), and Alexander the Great (song).
[edit] Candidates and nominees
- Example: Potential 2008 Republican U.S. Presidential Candidates
- Wikipedia is not a crystal ball. A candidate for public office, the possible next CEO of a certain corporation, a potential member of a sports team, an actor on the "short list" to play a role, or an award nominee (just to name a few examples) should not be grouped by category. Lists may be appropriate for such groupings.
[edit] Award recipients
- Example:
- People can and do receive awards and/or honors throughout their lives. In general (though there are a few exceptions to this), recipients of an award should be grouped in a list rather than a category.
- Exceptions include Category:Nobel laureates and Category:Academy Award winners. See also Category:Award winners.
[edit] Published list
- Example: Rolling Stone's 500 Greatest Albums
- Magazines and books regularly publish lists of the "top 10" (or some other number) in any particular field. Such lists tend to be subjective and somewhat arbitrary. Some particularly well-known and unique lists such as the Billboard charts may constitute exceptions, although creating categories for them may risk violating the publisher's copyright or trademark.
[edit] Venues by event
- Example: WrestleMania venues, Republican National Convention venues, Democratic National Convention venues
- There is no encyclopedic value in categorizing locations by the events or event types that have been held there, such as arenas that have hosted specific sports events or concerts, convention centers that have hosted specific conventions or meetings, or cities featured in specific television shows that film at multiple locations.
- Likewise, avoid categorizing events by their hosting locations. Many notable locations (e.g. Madison Square Garden) have hosted so many sports events and conventions over time that categories listing all such events would not be readable.
- However, categories that indicate how a specific facility is regularly used in a specific and notable way for some or all of the year (such as Category:National Basketball Association venues) may sometimes be appropriate.
- See also #Performers by performance venue.
[edit] Performers by performance
- Avoid categorizing performers by their performances. Examples of "performers" include (but are not limited to) actors/actresses (including pornographic actors), comedians, dancers, models, orators, singers, etc.
[edit] Performers by action or appearance
- Examples: Actresses who have appeared veiled, Anal porn actress, Musicians who play left-handed. Saxophonists who are capable of circular breathing
- Avoid categorising performers by some action they may have performed (such as a "pirouette", a "runway walk", a "spit take", a "pratfall", a "sword fight", etc.); some method of performance (such as while standing on their head, left-handed, etc.); or how they may have chosen to appear (such as bald, veiled, etc.)
[edit] Performers by role or composition
-
- Performers who have portrayed <character name>
- Performers who have portrayed <a type of character>
- Performers who have performed <a specific work>
- Examples: Fictional characters by actor and subcategories, American dramatic actors, Actors that portrayed heroes or villains, Jim Steinman artists, Actors & Actresses who portrayed, Actors who have played serial killers, Actors who have played gay characters, Actors who played HIV-positive characters, and Actors who have played the President of the United States.
- Avoid categories which categorise performers by their portrayal of a role. This includes portraying a specific character (such as Darth Vader, or Hamlet). This also includes voicing animated characters (such as Donald Duck), or doing "impressions"; portraying a "type" of character (such as wealthy, poor, religious, homeless, gay, female, politician, Scottish, dead, etc.); or performing a specific work (such as Amazing Grace, "Waltz of the swans" from Swan Lake, "To be or not to be" from Hamlet (the play), "Why did the chicken cross the road?" (a joke), etc.).
[edit] Performers by performance venue
- Examples: Artists who played Coachella, Saturday Night Live musical guests, Ozzfest performers, Celebrity Poker Showdown players, Entertainers who performed for troops during the Vietnam War, and Actors by series
- Avoid categorising performers by an appearance at an event or other performance venue. This also includes categorization by performance in any specific film, radio, television, or theatrical production (such as M*A*S*H, Star Wars, or Phantom of the Opera).
- Note also that performers should not be categorized into a general category which groups topics about a particular performance venue or production (e.g. Category:Star Trek), when the specific performance category would be deleted (e.g. Category:Star Trek script writers).
- See also #Venues by event.
[edit] See also
- Wikipedia:Categorization
- Wikipedia:Categories, lists, and series boxes
- Wikipedia:Categorization FAQ
- Wikipedia:Categorization of people
- Wikipedia:Categorization and subcategories
- Wikipedia:Naming conventions (categories)
- Sortable tables
- Wikipedia:Category intersection - One of several open feature requests which seek to be an alternative way to address overcategorization.