Lumpers and splitters

From Wikipedia, the free encyclopedia

Lumping and splitting refers to a well known problem in any discipline which has to place individual examples into rigorously defined categories. The lumper/splitter problem occurs when there is the need to create classifications and assign examples to them, for example schools of literature, biological paleo-species and so on. A "lumper" is an individual who takes a gestalt view of a definition, and assigns examples broadly, assuming that differences are not as important as signature similarities. A "splitter" is an individual who takes precise definitions, and creates new categories to classify samples that differ in key ways.

1 Lumping and splitting in biology
2 Lumping and splitting in history
3 Lumping and splitting in software modelling
4 Lumping and splitting in language classification
5 See also
6 External links

[edit] Lumping and splitting in biology

The naming of a particular species should be regarded as a hypothesis about the evolutionary relationships and distinguishability of that group of organisms. As further information comes to hand, the hypothesis may be confirmed or refuted. Sometimes, especially in the past when communication was more difficult, taxonomists working in isolation have given two distinct names to individual organisms later identified as the same species. When two named species are discovered to be of the same species, the older species name is usually retained, and the newer species name dropped, a process called synonymization, or convivially, as lumping. Dividing a taxon into multiple, often new, taxa is called splitting. Taxonomists are often referred to as "lumpers" or "splitters" by their colleagues, depending on their personal approach to recognizing differences or commonalities between organisms.

[edit] Lumping and splitting in history

Main article: periodization

In history lumpers are those who tend to create broad definitions that cover large periods of time and many disciplines, whereas splitters want to assign names to tight groups of inter-relationships.

Each approach has its well known problems. Lumping tends to create a more and more unwieldy definition, with members having less and less mutually in common. This can lead to definitions which are little more than conventionalities, or groups which join fundamentally different examples. Splitting often leads to "distinctions without difference", ornate and fussy categories, and failure to see underlying similarities.

For example, in the arts, "Romantic" can refer specifically to a period of German poetry roughly from 1780-1810, but would exclude the later work of Goethe, among other writers. In music it can mean every composer from Hummel through Rachmaninoff, plus many that came after.

[edit] Lumping and splitting in software modelling

Software engineering often proceeds by building models (sometimes known as Model-Driven Architecture). A lumper is always keen to generalize, and produces models with a small number of broadly defined objects. A splitter is reluctant to generalize, and produces models with a large number of narrowly defined objects. For example, according to the lumpers, a subcontractor could be basically the same as any other supplier, and is therefore the same class; meanwhile the splitters would probably argue that there are significant differences between different groups of suppliers, justifying separate classes in the model.

[edit] Lumping and splitting in language classification

Language families with lumper-splitter controversies include Ural-Altaic, Altaic itself, Austric, Nostratic, and Joseph Greenberg's similar Eurasiatic, his Amerind languages, Indo-Pacific, and Nilo-Saharan, and above all Merritt Ruhlen's Proto-World.

Splitters regard reconstruction of a common ancestor (protolanguage) via the comparative method as the only valid proof of relationship, and consider genetic relatedness to be the question of interest. American linguists of recent decades tend to be splitters.

Lumpers are more willing to admit techniques like mass lexical comparison or lexicostatistics, and mass typological comparison, and to tolerate the uncertainty of whether relationships found by these methods are the result of linguistic divergence (descent from common ancestor) or language convergence (borrowing). Much long-range comparison work has been from Russian linguists like Vladislav Illich-Svitych and Sergei Starostin. In the US, Greenberg's and Ruhlen's work has been well publicized. Some well-known earlier American linguists like Morris Swadesh and Edward Sapir also pursued large-scale classifications.