Talk:Swadesh list

From Wikipedia, the free encyclopedia

This article is within the scope of the WikiProject Linguistics. If you would like to participate please visit the project page, where you can join the project and see a list of open tasks.

Contents

[edit] General

It should be noted that all linguistic sources are extremely old and outdated. Further no single textbook is even mentioned (what I made up for). This is a bad sign.HJJHolm 11:26, 16 January 2007 (UTC)

[edit] source, 207? 200?

what is the source of this list, and why does it have 207 rather than 200 items? dab () 13:28, 28 Feb 2005 (UTC)

Swadesh himself started with ca. 500 words and gradually shortened it via 207, 200, to finally 100 words. Even these have often been proved to contain a lot of lexemes, wich are not at all resistant to borrowing.HJJHolm 11:21, 16 January 2007 (UTC)

The source is here at wiktionary: [1] --Arcadian 04:31, 1 Mar 2005 (UTC)

I see -- the Rosetta project gives you a choice of 100, 200 or 207 word lists. It is our job to find out who introduced which list. A google search gave me no conclusive answers. It is very important to stick to a fixed list. It doesn't do to pick any old list of basic vocabulary: It can be so rigged to show greater or lesser realtionship, depending on what you want to show. I think Swadesh originally proposed 100 items, and it was later expanded to 200 (by whom?), and the additional 7 items may be an idiosyncracy of Rosetta project. (also, which are the 7 items added?). I'll try to find out, but I'm not sure where I'll find this information. dab () 06:45, 1 Mar 2005 (UTC)
the 207-word list is simply the combination of the 100 & 200 word lists. The 200-word list is not the 100-word + 100 more words. The following 7 words appear in the 100-word list but not the 200-word list: breast, fingernail, full, horn, knee, moon, round. peace – ishwar  (speak) 18:45, 15 September 2005 (UTC)
the 200-word list comes from Swadesh (1952). This list was later divided into two 100-word groups of which the 1st group is preferred and the 2nd group is secondary & to be used if there are items in the 1st group that are difficult or impossible to translate. The revision is proposed in Swadesh (1955). The original list had 215 words. – ishwar  (speak) 19:06, 15 September 2005 (UTC)

[edit] concepts across languages

I find it interesting what the page on glottochronology has to say about Swadesh lists:

The process makes use of the Swadesh list, a list of basic lexical terms compiled by Morris Swadesh. This core vocabulary was designed to encompass concepts common to every human language, eliminating concepts that vary by culture and time.

While the concepts might be common to most languages, they are certainly not common to all, and at that, the words each individual language uses could be very different in form and function. I forget the name of it, but there is a language spoken in South America that has almost no number words -- just "one" and "more than one".

Combined with the controversy about the rate of language change, I see some pretty big holes. For instance, aside from my native tongue of English, I am most familiar with Japanese, which is far enough from Indo-European that the Swadesh list does not seem to fit very well. For instance, Japanese has several words for first-person and second-person pronouns, and these have changed over time as the most polite form becomes gradually more commonplace until it is replaced by something new. Kisama used to mean something along the lines of "o honorable noble" and was used as a term of polite address, but using it today could get you into a fight as it now means something more like "you little SOB". One of my dictionaries here suggests this shift took place over the course of about 400 years. Meanwhile, the word yama (mountain) was apparently present in the language about a thousand years ago, and is still in current use.

On top of that, I can think of at least four, possibly five different ways of expressing "if" in Japanese, only one of which includes a separate word that correlates to English "if" (the rest all use specific verb forms).

Has there been any effort to expand on Swadesh's word-list ideas to correct for these issues? --- Eirikr 07:53, 7 Apr 2005 (UTC)

yes, Swadesh reduced his list from 215 to 100 due to European bias. read: Swadesh (1955) & Hoijer (1956). peace – ishwar  (speak) 19:18, 15 September 2005 (UTC)

[edit] "Person"

I changed "person" to "man (human being)" because it's all too common to get translations like Latin persona or German Person for this, when what is intended is translations like Latin homo or German Mensch. --Angr/comhrá 07:59, 17 May 2005 (UTC)

[edit] WP:NOT

While discussion of the Swadesh list is valuable for WP, including the translations is not encyclopedic, per What Wikipedia is not. That is something clearly more suitable to Wiktionary, and they are already duplicated there for the most part. To reduce wasted duplicated effort, I believe all the lists should be on only Wiktionary, except perhaps the English list can stay in this article. I started the discussion on Talk:Hindi_Swadesh_list where I first saw one of the lists before I reallized there were many more. In short I believe we should transwiki all of them to Wiktionary. That is specifically what Wiktionary is for and we should use it for that. - Taxman Talk 16:35, 30 March 2006 (UTC)

This makes sense to me, and is something I've already effectively agreed to in a discussion with Peter Isotalo over on the Japanese Swadesh list Talk page, opting instead to develop one over at wikt:Swadesh list for Japanese.
For the English-language article Swadesh list, I think it might be useful to have the list showing more than just English, as an example of how the list is used in comparison. Perhaps instead of the flat English-only list currently on the page, we could use the example table at wikt:Appendix:Swadesh list? Having a populated table like that, in which we can see many different languages and how the words begin to correlate and how sound differences sometimes follow a pattern, would make it more clear what the list was developed for in the first place.  :) Cheers, Eiríkr Útlendi | Tala við mig 22:01, 30 March 2006 (UTC)
The problem with using that example, is it still basically incorporates a source document that is more effectively placed elsewhere. It's just as beneficial to the reader to point them to Wiktionary. There's also the issue of that table including only European languages and the bias that that represents. But I'm ok having some kind of an example (ideally more balanced) if people want it as long as we don't have separate lists here at Wikipedia. - Taxman Talk 00:03, 31 March 2006 (UTC)
I understand the example is biased towards IE languages, but that's partly the point -- the list was developed to show relatedness. Comparing English with Swahili, Nahuatl, Wiradhuri wouldn't do any good, as you have to compare with languages that might have at least some interrelation.  :) But then perhaps I misunderstand what you mean by more balanced? If you simply mean that you'd rather the sample list be more along the lines of, say, the wikt:Swadesh lists for Finno-Ugric languages, where the English column is just a reference, that's fine by me. And I definitely agree about not having separate lists here at Wikipedia. Cheers, Eiríkr Útlendi | Tala við mig 15:59, 31 March 2006 (UTC)
I've redirected Afrikaans Swadesh list to wikt:Wiktionary:Swadesh lists for Afrikaans and Dutch, and fixed most of the mistakes and omisions at the wiktionary list. Anyone brave enough to do the other languages? (Dutch Swadesh list should redirect to wikt:Appendix:Swadesh list, not the Afrikaans/Dutch list.)-- Jeandré,t12:04z
Great, thanks. I certainly don't have time to transwiki them all, but I've started tagging them with Template:Move to Wiktionary to get the process rolling. I won't be able to finish that either, but it's a start. I've done Hindi also. Are cross project redirects considered a good idea though? - Taxman Talk 12:57, 16 April 2006 (UTC)
Transwiki redirects - Well, we have a nice Template:wi which looks better than a simple #redirect [[Wiktionary:whatever]], which easy to use. As a Wiktionarian and a Wikipedian, I believe Wiktionary could benefit from the move, but we have lots of Swadesh lists already. If nobody else moves or redirects and merges them, then I'll do it. --Dangherous 21:29, 16 April 2006 (UTC)

[edit] German equivalent

The German equivalent to Swadesh list is not de:Grundwortschatz. The backlinks from there to many other languages are all wrong.

The sense of Grundwortschatz is a vocabulary which contains the 1200 or 2000 most used words of language. Booklets containing such a Grundwortschatz are used by language learners.

Thus I remove the link and the backlinks. Hirzel 01:22, 9 April 2006 (UTC)

As this wrong link is present in the corresponding pages of the other languages, it's likely that your fix will be removed by a bot in a next future ... So, the links should be fixed in all corresponding pages. Croquant 09:05, 9 April 2006 (UTC)
Correct. For this reason I canceled the last sentence in the first chapter erroneously referring to the usage as core vocabulary.HJJHolm 11:30, 16 January 2007 (UTC)

[edit] Cognates?

What if one language has a word that is a cognate with a similar meaning, but not precisely what the Swadesh list has? For example, in Russian отец means "father." In Ukrainian it's батько, but one can say "святый отец" when they mean what in Russian would be described as "батюшка"! So the words have switched meanings! Another example: In Russian one says человек for male person in the singular nominative, but люди in the plural nominative, and back to человек in the genitive plural while Ukrainian has the same pattern, but the other way around!! Obviously, these two languges are closer than just two languages where these words don't match. -Iopq 20:57, 28 June 2006 (UTC)

This is hardly relevant since the method doesn't actually work. It is here as a curiousity at best, so there's not much point in discussing how it should be modified. It cannot be modified into a functioning form since it´s presumption of steady lexical replacement is false.--AkselGerner (talk) 19:53, 28 February 2008 (UTC)
It means that father and male person are not good examples for the beginning linguist who wishes to use Swadesh lists for comparative purposes.
Swadesh lists are used in comparative linguistics in reconstructing phonological changes since either language has split or since one language borrowed the other's word. So the Spanish word for "word" is palabra but it's mot in French. A little investigation finds that palabre in French means "endless discussion." French mot comes from Latin muttum (a mutter, a grunt) and I'm not sure where mot's cognates appear in spanish (mutis?). The idea behind Swadesh lists is that they are words that speakers are least likely to change from borrowing.
In the scheme of the comparative method, such semantic shifting doesn't deter a linguist. AEuSoes1 02:44, 5 September 2006 (UTC)
Incorrect terminology! Swadesh's method does not belong to comparative linguistics. Comparative linguistics is a fully viable method of studying genetic relations between languages and by extension to reconstruct predecessor language forms, be it sounds, words or grammatical forms. Swadeshian lexicostatistics however does not use the comparative method at all, instead relying entirely upon fonetic similarity of lexical equivalents in the studied language. Besides using such quick-and-dirty technique it also uses completely false theoretic assumptions that render it´s results completely useless. Comparative linguistics does not generally involve any kind of dating of linguistic diversion, but lexicostatistics uses nothing else. See Lyle Campbell, Historical Linguistics - an introduction for comfirmation.--AkselGerner (talk) 22:33, 28 February 2008 (UTC)
All right, so then what sorts of words do comparative linguists use? — Ƶ§œš¹ [aɪm ˈfɻɛ̃ⁿdˡi] 00:26, 29 February 2008 (UTC)
Any word will do. The point is that comparative linguistics looks at both meaning and soundshape and meticulously reconstructs the sound changes that a language has gone through, looking both at the body of evidence found within the language itself ([internal reconstruction], in fact similar to generative phonology, except that generative phonology mistakenly believes itself to be synchronic science), and at the body of evidence of related and/or neighbouring languages. For example, while not genetically related to indoeuropean languages (in the sense that no evidence to the fact can be shown with any scientific merit) the finnish language has loan-words from baltic and germanic languages that can be used to show the sound changes of their donor languages. In other words, to show that a language is related to another the vocabulary must first be analyzed in depth, the noise of later loanings and typologically commonplace changes must be filtered out and the complete set of sound changes reconstructed until the point where a common denominator lexicon can be shown to exist. Comparative linguistics is a complete science, not a mere hack like Swadeshs method, and the possibility of very peripheral vocabulary to be preserved intact over millennia is not excluded, in fact it can be shown that high-frequency words are more likely to suffer from atrophy than low-frequency words. Low-frequency words of course are more likely to disappear without trace so they are rarely simultaneously available in distantly related languages. In comparative linguistics the soundshape is more important than the meaning, semantic shift is always possible but phonetic change is subject to rules, at least more so than semantics (but see also grammaticalization theory and grammaticalization pathways). There is no widely accepted method of doing what Swadesh tried to do with his infamous lexicostatistical method, noone can date the protolanguages proposed by the comparative method unless there happens to be written evidence, and then only if the written evidence can be dated. This arises from the nature of diachronic investigation, it is like an x-ray vision that completely ignores the synchronic layers, and all dates are by definition synchronic. . See [comparative method]. --AkselGerner (talk) 20:28, 29 February 2008 (UTC)
While I disagree with you on a few points, I don't feel like arguing or doing the research to back up my disagreements, especially since it won't have any bearing on the article itself. — Ƶ§œš¹ [aɪm ˈfɻɛ̃ⁿdˡi] 22:16, 29 February 2008 (UTC)
You don't think? If you don't give good arguments I might edit the article. If you give a non-answer like that there's no telling what I might do. However, I'm cutting some of my arguments above because they are not necessary and are available elsewhere. The generative phonology stab by the way is completely solid: both methods use the same input (synchronic morfophonological variation), make the same practical assumptions and perform parallel operations. They are identical in all but name. It just so happens than any comparative method (internal reconstruction is the comparative method when applied to morfophonological variations within a single language, although the term is also sometimes used for using the comparative method on the dialects of a single language) is always diachronic, the abstracting of the results of past sound changes always gives a pre-form, a reconstructed form.
Go ahead and edit the article. If I or anyone else find your edits disagreeable we can talk about them in the talk page. — Ƶ§œš¹ [aɪm ˈfɻɛ̃ⁿdˡi] 02:41, 1 March 2008 (UTC)

[edit] My printable Swadesh

I should propose a list here for easy printing reasons - the question was in WP:RD yesterday.

I you * he we you * they this that here there
who what where when how not all many some few
other one two three four five big long wide thick
heavy small short narrow thin woman man * man * child wife
husband mother father animal fish bird dog louse snake worm
tree forest stick fruit seed leaf root bark flower grass
rope skin meat blood bone fat (n.) egg horn tail feather
hair head ear eye nose mouth tooth tongue fingernail foot
leg knee hand wing belly guts neck back breast heart
liver drink eat bite suck spit vomit blow breathe laugh
see hear know think smell fear sleep live die kill
fight hunt hit cut split stab scratch dig swim fly (v.)
walk come lie sit stand turn fall give hold squeeze
rub wash wipe pull push throw tie sew count say
sing play float flow freeze swell sun moon star water
rain river lake sea salt stone sand dust earth cloud
fog sky wind snow ice smoke fire ashes burn road
mountain red green yellow white black night day year warm
cold full new old good bad rotten dirty straight round
sharp dull smooth wet dry correct near far right left
at in with and if because name

Notes :

  • you (singular)
  • you (plural)
  • man (adult male)
  • man (human being)

What do you think ? -- DLL .. T 23:04, 12 October 2006 (UTC)

[edit] Swadesh tables

I've created Swadesh list of slavic languages that has multiple languages for direct comparison. I invite other editors to create and contribute to it and similar pages for Celtic, Germanic, Indo-Iranian, Afro-Asiatic, Sino-Tibetan, Romance, Finno-Ugric, Turkic, Austronesian, and whatever other language family there's enough information for. For a base table without any words in it, see this edit. Ƶ§œš¹ [aɪm ˈfɻɛ̃ⁿdˡi] 23:05, 19 October 2006 (UTC)

[edit] Articles in Nature

I added something about the latest study as that is probably relevant in light of the widespread scientific scepticism. I just realized that it probably belongs in glottochronology rather than here, dagnabbit... and of course it should probably be tuned for a more unbiased tone.--AkselGerner (talk) 20:59, 1 March 2008 (UTC)