Internal reconstruction
From Wikipedia, the free encyclopedia
Internal reconstruction is a system of reasoning for recovering the history of a language. It relies on exactly the same thinking as the comparative method, namely the analysis of patterns of similarity and difference among forms of similar meaning, with the aim of discovering an underlying historical unity. The comparative method's goal is to see if such patterns of similarity and difference can be explained most economically by the theory that two (or more) languages were formerly a single language, as indicated by (a) plausible surmises as to the structural details of the basic elements of that proto-language, as it is called, and (b) a collection of regular or otherwise defensible developments through time that account for the patterns of differences and similarities as seen in the attested languages.
Internal reconstruction's reasoning may be the same as that of the comparative method, but its goals are somewhat different. This is in part from necessity: the data themselves are basically different in the two types of historical analysis. The whole of a language is the comparative method's arena; internal reconstruction is practically limited to those components of a single language which obligingly exhibit alternation, that is, patterns of similarity and difference in meaning-bearing forms, from one environment to another in inflection and derivation. The basic premise is that a meaning-bearing element that alternates between two or more similar forms in different environments was probably a single form in the past, into which alternation was introduced by the usual mechanisms of sound change and analogy. As Lyle Campbell comments at the outset of a whole chapter on internal reconstruction, it is easier to grasp what is at stake from examples than from precept.
To take a simple example from Sanskrit, note the distribution of /n/ (dental) and /ṇ/ (retroflex, i.e. articulated by the tip of the tongue behind the alveolar ridge) in the instrumental singular case of a-stems:
-
-
-
-
-
-
-
-
-
-ena -eṇa kṛśena rasena
nādena
somena
yajñena
stomena
devena
ghṛtena
ratnena
hastena
vrajena
ardhena
sickly moisture
bellow
soma
rite
praise
god
ghee
jewel
hand
stall
east
sūryeṇa indreṇa
vīryeṇa
rāmeṇa
vṛkeṇa
agreṇa
śṛŋgeṇa
varuṇeṇa
karṇeṇa
rūpeṇa
gṛheṇa
raveṇa
arbheṇa
sun Indra
heroism
Rama
wolf
front
horn
Varuna
ear
beauty
servant
roar
small
-
-
-
-
-
-
-
-
There are three possibilities, here. Either /ena/ and /eṇa/ were always different endings; or /n/ is derived from a historically-earlier /ṇ/; or /ṇ/ is a replacement of an earlier /n/.
It is always a good idea to look around to see what else is going on in the language. This will quickly reveal that /n/ and /ṇ/ do not play parallel roles in Sanskrit phonology. /ṇ/ is not found in word-initial position, is not found in any root apart from a few which also have /r/ (e.g., raṇ-a-ti "enjoys") or begin with the consonant cluster /kṣ/, and in contrast to the abundance of /n/, it is found in a relatively small number of words otherwise (small indeed, in the earliest texts) such as guṇa- "(inherent) nature".
Returning to our lists, on the basis of these forms there is not much likelihood that /n/ is derived from */ṇ/. For that to be the case, there would have to be some plausible or consistent correlation with something always present in these forms and never present in the forms with /eṇa/, and that is not the case. By contrast, there is always an /r/ in the stems taking /eṇa/, even not necessarily very close to the nasal. That is an encouraging observation, but it is not enough: a number of the forms taking /ena/ also have an /r/, e.g. rasena "moisture", and to make a case for the historical split (see phonological change) of */n/ into /n/ and /ṇ/ we have to explain the forms in which an /r/ occurs but the ending is nevertheless /ena/. The solution is that in all such cases a coronal obstruent intervenes between the /r/ and the nasal. (Coronal consonants are those articulated with the apex, blade, and front of the tongue, viz., dentals, alveolars, retroflex, palatals; obstruents are the natural class of stops, affricates, and fricatives.) That is, one might think of sounds of these natural classes as in some sense "blocking" the influence of the /r/ on the point of articulation of the nasal, whereas the retroflex articulation is "passed through" all other consonants.
An example from Samoan, this time reconstructing lost segments, not altered ones. In verbs, we have two forms, as follows, called here A and B:
-
-
-
-
-
-
-
-
-
-
Type A Type B sio lilo
alofa
puni
gau
sila
faitau
siomia liloia
alofagia
punitia
gausia
silafia
faitaulia
surround hide
love
close
break
see
read
-
-
-
-
-
-
-
-
-
To a historian's eye, it is obvious what is going on, here. The Type B forms all have a suffix /ia/ added directly to the root, with no changes. A sound law applied historically to the endingless Type A forms, however: any original word-final consonant in the endingless Type A forms was lost ("truncated" as the current terminological fashion has it). That is, in pre-Samoan we reason that the Type A forms were in fact *siom, *lilo, *alofag, and so on. The Type B forms were (so far as one can tell from these data) the same as the present-day forms. (The comparative method reconstructs proto-languages; the features recovered by internal reconstruction are often known as pre-languages, as in pre-Samoan, here.)
- Note: this analysis, like much reasoning in historical linguistics, seems to be strongly counter-intuitive. There is a tendency for those inexperienced in linguistic analysis to assume that simplex forms, as in Type A verbs here, are basic; the idea that they might actually be more changed than a derived form goes against common sense. But exactly that history is manifestly the case here, and often in language history.
As an example of the shortcomings of internal reconstruction, consider the following forms from Spanish (in phonemic transcription);
-
-
-
-
-
-
-
-
-
-
infinitive 3rd person sg bolbér probár
dormír
morír
ponér
doblár
goθár
korrér
(re)turn test
sleep
die
place
fold
enjoy
run
buélbe pruéba
duérme
muére
póne
dóbla
góθa
kórre
-
-
-
-
-
-
-
-
-
One pattern of inflection here shows alternation between /o/ and /ue/, the other has /o/ throughout. The lexical items are all basic, i.e. not technical or high register or obvious borrowings, so their behavior is likely to be a matter of inheritance from an earlier system rather than the result of some native pattern overlaid by a borrowed one.
One might guess that the difference between the two sets can be explained by two aboriginally different markers of the 3rd person singular, but a basic principle of linguistic analysis is that you can't analyze data you don't have (and shouldn't try to). Besides, positing such a history violates the principle of parsimony: it adds a complication to the analysis unnecessarily, a complication moreover whose chief result is to restate the observed data as a sort of historical fact. That is, the result of the analysis is the same as the input. And as it happens, the forms as given yield readily to real analysis, so there is no reason to look elsewhere.
The first assumption is that in pairs like bolbér/buélbe the root vowels were originally the same. We have two choices if we stick to the data: either something happened to make an original */o/ turn into two different sounds in the 3rd person singular, or else the distinction in the 3rd sg. is original, and the vowels of the infinitives are in what is called a neutralizing environment (i.e., where an original contrast is lost because two or more elements "fall together", i.e., coalesce into one). There is no way of telling ("predicting" as the jargon has it) when /o/ will break to /ué/ and when it will remain /ó/ in the 3rd sg. On the other hand, starting with /ó/ and /ué/ as givens, we can write an unambiguous rule for the infinitive forms: /ué/ becomes /o/. And one might notice further, upon looking around in Spanish, that the nucleus /ue/ is found only in tonic syllables anywhere, not just in verb forms.
This analysis gains plausibility from the observation that the neutralizing environment is atonic, whereas the nuclei are different in tonic syllables. This fits with the commonplace that vowel contrasts are often preserved differently in tonic and atonic environments, and further that the usual relationship is that there are more contrasts in tonic syllables than in atonic ones (owing to previously-distinctive vowels having fallen together in the atonic environment).
The idea that original */ue/ might fall together with original */o/ is unproblematic; so we internally reconstruct a complex nucleus *ue which remains distinct when tonic and coalesces with *o when atonic.
-
- This reconstruction, however, while accurate in principle is erroneous in detail. The comparative method, with its access to vastly richer data, comes to a different conclusion: applying the comparative method to the languages that make up Western Romance, it is apparent that there were in fact two kinds of mid back rounded vowel, a higher one written *ọ in traditional Romance notation, and a lower (more open one) traditionally written *ǫ (i.e., *ɔ). In Spanish, the latter breaks in tonic syllables to /ue/, but the two degrees of tongue height are lost in atonic syllables. That is as far as the comparative method can take us; an inspection of Latin itself reveals that PWRom. *ọ is itself the product of a merger, between Latin ŭ and ō, while Latin ŏ becomes PWRom. *ǫ. (Neither the comparative method nor internal reconstruction points to a historical vocalic system with a contrast of length, an instructive example of the limitations of even the remarkably powerful comparative method to recover prehistory.)
Finally, consider a morphological problem (though inevitably with a phonological angle).
For roots ending in apical stops, i.e. /t d/, English has two patterns for forming the preterite tense (omitting some details, and excluding ablauting verbs like sit/sat, bind bound. Note that in the following data, meet/met is not an ablauting verb, as shown by the fact that its participle is met, not ˣmetten or some other form ending in -en.
-
-
-
-
-
-
-
-
-
-
Type I Type II wait reflect
greet
fret
rent
note
waste
adapt
regret
fund
found
grade
abide
plod
blend
end
waited reflected
greeted
fretted
rented
noted
wasted
adapted
regretted
funded
founded
graded
abided
plodded
blended
ended
put set
cut
cast
meet
bleed
rid
shed
send
bend
lend
put set
cut
cast 'throw'
met
bled
rid
shed
sent
bent
lent
-
-
-
-
-
-
-
-
-
Now, of course English has very little affixal morphology, but its number includes a marker of the preterite, apart from verbs with vowel changes of the find/found sort, and the great majority of verbs that end in /t d/ take /əd/ as the marker of the preterite, as seen in Type I.
Can we make any generalizations about the membership of verbs in Types I and II? Most obviously, the Type II verbs all end in /t/ and /d/, though that is just like the members of Type I. Less obviously they are all without exception basic vocabulary. Note well that this is a claim about Type II verbs and not a claim about basic vocabulary: there are basic home-and-hearth verbs in Type I, too. But there are no denominative verbs in Type II, that is, verbs like to gut, to braid, to hoard, to bed, to court, to head, to hand. There are no verbs of Latin or (a little harder to spot) of French origin; all stems like depict, enact, denote, elude, preclude, convict are Type I. Furthermore, all novel forms are inflected as Type I: all native speakers of English would presumably agree that the preterites of to sned and to absquatulate would most likely be snedded and abasquatulated.
The inference from these considerations is that the absence of a "dental preterite" marker on roots ending in apical stops in Type II reflects a more original state of affairs, i.e., that in the early history of the language the "dental preterite" marker was in a sense absorbed into the root-final consonant when it was /t/ or /d/; the affix /əd/ after word-final apical stops then belongs to a later stratum in the evolution of the language. The same suffix is involved in both types, but with a 180∘reversal of "strategy": other exercises of internal reconstruction would point to the conclusion that the aboriginal affix of the dental preterites was /Vd/ (where V = a vowel of uncertain phonetics, and of course an inspection of Old English directly would reveal several different stem-vowels in the mix), whereas in modern formations, it is stems ending in /t d/ that preserve the vowel of the preterite marker; in an earlier day—perhaps it seems odd—the loss of the stem vowel had taken place already prehistorically whenever the root ended in an apical stop.
Not all synchronic alternation is amenable to internal reconstruction. Even though cases of secondary split (see phonological change) often result in alternations that signal a historical split, the conditions involved are usually immune to recovery by internal reconstruction. For example, the alternation of voiced and voiceless fricatives (and their reflexes) in Germanic languages, as described in Verner's law, is absolutely impervious to explanation given only an examination of the Germanic forms themselves. This is in fact a general characteristic of secondary split, though occasionally internal reconstruction can work. Primary split (see Phonological change again) in principle is recoverable by internal reconstruction whenever it results in alternations, but later changes can render the conditioning irrecoverable. And not much is required to have that result:–
For example, take the enormously common alternation between diphthongs (middle English long vowels) and short vowels in modern English: divide/division /ay/ ~ /i/; serene/serenity /iy/ ~ /e/; opaque/opacity /ey/ ~ /æ/; provoke/provocative /ow/ ~ /a/ (British /ew/ ~ /ɔ/); produce/production /uw/ ~ /ə/ (Brit. /ʌ/); prove/probable /uw/ ~ /a/ (Brit. /ɔ/); pronounce/pronunciation /aw/ ~ /ə/ (Brit. /ʌ/); point/punctuation /oy/ ~ /ə/ (Brit. /ʌ/). There is nothing on the surface to suggest what might have brought these things about. (In fact, on the basis of other evidence we know that alternation between long and short vowels in English has at least seven different sources, the earliest dating back to Proto-Indo-European itself.)
-
- Note: apart from /ay aw oy/, the offglides of modern standard forms of English are fairly recent developments. None of the 17th century orthoepists say anything that even hints at the presence of the offglides that have become very prominent in English, and in fact there are large speech communities (Ireland, for one) where such offgliding, if it occurs at all, is phonetically very different, and in several varieties of English in the American South, the reflex of earlier /ay/ is [α:] (phonemically /α/). But it would be a misuse of technical terms to call the tonic vowels of see say sew and Sue in standard forms of English "long vowels", so for simplicity's sake "diphthong" it will have to be.
In the huge body of Latinate forms such alternations as seen in divide/division, recognize/recognition. compete/competitive, has to do, in a nutshell, with Middle English long vowels in tonic syllables and short ones in atonic ones. In forms like divide, serene, compete, opaque, the tonic accent has always been where it is today—on a Middle English long vowel that has turned into a diphthong. But in words known to us as division, serenity, competitive, opacity, and the like, the present-day tonic vowel was atonic in Chaucer's day. The tonic accent in such words lay two syllables to the right of today's tonic syllable. At some point in English history (changes of this sort are very hard to trace in written texts), any tonic accent that was three or more syllables from the beginning of a word moved two syllables to the left. Thus serenité, competitíf (-íu, -íve, -íue) underwent what is called the Three Syllable Accent Retraction rule to become the forms we have today, with the tonic accent on a formerly atonic (and therefore short) vowel.
Now, the facts as recounted so far owe nothing to internal reconstruction; in fact cannot. The tonic vowels are short in such words as division, serenity, competitive and the rest, and an inspection of the present-day forms yields no reason to suspect that the tonic accent was ever anywhere but where it is now; and no reason to locate it two syllables to the right, or anywhere else in particular. This is partly because a huge list of such forms (vision, concession, division, e.g.) have lost a syllable, subsequent to the accent retraction: prior to the three-syllable retraction event, these words all scanned /-siûn, -ziûn/ (where /û/ = long, tonic) to judge from Chaucer's orthography, visioun and the like, and his meter. At some point after the retraction, the suffixes in question underwent a reduction of syllable count via what is called synizesis: /si/ and /zi/ became /sy/ and /zy/ before vowels, and later (to cover the tracks still further) the clusters simplified to /š/ /ž/.
And if this weren't obfuscation enough, there are a large number of forms that do not conform to this historical account. Some are individual puzzles, like why nation and completion have long vowels in what were historically atonic syllables and therefore necessarily short; others are whole classes, such as all the numerous nouns in -ation (divination, peroration and so on) which likewise have the reflex of a long vowel where the rule would predict a short one. A form like national, innocent as it looks, is actually a double conundrum: from an original /nasionâl/ (and why is the root vowel short, this time?), the synizesis of /sio/ to /syo/ would have to have preceded the accent retraction for the three syllable rule in order to yield the right accent placement; but in long lists of similar forms, as alluded to above, synizesis must have taken place only after retraction, for the three-syllable rule to be valid.
-
- Note 1: the change of Latin -ti- and -di- to /si/ /zi/ before vowels took place long before this vocabulary was taken into English largely through French; it was the uniquely French tradition of retaining insofar as possible standard Latin spelling that adds to the complex fit between letter and sound in the Latinate vocabulary of English. Latin itself was written phonemically, that is, vowel length apart, the letters faithfully track all the vowel reductions, consonant alternations, and so on, that preceded the establishement of standard spelling. In English, such forms underwent a progressively greater separation of pronunciation from letters via English sounds laws (see Spelling pronunciation).
-
- Note 2: some of the inconsistencies of these alternations can be dealt with in descriptive morphophonological analysis by positing a difference between base forms with long vowels and those with short ones. Thus compete, divide, crime are equipped with inherently long vowels, which become short when atonic as a result of the accent lying elsewhere in the formation, as in competitive, division, criminal, serenity, probable. Forms like profess, regret, omit, and their like, have short vowels underlyingly and therefore have no surface forms in Eglish with long vowels (= diphthongs).
- Whatever its merits, this approach must not be confused with historical analysis, despite the superficial similarities (if you reterm "historical forms" as "underlying forms" which, as it happens, would be irrecoverable without the aid of historical grammars). Early English and late Proto-Germanic borrowings from Latin and Greek accurately retain the vowel length of the source forms, but at the time when Latinate words of the divide/division kidney flooded into the English lexicon, none of the source languages had a contrast of vowel length, and no source language ever had a long vowel in the antecedents of English compete, state, basis, mania, data, recent, decent, provoke, provide, apex, reconcile, revere, convene and hundreds of others. Put differently, the long vowels in these forms, the source of present-day diphthongs, are all Made in England. The fact that the Middle English long vowels in the forms that give us nation, divine, crime, legal, severe and hundreds of others occur in words which, in Latin, actually did have long vowels corresponding to the Middle English ones, is beside they point: they have the same Made in England history, since there is no way that the prosodic vowel features of say Latin crīmen could possibly materialize through direct continuity as the vowel of Middle English cryme, English crime /kraym/.
[edit] The Role of Internal Reconstruction
In the case of languages whose histories are well understood, either via the comparative method or historical attestation of significant time-depth, internal reconstruction is little more than an entertaining parlor-game, at best a kind of test to see if the data and the reasoning applied to them actually "work"; that is, actually conform to what is known about the history of a language from other sources. And to take note of the fact that, as in the example from Spanish, above, the likeliest inferences from such an analysis do not necessarily recover the best history. (In the Spanish case, the result of the best analysis was correct in principle but faulty in detail.) When undertaking a comparative study of a hitherto un(der)analyzed family of languages, it is however worthwhile to get an understanding of their systems of alternations, if any, before tackling the greater complexities of analyzing entire linguistic structures. For example, the Type A forms of verbs in Samoan (as in the example, above) are the citation forms, i.e., the forms in dictionaries and word lists, but when making historical comparisons with other Austronesian languages it would be a blunder to use Samoan forms with parts missing. (And an analysis of the verb sets would alert the researcher to the certaintainty that many other words in Samoan have probably lost a final consonant.) Another way of looking at it is that internal reconstruction gives access to an earlier historical stage, at least in some details, of the languages being compared, and this can be valuable: the more time that passes, the more changes accumulate in the structure of a (living) language, and for this reason we always try to use the earliest known attestations of languages when working with the comparative method.
-
- Note: using internal reconstruction as a test on method is valid only if one doesn't cheat, that is, slant one's arguments to an outcome already known from other sources. But the impossible complexity of English vowel lengthening and shortening has been "analyzed" in just such a manner. Similarly, palatalization in Sanskrit reduplication (ca-kar- "did" to root kṛ-, ja-gam- "went" to root gam-) has been held to show that Indic /a/ must have two different prehistoric sources, one palatalizing and one not, thereby reconstructing a historical merger. And this is the actual fact, as recovered by the comparative method; but as for internal reconstruction, the problem with the reasoning is that the exact same pattern is seen for Sanskrit /i/ and /u/: ci-kit- "noticed", cu-kup- "was angry", and the same reasoning would argue for two kinds of pre-/i/ and pre-/u/ as well which, however, is not in accord with better evidence for the history of the language. That is, two-thirds of the conclusions honestly reached by this analysis are false, which is not a good score. Indeed, a more sober analysis (but one that as it happens is even more inaccurate, historically) would capitalize on the dissimilation seen in the reduplication of aspirates (da-dhā- "put" , ja-ghas- "swallowed", etc.) and propose, not pairs of pre-vowels, but a dissimilatory change of dorsal to palatal articulation. To repeat a point made earlier, not every alternation can be accounted for by internal reconstruction.
Internal reconstruction, when not a sort of preliminary to the application of the comparative method, is most useful in cases where the superior analytic power of the comparative method is unavailable.
Among such cases are reconstructed proto-languages themselves. [A reminder: the point of the comparative method is to determine whether two or more languages were the same language at some point in time; this "same language" is known nowadays as a proto-language. The historical claims of internal reconstruction relate to what is commonly known as a pre-language. Thus Proto-Indo-European is the fruit of the comparative method applied to the several dozen languages that make up the family; pre-Indo-European would be the term for the historical forms predating Proto-Indo-European as recovered by internal reconstruction.] The Proto-Indo-European that we can reach by the comparative method has nominatives singular of masculine and feminine nouns marked by the ending *-s, apart from two transparently secondary feminine stem-types (see laryngeal theory, end). Two other classes of exception are r-stems and n-stems, which lack *-s in the nominative singular (and the n-stems lack an n as well); and both types have, in addition, a long stem-vowel unique to the nominative singular: *ḱwon- "dog" has stems *ḱon- and *ḱun-, depending on case and number, but nominative singular *ḱwō; *swesor- "(unmarried) sister" likewise has a paradigm with stems *swesor- and *swesr- apart from the nominative singular, which is *swesōr.
It would seem to be an obvious inference that the pre-Indo-European form of these stem-types would have been **ḱwons and **swesors (two asterisks to denote a reconstruction based on a reconstruction), with simplification of the final cluster and compensatory lengthening; the loss of the *n altogether, moreover, fits with a widely-observed proclivity of nasals to drop before fricatives.
This analysis gets some support from (or, perhaps, encourages a certain analysis of) the most archaic form recoverable by the comparative method of the PIE nominative singular of the word for "heart", namely *ḱēr (stem *ḱṛd-): Greek (Homeric) kêr, Hittite HEART-ir, even Sanskrit (Vedic) hârdi. [Note: the form favored by the Greek tragic poets, kéar, is a fantasy of folk etymology.] We might see the original nominative as having been **ḱerd (neuter consonant-stems are endingless in the nominative/accusative singular) whence probably something like **ḱerr and finally the form we reached by comparative reconstruction, *ḱēr. Vedic hârdi is then the reflex of this form (with a modification of the initial consonant to *ǵh), the root-final *-d restored by leveling analogy from the oblique cases—together with a prop-vowel to keep it there, given that the form would have been unpronounceable otherwise (word-final consonant clusters were early truncated). That is, pre-Vedic *źhārdə, whence attested hârdi (and incidentally providing a hint that syllabic laryngeals in Proto-Indo-Iranian were indeed, originally, some sort of schwa that was nudged up a tongue-height notch when PInIr. *a became [ə]; see laryngeal theory.)
Internal reconstruction can also draw limited inferences from peculiarities of distribution. Even before comparative investigations had sorted out the true history of Indo-Iranian phonology, some scholars had wondered if the extraordinary frequency of the phoneme /a/ in Sanskrit (20% of all phonemes together, a stupendous total) might point to some historical fusion of two or more vowels. (In fact, it represents the final outcome of five different Proto-Indo-European syllabics two of which—the syllabic states of /m/ and /n/—can be discerned by the application of internal reconstruction.) But in such cases, internal analysis is better at raising questions than at answering them.
A possibly more fruitful case is an observation relating to the six PIE resonants (or semivowels) *y *w *r *l *m *n. These are consonantal or vocalic (i.e., syllabic), depending on environment as defined by neighboring true consonants, true vowels, and perhaps one or more other resonants. Between two vowels, in a sequence of two resonants both are consonantal, *deyw-o- "divine" for example. In all other situations, one of the resonants is syllabic, the other consonantal, in accord with general rules. With one peculiar exception. PIE *w- is found in word-initial position followed by (consonantal) *y, *r, or *l (but never a nasal): *wyet(H)- "hesitate", *wreH- "declare", *wleH- "twist". This is very peculiar. It makes *w look a little bit like a stop. (And note that initial clusters of stop + nasal are rare or non-existent in IE; the apparent root *pnew-, for example, the basis of words having to do with wheezing, sneezing, and breathing, is generally regarded as onomatopoeic, and onomatopoeic words often violate the usual patterns of distribution). Now, the remarkable rarity of PIE *b, especially in root-initial position, is a commonplace of Indo-European reconstruction. Likewise the remarkable abundance of *w- in root-initial position: there are no fewer than thirteen roots (with subdivisions) reconstructed as *wer-, eight as *wel-, ten as *wes- (and this in a lexicon notable for the rarity of homophonous roots altogether). Some of these are doubtless false reconstructions, and many of the homophonies are surely the result of undetected root-initial or root-final laryngeals. But these purely distributional facts plus the peculiar phonotactics connected with *w invite one to wonder if at least some of the reconstructed cases of *w in root-initial position were not originally *b.
-
- Note: this is not a claim that all cases of *wy-, *wr-, *wl- continue **b-. Once a phonological type becomes established, by whatever means, novel creations are always possible. So far as we know, for example, Old English had no diphthong [aw], and such a shape eventually appeared only as a changed later form of earlier /ū/. But not all present-day examples of /aw/ continue earlier /ū/: chow, powwow, pout, kowtow, Howitzer, etc., etc.
Lyle Campbell (who devotes a whole chapter in the book cited below to internal reconstruction) raises an interesting caution: if internal reconstruction is applied to members of a compact subgroup prior to the exercise of comparative analysis, there is a risk that a shared innovation definitive of the subgroup itself will be analyzed out of existence. His example is consonant gradation in Finnish, Estonian, and Lapp (Saami). A pre-gradation phonology can be discerned for each of the three groups via internal reconstruction, but in fact it was manifestly an innovation in the Baltic Finnic branch of Finno-Ugric, not of the individual languages, and indeed it was one of the innovations defining that branch. This fact would be missed if the comparanda of the Finno-Ugric family included as primary data the "de-graded" (if you will) states of Finnish, Estonian, and Lapp.
This is an interesting point, and an insightful one, but it does not portend any serious problems. Even if such a mistake were to be made, sooner rather than later a historian would notice the result: that nearly identical sound laws were being formulated for each of several closely-related languages. Such things do happen in actual fact, with the spread of areal features, or with commonplaces (say, devoicing in word-final position), but the whole point of setting up subgroups, branches, and so on, in the first place, is that it is more plausible that a phonological (or morphological) innovation, particularly a complex or unobvious one, took place only once in the history of the group—i.e., in the speech community of a proto-branch—rather than separately and repeatedly in a whole array of daughter languages. (And Finnic consonant gradation is in the character of a complex and unobvious innovation.) That is, the blunder warned of by Campbell is harmless enough, given that its mischief would necessarily be temporary because soon noticed and corrected.
[edit] References
- Campbell, Lyle (2004). Historical Linguistics: An Introduction, 2nd ed., Cambridge (Mass.): The MIT Press. ISBN 0-262-53267-0 (U.S.).