The Grammaticalization of Indefinite Pronouns
The Grammaticalization of Indefinite Pronouns
Abstract and Keywords
This chapter discusses the grammaticalization of indefinite pronouns, focusing on the ways in which such pronouns arise and change over time in different languages and the regularities in these changes. It first considers diachronic typology before describing four main source constructions for indefiniteness markers: the ‘dunno’ type, the ‘want/pleases’ type, the ‘it may be’ type, and the ‘no matter’ type. It then examines the six parameters of grammaticalization, three of which are paradigmatic (integrity, paradigmaticity, paradigmatic variability) and three are syntagmatic (scope, bondedness, syntagmatic variability). It also looks at desemanticization, with particular emphasis on three competing theories of semantic grammaticalization, before concluding with an overview of the indefinites that express the free-choice functions and their use as true universal quantifiers.
In this and the following chapter, I study the ways in which indefinite pronouns arise and change over time in different languages and the regularities in these changes. There are two main reasons for engaging in such a study of diachronic typology. First is the diachrony itself. Language change is a universal and essential feature of human language, and by studying the general laws of language change, we learn much about human language. But secondly, diachronic typology also helps us understand synchronic language states better. All languages are constantly in a process of change, in a kind of flux, and many features that do not fit neatly into a synchronic system begin to make sense once a diachronic point of view is taken. This applies both to recent innovations and to remnants of earlier regularities that are no longer synchronically motivated. Languages can carry around such synchronic irregularities for many generations, and if our goal is the explanation of linguistic structures, we have to take diachronic explanations into account.
Most importantly for my purposes, there is often a close correspondence between the generalizations obtained from synchronic and from diachronic typological studies, so that the results from such studies reinforce each other. For instance, in Hawkins’s (1983) study of word-order universals, the implicational hierarchies that account for the cross-linguistic distribution of word-order patterns also make correct predictions about possible diachronic changes, as Hawkins shows. If a language acquires new word-order patterns, it acquires them in accordance with the order of the implicational hierarchy. Quite analogously, implicational maps that account for the cross-linguistic distribution of different functions of grammatical categories also constrain the possible diachronic changes: a category can acquire a new function only if that function is adjacent on the semantic map to some function that the category already covers. Semantic change of grammatical categories is ‘incremental’ (cf. Croft et al. 1987), and grammatical categories gradually extend their uses along the paths allowed by the map.
This correspondence between synchronic and diachronic typology can also be observed in the case of indefinite pronouns. As I will show in this chapter (especially § 6.4), the extension of indefinite series to new functions proceeds along the paths permitted by the implicational map of Chapter 4.
One problem for the diachronic–typological study of indefinite pronouns at this stage is that there are very few specialized studies of diachronic change in (p.130) indefinite pronouns in individual languages. While I was able to make use of quite a few studies on individual languages for the synchronic distribution, I had to rely on other kinds of evidence for the diachronic study. First, some of the large historical grammars of the major European languages do contain a limited amount of relevant information. Secondly, I use comparative evidence from closely related languages. And third, etymological information, which is available for many languages, also gives us valuable insights, especially about the source construction from which an indefinite pronoun was derived.
A key concept for understanding the genesis and later development of indefinite pronouns is that of GRAMMATICALIZATION (cf. Lehmann 1982a; 1995; Heine et al. 1991; Hagège 1993; Hopper and Traugott 1993, among many others). Research in diachronic typology is to a large extent concerned with various types of grammaticalization. Grammaticalization is the unidirectional gradual diachronic change by which a lexical–syntactic source construction loses its autonomy and is integrated into the grammar. In the following section (§ 6.2), I identify four main source constructions from which the grammaticalization of indefinite pronouns started. In a next step (§ 6.3–4), I will relate these changes to general properties of grammaticalization, and discuss the consequences for different theoretical accounts of grammaticalization changes.
6.2. Source Constructions for Indefiniteness Markers
The grammaticalization processes are particularly interesting in the case of a large subclass of interrogative-based indefinite pronouns.1 Accordingly, I will concentrate the discussion on such indefinites in the following subsections. In Chapter 7 I will then discuss further diachronic sources of indefinite pronouns that cannot be subsumed under grammaticalization.
6.2.1. The ‘dunno’ type
220.127.116.11. The source construction
Some indefiniteness markers that combine with interrogative pronouns have arisen from a clause with the meaning ‘I don’t know’, or similar. This type is especially well attested in European languages. Some cases are shown in (289). (Here and in many of the following examples, one indefinite pronoun, often the one denoting a person, stands for the whole indefinite series.)
A variant of the explicit negation ‘I don’t know’ is the rhetorical question ‘who knows?’, which by way of a conversational implicature renders the same meaning. This type of source construction has been strongly grammaticalized in Lithuanian, while in the other languages where I have found it it is quite rare and highly expressive, i.e. it is still weakly grammaticalized.
In Albanian, the strongly grammaticalized indefiniteness marker di- (e.g. di-kush ‘somebody’, di-ç ‘something’, di-ku ‘somewhere’) is identical to the root di-‘know’. It is unclear whether this goes back to type (289) (with the negation omitted) or to type (290) (with ‘who’ omitted) or perhaps to some third type (e.g. a rhetorical polar question like ‘do I know who?’).
Another highly expressive variant of this type is an expression like ‘God knows wh-’ or even ‘the devil knows wh-’. I know of no case where such a source construction has been grammaticalized strongly.
(p.132) The scarcity of attested cases of ‘dunno’-indefinites outside of Europe might lead one to suggest that such indefinites are an areal feature typical of Europe.2 But in view of the unsatisfactory documentation of most non-European languages, this conclusion may turn out to be premature.
It is not difficult to see why an expression like ‘I don’t know’ should come to mark an indefinite pronoun: indefinite pronouns are typically used when the referent is unknown to the speaker (though it need not be unknown; cf. § 3.2.4). The diachronic scenario of its development is also easy to reconstruct: the original source structure is an indirect parametric (or ‘wh-’) question embedded in the matrix clause ‘I don’t know’ where the greater part of the embedded question is omitted because it is obvious from the context (this type of contextual omission was called ‘sluicing’ in Ross (1969); cf. also von Bremen (1983: 118–19).
In a next step, this sentence is inserted into another sentence where the interrogative pronoun occupies some syntactic position.
This is a kind of ‘syntactic amalgam’ of the type studied in Lakoff (1974) (cf. Lakoff’s example I saw you’ll never guess how many people at the party). One could propose a constituent structure as in (294) for it (cf. Espinal 1991: 748–9), where there are two separate structures with their own root node that share one constituent (the NP what). Whatever the best syntactic analysis of such constructions is, they are obviously highly marked and prone to reanalysis. The erstwhile matrix clause ‘I don’t know’ is reanalysed as an indefiniteness marker, and when this way of expressing indefinite pronouns becomes more frequent, it may undergo quite radical phonological reduction, as shown by the Bulgarian, Romanian and especially the Old Norse example.
18.104.22.168. The original meaning
Now let us ask what kind of indefinite meaning we get from the source ‘I don’t know’. Obviously we do not get the meaning ‘known to the speaker’, but can we be more specific? We can: in all the other (p.133) functions distinguished in § 3.3, the referent is unknown to the speaker, but this source only gives rise to the specific-unknown function. The reason is that in all the non-specific functions it would be nonsensical for the speaker to state that he or she does not know the referent because if the referent is non-specific, nobody could possibly know it. This can also be illustrated by the impossibility of adding I don’t know wh- when the referent of the indefinite pronoun is non-specific. Thus, while (295a–b) are possible (like 292), (296a–c) are anomalous.
When we look at the way ‘dunno’-indefinites are used, we see that the prediction that they have the specific-unknown function is confirmed.
6.2.2. The ‘want/pleases’ type
22.214.171.124. The source construction
Some indefiniteness markers that combine with interrogative pronouns go back to an expression meaning ‘want’ or ‘pleases’ or (p.134) similar. Examples are given in (298). (Again, the indefinite pronoun denoting a person generally represents its series.)
For these indefinite pronouns, I hypothesize the source constructions in (299) as the starting-point of the grammaticalization process.
(299a) and (299b) are semantically equivalent and differ only in that the addressee is the subject of the predicate ‘want’ in (299a) but the object of the predicate ‘please’ in (299b). The predicate (‘want/pleases’) is in both cases the predicate of a NON-SPECIFIC FREE RELATIVE CLAUSE4 which serves as an argument of the main clause.
In contrast to the previous case, this source construction does not require any syntactic restructuring before it can be turned into a grammatical marker, the reason being, of course, that the original construction consists of a subordinate clause, rather than a superordinate clause as in § 6.2.1.
The hypothesized source constructions also explain why the indefiniteness markers in this section are suffixes, whereas the indefiniteness markers from ‘I don’t know’ are prefixes. And that these indefinites, too, are based on interrogative pronouns finds a natural explanation in the fact that in many languages, non-specific free relative clauses are formed with an interrogative-based relative pronoun (cf. Lehmann 1984: 326).
As in ‘dunno’-indefinites, it is pretty obvious why source constructions like (299a–b) should be chosen for an indefinite pronoun. A sentence like (299a) is a close paraphrase of a sentence with a free-choice indefinite pronoun like You may take anything, and in § 3.3.3 we saw that languages lacking free-choice indefinites use precisely such constructions to render the same meaning.
Thus, we expect ‘want/please’-indefinites to express the free-choice meaning initially, and this is indeed what we find in many cases:
6.2.3. The ‘it may be’ type
126.96.36.199. The source construction
Many indefiniteness markers contain an element that goes back to a form of the verb ‘be’. Why this should be so is perhaps not immediately obvious, but there is a straightforward account for it, which will be presented immediately below. Consider the examples in (301).
One might ask why the source construction cannot contain a non-specific free relative clause, as in the ‘want’-type that we saw in the preceding section. This would look as in (303).
The problem is that the subject ‘it’ (which appears overtly in Russian to, French ce, and Hebrew hu) in this hypothetical source construction is unmotivated, so I assume that (302) is correct. But eventually some kind of ill-understood restructuring takes place and turns (302) into (303), which is then grammaticalized, yielding the indefiniteness markers in (301). The replacement of (302) by (303) may also be facilitated by the fact that in many languages, parametric concessive conditional clauses and non-specific free relative clauses are structurally quite similar (cf. Haspelmath and König 1998).
The typical structural features of parametric concessive conditional clauses are also reflected in the resulting indefinite pronoun. Thus, the verb is typically in some kind of subjunctive mood in concessive conditional clauses, and this is what we find in the indefiniteness markers: French soit, Russian bud’/by … bylo, Bulgarian da e, Proto-Slavic *sit (in Czech kdo-si) are all subjunctive forms of the verb ‘be’. Russian parametric concessive conditional clauses are marked by the pleonastic negator ni, and this is carried over to the indefiniteness markers by to ni bylo and -nibud’.
A common formal feature of concessive conditional clauses is the focus particle ‘also, even’ (cf. König 1988). In Lezgian and Kannada, this focus particle follows the verb, which is in the conditional mood; cf. (304).
(p.137) Accordingly, the particle ‘also, even’ (-ni/-uu) also appears in this place in the Lezgian and Kannada indefiniteness markers. In European languages, by contrast, the focus particle tends to occur after the interrogative pronoun, before the verb, as for instance in Bulgarian:
As a result, we find the focus particle i in the Bulgarian indefiniteness marker (WH-to) i da e ‘any-’ between the interrogative pronoun and the verb.
Four further structural features are commonly displayed by parametric concessive conditionals, in addition to those already mentioned ((i) subjunctive mood, (ii) pleonastic negator, (iii) focus particle ‘also, even’, (iv) conditional marker: (v) a temporal adverb like ‘ever’, (vi) an additional general subordinator (‘that’), (vii) an expression meaning ‘want’, (viii) other emphatic particles like ‘now’, ‘only’ (cf. von Bremen 1983, Haspelmath and König 1998). These are exemplified in the following sentences:
It is not my task here to explain this astonishing variety of structural means associated with parametric concessive conditionals (cf. Haspelmath and König 1998, for an attempt at an explanation). But we need to be aware of the components of the source constructions in order to understand the components of the resulting indefinite pronouns. Thus, the French indefiniteness marker que ce soit contains a general subordinator (French que), which is explained by the fact that it has arisen from a source construction with a subordinator, cf. (307).
In quite a few cases, indefiniteness markers consist of the same structural components as parametric concessive conditional clauses, but show no trace of the verb ‘be’. Nevertheless, I claim that these, too, can be understood as arising from such clauses. Examples are given in (310–14).
(p.139) A well-documented case of this development is French quelque, whose evolution is traced in detail in Foulet (1919).
In descriptive grammars of European languages, the identity between indefinite pronouns in (310–14) and the pronouns introducing parametric concessive conditional clauses is often described as if the indefinite pronouns were primary and their function in concessive conditionals secondary.8 However, the opposite is in fact the case: the indefinite pronouns in (310–14) are derived from the same kind of source construction discussed above (see 302), with the difference that the verb ‘be’ has also been ellipted. Although the verb ‘be’ cannot be reconstructed from the context (unlike the ellipted parts in the earlier source constructions (see 292, 299)), it does not carry much informational weight and is therefore dispensable.
Such source constructions are not entirely hypothetical, as shown by (316) from Dutch.
188.8.131.52. The original meaning
What kind of indefiniteness meaning do we get from a parametric concessive conditional source construction? As in the ‘want’ type of the previous section, the original meaning is that of free choice. While the source construction in the ‘want’ type explicitly leaves the choice to the hearer, the same effect is achieved more indirectly by the concessive conditional source construction. Concessive conditionals express a conditional relationship between a consequent and a series of antecedent conditions; but in contrast to ordinary conditionals, they entail their consequent. The antecedent conditions are therefore irrelevant for the consequent, and concessive conditionals have also been called ‘irrelevance conditionals’ (cf. König 1985). Thus, a parametric concessive conditional clause like whoever it may be states that the identity of the person in question is irrelevant, which amounts to the same as free choice. This close semantic relationship between the two meanings is also reflected in similar co-occurrence restrictions. For example, neither free-choice indefinites nor the two source constructions can be used in contexts that allow only specific reference (in 318–19, a non-habitual reading of the past tense is intended):
Here are some examples of ‘it may be’-indefinites in their original (i.e. free-choice) meaning (and see Rullmann 1995):
6.2.4. The ‘no matter’ type
184.108.40.206. The source construction
In indefinite pronouns of this type, the indefiniteness marker is derived from an expression meaning ‘it does not matter wh-’, ‘it’s all the same wh-’. Some examples are given in (321).
I know of no indefinite pronoun of this type that has been strongly grammaticalized, so the indefiniteness markers of this type are fairly transparent. The source construction is shown in (322).
As the it does not matter clause is turned into an indefiniteness marker, it may undergo simplification, especially omission of the copula (thus, German es ist gleich w- ‘it is the same wh-’ becomes gleich wer).
In this type we see again an overlap with an earlier type: in some languages, parametric concessive conditionals are expressed by means of a ‘no matter’ expression, as in (324).
However, it seems an unnecessary complication to assume that the development of the indefiniteness markers of the ‘no matter’ type proceeded via concessive conditionals. The source construction in (322) must be posited anyway because not all languages with ‘no matter’ indefinites have ‘no matter’ concessive conditionals. For example, in French there is no evidence that a n’importe-concessive conditional ever existed that could have given rise to the indefiniteness marker n’importe:
220.127.116.11. The original meaning
Since ‘no matter’ indefinites are all weakly grammaticalized, they all have the expected free-choice meaning synchronically, as exemplified by (326).
6.3.1. Grammaticalization theory
Let us now see in what way the changes described in the previous section can be subsumed under the general phenomenon of grammaticalization. A comprehensive and systematic description and discussion of the various individual aspects of (p.142)
− paradigmatic variability
− syntagmatic variability
grammaticalization changes is Lehmann (1982a; 1995), summarized in Lehmann (1985). Lehmann identifies three main parameters: weight, cohesion, and variability, each of which has a paradigmatic and a syntagmatic aspect. The six resulting parameters are shown in Table 6.1. A plus sign in front of a parameter means that with increasing grammaticalization, the degree to which this parameter is present increases, and a minus sign means that the degree to which a parameter is present decreases. In principle, all of these parameters are affected simultaneously by grammaticalization changes, and there is a high degree of correlation among them. However, in each particular case there may be circumstances that make a parameter inapplicable, so not all parameters can be observed in every change. But what is strictly disallowed by Lehmann’s theory is for different parameters to change in opposite directions. Three well-known paradigm cases of grammaticalization are the change from a modal construction like cantare habeo ‘I have to sing’ in late Latin to the Romance synthetic future, e.g. Portuguese cantarei ‘I will sing’; the development of a suffixed definite article in Bulgarian (kniga-ta ‘the book’) from a demonstrative determiner (cf. Old Church Slavonic kŭniga ta ‘that book’); or the development of a comitative/instrumental case suffix in Turkish (ağaçla ‘with the tree’) from an earlier postposition (ağaç ile). These examples serve to illustrate what is meant by each of the six parameters.
INTEGRITY is the most conspicuous parameter of grammaticalization. It has two aspects, phonological and semantic. Loss of phonological integrity or EROSION means that an expression loses phonological substance (loss of segments or whole syllables) or distinctiveness (loss of stress, assimilation), as shown by all three examples above. Loss of semantic integrity or DESEMANTICIZATION means that an expression loses semantic features, is ‘bleached’, generalized, or weakened (for alternative views, see § 6.4). For example, the meaning of the definite article is weaker than and included in the meaning of a demonstrative pronoun.
Reduction of (syntactic) SCOPE means that an item that earlier combined with constituents of arbitrary complexity is increasingly restricted to a word or stem. For instance, the Latin verb habeo ‘have’ in its modal sense combined with a verb phrase, whereas the Portuguese suffix -ei combines with a verb stem.
Increasing PARADIGMATICITY means the integration into an increasingly small and tightly organized paradigm. For example, the Turkish postposition ile was part of a large paradigm of postpositions with little coherence, but by becoming a case affix it joins the small Turkish case paradigm which consists of only six cases (including the new comitative/instrumental).
Loss of PARADIGMATIC VARIABILITY means that an item is increasingly obligatory, more dependent on grammatical rules than on communicative intentions. Thus, Bulgarian nouns have to have a definite article when its conditions (uniqueness and inclusiveness) are met, independently of the speaker’s communicative intentions.
And finally, loss of SYNTAGMATIC VARIABILITY means an increasingly fixed word order. In Latin, the verb habeo could precede or follow its complement, but in the Romance future, the future suffixes may only follow the stem.
Let us now consider the way in which the various parameters of grammaticalization are manifested in the development of indefinite pronouns as presented in § 6.2. First of all, grammaticalization changes are always unidirectional, and indefiniteness markers are no exception. No changes whereby an indefiniteness marker turns into a superordinate clause of the ‘dunno’ or ‘no matter’ types, or into a free relative or concessive conditional clause, have been attested. In the following sections we will consider each of Lehmann’s six parameters of grammaticalization in turn.
The semantic aspect of loss of integrity, desemanticization, will be treated in detail in the next section (§ 6.4). So far we have only seen the original meanings of the indefiniteness markers: ‘specific–unknown’ for the ‘dunno’-type (§ 6.2.1), and ‘free choice’ for the other three types (§ 6.2.2–4). These are the meanings that I assume for the earliest stage of the new indefiniteness markers, and they are not yet very far away from the meanings of the source constructions. But later desemanticization is amply attested, as shown in § 6.4.
Phonological erosion (the phonological aspect of the loss of integrity) is especially radical in ‘dunno’-indefinites. We find phonological changes like ne weiz > neiz (Middle High German), ne znam > nam (Bulgarian), *ne vě > ně (Old Church Slavonic), kas žino kas > kažkas (Lithuanian), or even *ne wait ik hwarir > nekkver (Old Norse). These reductions go well beyond regular sound changes, but this is a frequent feature of phonological erosion as part of grammaticalization changes and is therefore not surprising.9 Phonological erosion is less conspicuous in the other three types of indefinite, although it can be observed in some cases (e.g. Romanian -va < vrea). The reason for this probably has to be sought in their basic free-choice meaning. As was mentioned above (§§ 3.2.6, 5.7), free-choice (p.144) indefinites are typically stressed, and stressed expressions are naturally more resistant to phonological reduction. This seems to apply also to the expressions of the type ‘God knows wh-’ (cf. 291) which have an emphatic value, are therefore stressed and are not reduced phonologically.
However, free-choice indefinites are of course not immune to semantic change, as will be documented below (§ 6.4.2). Once they are no longer restricted to the free-choice function, they are unstressed and therefore subject to more substantial changes. Thus Romanian -va (< vrea) is a general indefinite with non-specific and specific uses and no longer has the free choice meaning. Similarly, Russian -libo (< ljubo) and -nibud’ (< ni budi) no longer have the free-choice meaning and show some phonological reduction. Another suggestive case from the ‘dunno’-type is Lithuanian kažkas (cf. 290), which contrasts with the other indefinites of its type (Czech kdovíkdo, German wer weiss wer, and others) both formally (it is reduced) and functionally (it is not emphatic, cf. Appendix A, Section 17). In order to prove the correlation between desemanticization and phonological erosion one would have to study a large number of examples, devise a measure for the degree of desemanticization and phonological erosion, and perform a statistical analysis (as is done for futures in Bybee et al. 1991). Such an investigation is beyond the scope of the present work, but the few examples just cited show that this approach seems to be promising.
The next parameter of grammaticalization is the reduction of syntactic scope. This can also be observed in indefiniteness markers. For example, in older French the indefiniteness marker n’importe combined with prepositional phrases, as is evident from its position in front of the preposition in (327a) (Grevisse 1986: § 373).
In the contemporary language, only (327b) is possible, where the indefiniteness marker is combined directly with the noun phrase (or the pronoun). Thus, its scope, which used to extend over a prepositional phrase, has been reduced.
Similarly, in Old Church Slavonic the indefiniteness marker n- could precede a preposition, as shown in (328a). In modern Russian, this is no longer possible (328b). Again, the scope of n- has been narrowed.
Indefiniteness markers show a tendency to stand as close as possible to the (p.145) pronominal stem, i.e. to have the narrowest possible scope. Suffixal indefiniteness markers often switch places with suffixal case markers after the indefiniteness marker has become an affix. Thus the Georgian suffixed indefiniteness marker -me, which used to be a case-external extrafix (cf. § 3.1.1), may now also occur in internal position. This type of change, the externalization of inflection, is not motivated by grammaticalization, but grammaticalization is responsible for creating the structures that are affected by it (cf. Haspelmath 1993b for detailed discussion).
Again, it should in principle be possible to test rigorously the prediction that the parameter of scope correlates with the others. Some anecdotal but suggestive evidence for a correlation comes from Lezgian, which has two non-negative indefiniteness markers, jat’ani and x̂ajit’ani. Both belong to the ‘it may be’-type (cf. 301), and the only difference is that jat’ani is based on the copula ja ‘be’, whereas x̂ajit’ani is based on the full verb x̂un ‘become, be’. However, there is a striking meaning difference between them: x̂ajit’ani, evidently the younger form, has free-choice meaning, but jat’ani has only functions further to the left on my implicational map. This correlates with different scopes: the functionally younger form x̂ajit’ani occurs in phrase-external position, i.e. has scope over the phrase, whereas the older form jat’ani always stands next to the interrogative word, i.e. has scope only over this word.
Analogously, in Japanese the indefiniteness marker -ka occurs inside case particles, whereas -mo and -demo occur outside. The marker -ka, which has uses to the left of the implicational map and is presumably older, has narrower scope:
There is usually a very small set of indefiniteness markers (at most four or five) that are strongly grammaticalized, but there is a much larger set of indefiniteness markers with a low degree of grammaticalization in many languages. For example, English has a core system consisting only of some-, any- and no-, but at the periphery there are various forms such as wh-ever, God knows wh-, and no matter wh-. Sometimes these weakly grammaticalized, peripheral forms are not fixed in their internal structure. For example, in older Latin there is not only the form qui-vis ‘anyone’, with the indefiniteness marker vis (2nd singular indicative of velle ‘want’), but similar expressions with other tense-mood forms of velle are attested as well:
Similarly, in modern French there is not only the weakly grammaticalized je ne sais qu- ‘I don’t know wh-’ (e.g. 297a), but other pronominalized subjects (on ‘one’, elle ‘he’) are also possible, or different tense forms of the verb savoir ‘know’ (Grevisse 1986: § 373).
This freedom of choice is no longer possible in indefiniteness markers with a high degree of grammaticalization.
Increase of bondedness between two expressions is the gradual transition from (p.147) juxtaposition (where both are independent words) to cliticization and affixation, and possibly on to internal modification. It is often hard to tell which degree of bondedness an indefiniteness marker has, but there is no doubt that indefiniteness markers start out as (sequences of) independent words and end up as affixes. To give just one example, the Russian marker by to ní bylo (e.g. in gdé by to ní bylo ‘anywhere’) still has its own stress, whereas the marker -nibud’ (e.g. in gdé-nibud’ ‘somewhere, anywhere’) is stressless and thus at least a clitic, if not a suffix. The higher degree of bondedness of -nibud’ (which is also reflected in its spelling) again correlates with its higher degree of desemanticization.
6.3.6. Paradigmatic and syntagmatic variability
With increasing grammaticalization, the items undergoing the change become increasingly obligatory, and there is less and less choice between different members of a paradigm. This parameter is not so easy to illustrate in the domain of indefiniteness markers because there are no syntactic environments that require an indefinite pronoun (in contrast to cases or agreement categories, for instance, which are obligatory under certain syntactic conditions, independently of their meaning). But one might perhaps say that, for example, weakly grammaticalized free-choice pronouns leave the speaker more freedom to select between them (e.g. between Dutch wie dan ook, wie ook, wie ook maar, onverschillig wie, om het even wie, gelijk wie, all meaning ‘anyone’), whereas strongly grammaticalized ones allow fewer alternatives (e.g., there is no alternative to Dutch iemand ‘someone’ in iemand is gekomen ‘someone came’).
Syntagmatic variability, i.e. increasing fixation of order, cannot be observed in indefiniteness markers because the source constructions already show fixed word order, so this parameter is vacuous in our case.
6.3.7. The explanatory power of grammaticalization
The paths of grammaticalization that we have seen in this section allow us to formulate two types of diachronic explanation: parochial and universal. The parochial explanations concern facts of individual languages that become clear once they are viewed against the background of grammaticalization theory, e.g. the fact that the Russian indefiniteness marker -nibud’ contains the negative marker ni and the root bud’ ‘be’, or the fact that the Albanian indefiniteness marker -do is homonymous with do ‘want’. These facts are, of course, not part of the linguistic knowledge of the speakers of these languages, but in the present context that does not mean that they are automatically uninteresting.
At another level, grammaticalization also explains universal or cross-linguistically widespread properties of indefinite pronouns. For example, the fact that indefinite pronouns are so often based on interrogative pronouns is explained (at least in part) by the source structures identified in § 6.2, which contain embedded (p.148) interrogative, free relative, and parametric concessive conditional clauses, and these three clause types usually contain an interrogative pronoun.10 But the strongest predictions are made by the general tenet of grammaticalization theory that there is a correlation between the degrees of grammaticalization of the six parameters of § 6.3.1. The overall prediction is that if an element is more grammaticalized than another element on some parameter, then it is grammaticalized also on all other parameters (or at least not less grammaticalized). We have already seen some examples of such correlations in §§ 6.3.2–6, and some more evidence will be presented in § 6.4.
6.4. Desemanticization: The Semantic Side of Grammaticalization
6.4.1. Three competing theories of semantic grammaticalization
In Lehmann’s systematization of grammaticalization processes, the semantic aspects are subsumed under ‘loss of integrity’, i.e. Lehmann stresses those aspects of semantic development that have been characterized as ‘bleaching’ (Givón 1975), ‘generalization’, or ‘weakening’ (Bybee and Pagliuca 1987) of meaning. Other authors have put the emphasis on other aspects of semantic grammaticalization. Thus, Heine et al. (1991) see metaphorization as the main driving force behind semantic change associated with grammaticalization. For example, often concrete spatial expressions like ‘head’ come to be used as local relators with abstract meanings like ‘on’, ‘in front of’, or ‘before’, or a concrete spatial verb like ‘go (to)’ comes to express future meaning. Traugott (1988), by contrast, emphasizes the role of ‘pragmatic strengthening’ in the development of grammatical meaning. For instance, she assumes that the semantic extension of English while from a strictly temporal subordinator to an adversative one (‘whereas’) has to do with grammaticalization. It would be nice if the evidence of grammaticalization of indefinite pronouns helped to resolve the theoretical issue, and indeed it does.
Most of the semantic changes found in indefinite pronouns and that are part of the grammaticalization process are clear instances of weakening or generalization of meaning, so the evidence from indefinite pronouns favours the views of Lehmann, Bybee and Pagliuca, and Givón. In contrast, there is no evidence whatsoever for metaphorization in indefinite pronouns—the source structures of § 6.2 are already rather abstract, so there is no change from concrete to abstract here. And those instances of pragmatic strengthening that we do find in indefinite pronouns are not connected with grammaticalization—appreciative meanings like ‘some important person’ for someone, and depreciative meanings like ‘an unimportant person’ for anyone are not linked to grammaticalization. Indeed, they are rarely (p.149) conventionalized, so that there is rarely any semantic change. (These senses are discussed in § 7.5.4 below.)
None of the three theories of semantic grammaticalization has made crucial use of indefinite pronouns (although Lehmann 1995: 50–5 discusses them), so the fact that the weakening/generalization hypothesis can accommodate this new case is a point in its favour.
The semantic development of indefinite pronouns can be characterized as ‘generalization’ in that indefinite series are often diachronically extended to more functions on the implicational map of Chapter 4, thus becoming more general. As mentioned in § 4.4, the implicational map makes predictions about the route of the change: indefinite series can extend only to functions that are adjacent to those that it already covers, and the extension is incremental, i.e. the marker is extended only to one new function at a time (cf. Croft et al. 1987 for a programmatic proposal along these lines).
6.4.2. Extension from ‘free choice’
The clearest cases of generalization of the meaning of indefinite pronouns involve indefinites whose original function was that of free choice, and which were then extended to other functions to the left of ‘free choice’ on the map. The extension thus proceeds from right to left on the scale in Fig. 6.1. This scale is identical to the map of Chapter 4 except that the two negation functions are missing. Below the scale, I give some examples of indefiniteness markers whose original function must have been ‘free choice’ because they come from one of the sources in §§ 6.2.2–4. The first three examples show different stages of extension to the left. The fourth example, Russian -libo, covers one more function on the left; but on the other hand it no longer has the original free-choice function. Russian -nibud’ has even lost the comparative function. The next two examples, Lezgian x̂ajit’ani and French quelque, in addition have the ‘specific’ function (see Foulet 1919 for the semantic
development of quelque), and Czech -si only has the ‘specific’ function (but recall that its etymology is not certain, cf. § 18.104.22.168).
Thus, an original free-choice indefinite may be extended all the way to the opposite end of the scale. However, generalization of the meaning is clearly not the whole story, because indefinites that have been extended to functions in the left part of the scale lose some of the original functions in the right part. Rather than growing larger and larger, the area covered by an indefinite shifts to the left, like a window that opens up the view on a limited area of the semantic space. But the shift of the window from right to left is apparently unidirectional—indefinites do not acquire the comparative or free-choice functions by semantic extension. How can we understand this leftist orientation of indefinites if it is not motivated by generalization?
Here we have to make use of the notion of ‘weakening’. The functions on the right side of the scale are in some sense ‘stronger’ than the functions in the middle and on the left, and a change from right to left means a loss of ‘strength’, just as predicted by the weakening view of semantic grammaticalization. To express this visually, the scale in Fig. 6.1 can be represented as a ‘trough’ (cf. Fig. 6.2), where the stronger functions are above the weaker functions. The diachronic extension is restricted to a downward movement in this visualization.
6.4.3. Semantic change as weakening
After these rather abstract considerations, let us now ask what ‘weakening’ means in semantic terms.
22.214.171.124. Loss of focusing and scalarity
If we just look at the free-choice function and at the irrealis–non-specific function, it is immediately clear that semantic substance has been lost. Consider the minimal contrast in (334).
The free-choice indefinite in (334a) expresses the endpoint of a pragmatic scale, as we saw in § 5.5.5, but no pragmatic scale is associated with the simple nonspecific indefinite in (334b). Both share the feature of non-specificity—recall that free-choice indefinites are also non-specific (cf. § 3.2.6). In those cases where an original free-choice indefinite has acquired the irrealis–non-specific function, we can say that the semantic development consists in the loss of focusing and thereby of the semantic component of the pragmatic scale and its endpoint. Only the semantic component of non-specificity is preserved.11 This development must have taken place, for example, in Russian -nibud’- and -libo-indefinites.
126.96.36.199. Loss of non-specificity
Desemanticization may also be understood as loss of semantic substance in the further development from the simple non-specific function to the ‘specific–unknown’ function. Consider the minimal contrast in (335).
If an indefinite that has the simple non-specific function is extended to the ‘specific-unknown’ function, it loses the feature of non-specificity. Since the referents of non-specific indefinites are also necessarily unknown to the speaker, the ‘unknown’ feature is common to both, and it is the only feature that is left after non-specificity has been lost. This development must have taken place in Portuguese qualquer-indefinites, for example.
188.8.131.52. Loss of unknownness
Finally, even the feature of unknownness may be lost, and then the indefinite may even be used in the specific–known function. This development must have taken place, for example, in Romanian -va-indefinites. (Of course, one could also think of this change as the acquisition of the new feature ‘unknown’, so this extension is not a strong argument for the view that semantic change in indefinites generally means loss of semantic features.)
184.108.40.206. The comparative
In the preceding subsections, I hope it has become plausible that the extension of the functions of original free-choice indefinites can be understood as the loss of semantic substance. The developments are summarized in (336).
But so far I have not accounted for the two intermediate functions ‘comparative’ and ‘question/conditional’. Unfortunately, the semantics of the comparative function is not clear to me, so I cannot say much about its semantic development here. However, it has been remarked several times in the literature that the comparative function is intermediate between free choice and negative polarity (cf. § 5.6); if that is correct, then whatever explains the shift from free choice to negative polarity will also explain the shift from the free-choice function to the comparative function, and from the comparative to the question/conditional function.
220.127.116.11. Negative polarity: questions and conditionals
The semantic difference between free-choice and negative polarity has been described above as the difference between non-reversed and reversed pragmatic scales. It is not clear that this difference can be described as ‘weakening’ in any way, and I can see no other reason why there should be a unidirectional development from non-reversed to reversed scales. Thus I have to admit that I have no good explanation for the fact that free-choice indefinites commonly acquire the negative-polarity function, whereas negative-polarity indefinites are not generally extended to the free-choice function.12
However, semantic weakening can be observed in questions and conditionals in a different way. Recall from § 5.7.2 that both emphatic and non-emphatic indefinites may occur in these contexts, where they have the same truth conditions but subtly different meanings, as in (337).
In § 5.7.2 we saw that the meaning difference between these sentences can be characterized as ‘presence vs. absence of scalarity’. In (337b), čto by to ni bylo is the endpoint of a pragmatic scale, and hence a more emphatic reading results (‘if you hear the slightest noise, e.g. if a cat miaows’). At some stage the indefinite in (p.153) (337b) may lose its emphatic value and become equivalent to (337a)—this must indeed have happened in Russian, because etymologically -nibud’ is completely analogous to by to ni bylo (§ 18.104.22.168).
Thus, the question and conditional functions serve as a bridge between the obligatorily scalar free-choice and comparative functions and the obligatorily non-scalar irrealis–non-specific function.
6.4.4. Extension from ‘dunno’
In addition to the source constructions that yield scalar-endpoint indefinites (§§ 6.2.2–4), there is also one source construction that yields specific indefinites: the ‘dunno’ type of § 6.2.1. As I showed in § 22.214.171.124, the original function of recently grammaticalized ‘dunno’-indefinites is ‘specific–unknown’. Like the scalar-endpoint indefinites, ‘dunno’ indefinites undergo semantic extension when they are strongly grammaticalized; as a result they may come to cover a substantial portion of the implicational map, and perhaps lose their original function. However, here it is much more difficult to make generalizations, because such cases are not common. In the modern European languages, only Lithuanian kaž-, and Albanian di-, and perhaps Slavic ně-, Scandinavian någon (Swedish)/nokkur (Icelandic), represent strongly grammaticalized ‘dunno’ indefinites. It appears that ‘dunno’ indefinites resist stronger grammaticalization for some reason.
Thus, in addition to the right-to-left extension of Fig. 6.2, we also have to assume left-to-right extension as shown in Fig. 6.3. If there is both right-to-left and left-to- right extension on the map, does that mean that the semantic grammaticalization is not unidirectional? I do not think so. I have argued above that the most important semantic change in the grammaticalization of indefinites is semantic weakening, and this is a dimension that is not directly represented in Fig. 6.3. In the weakening of the original ‘dunno’ meaning, the most important change is that the speaker’s lack of knowledge is no longer emphasized; similarly, in the weakening of free-choice indefinites, the most important change is the loss of scalarity. Thus, instead of two opposite movements in one dimension, what seems to be going on is unidirectional movement in two dimensions, from strong (or emphatic) to weaker (or less emphatic). This can be represented by the ‘trough model’ in Fig. 6.4. It must be admitted that this visual representation cannot capture all the relevant details. But it does seem useful as an approximation, and it drives home the point that semantic grammaticalization of indefinite pronouns is primarily weakening of emphasis, not metaphorization or pragmatic strengthening.
In addition, the model in Fig. 6.4 helps us understand an observation that was made above in § 4.5 (Principle 1): in the middle of the implicational map, indefinite pronoun series always express more than two adjacent functions. This seems to be due to the fact that the functions in the middle, which are the functions at the bottom of the trough in Fig. 6.4, are ‘weakest’, i.e. least distinctive, and hence least likely to be expressed by a unique indefinite series that has no other functions.
6.5. From Free-Choice Indefinite to Universal Quantifier
In some languages, the indefinites that express the free-choice functions can also be used as true universal quantifiers, corresponding to English every, everyone, everything, etc. Since the meaning distinction between ‘any’ and ‘every’ is often
quite subtle, we can be sure that we are dealing with a truly universal use only if the expression can be used in contexts that do not allow free-choice indefinites, as in (340).
But how can we be sure that the German determiner jeder is also a free-choice indefinite? True, it can be used to translate English free-choice any, as in Jeder Idiot würde das sehen ‘Any idiot would see that’. But we saw in § 3.3.4 that some languages do not have special free-choice indefinites and use universal quantifiers in their place. In the case of German, a convincing argument is that jeder may also be used in the indirect-negation function, where it cannot be paraphrased by ‘every’ (cf. 341). In such cases, jeder also combines with non-countable nouns.
Thus, German jeder spans the four functions ‘indirect negation’, ‘comparative’, ‘free choice’, and ‘universal’. We thus have reasons to add an additional function to our implicational map, which would now look as in Fig. 6.5, where the distribution of jeder is shown. I have not found many languages where one expression covers both the indirect-negation function and the universal function, but it is intuitively highly plausible that ‘universal’ should be located next to ‘free choice’ on such an extended implicational map.
This would also account for the fact that free-choice indefinites may diachronically evolve into universals, as I argued in Haspelmath (1995). Again, the evidence for this development is rather indirect, but a number of languages have universal quantifiers which consist of the same formal elements as (free-choice) indefinite pronouns and hence must go back to one of the source constructions identified in §§ 6.2.2–3 and § 7.1. Compare the following cases:
The semantic change from ‘any’ to ‘every’, which must be posited to account for these cases, also involves the loss of the semantic feature of scalarity, like the change from ‘any’ to ‘some’ posited in § 6.4.2. However, in this case the semantic feature of non-specificity is also lost, and the implicature of universal quantification is strengthened to a semantic component (in contrast to the change from ‘any’ to ‘some’, where the universal implicature is also lost, but non-specificity is preserved).
Thus, the meaning of free choice can develop in two directions: to ‘some’, and to ‘every’. Both these developments seem to be unidirectional. ‘Some’ cannot develop into free-choice ‘any’, and ‘every’ cannot develop into ‘any’, either.13
(2) A possible case outside of Europe is the Indonesian indefiniteness marker entah, e.g. entah dimana ‘somewhere’ (cf. dimana ‘where’), entah bagaimana ‘somehow’ (cf. bagaimana ‘how’). However, entah is not a verb, but a particle glossed as ‘I don’t know’, and also ‘maybe’.
(4) A free relative clause is a relative clause that does not modify a noun phrase, but constitutes a noun phrase itself (see von Bremen 1983; Lehmann 1984: v.4, among many others, for discussion). Nonspecific free relative clauses (also called ‘generalizing free relative clauses’) are semantically nonspecific in the sense of § 3.2.3, and they are easily recognized in English because only they allow the relative pronoun wh-ever, e.g. (specific) She told him what 〈*whatever〉 she had seen the day before; but (non-specific) She used to tell him what/whatever she saw.
(5) This etymology of the Czech indefiniteness marker -si, which is also found in other Slavic languages (Polish -ś, Slovak -si, Ukrainian -s’), is rather speculative, though very attractive because it fits so well into the general pattern. In this case, the etymology cannot serve as evidence for the typology; on the contrary, the typology supports the etymology.
(6) There are three types of concessive conditional clause, analogous to the three types of question (polar, alternative, parametric): (i) (polar concessive conditional clause) Even if it rains, we will go out; (ii) (alternative concessive conditional clause) Whether it rains or the sun shines, we will go out; (iii) (parametric concessive conditional clause) Whatever the weather will be, we will go out; See König (1992), Haspelmath and König (1998) for general studies of concessive conditional clauses.
(7) The lexeme ‘want’ can become (part of) an indefiniteness marker in three different ways: (i) by being the predicate of a non-specific free relative clause, as explained in § 6.2.2; (ii) by marking a parametric concessive-conditional clause, as explained in this section; and (iii) by becoming a focus particle meaning ‘at least’ (see § 7.1.)
(8) Serianni (1988:507) ‘Pronomi indefiniti possono introdurre una proposizione relativa concessiva’ (‘Indefinite pronouns can introduce a relative clause with concessive meaning’, i.e. a parametric concessive conditional clause), and Geerts et al. (1984).
(9) Some etymological dictionaries are sceptical about the Old Church Slavonic etymology *ne vě kŭto > nĕ kŭto because it does not conform to the normal sound changes (e.g. Vasmer 1953–8, s.v. nekij). It is true that this etymology, which goes back at least to Miklosich (1886), is speculative, and we have no way of proving that it is correct. The point here is that the irregular sound change does not make Miklosich’s etymology any less plausible; on the contrary, it is perfectly in line with it.
(11) Gurevič (1983) describes this difference as follows. In the free-choice use, the semantic component of irrelevance is asserted, but is weakened to a presupposition in the simple non-specific use. Thus, minimal pairs like (334a–b) differ only in the assignment of presupposition and assertion. (i) (free choice) You may invite anyone [presupposition: ‘You may invite someone.’ assertion: ‘It is irrelevant who you invite.’] (ii) (simple non-specific) You may invite someone [assertion: ‘You may invite someone.’ presupposition: ‘It is irrelevant who you invite.’]
(12) One exception that I am aware of is English any (Old English œnig, based on ān ‘one’), which originally must have meant ‘single’ but now occurs not only in negative-polarity functions, but also in the free-choice function.
(13) Unfortunately, I know of one or two exceptions to the second generalization. Hebrew kol ‘every, any’ clearly developed in the following way: ‘totality’ (Proto-Semitic *kull) > ‘all’ > ‘every’ > ‘any’. And Turkish herhangi ‘any’ contains the root her ‘every’ (from Persian har ‘every’). These cases are quite puzzling: if this change is possible after all, why is it so rare?