Southeast Asian Classifiers

Eric Schiller

Linguistics Unlimited

 

Abstract

Southeast Asian languages typically use a combination of numeral plus classifier in quantified noun phrases. The classifier is usually a noun (compare English loaf in a loaf of bread) and is often obligatory. The paper discusses the syntax, semantics and morphology of the classifiers, as well as the distribution of the constructions in the languages of the area. The strong areal influence over-rides other aspects of the grammar. Formally, the constituent containing the numeral and the classifier will be established as an N-bar, with either element eligible to act as head.

 

This is a draft of a paper which will be published in due course. It represents the paper as delivered at the 9th meeting of the Southeast Asian Linguistic Society (SEALS). It has been adapted for web presentation in a number of ways, though no content has been changed. Color has been added and the pages have been resized for more convenient onscreen viewing, and references and  footnotes have been hyperlinked. Until the paper is in the final publication form, citations should refer to this document. Before citing, please contact linguist@chessworks.com to find out if the final version is available for citation.

 

Version 1.0            20.October 1999

Note: IPA characters may not display correctly in all browsers. The font is Lucida Sans Unicode, and any Unicode font on your system should work. For some reasons the script a ɑ and the eng ŋ have some problems. I am working on a PDF version to solve these problems.

 

Contents

Introduction

Syntax

Semantics

Generic Classifiers

Areal patterns

GIP and Universals

Conclusions

Examples

Refs

 

Introduction

This paper is an update of my earlier autolexical investigations into numeral-classifier constructions in Southeast Asian languages, in which I hope to redeem outstanding promissory notes and earn the right to issue more*. I will also enhance a few features of the analysis (or eliminate the bugs, if you prefer). Acting on the assumption that all of you are familiar with the basic use of numerals in noun phrases, I'll be surveying the syntactic structures cross-linguistically. We'll also take a look at the formal semantic structure and see how well it matches with the syntax. Specifically, I hope to lead you to the following conclusions.

(1)    The numeral + classifier construction should

I.       be considered as an N-bar constituent in the syntax

II.     involve both quantification and classification in the logico-semantics

III.  conform to head-modifier relations where possible

IV.  are strongly subject to areal influences

V.     perhaps be considered double-headed in the syntax

The first should be non-controversial, but most of the grammars I referenced speak only of positioning of the numeral "between" the noun and the classifier, without noting that the numeral is attached to the classifier by some very strong cords. The second also seems fairly obvious, but as Lehman (1990) reminds us, there are other views, though I think he does a good job of breaking some of them down. The third observation is at the heart of the theoretical part of the paper, but discussion there will be mercifully brief because it involves the fundamental principle of autolexical theory. Anyone familiar with the languages of Southeast Asia can readily spot the pattern using the mighty Mekong as a guide. Yet the areal influence can affect syntax in ways that some theories might find annoying, so it is best to establish it properly. Finally, I tentatively suggest that in the syntax, the numeral and classifier join together to form an N-bar, the intervening level between noun and noun phrase. This is not a theory-internal trivial matter. I believe that it explains quite a bit and allows a parallel to be drawn with serialized verb phrases.

Along the way we'll visit some non-garden variety classifier structures in languages covering most of the Southeast Asian language area. As a member of Haj Ross's DFW (data fanatics worldwide or data fetishists worldwide), I am obliged to do so.

Syntactic structures

In Southeast Asian languages, classifiers are constituents of the noun phrase and unlike English do not require, or allow, additional morphemes within the construction.

Syntactically, the noun phrases under discussion will have a noun (N), optional modifier of the head (M), Numeral (#), Classifier (C) and Demonstrative (D). For the moment, the symbols used have no theory-specific meaning and can be taken as informal notational devices. Although I am discussing constructions with numerals, other quantifiers may be used, such as 'many', 'some', 'all' and 'few', but there is a great deal of cross-linguistic variety there so I will confine myself to canonical classifier expressions. The following word orders are attested:[1]

(2) Attested orders for noun, adjective, classifier, demonstrative

NA#CD

D#CAN

#CNAD

DNA#C

We can immediately draw two useful conclusions.

·         The numeral always precedes the classifier1

·         Noun modifiers are always adjacent to the noun

In Burmese, Okell gives some examples of round numbers where the numeral follows the classifier. I believe examples such as Burmese (c) can be analyzed as a having the rounded number act as a classifier, i.e.,  two tens of yoke of oxen. Note the parallel to Burmese (d). We will return to these examples later.

So, we can simplify the structure by moving up a level, where # and C are a constituent, we'll call CL, for the moment. The noun+modifier structure is N1, a conventional N-bar. I know that our community follows many different syntactic teachings, but I hope that the motivations for an intermediate level between that of a noun and a full noun phrase is universally accepted, and refer skeptics to the work of our late colleague Jim McCawley (1988).

(3) Attested orders for n-bar, numeral+classifier, demonstrative

N1 CL D

D CL N1

CL N1 D

D N1 CL

 

I know of no example of CL D N1, and the Lahu example in the data appendix shows a particle /ve/ which I think makes it more like the English construction. Jim Matisoff  (1972) had mentioned the similarity, but I had overlooked that until recently.

There is no common ordering relation between the constituents. We can try correlating the orders with other aspects of the grammar, such as the order of verb and object, often taken as some sort of underlying parameter with great explanatory power.

(4) Verb-object order and classifier construction

Order

Languages

Verb-object order

CL N1 D

Hmong, Yay, Vietnamese, some Mon-Khmer

VO

D CL N1

Cantonese, Mien, Mandarin

VO

D N1 CL

Burmese, Lahu

OV

N1 CL D

Thai, Most Mon-Khmer

VO

So, we cannot correlate the verb-object order with the order of NP constituents. We'll return to this topic later in discussion of Hawkin's implicational universals.

The Syntax of Classified Noun Phrases

The figures in (5) represent the constituent structure of the syntax of noun phrases which contain classifiers. For the moment, we will stick with the notional categories as terminal nodes so that we can look at the constituent structure from a theory-neutral view.

(5) Syntactic structures of classified noun phrases

 

 

 

The first three are straightforward enough. The noun-modifier and classifier constructs are each N1. They join to form a larger N1 constituent, which is modified by a demonstrative  (D) which functions as a syntactic operator converting the N1 into a full noun phrase (N2). Burmese, the final example, has the demonstrative in the first position, to the left of the head N-bar instead of the right. The dominance relations are the same as that of the Khmer type, representing what we might take to be the normal or typical structure. As we saw earlier, there are some seemingly anomalous Burmese constructions, but you'll have to wait a bit longer before we get to those.

We have so far been treating each element as belonging to a distinct category. However, we have assumed that the structure containing the numeral and classifier is an N-bar. If so, shouldn't it have a noun as head? Most classifiers are nouns, so it seems reasonable to assign that element a head feature. On the other hand, numerals are also look a lot like nouns, and are candidates for head feature N. If one has to choose a head, one might well agree with Lehman (1990) that the numeral is a better candidate. He goes on to suggest that the classifier may be cliticized to the numeral, a proposal which is not without supporting evidence. The unit of numeral + classifier does seem to be indivisible.

In fact, actual cliticization takes place in Khmer but clearly the numeral is the affix and the classifier is the head mneak = muəy + neak 'one person'. More problematically, classifiers are an open class in most languages, and clitics tend to be members of a closed class. An additional objection can be raised, namely that the numeral can be deleted in some languages (Thai, Vietnamese, Cantonese, citations in Gil 1994).

I suggest the Beeblebroxian solution: treat both the numeral and the classifiers as noun heads.[2] There is nothing in autolexical x-bar theory (Schiller & Need 1992) to prevent this, and it fits the default associations which state that in the absence of any motivation to be otherwise, numerals are syntactic nouns with morphological nouns and semantic quantifiers. Classifiers are generally morphological nouns with semantics we'll get to later.

The two are tightly bound as a syntactic N-bar, are closely linked in the semantic representation, and show strong morphosyntactic bonding. The cliticization in Khmer, albeit limited to the numeral 1 (cf. English another but not *twother, *forother, etc.) does show a strong, even impenetrable bond between the numeral and the classifier. Fuller (1988) pointed out that tone sandhi applies in the numeral-classifier construction in White Hmong, more evidence of the close relationship.

On the other hand, it is important to note that the classifier N-bar and the head N-bar are not indivisible. An intervening demonstrative is possible and is seen in the Lahu example, but with a complementizing particle. Lehman cites the Thai examples with complementizer, of the type seen in Thai example (b). I'm not going to deal with relative clauses here but will issue the usual promissory note to be redeemed, perhaps, at some future gathering of the SEALS.

In Thai, noun phrases with heads, modifiers, demonstratives and quantifiers are superficially found in three orders:

(6)  Head-modifier-quantifier-demonstrative orders in Thai (Panupong 1970)

*      H > M > Q > D

*      H > M > D > Q

*      H > Q > M > D

This requires further investigation. (7-9) are from Panupong (glosses added):

(7)    H > M > Q > D

          tôn-má:y     yày     sìp tôn         ní:

        tree           big    ten CLF     these

These ten big trees

This conforms to the normal N1-CL-D areal pattern.

(8)    H > M > D > Q

          phucha:y ʔuən    thî kamlaŋ dɯ̀:m biə yù:  sɔ:ŋ khon

        man      fat     WH PROG drink beer   two CLF

two fat men who are drinking beer

We have a relative clause structure, which I take to be attached to a node which is not under discussion here.

(9)    H > Q > M > D

          nɯ́ə  sɔ̌:ŋ     chin    yày    nî:

        meat         two   piece                 big    these

these two big pieces of meat

The semantic structure of (9) clearly involves big modifying pieces and not meat. In the default case, we expect the syntax to parallel the semantics. We should revise my initial structures to add an additional N-bar node dominating just the classifier and its modifier. The syntactic structure is shown in (10).

(10)

 

If this structure is correct, it requires us to acknowledge that the complement of the numeral is an N-bar, not a simple N. Lehman's cliticization account would now need to be expanded to have N-bar clitics, unless, of course, the classifier is the head and the numeral is the clitic.[3]

Summing up the syntax of Thai, we can suggest the basic noun phrase rule is a set of N-bars followed by a demonstrative. This fits nicely with the observation that Thai serializes verb phrases. It might not be unreasonable to propose a rule that any constituent with a bar level of 1 may be serialized.

 The semantics of classified Noun Phrases

The classified noun phrases we have been looking at contain 5 elements organized as in (11).

(11)

This can also be expressed as in (12).

(12)

Expression: (Quantified entity (entity (entity, property)) (quantifier (quantifier, counter))) (Deictic binder)

What about two big pieces of meat? Or even two big pieces of red meat? As the English version suggests (I am assuming that the semantic representations among languages are structurally the same), we can modify our semantic structure by allowing the counter to consist of an entity and a property (13)

(13) Expanded semantic representation

It happens that both nouns and adjectives are found as classifiers, which is hardly surprising. Both categories are used to express properties, e.g., dog, tired: tired dog | dog-tired. Although dog is an entity, entities have properties, and these can be used individually. The structure contains two property slots, and I believe this can help explain the Burmese examples (c) and (d), shown here as (14).

(14) Some anomalous Burmese classifier constructions

nwà    ăhyìñ   hnă   hse

        ox     yoke   two  ten

twenty yoke of oxen

          daʔhsi     biyapăliñ       hnă   loùñ

        petrol    beer bottle  2     round thing (CL)

two beer-bottles-ful of petrol

Numerals can be used as group nouns, cf. English 'tens', 'dozens'.  Under the analysis given here, the head N-bar would contain two nouns, and the second two words are the classifier N-bar. I don't have relevant data on combining this constructions with additional adjectives, though I hope someone here can perhaps provide them.

Lexical Entries for elements of the noun phrase

The lexicon contains information for each of the hierarchies on the various dimensions. The chart (15) presents relevant grammatical aspects of the lexical entries for five elements which are typically found in a noun phrase.

(15) Partial lexical entries for elements of classified noun phrases

Item

Semantic

Syntax

Morphology

Head noun

entity k

noun N0

[+head]

nominal nml

Adjective

property f

modifier [N0>>N1]

adjectival adj

Numeral

quantifier q

modifier [N0àN1]

varies

Classifier

counter c

[+count]

noun N0

nominal nml

Demonstrative

demonstrative D index(k, x)

[Deixis {values...}]

[Plurality {sng, plu}]

syncat  [N1àN2]

inert

 

·         The head noun is normally an entity which occupies a head noun position and is morphologically marked as nominal.

·         The adjective is usually a property, has a syntactic function of combining with nouns to form n-bars, and is marked morphologically as an adjectival form.

·         The numeral is a quantifier with the same syntactic category as the adjective, and is morphologically varied.

·         Classifiers are counters with noun syntax and tend to have nominal morphology, as supported by Khmer evidence. Khmer uses the nominalized form sɔnlək as a classifier, not the base form slək.

·         Demonstratives include a semantic function that assigns an index (binds a variable) to some appropriate entry in the context register.  They are generally marked for deixis and plurality. Syntactically they combine with n-bars to form complete noun phrases, and tend to be morphologically inert. Reconciling syntax and semantics

Keeping in mind that our semantic representation is not ordered linearly, most of the languages are harmonic, showing a similar constituent structure.

(16)

The representation in (16) shows a completely harmonic relationship between syntax and semantics, with both linear order and configuration mirrored across the interface. We find this structure in many Mon-Khmer languages. In these languages, the Head-Modifier precedence rule is Head > Modifier, and it is seen with great consistency. The languages are Verb > Object languages. For a typical Object > Verb language, we would expect the linear order to be reversed as in (17). However, this is not the case. The numeral always precedes the classifier, regardless of word order.

(17)

The best the language can do is the structure shown in (18).

 (18)

This follows the Generalized Interface Principle (Sadock & Schiller), creating the best match given the linear precedence rule Num > Clf . There is no interface violation, because there is no linear precedence in semantics, only hierarchical structure. So the quantifier is unordered with respect to the counter, a sister node. Mien and Mandarin show this order.

Summing up the formal syntax and semantics, we can say that a classifier language requires an N-bar in the syntax corresponding to the semantic structure containing the quantifier plus a property which defines the units which are being counted. As is well known, different classifiers can be used based on semantic aspects of the head noun, and generic classifiers exist in some languages.

Generic classifiers

Generic classifiers come in many forms. There is often an animacy distinction, as in the Mien example, where the classifier for higher beings is used with respect to people. In this case, the semantics are bleached, and the contribution to the real-world semantics is minimized. The generic classifier exists so that the syntactic requirement that a numeral be followed by a classifier is satisfied. This is reminiscent of it in English It's raining. Semantically it marks the item as countable, but does not supply any property.

Areal patterns

Bradley (1979) notes that the Lahu data suggest an areal influence: "Quantification required both numeral and classifier in that order, following the head noun. This Common Lahu construction is common to nearly all Southeast Asian languages, but it is not found in all TB languages; thus it seems to be an areal development." Thomas (1971) also noted an important areal pattern: "In Chinese and in most of the Vietnam languages including Chrau, the classifier is placed between the numeral and the noun. In other Southeast Asian languages it is not uncommon for the order to be noun-numeral classifier." Though these remarks do not take into account the constituent structure of the two n-bars, the ordering of the units is correctly identified.

Quantified noun phrases play an important role in commerce, and that may play a role in maintaining a stable word order with numeral preceding classifier. China and Vietnam seem to have a dominant order with classifier n-bar before head n-bar. Cambodia, Thailand and Burma have the head n-bar first, then the classifier. If the ordering of n-bar constituents has been easily affected by areal influence, then the strength of any putative word order universals must be questioned.

Of course the word "areal" begs the question: when? After all, Southeast Asian peoples tend to move around a lot.  A simplified map of the region has one very prominent attribute: the Mekong River. Although there are many exceptions, no doubt the result of various migrations, as a general rule the languages with numeral-classifier N-bar preceding the head noun are east side of the Mekong, and those on the west show the opposite order. This clearly cuts across language families, as the Mon-Khmer (Khmer |Chrau), Tai-Kadai (Thai | Yay), Sino-Tibetan (Burmese | Mandarin) families show both orders.

 

The Mon-Khmer languages seem to have a fairly loose attachment to classifiers, and seem to have been affected by their proximity to either the Western or Eastern types of classifier (if I may be allowed the luxury of such an oversimplification). The use of classifiers is not grammatically obligatory in Mon, Khmer and many other Mon-Khmer languages. This calls into question Lehman's 1990 link between morphological properties and the use of classifiers. There is no discernable morphological difference between the classifier-rich languages of Southeast Asia and those which have less elaborate or minimal use of classifiers. If the link were real, we'd also find many pidgin and creole languages with classifier syntax, but we do not.

The Generalized Interface Principle and Word Order Universals

To account for the variety of noun phrase internal word order, Hawkins (1988) proposed a Prepositional Noun Modifier Hierarchy (19)

(19)  Prep É ((NDem Ú NNum É NA) & (NA É NG) & (NG É NRel))

As this paper is not concerned with relative clauses and genitive constructions, we are only concerned with the first part, which states that a prepositional language where the head noun precedes the demonstrative or the numeral, it will also precede the adjective.  In other words, if the head-modifier relationship is harmonic with regard to demonstratives and numerals, it is also going to hold for adjectival modifiers. He notes (p.75) that the demonstrative and numeral are "more unstable than the genitive and the relative clause, and the adjective is more unstable than the relative clause." He later (p.295) suggests that the cross-linguistic study of noun phrases "can reveal the nature of the interaction between explanatory principles in universal grammar." Hawkins found existing x-bar syntax inadequate for the explanatory purposes. It should be noted, however, that autolexical syntax has implemented the x-bar wish list presented in his book (Hawkins 1988:201).

I think that it is really the interaction not of powerful principles, but rather of simple hierarchies that provide the variety of languages we so enjoy studying. The Generalized Interface Principle of autolexical grammar (Sadock & Schiller 1993) provides a simpler account of the data included in Hawkins sample. As an implicational universal, we would suggest the old, simple, strong form:

(20)  Syntactic order of elements should follow a consistent head-modifier order.

Note the presence of the word should. In an artificial language, it might be possible to keep everything lined up neatly. Natural languages, however, have far too many conflicting demands. Phonological changes may turn words into clitics. Even  numerals can become wedded to their hosts. In Schiller (1992), I discussed the Khmer muəy  "1", a numeral  which shows aberrant behavior in many languages. In Khmer it has cliticized to many nouns, for example mdɑɑŋ ‘once' from dɑɑŋ 'a time'. The numeral in its full form muəy, has cliticized to the copular verb ciə  to form ciəmuəy, which is a verb meaning 'to accompany' or a preposition 'with'.

If we assume that the fixed numeral > classifier order is a result of some more general aspect of information structure specifying the linear order syntax in that constituent, then all of the attested word orders conform to our expectations.

Conclusions

I have tried to show that classifier constructions should

·         be considered as paired N-bar constituents in the syntax

·         involve both quantification and classification in the logico-semantics

· &nb