Southeast Asian Classifiers
Eric
Schiller
Linguistics
Unlimited
Southeast
Asian languages typically use a combination of numeral plus classifier in
quantified noun phrases. The classifier is usually a noun (compare English loaf in a loaf of bread) and is often obligatory. The paper discusses the
syntax, semantics and morphology of the classifiers, as well as the
distribution of the constructions in the languages of the area. The strong
areal influence over-rides other aspects of the grammar. Formally, the
constituent containing the numeral and the classifier will be established as an
N-bar, with either element eligible to act as head.
This is a draft of a paper which will be published in due course. It
represents the paper as delivered at the 9th meeting of the
Southeast Asian Linguistic Society (SEALS). It has been adapted for web
presentation in a number of ways, though no content has been changed. Color has
been added and the pages have been resized for more convenient onscreen
viewing, and references and footnotes
have been hyperlinked. Until the paper is in the final publication form,
citations should refer to this document. Before citing, please contact
linguist@chessworks.com to find out if the final version is available for
citation.
Version 1.0 20.October 1999
Note: IPA characters
may not display correctly in all browsers. The font is Lucida Sans Unicode, and
any Unicode font on your system should work. For some reasons the script a ɑ and the eng ŋ have some problems. I am working on a PDF version
to solve these problems.
Contents
This paper is an update
of my earlier autolexical investigations into numeral-classifier constructions
in Southeast Asian languages, in which I hope to redeem outstanding promissory notes
and earn the right to issue more*. I will
also enhance a few features of the analysis (or eliminate the bugs, if you
prefer). Acting on the assumption that all of you are familiar with the basic
use of numerals in noun phrases, I'll be surveying the syntactic structures
cross-linguistically. We'll also take a look at the formal semantic structure
and see how well it matches with the syntax. Specifically, I hope to lead you
to the following conclusions.
(1) The numeral + classifier
construction should
I.
be considered
as an N-bar constituent in the syntax
II.
involve both
quantification and classification in the logico-semantics
III. conform to head-modifier relations where possible
IV. are strongly subject to areal influences
V.
perhaps be
considered double-headed in the syntax
The first should be
non-controversial, but most of the grammars I referenced speak only of
positioning of the numeral "between" the noun and the classifier,
without noting that the numeral is attached to the classifier by some very
strong cords. The second also seems fairly obvious, but as Lehman (1990)
reminds us, there are other views, though I think he does a good job of
breaking some of them down. The third observation is at the heart of the
theoretical part of the paper, but discussion there will be mercifully brief
because it involves the fundamental principle of autolexical theory. Anyone
familiar with the languages of Southeast Asia can readily spot the pattern
using the mighty Mekong as a guide. Yet the areal influence can affect syntax
in ways that some theories might find annoying, so it is best to establish it
properly. Finally, I tentatively suggest that in the syntax, the numeral and
classifier join together to form an N-bar, the intervening level between noun
and noun phrase. This is not a theory-internal trivial matter. I believe that
it explains quite a bit and allows a parallel to be drawn with serialized verb
phrases.
Along the way we'll
visit some non-garden variety classifier structures in languages covering most
of the Southeast Asian language area. As a member of Haj Ross's DFW (data
fanatics worldwide or data fetishists worldwide), I am obliged to do so.
In Southeast Asian
languages, classifiers are constituents of the noun phrase and unlike English
do not require, or allow, additional morphemes within the construction.
Syntactically, the noun
phrases under discussion will have a noun (N), optional modifier of the head
(M), Numeral (#), Classifier (C) and Demonstrative (D). For the moment, the
symbols used have no theory-specific meaning and can be taken as informal
notational devices. Although I am discussing constructions with numerals, other
quantifiers may be used, such as 'many', 'some', 'all' and 'few', but there is
a great deal of cross-linguistic variety there so I will confine myself to
canonical classifier expressions. The following word orders are attested:[1]
(2) Attested orders for noun, adjective, classifier,
demonstrative
|
NA#CD |
D#CAN |
#CNAD |
DNA#C |
We can immediately draw
two useful conclusions.
·
The numeral
always precedes the classifier1
·
Noun modifiers
are always adjacent to the noun
In Burmese, Okell gives
some examples of round numbers where the numeral follows the classifier. I
believe examples such as Burmese (c) can be analyzed as a having the rounded
number act as a classifier, i.e., two
tens of yoke of oxen. Note the parallel to Burmese (d). We will return to these
examples later.
So, we can simplify the
structure by moving up a level, where # and C are a constituent, we'll call CL,
for the moment. The noun+modifier structure is N1, a conventional N-bar. I know
that our community follows many different syntactic teachings, but I hope that
the motivations for an intermediate level between that of a noun and a full
noun phrase is universally accepted, and refer skeptics to the work of our late
colleague Jim McCawley (1988).
(3) Attested orders for n-bar, numeral+classifier,
demonstrative
|
N1 CL D |
D CL N1 |
CL N1 D |
D N1 CL |
I know of no example of
CL D N1, and the Lahu example in the data appendix shows a particle /ve/ which
I think makes it more like the English construction. Jim Matisoff (1972) had mentioned the similarity, but I
had overlooked that until recently.
There is no common
ordering relation between the constituents. We can try correlating the orders
with other aspects of the grammar, such as the order of verb and object, often
taken as some sort of underlying parameter with great explanatory power.
(4) Verb-object order and classifier construction
|
Order |
Languages |
Verb-object order |
|
CL N1 D |
Hmong, Yay, Vietnamese, some Mon-Khmer |
VO |
|
D CL N1 |
Cantonese, Mien, Mandarin |
VO |
|
D N1 CL |
Burmese, Lahu |
OV |
|
N1 CL D |
Thai, Most Mon-Khmer |
VO |
So, we cannot correlate
the verb-object order with the order of NP constituents. We'll return to this
topic later in discussion of Hawkin's implicational universals.
The figures in (5)
represent the constituent structure of the syntax of noun phrases which contain
classifiers. For the moment, we will stick with the notional categories as
terminal nodes so that we can look at the constituent structure from a
theory-neutral view.
(5) Syntactic structures of classified noun phrases
The first three are
straightforward enough. The noun-modifier and classifier constructs are each
N1. They join to form a larger N1 constituent, which is modified by a
demonstrative (D) which functions as a syntactic
operator converting the N1 into a full noun phrase (N2). Burmese, the final
example, has the demonstrative in the first position, to the left of the head
N-bar instead of the right. The dominance relations are the same as that of the
Khmer type, representing what we might take to be the normal or typical
structure. As we saw earlier, there are some seemingly anomalous Burmese
constructions, but you'll have to wait a bit longer before we get to those.
We have so far been
treating each element as belonging to a distinct category. However, we have
assumed that the structure containing the numeral and classifier is an N-bar.
If so, shouldn't it have a noun as head? Most classifiers are nouns, so it
seems reasonable to assign that element a head feature. On the other hand,
numerals are also look a lot like nouns, and are candidates for head feature N.
If one has to choose a head, one might well agree with Lehman (1990) that the
numeral is a better candidate. He goes on to suggest that the classifier may be
cliticized to the numeral, a proposal which is not without supporting evidence.
The unit of numeral + classifier does seem to be indivisible.
In fact, actual
cliticization takes place in Khmer but clearly the numeral is the affix and the
classifier is the head mneak = muəy + neak 'one person'. More problematically, classifiers
are an open class in most languages, and clitics tend to be members of a closed
class. An additional objection can be raised, namely that the numeral can be
deleted in some languages (Thai, Vietnamese, Cantonese, citations in Gil 1994).
I suggest the
Beeblebroxian solution: treat both the numeral and the classifiers as noun
heads.[2]
There is nothing in autolexical x-bar theory (Schiller & Need 1992) to
prevent this, and it fits the default associations which state that in the
absence of any motivation to be otherwise, numerals are syntactic nouns with
morphological nouns and semantic quantifiers. Classifiers are generally morphological
nouns with semantics we'll get to later.
The two are tightly
bound as a syntactic N-bar, are closely linked in the semantic representation,
and show strong morphosyntactic bonding. The cliticization in Khmer, albeit
limited to the numeral 1 (cf. English another
but not *twother, *forother, etc.)
does show a strong, even impenetrable bond between the numeral and the
classifier. Fuller (1988) pointed out that tone sandhi applies in the
numeral-classifier construction in White Hmong, more evidence of the close
relationship.
On the other hand, it is
important to note that the classifier N-bar and the head N-bar are not
indivisible. An intervening demonstrative is possible and is seen in the Lahu
example, but with a complementizing particle. Lehman cites the Thai examples
with complementizer, of the type seen in Thai example (b). I'm not going to
deal with relative clauses here but will issue the usual promissory note to be
redeemed, perhaps, at some future gathering of the SEALS.
In Thai, noun phrases
with heads, modifiers, demonstratives and quantifiers are superficially found
in three orders:
(6)
Head-modifier-quantifier-demonstrative orders in Thai (Panupong 1970)
H > M >
Q > D
H > M >
D > Q
H > Q >
M > D
This requires further
investigation. (7-9) are from Panupong (glosses added):
(7) H > M > Q > D
tôn-má:y yày sìp
tôn ní:
tree big ten CLF these
These ten big trees
This conforms to the
normal N1-CL-D areal pattern.
(8) H > M > D > Q
phucha:y ʔuən thî
kamlaŋ dɯ̀:m biə yù: sɔ:ŋ
khon
man fat WH PROG drink beer two CLF
two fat men who are
drinking beer
We have a relative
clause structure, which I take to be attached to a node which is not under
discussion here.
(9) H > Q > M > D
nɯ́ə sɔ̌:ŋ chin yày nî:
meat two piece big these
these two big pieces of
meat
The semantic structure
of (9) clearly involves big modifying
pieces and not meat. In the default case, we expect the syntax to parallel the
semantics. We should revise my initial structures to add an additional N-bar
node dominating just the classifier and its modifier. The syntactic structure
is shown in (10).
(10)
If this structure is
correct, it requires us to acknowledge that the complement of the numeral is an
N-bar, not a simple N. Lehman's cliticization account would now need to be
expanded to have N-bar clitics, unless, of course, the classifier is the head
and the numeral is the clitic.[3]
Summing up the syntax of
Thai, we can suggest the basic noun phrase rule is a set of N-bars followed by
a demonstrative. This fits nicely with the observation that Thai serializes
verb phrases. It might not be unreasonable to propose a rule that any
constituent with a bar level of 1 may be serialized.
The classified noun
phrases we have been looking at contain 5 elements organized as in (11).
(11)
This can also be
expressed as in (12).
(12)
Expression: (Quantified
entity (entity (entity, property)) (quantifier (quantifier, counter))) (Deictic
binder)
What about two big pieces of meat? Or even two big pieces of red meat? As the
English version suggests (I am assuming that the semantic representations among
languages are structurally the same), we can modify our semantic structure by
allowing the counter to consist of an entity and a property (13)
(13) Expanded semantic representation
It happens that both
nouns and adjectives are found as classifiers, which is hardly surprising. Both
categories are used to express properties, e.g., dog, tired: tired dog | dog-tired. Although dog is an entity, entities have properties, and these can be used
individually. The structure contains two property slots, and I believe this can
help explain the Burmese examples (c) and (d), shown here as (14).
(14) Some anomalous Burmese classifier constructions
nwà ăhyìñ hnă hse
ox yoke two ten
twenty yoke of oxen
daʔhsi biyapăliñ hnă loùñ
petrol beer
bottle 2 round
thing (CL)
two beer-bottles-ful of
petrol
Numerals can be used as
group nouns, cf. English 'tens', 'dozens'.
Under the analysis given here, the head N-bar would contain two nouns,
and the second two words are the classifier N-bar. I don't have relevant data
on combining this constructions with additional adjectives, though I hope
someone here can perhaps provide them.
The lexicon contains
information for each of the hierarchies on the various dimensions. The chart
(15) presents relevant grammatical aspects of the lexical entries for five
elements which are typically found in a noun phrase.
(15) Partial lexical entries for elements of classified
noun phrases
|
Item |
Semantic |
Syntax |
Morphology |
|
Head
noun |
entity k |
noun N0 [+head] |
nominal nml |
|
Adjective |
property f |
modifier [N0>>N1] |
adjectival adj |
|
Numeral |
quantifier q |
modifier [N0àN1] |
varies |
|
Classifier |
counter c [+count] |
noun N0 |
nominal nml |
|
Demonstrative |
demonstrative D index(k, x) [Deixis {values...}] [Plurality {sng, plu}] |
syncat [N1àN2] |
inert |
·
The head noun
is normally an entity which occupies a head noun position and is
morphologically marked as nominal.
·
The adjective
is usually a property, has a syntactic function of combining with nouns to form
n-bars, and is marked morphologically as an adjectival form.
·
The numeral is
a quantifier with the same syntactic category as the adjective, and is
morphologically varied.
·
Classifiers
are counters with noun syntax and tend to have nominal morphology, as supported
by Khmer evidence. Khmer uses the nominalized form sɔnlək as a classifier, not the base form slək.
·
Demonstratives
include a semantic function that assigns an index (binds a variable) to some
appropriate entry in the context register.
They are generally marked for deixis and plurality. Syntactically they
combine with n-bars to form complete noun phrases, and tend to be
morphologically inert. Reconciling syntax and semantics
Keeping in mind that our
semantic representation is not ordered linearly, most of the languages are
harmonic, showing a similar constituent structure.
(16)
The representation in
(16) shows a completely harmonic relationship between syntax and semantics,
with both linear order and configuration mirrored across the interface. We find
this structure in many Mon-Khmer languages. In these languages, the
Head-Modifier precedence rule is Head > Modifier, and it is seen with great
consistency. The languages are Verb > Object languages. For a typical Object
> Verb language, we would expect the linear order to be reversed as in (17).
However, this is not the case. The numeral always precedes the classifier,
regardless of word order.
(17)
The best the language
can do is the structure shown in (18).
(18)
This follows the
Generalized Interface Principle (Sadock & Schiller), creating the best
match given the linear precedence rule Num > Clf . There is no interface
violation, because there is no linear precedence in semantics, only
hierarchical structure. So the quantifier is unordered with respect to the
counter, a sister node. Mien and Mandarin show this order.
Summing up the formal
syntax and semantics, we can say that a classifier language requires an N-bar
in the syntax corresponding to the semantic structure containing the quantifier
plus a property which defines the units which are being counted. As is well
known, different classifiers can be used based on semantic aspects of the head
noun, and generic classifiers exist in some languages.
Generic classifiers come
in many forms. There is often an animacy distinction, as in the Mien example,
where the classifier for higher beings is used with respect to people. In this
case, the semantics are bleached, and the contribution to the real-world
semantics is minimized. The generic classifier exists so that the syntactic
requirement that a numeral be followed by a classifier is satisfied. This is
reminiscent of it in English It's raining. Semantically it marks the
item as countable, but does not supply any property.
Bradley (1979) notes
that the Lahu data suggest an areal influence: "Quantification required
both numeral and classifier in that order, following the head noun. This Common
Lahu construction is common to nearly all Southeast Asian languages, but it is
not found in all TB languages; thus it seems to be an areal development."
Thomas (1971) also noted an important areal pattern: "In Chinese and in
most of the Vietnam languages including Chrau, the classifier is placed between
the numeral and the noun. In other Southeast Asian languages it is not uncommon
for the order to be noun-numeral classifier." Though these remarks do not
take into account the constituent structure of the two n-bars, the ordering of
the units is correctly identified.
Quantified noun phrases
play an important role in commerce, and that may play a role in maintaining a
stable word order with numeral preceding classifier. China and Vietnam seem to
have a dominant order with classifier n-bar before head n-bar. Cambodia,
Thailand and Burma have the head n-bar first, then the classifier. If the
ordering of n-bar constituents has been easily affected by areal influence,
then the strength of any putative word order universals must be questioned.
Of course the word
"areal" begs the question: when? After all, Southeast Asian peoples
tend to move around a lot. A simplified
map of the region has one very prominent attribute: the Mekong River. Although
there are many exceptions, no doubt the result of various migrations, as a
general rule the languages with numeral-classifier N-bar preceding the head
noun are east side of the Mekong, and those on the west show the opposite
order. This clearly cuts across language families, as the Mon-Khmer (Khmer
|Chrau), Tai-Kadai (Thai | Yay), Sino-Tibetan (Burmese | Mandarin) families
show both orders.
The Mon-Khmer languages
seem to have a fairly loose attachment to classifiers, and seem to have been
affected by their proximity to either the Western or Eastern types of
classifier (if I may be allowed the luxury of such an oversimplification). The
use of classifiers is not grammatically obligatory in Mon, Khmer and many other
Mon-Khmer languages. This calls into question Lehman's 1990 link between
morphological properties and the use of classifiers. There is no discernable
morphological difference between the classifier-rich languages of Southeast
Asia and those which have less elaborate or minimal use of classifiers. If the
link were real, we'd also find many pidgin and creole languages with classifier
syntax, but we do not.
To account for the
variety of noun phrase internal word order, Hawkins (1988) proposed a
Prepositional Noun Modifier Hierarchy (19)
(19) Prep É ((NDem Ú NNum É NA) & (NA É
NG) & (NG É NRel))
As this paper is not
concerned with relative clauses and genitive constructions, we are only
concerned with the first part, which states that a prepositional language where
the head noun precedes the demonstrative or the numeral, it will also precede
the adjective. In other words, if the
head-modifier relationship is harmonic with regard to demonstratives and
numerals, it is also going to hold for adjectival modifiers. He notes (p.75)
that the demonstrative and numeral are "more unstable than the genitive
and the relative clause, and the adjective is more unstable than the relative
clause." He later (p.295) suggests that the cross-linguistic study of noun
phrases "can reveal the nature of the interaction between explanatory
principles in universal grammar." Hawkins found existing x-bar syntax
inadequate for the explanatory purposes. It should be noted, however, that
autolexical syntax has implemented the x-bar wish list presented in his book
(Hawkins 1988:201).
I think that it is
really the interaction not of powerful principles, but rather of simple
hierarchies that provide the variety of languages we so enjoy studying. The
Generalized Interface Principle of autolexical grammar (Sadock & Schiller
1993) provides a simpler account of the data included in Hawkins sample. As an
implicational universal, we would suggest the old, simple, strong form:
(20) Syntactic
order of elements should follow a consistent head-modifier order.
Note the presence of the
word should. In an artificial
language, it might be possible to keep everything lined up neatly. Natural
languages, however, have far too many conflicting demands. Phonological changes
may turn words into clitics. Even
numerals can become wedded to their hosts. In Schiller (1992), I
discussed the Khmer muəy "1", a numeral which shows aberrant behavior in many
languages. In Khmer it has cliticized to many nouns, for example mdɑɑŋ ‘once' from dɑɑŋ 'a time'. The numeral in its full form muəy, has cliticized to the copular verb ciə to form ciəmuəy, which is a verb meaning 'to accompany' or a
preposition 'with'.
If we assume that the
fixed numeral > classifier order is a result of some more general aspect of
information structure specifying the linear order syntax in that constituent,
then all of the attested word orders conform to our expectations.
I have tried to show
that classifier constructions should
·
be considered
as paired N-bar constituents in the syntax
·
involve both
quantification and classification in the logico-semantics
· &nb