Sociolinguistic Perspectives

OXFORD STUDIES IN SOCIOLINGUISTICS Edward Finegan, General Editor EDITORIAL ADVISORY BOARD Douglas Biber Alessandro Duranti John R. Rickford Suzanne Romaine Deborah Tannen




Simply put, sociolinguistics is the study of language in social use. Its special focus is on the relationships between language and society, and its principal concerns address the form and function of linguistic variation across social groups and across the range of communicative situations in which women and men deploy their verbal repertoires. In short, sociolinguists examine discourse as it is constructed and co-constructed, shaped and reshaped, in the interactions of everyday life, and as it reflects and creates the social realities of that life. While some linguists study the structure of sentences independent of who is speaking or writing and to whom, independent of what precedes and what follows in a discourse, and even independent of setting, topic, and purpose, socioloinguists investigate linguistic expression precisely as it is embedded in its social and situational contexts. In other words, sociolinguists analyze language as it functions in the workday lives of the people using it. Among language observers who are not professional linguists, almost all interest in linguistic matters likewise focuses on language in use, for only in use can the patterns of language variation be seen to reflect the intricacies of social structure and to mirror the situational and strategic influences that shape human discourse. In offering a platform for studies of language use in communities around the globe, Oxford Studies in Sociolinguistics invites significant synchronic or diachronic treatments of discourse and of social dialects and registers, whether oral, written, or signed. The series is host to studies that are descriptive or theoretical, interpretive or analytical. While volumes in the series usually report original research, an occasional one synthesizes or interprets existing knowledge. Occasionally, as with the present volume, a collection of valuable and influential papers by a distinguished linguist may appear in the series. While the series aims for a style that is accessible beyond linguists to other humanists and social scientists, some volumes will hold appeal for students and other readers keenly interested in the language of human affairs—for example, in the discourse of doctors or lawyers engaging their clients and one another with their specialist registers, or of women and men striving to fathom the sometimes baffling character of their shared interactions. By providing a forum for innovative and valuable studies of language in use, Oxford Studies in Sociolinguistics aims to influence the agenda for linguistic research in the twenty-first century and, meanwhile, to provide an array of insightful and provocative analyses to help launch that agenda. The present volume contains two dozen of Charles A. Ferguson's insightful and valuable sociolinguistics papers, including some classics. In editing the volume, Thorn Huebner has worked closely with Charles Ferguson in selecting papers and particularly in providing a framework for understanding them in the contexts of their original composition. Anyone interested in the richness of language variation in social life will find in this collection some of the most thoughtful sociolinguistic writing of the past four decades. We are pleased to have these papers as the latest contribution to Oxford Studies Sociolinguistics.



search in the twenty-first century and, meanwhile, to provide an array of insightful and provocative analyses to help launch that agenda. The present volume contains two dozen of Charles A. Ferguson's insightful and valuable sociolinguistics papers, including some classics. In editing the volume, Thorn Huebner has worked closely with Charles Ferguson in selecting papers and particularly in providing a framework for understanding them in the contexts of their original composition. Anyone interested in the richness of language variation in social life will find in this collection some of the most thoughtful sociolinguistic writing of the past four decades. We are pleased to have these papers as the latest contribution to Oxford Studies Sociolinguistics.


I first met Charles A. Ferguson some ten years ago during an extended stint as a visiting assistant professor at Stanford University, from 1984 until 1988. My initial intimidation by his scholarly reputation soon gave way to the most deep-seated admiration for him, both as a scholar and as a human being. Sitting on dissertation committees with him, I never ceased to be amazed at his breadth and depth of knowledge about language and languages. Interacting with presenters at conferences or other formal presentations of academic papers, he would gently, through tactful, self-effacing questioning, lead the presenter and the rest of us in the audience to new insights and implications for broader issues of language structure and use. I, along with hundreds of his students, have had the privilege of getting to know a man of impeccable academic rigor, professional integrity, and human kindness. My first professional collaboration with Ferguson was in 1988, when he asked me to coauthor an article with him for a conference on foreign language research cosponsored by the National Foreign Language Center, the Rockefeller Foundation, and the European Cultural Foundation held in Bellagio, Italy. Shortly afterward, I was both honored and humbled when he suggested that we organize a conference on Second Language Acquisition and Linguistic Theories at the 1989 LSA Linguistic Institute at Stanford. Work on publication of the proceedings of that conference afforded me an opportunity to continue my tutelage in language and its structure, use, and acquisition. It was during that time that I suggested that he consider compiling another volume of his collected works. It was several years, however, before he could free up his own calendar of scholarly commitments to consider the task. And a task it was. The sheer volume of his work made selection for a such a compilation extremely difficult. We met, on average, once a week at his home for well over one year. My role was to read and synthesize his work and to write overview sections for each group of papers selected. His role was to provide the context in which each was created and to educate me in the many areas of sociolinguistics in which, unhappily, I was less than totally conversant. That opportunity has been a highlight of my own professional experiences and development. So, first and foremost, I must extend my deepest gratitude to Professor Ferguson for that opportunity and for the hospitality he so graciously extended to me during that time. Whatever inadequacies and downright inaccuracies that may exist in the overview sections are clearly the fault of a less than stellar student and are no reflection on the teacher. I would also like to thank Peter Lowenberg, who sat in on many of the sessions, for the insights he provided on those areas of sociolinguistics with which I was less familiar. Professor Braj Kachru initially provided feedback on the selection of papers and offered support throughout the editing process. Professors John Rickford and Joshua Fishman read early drafts of the overview sections and provided important critical commentary. They, too, are to be exonerated for any inaccuracies. The School of Humanities and the Arts at San Jose State University, through Dean John Crane, provided a small research grant to help with the editing of the volume. Melissa Groo provided the editorial assistance. I would also like to thank the editors of this series for their encouragement and help. Finally, special thanks are due to Manito Regio and Patricia Schmitt for physical therapy and other assistance rendered to Professor Ferguson during the final stages of the preparation of this volume, while he was still recovering from a stroke he experienced in May 1992. San Francisco July 1994



was less familiar. Professor Braj Kachru initially provided feedback on the selection of papers and offered support throughout the editing process. Professors John Rickford and Joshua Fishman read early drafts of the overview sections and provided important critical commentary. They, too, are to be exonerated for any inaccuracies. The School of Humanities and the Arts at San Jose State University, through Dean John Crane, provided a small research grant to help with the editing of the volume. Melissa Groo provided the editorial assistance. I would also like to thank the editors of this series for their encouragement and help. Finally, special thanks are due to Manito Regio and Patricia Schmitt for physical therapy and other assistance rendered to Professor Ferguson during the final stages of the preparation of this volume, while he was still recovering from a stroke he experienced in May 1992. San Francisco July 1994

T. H.


Introduction, 3

Part I: Speech Communities and Language Situations 1. Diglossia (1959), 25 2. Language Development (1968), 40 3. The Role of Arabic in Ethiopia: A Sociolinguistic Perspective (1970), 48 4. Religious Factors in Language Spread (1982), 59 5. Literacy in a Hunting-Gathering Society: The Case of the Diyari (1987), 69 6. South Asia as a Sociolinguistic Area (1991), 84

Part II: Register and Genre 7. Baby Talk in Six Languages (1964), 103 8. Absence of Copula and the Notion of Simplicity: A Study of Normal Speech, Baby Talk, Foreigner Talk and Pidgins (1971), 115 9. The Collect as a Form of Discourse (1976), 124 10. The Structure and Use of Politeness Formulas (1976), 133 11. Sports Announcer Talk: Syntactic Aspects of Register Variation (1983), 148 12. Genre and Register: One Path to Discourse Analysis (1985), 167

Part III:

Variation and Change

13. The Sociolinguistic Variable (s) in Bengali: A Sound Change in Progress? (with Afia Dil) (1979), 181 14. Standardization as a Form of Language Spread (1987), 189 15. From ESSES to AITCHES: Identifying Pathways of Diachronic Change (1990), 200 16. Then They Could Read and Write (1990), 216 17. Individual and Social in Language Change: Diachronic Changes in Politeness Agreement in Forms of Address (1991), 227 18. Variation and Drift: Loss of Agreement in Germanic (1991), 241



Part IV:

Language Planning

19. The Language Factor in National Development (1962), 267 20. On Sociolinguistically Oriented Language Surveys (1966), 272 21. Sociolinguistic Settings of Language Planning (1977), 277 22. National Attitudes Toward Language Planning (1979), 295 23. Conventional Conventionalization: Planned Change in Language (1987), 304 24. Language and National Development (1988), 313 Ferguson Bibliography, 1985-1995, 325 Names Index, 329 Topics Index, 336 Languages Index, 345

Sociolinguistic Perspectives


Few linguists in the world today have not been influenced in some way by the work of Charles A. Ferguson. In areas of study ranging from Arabic linguistics to applied linguistics, from child language acquisition to language planning, from language and religion to language universals, from Bengali syntax to American sports announcer talk, seminal papers bear his authorship. He has held academic appointments at universities on five continents (North America, Asia, Europe, Australia, and South America). A festschrift on the occasion of his sixty-fifth birthday (Fishman et al. 1986) had editors from five continents, and as he has often pointed out with no small degree of satisfaction, from four different religious traditions. Regardless of the direction his wide interest in language has taken him, his work has been characterized by uncommon diligence, breadth of knowledge, and intellectual integrity. And although he has never adhered steadfastly to a single theoretical framework, he consistently reminds us of the facts of language structure and use that a theory of linguistics must account for. This latter emphasis is especially true of Ferguson's contributions to sociolinguistics, which has carried his undeniable influence from its inception as a (sub)discipline.

Sociolinguistic Beginnings: Ferguson's Role The inception, development, and institutionalization of a field of inquiry such as sociolinguistics involves the adoption of a label to set it apart from related fields, the publication of a body of literature articulating its assumptions, goals and methodologies, the design of courses to transmit this body of information to subsequent generations, and the establishment of organizations and conferences to facilitate communication among its adherents. Sociolinguistics in North America can trace its roots as far back as the larger discipline of linguistics on the continent. Although in Europe, linguists such as Meillet (1926), Firth (cf. 1957), Cohen (1956), and Sommerfelt (1938) never abandoned their interest in meaning and social context, the school of linguistics that eventually dominated North American linguistics, following the formation of the Linguistic Society of America in 1924 and publication of its journal Language in 1925, was Bloomfield's brand of structuralism with emphasis on investigating form apart from meaning and on the "discovery procedures" needed to do so. Edward Sapir, influenced by earlier work of Franz Boas, was the dominant foil to this approach, and a major influence on both the development of the field of sociolinguistics and the papers in this collection. 3



The term socio-linguistics can be traced at least back to a paper presented in 1949, and published three years later (Currie 1952; cf. Currie 1984). Subsequent use of the term, with and without the hyphen, is found in a number of books (e.g., Weinreich 1953, 99) and articles, notably in Word. Word, the journal of the Linguistic Circle of New York, edited by Uriel Weinreich, became in the decade of the 1950s an important outlet for scholarship on the relationship between language and society, and in 1959 was the original venue for Ferguson's often cited and reprinted article "Diglossia." Several important collections of readings also contributed to the development of the field. With the possible exception of Hoijer (1954; published through the American Anthropological Association), these did not begin to appear until Hymes's 1964 Language and Culture in Society, a classic containing contributions from every major figure in the field at the time. Fishman's Readings in the Sociology of Language, similarly broad in its range of contributors and approaches, appeared in 1968. Giglioli's Language and Social Context, still in print and widely used, was originally published in 1972. The Hymes and Giglioli volumes both included Ferguson's "Diglossia"; the Fishman volume included Ferguson's "Myths about Arabic." Courses are another indicator of the institutionalization of a (sub)discipline. In the spring of 1966, Fishman and Vera John offered a course at Yeshiva University called "Language and Behavior II," which listed as required texts Gumperz and Hymes 1964 and Herzler 1965, with Hymes 1964 and Jespersen 1964 (originally published in 1946) listed as "Other Useful Texts." At that summer's LSA Linguistic Institute in Los Angeles, Labov taught "Introduction to Sociolinguistics," reflecting a heavy emphasis on the contributions of the field to grammatical theory and linguistic diachrony. In the fall of 1964, Kachra offered a course by the same title at the University of Illinois, which included as texts Firth 1957, Hoijer 1954, Hymes 1964, and Jespersen 1946. But two years before that, at the 1962 LSA Linguistic Institute at the University of Washington, Ferguson taught "Linguistics 580L: Sociolinguistics," a course he was to repeat the following summer at Washington and in the 1965-66 academic year at Georgetown. Readings for the original version of the course included Cohen 1956, Ferguson and Gumperz 1960, Guxman 1960, Kloss 1952, Meillet 1928, and Weinreich 1953. This may have been the first course taught in North America entitled simply "Sociolinguistics." The emergence of a field can also be measured by organizations and conferences whose purpose is to further the goals of the discipline. In this respect, too, Ferguson has played a central role. As founder and director of the Center for Applied Linguistics from 1959 until 1966 when he accepted the first full-time appointment in Linguistics at Stanford University, Ferguson was instrumental in bringing together people from a variety of disciplinary backgrounds and philosophical points of view who shared an interest in the study of language within its social context. In this role, Ferguson was appointed chair of the newly formed Committee on Sociolinguistics of the Social Sciences Research Council in 1963. Other charter members of the committee were Joseph Greenberg, Everett Hughes, Thomas Sebeok, and John Useem. The following year, Joshua Fishman, Einar Haugeri, Dell Hymes, Nathan Keyfitz, and Stanley Lieberson were included in its



membership, along with Ferguson, Hughes, and Useem. Ferguson remained a member during each year of the committee's existence. In the ten years following its founding, the Committee on Sociolinguistics sponsored a number of notable activities: • 1966 Conference on Language Problems of Developing Nations (Fishman, Ferguson, and Das Gupta 1968). • 1966 workshop on the planning of courses suitable for both linguists and other social scientists, resulting in Hymes's (1966) manuscript. • Project on Acquisition of Communicative Competence (Ervin-Tripp 1969), from which arose both the Working Papers of the Language Behavior Research Laboratory at Berkeley and the Field Manual (Slobin 1967). • First Conference on Pidginization and Creolization of Languages, cosponsored by the University of the West Indies (Hymes 1971). • Conference on Language as Data and as Obstacle in Cross-Cultural Sociological Research (Grimshaw 1969). • Workshop on sociolinguistic aspects of student-teacher communication cosponsored by the Language Research Foundation of Cambridge, Massachusetts in 1971. • Interdisciplinary conference on Language Input and Acquisition in 1974 (Snow and Ferguson 1977). The committee was also instrumental in creating the first journal of sociolinguistics, Language in Society, begun in 1972 under the editorship of Dell Hymes. The first major activity of the Committee on Sociolinguistics was the 1964 Seminar on Sociolinguistics at the LSA Summer Institute at Indiana University, chaired by Ferguson, although this was not the first conference on Sociolinguistics. In the spring of that year, William Bright had organized a conference at UCLA (Bright 1966). But even at that LSA seminar, there were still more signs of the inception of a (sub)discipline: the emergence of particular schools, and methodologies and labels distinguishing among them. Among members of the seminar in Bloomington in the summer of 1964 were Fishman, Gumperz, and Labov. Hymes also visited briefly during the eight-week institute. All advocated positions on the study of language in social context consistent with those they advocate today. Micro-analyses of linguistic variables that can be shown to be tied to social constracts are identified with Labov's interest in developing a better theory of linguistics, one that can account for variation and change. Macro-analyses of the demographics of language use are justifiably associated with Fishman's efforts to create a more general social theory, one that would include attention to language. The ethnography of speaking is a product of Hymes's willingness to cross disciplinary boundaries to create a new field of inquiry. And Gumperz, perhaps more than any of those mentioned above, has consistently embedded claims about theory and methodology within higher clauses such as, "In Sociolinguistics, we . . .," thus staking out the field as something different from both sociology and linguistics. (Papers from that seminar by Kloss, Fishman, Friedrich, Labov, Hunt, Lieberson, Haugen, and Bright appeared in the spring 1966 issue of Sociological Inquiry edited by Lieberson. Also arising from



that seminar, in collaboration with the Center for Applied Linguistics, was an interim bibliography of the field, Pietrzyk 1967.) Unlike many major figures in the development of sociolinguistics, Ferguson is comfortable with any number of approaches, so long as they are systematically aimed at explication of language in society. In fact, he could be described in terms recently used for one of his intellectual progenitors, Edward Sapir. Ferguson is not one to sponsor a school and is not associated with the development of any particular method, but, like Sapir, he "established a charter for the free intellectual play of personalities more or less akin to his own." (Joos on Sapir 1957, 25; cited in Harris 1993, 21). That catholicity of outlook was responsible in 1964 for his bringing together the range of disciplinary and theoretical viewpoints that were represented at that first Summer Research Seminar on Sociolinguistics in Bloomington, a Fergusonian characteristic that he brought with him from the Center for Applied Linguistics, and that today allows him to traverse the range of approaches found in what has come to be known as sociolinguistics.

Themes As one reviews Ferguson's work in sociolinguistics, and in other areas as well, several themes recur.

Theory Ferguson's reluctance to build an all-encompassing theory of language in society is explained in a chapter of Enkvist, Ferguson, Hajicova, and Ladefoged (1992), for which he assumed primary authorship: Sociolinguistics has never been a unified field, and most researchers do not try to construct—or even imagine—a single comprehensive theory of language-insociety. This is partly because the phenomena can be approached from many perspectives and the principal goals of different theoreticians would be quite different. One way to characterize different approaches is roughly in terms of the size of the phenomena to be investigated, and the terms "macro-sociolinguistics" and "micro-sociolinguistics" are sometimes used in this way. (Enkvist et al., 1992, 46)

This is not to say that Ferguson is atheoretical. As an undergraduate scholarship student at the University of Pennsylvania, he majored in philosophy and looked to philosophers of language and symbolic logicians to talk about language. His undergraduate thesis was an axiom set for linguistics as a science. He came to believe, however, that the study of human language had more to gain from descriptive work with language in use—"real language" as opposed to armchair philosophizing about language. While he was a graduate student in oriental studies at Penn, he was exposed to Rulon Wells, who rented a room from his family. Ferguson worked under scholars such as E. A. Speiser, H. Ranke, G. Levi della Vida, W. Norman Brown, and his primary adviser Zellig Harris. His master's thesis



was on Moroccan Arabic verbs, his doctoral dissertation on the phonology and morphology of Bengali. But his experiences through a fellowship from the Intensive Language Program of the American Council of Learned Societies to study Moroccan Arabic and his later sojourn (1946-55) with the Foreign Service Institute provided ample caution against building theories of language structure divorced from language use. If Ferguson feels that a single comprehensive theory of sociolinguistics is outside the scope of sociolinguistic inquiry, what might he consider legitimate terrain for theory building? The papers in this volume repeatedly illustrate Ferguson's strategy toward that end.

Data and Method Ferguson's skepticism about comprehensive theories is grounded in two firmly held beliefs about the relationship between theory and data: that theory should derive inductively from data, and that linguists are often guilty of positing too much theory on the basis of too little data. He identifies Bloch (1948) and Bloomfield (1926) as particular offenders in this regard (personal communication, 8/11/ 93). But he also includes himself in that criticism. In his work on universals of baby talk (cf. 1978), Ferguson studied more than twenty languages and concluded that features of baby talk found to be common to all of them must be universal. But he now concludes, in light of later work by ethnographers on child socialization processes, that even with this sample he was wrongly overgeneralizing. The papers included here are characteristic of Ferguson's work in other areas in that he finds evidence for theory construction or linguistic explanation in all kinds of language behavior. Here again, we see the influence of Sapir, who in his "Psychological Reality of Phonemes" (1949) includes data from mistakes, foreign accents, and so on. Nor is Ferguson's approach to data collection constrained by a single methodology. " 'God Wishes in Syrian Arabic' " (1983), originally written while Ferguson was at the School of Oriental and African Studies in London in 1975, is one of the few examples in which Ferguson interviews informants on appropriate use of politeness formulas in Syrian Arabic. His informants were a Damascus Arabic employee of the British Broadcasting Company and his wife. One methodological question was how to select the formulas to study. Ferguson selected a formula that allowed for some variation: "God do X to you." Through observation and introspection, Ferguson found about thirty such formulae. He then asked the husband and wife how to use them. From their disagreement, Ferguson learned that any formula with "God keep you" could be used to initiate a request for a favor. While Ferguson would probably not endorse the methodology now, it is an example of research that begins with a set of forms and proceeds to find functions for them. Despite his eclectic approach to data collection and analysis, two of Ferguson's favorite strategies can be found in these papers. One is to look carefully at one small piece of language in its social context, and to begin to build a theory of that, with implications for the building of larger, more comprehensive theories.



For example, sociolinguists could construct a theory of forms of address (Ferguson 1991). This would include the kinds of forms allowable (e.g., titles, names, epithets, pronouns), the kinds of agreement patterns found, their social functions, how they are acquired (cf. Ferguson 1978b), and their diachronic patterns of change. The case study, and even more fruitful for Ferguson, the comparative case study, is another recurring strategy in this collection. Whether the object of the study is language situations, registers and genres, variation and change, or language planning, comparative case studies highlight points of divergence and convergence that preclude theory building or claims of universality.

Universals Ferguson's deep respect for inquiry into language universals (e.g., Ferguson 1978c and the preface to that volume (1978a), which outlines Ferguson's evolving interests in universals) comes in large part from his close interaction with Joseph Greenberg at Stanford. Again, this interest has taken him in a direction somewhat independent of that followed by the majority of American linguists. For Ferguson, the search for universals involves not so much induction of the formal generative mechanisms common to all languages, based on the range of grammatical sentences intuited by native speakers; rather he looks for empirical evidence for universals in diachronic change through normal transmission, contact-induced change in pidgin situations, and developmental change in first and second language acquisition. He looks for universals not only of phonology and syntax, but also of register and genre. He looks, for example, both at the universality of genre across time and languages (1976b) and at the points of identity across registers (e.g., 1977a; 1977b; Ferguson and DeBose 1977). Rather than attributing these universals to a unique language faculty, Ferguson looks for links between language and other, more general human and nonhuman behaviors. He speculates that some forms of language behavior may arise from an "innate predisposition," comparing interjections and ritualized exhanges to "the bowings and touchings and well-described display phenomena of other species" (1976a, 138). The papers in this volume illustrate not only Ferguson's interest in the universal, but also his deep-seated interest in two other linguistically relevant levels of language behavior: "those of particular languages or varieties [aspects of language that are socially constructed] and [what is unique to] individuals" (1979, 142-43).

Conventionalization It is the tension between the individual and the social that gives rise to "the process by which members of a community somehow come to share the soundmeaning pairings that constitute their means of verbal communication" (1994, 1). This "conventionalization" process typically involves those parts of language that are to some degree stable, transmitted diachronically via natural means, and by definition, not universal. If there were accurate labels and if there were such a



school, a Fergusonian sociolinguist would be primarily concerned with "the problem of how individual competence relates to a system that is conventionalized, i.e., shared by members of a community" (Enkvist, Ferguson, Hajicova, and Ladefoged 1992, 10-11), which is the primary focus of these papers. The locus of investigation of these questions is "the socially shared langue." "A sociolinguistically localized grammar, although at present far beyond our techniques of analysis and presentation, must be an ultimate goal for linguists who see themselves as students of language in society." (1976, 108)

Two points of embarkation toward that goal are the study of differences in language use across individuals and of the use of highly conventionalized linguistic forms.



The pursuit of grammar in the context of use has little room for the notion of the idealized native speaker. Such a grammar includes by nature a great deal of variation, much of it individual, perhaps related to personality variables rather than to social variables. The inclusion of this individual variation in a general theory of language (a position that Ferguson argues for) calls into question even such traditionally held notions as "native speaker" and "mother tongue." It does so, first, because varying patterns of acquisition, use, and loss of first and subsequent languages among individuals blur the boundaries between native and nonnative; second, because universalist explanatory principles should account for all linguistic behavior, not just that of some privileged, mythological native speaker. "In fact, the whole mystique of native speaker and mother tongue should probably be quietly dropped from the linguists' set of profesional myths about language" (1982a, vii). Thus, individual variation represents the antithesis of conventionalization. Perhaps that is why it is so often ignored by those branches of linguistics concerned with the socially constructed aspects of language. But this has not diminished Ferguson's enthusiasm for the significance of individual differences in both language acquisition and use (cf. Ferguson 1975a, 1979).

Routinized Language In diametric opposition to individual variation, routinized language represents the prototypical case of conventionalization. While not discounting the validity, usefulness, and importance of the creativity of language, Ferguson aligns himself with the work of Fillmore (1979) and Pawley and Syder (1976) in stressing both the frequency and importance of prefabricated routines. But Ferguson's interests in routinized language are more encompassing, incorporating all linguistic forms for which a template constrains variability, including conversational turns and genre types as well as formulaic expressions. And these interests are of longer standing.



The roots of this notion of routinized language can be traced to Ferguson's interest in folk literature, which began with his friendship with W. D. Preston, a linguist and folklorist from Indiana University who visited Penn for several years in the late 1940s. Together they produced a long article on Bengali proverbs (Ferguson and Preston 1946a) and planned and drafted parts of similar studies of Bengali children's rhymes and riddles. A version of the latter appeared in the Journal of the American Oriental Society (Ferguson and Preston 1946b), but the children's rhymes article was never completed for publication. Ferguson was fascinated by several aspects of the folk literature materials. First, he was interested in specifying the formal linguistic structures of each genre, including register differences (e.g., among the 107 Bengali proverbs elicited from one Bengali speaker, Dr. B. V. Mukerji of Calcutta, were proverbs in Sanskrit as well as in a variety of Hindi). Second, he was interested in the actual patterns of use in the speech community in relation to their apparent literal meaning. Thus, for example, to understand the roles of the Arabic proverb "the lid fits the pot," it is essential to know that the proverb is used almost exclusively in relation to a propitious match between two spouses; and that it is one of a number of such proverbs in Arabic and other languages. Although most of Ferguson's investigations of Arabic and Bengali folk literature never appeared in print, something of his interests appeared in papers published by his students or colleagues, mostly Foreign Service personnel who attended classes or symposia conducted by him during his stay at the Foreign Service Institute, including his stint in Beirut. (Examples of such published papers include: Mak 1949; Ferguson and Echols 1952; Allen 1955; Ferguson and Rice 1960; Parker 1971.) Ferguson's work on routinized language reiterates the universality and diversity of routinized speech and the need for a linguistic theory broad enough to accommodate it. The legacy of this concern for formatted speech is found in Coulmas (1981) and in the body of literature on the role of routines and formulas in first and second language acquisition.

Fortuitous Incidents These themes find root—as do many of Ferguson's research themes—in a series of personal experiences in the world of Arabic studies. For example, his interest in the problem of conventionalization started with his observations about the differences between the vocalism of Lebanese dialects and that of Damascus Arabic. As is generally acknowledged, most varieties of Lebanese Arabic have basically three contrasting short vowel segments /a i u/ with associated allophonic and morphophonemic variants (cf. Cantineau 1939). This statement is not meant to deny the existence of unusual local variants and the greater variety of long vowels. In any case, the typical Lebanese vocalism is strikingly different from that of Damascus Arabic, which apparently has six contrastive short vowel segments (/a e i o u /) in a complex pattern of partial complementation (cf. Cowell 1964; Grotzfeld 1965). The facts of language use are that Lebanese and Damascus Arabic speakers



converse freely and easily without any awareness of linguistic differences, except that speakers of one variety sometimes attribute intonational distortion to speakers of the other variety, much like the attribution of "sing-song, drawling" and so forth to speakers of "other" varieties in various European languages. The specific incidents that triggered Ferguson's focus on conventionalization were his own perception of the conflicting patterns of contrasts. He first became familiar with many kinds of Lebanese Arabic, including different kinds of Beirut Arabic, which represent an amalgam of sub varieties unlike the relatively uniform phonological structure of Damascus Arabic (DA), and a mildly divergent village variety (Bismizzin; cf. Jiha 1964). After his familiarization with Lebanese sounds, when Ferguson was faced with Damascus speakers whose sound system he had decided to investigate, he found that the DA informants perversely—to his mind— attempted to persuade him that such pairs as sitti, "my lady, grandmother, respectful address to an older woman"; sitti, the number "six"; katabu, "he wrote it"; and katabu, "they wrote" were distinguished. Ferguson immediately jumped to the conclusion that the DA speakers were inventing a nonexistent contrast because of their lexical and morphological knowledge and their awareness of spelling differences between the corresponding words in written Arabic. His conclusion, however, received a sharp setback when he introduced the DA speakers to some of his American students of colloquial Arabic. Some of the American students clearly heard the differences that the DA speakers claimed to be making, and before long Ferguson accepted structural transcriptions of the type satti, "my lady"; s tte, "six"; and katabo, "he wrote it"; katabu "they wrote." Being embarrassed by the better perception of his phonetically untrained students, Ferguson tried some informal experiments by asking Lebanese speakers to identify the DA contrasts. They were—reassuringly—just as unable as he had been to differentiate consistently the sound contrasts, although all of them were familiar with the lexical, morphological, and orthographic differences. Some time later Ferguson tried the reverse strategy: How did the DA speakers perceive the Lebanese noncontrasts? As might have been expected, they all insisted that they heard the distinctions that were not there in the Lebanese forms. No attempt was made to do experimental tests under carefully controlled conditions; the Ferguson "experiments" were crude and impressionistic, comparable in sophistication to the early Spanish-English contrastive tests (Marckwardt 1944 and 1946), and equally unreliable. But they were sufficient to raise for Ferguson the troublesome question: Just what are the phonological contrasts that are presumed to be the foundation of the linguistic structure, the sound-meaning pairing of the speech community? If speakers of mutually intelligible varieties of "the same language" are operating with incommensurate inventories, just what is a phonological system? Ferguson's casual observations of Arabic in use also led him to an interest in formulaic speech. For example, the motivation for his "Root-echo Responses in Syrian Arabic Politeness Formulas" (1967) was an incident in Beirut, when in response to a conductor's use of a formulaic tfaddal, "Please," he hypothesized a response, sahtayn, "Two healths," to which the conductor in turn responded with the third part of the triadic pattern, 'ala qalbak, "On your heart." The next step



was to articulate rules for the underlying triadic response pattern. This suggested to Ferguson that for most of these formulas there is a definite response that is not always transparent. More interesting, if a formula has a certain root in it, and there is no conventional specific reply, then one uses the root in reply. He then tested the rules hypothesized in other contexts. Ferguson (1976a) describes his onthe-spot analysis and response during one such occasion while negotiating to buy an article of clothing in a Jerusalem market. But the insights Ferguson draws from these incidents transcend mere linguistic description or syntactic analysis. For example, Ferguson views the type of "discourse agreement" found in Syrian politeness formulas on par with subject-verb agreement (cf. Ferguson and Barlow 1988, 7-8). And in "The Blessing of the Lord be Upon You" (1977c), Ferguson suggests that Palestinian politeness formulas may be longer standing than previously thought, predating the Hebrew. From these incidents he manages to see not just the trees and the forest, but also the animal life in the trees and the place of the forest in the larger terrain.

Conclusion Ferguson may not be as frequently cited internationally as Fishman. He may not be as dedicated to the development of a particular theory or methodology as Labov. And he may not deal at levels of abstraction that Hymes does. However, the work represented in the papers in this volume spans all three approaches. What is uniquely Fergusonian is the inclusion of a fourth dimension, derived from his solid training in Sanskrit, Greek, Latin, and Oriental languages at the University of Pennsylvania, in descriptive linguistics there under Harris, and in his interest not only in language but also in the people who use it. That perspective is one that consistently looks for the relationship between diachronic language change and language development, phonology and syntax, social conventionalization and cognitive processing, and language universals and individual differences. The present volume includes those contributions by Ferguson that, in the editor's view, have had the most impact on the field of sociolinguistics. At the same time, it also contains more recent papers that reveal the lines of development that his own thinking has followed on these issues. The papers here share many common themes and demonstrate what might be called a Fergusonian approach to linguistic and sociolinguistic inquiry. The book is divided into four sections, each containing six papers on a common theme: speech communities and language situations, register and genre, variation and change, and language planning. Each section of the book is introduced by the editor's assessment of the present state: theory making, methods and directions of research, and implications for related fields. The references at the end of each paper are presented in the format of the original publication. The dates in the contents reflect the year of first publication, except for previously unpublished papers (papers 12, 18, 23, 24), in which case reference is to the year of first oral presentation.



References Allen, Arthur B. 1955. "Some Iraqi proverbs and proverbial phrases." Journal of the American Oriental Society 75:122-25. Bloch, Bernard. 1948. "A set of postulates for phonemic analysis." Language 24:3-46. Bloomfield, Leonard. 1926. "A set of postulates for the science of language." Language 2:153-64. Bloomfield, Leonard. 1939. Linguistic Aspects of Science (International Encyclopedia of Unified Science 1:4). Chicago: University of Chicago Press. Bright, William, ed. 1966. Sociolinguistics: Proceedings of the UCLA Sociolinguistics Conference, 1964. The Hague: Mouton. Cantineau, Jean. 1939. Remarques sur les parlers de sedentaires syro-lebano-palestiniens. Bulletin de la Societe de linguistique de Paris 40.1:80-88. Cohen, Marcel. 1956. Pour une sociologie du langage. Paris: Albin Michel. Coulmas, Florian, ed. 1981. Conversational Routine: Explorations in Standardized Communication Situations and Prepatterned Speech. The Hague: Mouton. Cowell, Mark W. 1964. A Reference Grammar of Syrian Arabic. Washington, D.C.: Georgetown University Press. Currie, Haver C. 1952. "A projection of sociolinguistics: The relationship of speech to social status." Southern Speech Journal 18:28-37. . 1984. "An expanded abstract of notes on personal involvement in sociolinguistics." Sociolinguistics 14.2:10-15. Enkvist, Nils Erik, Charles A. Ferguson, Eva Hajicova, and Peter Ladefoged. 1992. Linguistic Research in Sweden. Uppsala: Swedish Science Press. Ervin-Tripp, Susan. 1969. "Summer workshops in sociolinguistics: Research on children's acquisition of communicative competence." Items 23.2:22-26. Ferguson, C. A. 1967. Root-echo responses in Syrian Arabic politeness formulas. In D. G. Stuart, ed., Linguistic Studies in Memory of R. S. Harrell. Washington, D.C.: Georgetown University Press. . 1975. Applications of linguistics. In R. Austerlitz, ed., The Scope of American Linguistics. Lisse, The Netherlands: Peter de Ridder Press. . 1976a. "The structure and use of politeness formulas." Language in Society 5:137-51. (Also in F. Coulmas, ed. Conversational Routine. The Hague: Mouton.) . 1976b. The collect as a form of discourse. In W. J. Samarin, ed., Language in Religious Practice. Rowley, Mass.: Newbury House. (Also in M. A. Jazayery, E. C. Polome, and W. Winter, eds., Linguistics and Literary Studies in Honor of Archibald A. Hill. Lisse, The Netherlands: Peter de Ridder Press.) . 1977a. Baby talk as a simplified register. In C. E. Snow and C. A. Ferguson, eds., Talking to Children. Cambridge: Cambridge University Press. . 1977b. Simplified registers, broken language and Gastarbeiterdeutsch. In C. Molony, H. Zobl, and W. Stolting, eds., Deutsch im kontakt mit anderen Sprachen. Kronberg/Ts: Scriptor Verlag. . 1977c. The blessing of the Lord be upon you. In A. Juilland, ed., Linguistic Studies Offered to Joseph Greenberg. Saratoga, Calif.: Anma Libri. . 1978a. Introduction. In Joseph H. Greenberg, Charles A. Ferguson, and Edith A. Moravcsik, eds., Universals of Human Language. Vol. 3, Word Structure. Stanford: Stanford University Press. . 1978b. Phonological processes. In Joseph H. Greenberg, Charles A. Ferguson, and Edith A. Moravcsik, eds., Universals of Human Language. Vol. 3, Word Structure. Stanford: Stanford University Press.



. 1978c. Talking to children: A search for universals. In Joseph H. Greenberg, Charles A. Ferguson, and Edith A. Moravcsik, eds., Universals of Human Language. Vol. 3, Word Structure. Stanford: Stanford University Press. . 1979. Phonology as an individual access system: Some data from language acquisition. In Charles J. Fillmore, Daniel Kempler, and William S-Y Wang, eds., Individual Differences in Language Ability and Language Behavior. New York: Academic Press. . 1982a. Foreword. In Braj B. Kachru, ed., The Other Tongue: English Across Cultures. Urbana: University of Illinois Press. . 1983a. " 'God wishes' in Syrian Arabic." Mediterranean Language Review 1:65-83. . 1991. Individual and social in language change: Diachronic changes in politeness agreement in forms of address. In Robert L. Cooper and Bernard Spolsky, eds., The Influence of Language on Culture and Thought: Essays in Honor of Joshua A. Fishman's Sixty-Fifth Birthday. Berlin: Mouton de Gruyter. . 1994. Dialect, register, and genre: Some working assumptions about conventionalization. In D. Biber, and E. Finegan, eds. Sociolinguistic Perspectives on Register. New York: Oxford University Press. Ferguson, C. A., and M. Barlow. 1988. Introduction. In M. Barlow and C. A. Ferguson, eds., Agreement in Natural Language: Approaches, Theories, Description. Stanford: Center for the Study of Language and Information and the University of Chicago Press. Ferguson, C. A., and C. E. DeBose. 1977. Simplified registers, broken language, and pidginization. In A. Valdman, ed., Pidgin and Creole Linguistics. Bloomington, Ind.: Indiana University Press. Ferguson, C. A., and John Gumperz, eds. 1960. Linguistic Diversity in South Asia [International Journal of American Linguistics 26:3, Part II.] Bloomington: Indiana University Research Center in Anthropology, Folklore, and Linguistics. Ferguson, C. A., and J. M. Echols. 1952. "Critical bibliography of spoken Arabic proverb literature." Journal of American Folklore 65:255.67-84. Ferguson, C. A., and W. D. Preston. 1946a. "107 Bengali proverbs." Journal of American Folklore Society 57:365-86. . 1946b. "Seven Bengali riddles." Journal of the American Oriental Society. 66.4:299-303. Ferguson, C. A., and F. A. Rice. 1960. "Iraqi children's rhymes." Journal of the American Oriental Society 80.4:335-40. Fillmore, Charles J. 1979. On fluency. In Charles J. Fillmore, Daniel Kempler, and William S-Y Wang, eds., Individual Differences in Language Ability and Language Behavior. New York: Academic Press. Firth, J. R. 1957. Papers in Linguistics: 1934-1951. London: Oxford University Press. Fishman, Joshua, ed. 1968. Readings in the Sociology of Language. The Hague: Mouton. Fishman, Joshua A., Charles A. Ferguson, and J. Das Gupta, eds. 1968. Language Problems of Developing Nations. New York: Wiley and Sons. Fishman, Joshua, Andree Tabouret-Keller, Michael Clyne, Bh. Krishnamurti, Mohamed Abdulaziz, eds. 1986. The Fergusonian Impact: In Honor of Charles A. Ferguson on the Occasion of His Sixty-Fifth Birthday. Berlin: Mouton de Gruyter. Giglioli, Pier Paolo, ed. 1972. Language and Social Context. Baltimore: Penguin Books. Greenberg, J. H., C. A. Ferguson, and E. Moravcsik, eds. 1978. Universals of Human Language. Stanford: Stanford University Press.



Grimshaw, A. D. 1969. "Language as obstacle and as data in sociological research." Items 23.2:17-21. Grotzfeld, Heinz. 1965. Syrisch-arabisch Grammatik. Wiesbaden: Otto Harrassowitz. Gumperz, John, and Dell Hymes, eds. 1964. "The Ethnography of Communication." American Anthropologist 66.6, pt. 2: 137-54. Guxman, M. M. 1960. Voprosy formirovanija i razvitija nacionalnyx jazykov. Moscow. Harris, Randy Allen. 1993. The Linguistics Wars. New York: Oxford University Press. Herzler, Joyce O. 1965. A Sociology of Language. New York: Random House. Hoijer, Harry, ed. 1954. Language in Culture. (Comparative Studies of Cultures and Civilizations, no. 3; Memoirs of the American Anthropological Association, no. 79.) Chicago: University of Chicago Press. Hymes, Dell, ed. 1964. Language in Culture and Society. New York: Harper and Row. Hymes, Dell. 1966. Teaching and training in sociolinguistics. Unpublished ms. Hymes, Dell, ed. 1971. Pidginization and Creolization of Languages. Cambridge: Cambridge University Press. Jespersen, Otto. 1964 (1946). Mankind, Nation and Individual from a Linguistic Point of View. Bloomington: Indiana University Press. Jiha, M. 1964. Der arabisch Dialekt im Bismizzin. Beirut. Joos, Martin. 1957. Introduction. In M. Joos, ed., Readings in Linguistics. Washington, D.C.: American Council of Learned Societies. Kloss, Heinz. 1952. Die Entwicklung neuer germanischer Kultursprachen. Munich. Mak, Dayton S. 1949. "Some Syrian Arabic proverbs." Journal of the American Oriental Society 69:223-28. Marckwardt, A. H. 1944. "An experiment in aural perception." English Journal 33:21214. . 1946. "Phonemic structure and aural perception." American Speech 21:106-11. Meillet, Antoine. 1926. Linguistique historique et linguistique generale. Paris: Champion. Parker, Richard B. 1971. "Lebanese proverbs. "Journal of American Folk Lore 58:104-14. Pawley, Andrew, and Frances Syder. 1976. Creativity vs. memorization in spontaneous discourse: The role of institutionalized sentences. Unpublished ms. New Zealand: University of Auckland. Pietrzyk, Alfred, ed. 1967. Selected Titles in Sociolinguistics: An Interim Bibliography. Washington, D.C.: Center for Applied Linguistics. Sapir, Edward. 1949. The psychological reality of phonemes. In D. G. Mandelbaum, ed., Selected Writings of Edward Sapir in Language, Culture, and Personality. Berkeley: University of California Press. Slobin, Dan I., ed. 1967. A Field Manual for Cross-Cultural Study of the Acquisition of Communicative Competence. Rev. ed. Berkeley, Calif.: ASUC Bookstore. Snow, Catherine E., and Charles A. Ferguson, eds. 1977. Talking to Children: Language Input and Acquisition. Cambridge: Cambridge University Press. Sommerfelt, Alf. 1938. La langue et la societe. Oslo: H. Aschenhoug. Weinreich, Uriel. 1953. Languages in Contact. New York: Linguistic Circle of New York.

Each of the six papers in this section describes from a slightly different perspective one or more specific "language situations." Ferguson's original reference to "language situation" was "without any intention of its being a technical term" (Ferguson 1991, 221-22), although it has since come to be used as such. Ferguson cites Nikol'ski (1976) as an example of a definition consistent with his own intentions. In these papers, Ferguson uses "language situation" to refer to an aggregate of language varieties (dialect and register) and their patterns of acquisition, use, and modalities by and among various linguistic communities within a particular geographical region, whether national (e.g., Greece, Haiti), subnational (e.g., the Eyre Lake region of Australia), or supranational (e.g., South Asia). The intent is to unveil patterns of language use in an attempt to shed light on and raise questions about the sources for these patterns and implications for language acquisition, language maintenance, language change, language shift, language planning, and linguistic theory. This use of the term has perhaps been more fruitfully mined in European than in American sociolinguistics (cf. Brauner and Ochotina 1982). As with so many of the insights Ferguson brings to his work, the original motivation for his now famous diglossia article was a casual observation of language in use: that in some language situations, newspaper articles would be written in one variety of a language (the high or H variety), while political cartoons in the same newspapers would be written in another (the low variety, L). Ferguson did not invent the term diglossia; he borrowed it from the French Arabist W. Marcais (1930-31). But he brought to the term a detailed description of a specific type of language situation, and a refutable set of hypotheses relevant to an understanding of processes of language change, to assumptions about synchronic variation, and to social science in general. The initial popularity of the article can be attributed to a number of factors. It contains a good deal of morphological, lexical, and phonological detail, thereby appealing to linguists. And, as pointed out in the Introduction to the present collection of papers, it came at a time when some linguists were beginning to look at the social context of language. Several observers have even dated the beginning of modern sociolinguistics to this article (eg., Denison 1970). Its enduring influ17


Speech Communities and Language Situations

ence can be measured in many ways. It has been reprinted a number of times, both in English (e.g., Giglioli 1972; Hymes 1964; Ferguson 1971a) and in translation (e.g., Italian: Giglioli 1974; Spanish: Garvin and de Suarez 1974; Romanian: Ionescu-Ruxandoiu and Chitoran 1975; German: Steger 1982; Raith et al. 1986; Portuguese: Fonseca and Neves 1974). At least two important linguistics journals have devoted entire issues to the topic (Langages 61, in 1981 and Southwest Journal of Linguistics 10.1, in 1991). Hudson's (1992) extensive bibliography on diglossia contains 1,092 entries in sixteen languages since 1959. While it includes 30 books and monographs and 42 doctoral dissertations, it admittedly omits works on diglossia written in the languages of the diglossic situations, most notably Arabic and Greek (e.g., Badawi 1973; Perides 1963). A more recent bibliography contains almost three thousand entries (Fernandez 1994). The concept of diglossia has provided particularly fertile grounds for South Asian sociolinguists (cf. deSilva 1976; Shapiro and Schiffman 1983; Krishnamurti 1986). The generation of such an impressive body of research can only be attributed to the centrality of the sociolinguistic issues raised in Ferguson's original paper. Although Ferguson lists nine specific characteristics of a diglossic language situation (involving the functions, prestige, literary heritage, acquisition, standardization, morphology/syntax, lexicon and phonology of the H and L varieties), other scholars since 1959 have interpreted diglossia more widely. Some of these differences are clearly the result of a misinterpretation of Ferguson's original intent. For example, diglossia has been interpreted (wrongly) as an impediment to national development (deSilva 1976). And Fasold (wrongly) interprets Ferguson's notion of speech community to mean "all those within the borders of some country who speak the same language" (1984, 43). Here Fasold may be confusing speech community with language situation. Ferguson makes no apologies for applying the term language situation at one level of generality to geographical units that correspond with national boundaries. If sociolinguistic insights are to be applied to language planning, political entities cannot be ignored. The most widespread expansions and reinterpretations of diglossia, however, revolve around two issues: the binary question and the relatedness question. For example, Gumperz (1964), Fishman (1967), Platt (1977), and Abdulaziz (1978) have all used the term to refer to more than two language varieties. Fasold lists "double overlapping diglossia," "double-nested diglossia," and "linear polyglossia" as "types of multiple language 'polyglossia' " (1984, 44-50). These and other scholars have also used the term to refer to a much wider range of linguistic relatedness of the various codes. In some cases, diglossia has come to refer to standard-with-dialects situations, or to bilingual situations, or both. A summary of this expansion of the term and arguments for that expansion can be found in Fasold (1984, 40-52), who places the "classic diglossia" described by Ferguson as a midpoint on a continuum of "broad diglossia," which includes varieties of greater or lesser linguistic relatedness, with "superposed bilingualism" at one end and style-shifting at the other. Nevertheless, Ferguson has steadfastly held to the original concept involving two distinct varieties of the same language. Indeed, some have interpreted the

Speech Communities and Language Situations


framework presented in Fishman (1972) and Fasold (1984) to suggest that diglossia becomes "an aspect of essentially all language situations and thus no longer refers to a type of language situation" (cf. Gilbert 1975, 235-36, cited in Johnson 1986, 337). Ferguson endorses Britto's (1986) proposal of "optimal distance" to exclude from diglossia "H-L situations of 'super-optimal' and 'sub-optimal' distance" (Ferguson 1986, xxii). Ferguson's stance was motivated by his desire to see an eventual taxonomy of language situation types that would form the basis for an understanding of the relationship between language situations and linguistic structures, the paths of historical change of those structures, and the implication of these for linguistic theory. Where sociolinguists have broadened the notion of diglossia beyond its original intent, this focus on linguistic outcomes has sometimes been lost. For example, when Fishman, thirty years after Ferguson's article, echoed Ferguson's original call for "a common quest to determine when and to what extent it [diglossia] does obtain, how it came into being, what its consequences are relative to a particular focus of interest, [and] which factors tend to strengthen or weaken it" (1989, 196, cited in Hudson 1992, 612), one may safely assume that he is referring more to the variables in the social context and less to the linguistic forms and outcomes that are such an important part of Ferguson's classification. In "Language Development," Ferguson places the notion of diglossia within a larger framework. His intent is to provide a framework for isolating the question of degree of development of a particular language. Dismissing the use of linguistic (i.e., phonological, morphological-syntactic, semantic) structures as an indication of linguistic superiority, he looks to the language situation to identify three processes of language development: graphization, standardization (including diglossia and forms of "multimodal standardization"), and modernization (including expansion of registers, genres, and the lexicon). The discussion provides an inventory of features in the language situation that must be taken into account in the development of theories of language change and in the planning of language change. The last four papers in this section represent four particular interests of Ferguson that go back to his university, or in the case of religion to his pre-university, days. The first of these interests, Arabic, dates back to Ferguson's fellowship with the Intensive Language Program of the American Council of Learned Societies as an undergraduate and graduate student (cf. Belnap and Haeri n.d.). Over his career, the list of publications in this area includes numerous articles and reviews, two jointly authored textbooks, and two edited volumes, including Ferguson 1960. In "The role of Arabic in Ethiopia," the description of language situations for the purposes of language planning and language policy decisions takes a step toward the more explicit. This paper was a product of the Survey of Language Use and Language Teaching in Eastern Africa, a project funded by the Ford Foundation. It provides a framework for developing a sociolinguistic profile for a given nation (rather than any other demographic, societal, cultural, or psychological framework) and applies that framework to the language situation in Ethiopia. The taxonomy of language types includes major languages, minor languages, and languages of special status. Language functions (cf. Stewart 1968) include use as an official


Speech Communities and Language Situations

language, as a lingua franca, as a language of education or of wider communication, and so on. Here Ferguson describes the use of Arabic as representing a diglossic situation within a larger, more complex language situation. Another area of special interest is the influence of religion on language variation, conventionalization, standardization, and change. In this respect, Ferguson is an active teacher and scholar, having taught courses on language and religion at Michigan (1973), Stanford (1974), and Georgetown (1985) and having lectured in this area at the Graduate Theological Union in Berkeley as well as at theological schools in Australia and Papua New Guinea. But Ferguson is also an actionoriented researcher, serving from 1964 until 1979 as a member of the Lutheran Society for Worship, Music and the Arts, and from 1967 until 1978 as a member of the Liturgical Texts Committee (as the chair of its Subcommittee on Prayers) of the Inter-Lutheran Commission on Worship. "Religious factors in language spread" is one of numerous articles by Ferguson on language and religion (see Religious Bibliography at the end of this overview). In this article he draws on a number of language situations to examine the complex of religious factors that interact with other social factors (economic, political, demographic, geographical) to give rise to, facilitate the spread of, or impede such sociolinguistic phenomena as register and dialect variation, language maintenance and shift, the development of literacy, and language attitudes. Because these factors come into play in most contexts of language use, an understanding of them is prerequisite both to informed language planning decisions and to the development of a theory of language in society. If that article takes a global view of religious factors in language situations, "Literacy in a Hunting-Gathering Society" focuses on one aspect of the language situation in a specific region of a country. Religion is an important consideration in this article as well, but the primary focus is on the development of literacy. Although Ferguson is not as widely associated with work on literacy per se as he is with Arabic studies, language and religion, or South Asian linguistics, even in his more general sociolinguistic work, there are frequent references to writing systems, their forms, functions, and acquisition. This paper, however, is one of about a half dozen that Ferguson has written directly on the topic. The roots of Ferguson's interest in issues of literacy can be traced to his interest in religion. Ferguson often in his papers either unveils some little known facts about a particular language situation or challenges widespread assumptions about the nature of literacy. For example, in "Saint Stefan of Perm and applied linguistics" (Ferguson 1967) he describes the role of a single man, St. Stefan of Perm, in the development of vernacular literacy. In "Contrasting patterns of literacy acquisition in a multilingual nation" (Ferguson 1971b), he lists nine features of literacy instruction that most scholars would agree upon, but then proceeds to show how each is violated in the case of Ethiopia. In "Patterns of literacy in multilingual situations" (Ferguson 1978), he examines a situation often overlooked in sociolinguistics, one in which two different writing systems can exist side-by-side for the same language. Ferguson's 1977 work appeared after a trip to the People's Republic of China, when he was invited to write about literacy there; and his 1990 work, reprinted in Part III of this volume, represents yet another perspective on literacy.

Speech Communities and Language Situations


Underlying his detailed reconstruction of incipient literacy among the Diyari in the Lake Eyre region of Australia in the latter half of the nineteenth century in "Literacy in a Hunting/Gathering Society" is Ferguson's skepticism of strong claims made about the cognitive effects of literacy development. A language situation is described in terms of the various language groups present, their means of interlanguage communication, and the variation (dialect and register) within language groups. Functions of literacy are shown to be consistent with other cultural values of the introducing group. Ferguson also points out the economic, social, and religious factors that contributed to the transmission of literacy and to the eventual decline of Diyari as a spoken language. Comparison with other language situations suggests general themes, conditions, and principles under which vernacular literacy is successfully introduced and takes hold. The final area of special interest represented in this section, South Asian linguistics, is among Ferguson's longest standing interests. His first published article (Ferguson 1945) was in this area. Since then he has published nearly twenty others, he has co-edited a major volume on the topic (Ferguson and Gumperz 1960), and there has been at least one volume of South Asian linguistics dedicated to him (Krishnamurti 1986). In "South Asia as a Sociolinguistic Area," Ferguson highlights some features of language use that make South Asia unique. In the process he demonstrates how features of language use just as well as language structure can cluster in areal relationships. Not only does the paper deepen the reader's understanding of the region, it also suggests that this type of research into the language situation of a larger geographical region can have implications for theories of language change and cultural diffusion in general.

References Abdulaziz, M. H. 1978. Triglossia and Swahili-English bilingualism in Tanzania. In J. Fishman, ed., Advances in the Study of Societal Multilingualism. The Hague: Mouton, pp. 129-52. Badawi, S. 1973. Mustawaya-t al-'arabiyya al-mu a-sira fi misr. Cairo: Daar al-Ma'a-rif. Belnap, Kirk, and Nioofar Haeri, eds. N.d. Structuralist Papers on Arabic Linguistics by Charles A. Ferguson. Leiden: Brill. Forthcoming. Brauner, S., and N. V. Ochotina, eds. 1982. Studien zur nationalsprachlichen Entwicklung in Afrika: Soziolinguisiche und sprachpolitische Probleme. Berlin: AkademieVerlag. Britto, F. 1986. Diglossia: A Study of the Theory with Application to Tamil. Washington, D.C.: Georgetown University Press. Denison, H. 1970. Linguistics 7: Sociolinguistics. Communities of Speech. Times Literary Supplement July 25, 1970, pp. 829-30. De Silva, M. 1976. Diglossia and Literacy. Mysore: Central Institute of Indian Languages. Fasold, R. 1984. The Sociolinguistics of Society. Oxford: Basil Blackwell. Ferguson, C. A. 1991. "Diglossia revisited." Southwest Journal of Linguistics 10.1:21434. . 1990. Then they could read and write. In L. Bouton and Y. Kachru, eds., Pragmatics and Language Learning Monograph Series 1:7-19. Urbana: Division of English as an International Language, University of Illinois.


Speech Communities and Language Situations

. 1986. Foreword. In F. Britto, Diglossia: A study of the theory with application to Tamil. Washington, D.C.: Gerogetown University Press. Pp. xxi-xxiii. . 1978. Patterns of literacy in multilingual situations. In J. Alatis, ed., International Dimensions of Bilingual Education (Georgetown University Round Table on Languages and Linguistics 1978). Washington, D.C.: Georgetown University Press. Pp. 582-90. . 1977. Aspects of literacy teaching in the People's Republic of China. In T. P. Gorman, ed., Language and Literacy: Current issues and research. Tehran: Institute for Adult Literacy Methods. Pp. 227-33. . 1971a. Language Structure and Language Use. Stanford, Calif.: Stanford University Press. . 197 1b. Contrasting patterns of literacy acquisition in a multilingual nation. In W. H. Whiteley, ed., Language Use and Social Change. London: Oxford University Press. Pp. 234-53. . 1967. Saint Stefan of Perm and applied linguistics. In J. A. Fishman, C. A. Ferguson, and J. Das Gupta, eds., Language Problems of Developing Nations. New York: Wiley and Sons. Pp. 253-65. , ed. 1960. Contributions to Arabic Linguistics. Cambridge, Mass.: Harvard University Press. . 1945. "A chart of the Bengali verb." Journal of the American Oriental Society 65.1:54-55. Ferguson, C. A., and J. Gumperz, eds. 1960. Linguistic Diversity in South Asia. (International Journal of American Linguistics 26.3, pt. 2.) Bloomlngton, Ind.: Indiana University Press. Fernandez, M. 1994. Diglossia: A comprehensive bibliography, 1960-1990, and supplements. Amsterdam: John Benjamins. Fishman, J. A. 1989. Language and Ethnicity in Minority Sociolinguistic Perspective. Clevedon: Multilingual Matters. . 1972. The Sociology of Language. In J. Fishman, ed., Advances in the Sociology of Language I. The Hague: Mouton, pp. 217-404. . 1967. "Bilingualism with and without diglossia; diglossia with and without bilingualism." Journal of Social Issues 32:29-38. Fonesca, M. S., and M. F. Neves, eds. 1974. Sociolinguistica. Rio de Janeiro: Eldorado. Garvin, P. L., and Y. L. de Suarez. 1974. Antologia de Estudios Etnolinguisticos y sociolinguisticos. Mexico City: Instituto de Investigaciones Antropologicas, Universidad Nacional Autonoma de Mexico. Giglioli, P. P., ed. 1972. Language and Social Context. Hannondsworth: Penguin. Linguaggio e Societa. Bologna: IIgna: II Mulino. Gilbert, G. 1975. "Review of The Sociology of Language, by J. Fishman." Language 51:234-36. Gumperz, J. 1964. "Linguistic and social interaction in two communities." American Anthropologist 66.6:137-53. Hudson, A. 1992. "Diglossia: A bibliographic review." Language in Society. 21.4:611-74. Hymes, D. 1974. Foundations in Sociolinguistics. Philadelphia: University of Pennsylvania Press. , ed. 1964. Language in Culture and Society. New York: Harper and Row. Ionescu-Ruxandoiu, L., and D. Chitoran, eds. 1975. Sociolingvistica. Orientare actuale. Bucuresti: Editura didactica si pedagogia. Johnson, B. C. 1986. Diglossia in Africa. In Bh. Krishnamurti, ed., South Asian Lan-

Speech Communities and Language Situations


guages: Structural Convergence and Diglossia. Delhi: Motilal Bunarsidass. Pp. 337-49. Krishnamurti, Bh., ed. 1986. South Asian Languages: Structural Convergence and Diglossia. Delhi: Motilal Banarsidass. Marcais, W. 1930-31. "La diglossie arabe. La languae arabe dans L'Afrique de Nord, etc." L'Enseigement Public 97:401-9; 105:20-39, 121-78. Nikol'ski, L. B. 1976. Sinchronnaja sociolingvistika. Moscow. Perides, M. 1965. Evolution of the Modern Greek Language and Its Present-day Form. Athens. Platt, J. 1977. "A model for polyglossia and multilingualism (with special reference to Singapore and Malaysia)." Language in Society 6.3:361-78. Raith, J., R. Schulze, and K.-H. Wandt, eds. 1986. Grundlagen der Mehrsprachigkeitsforschung: Forschungsrahmen, Konzepte, Beschreibungsprobleme, Fallstudien (ZDL Beifefte 52). Wiesbaden: Franz Steiner. Shapiro, M., and H. Schiffman, eds. 1983. Language and Society in South Asia. Foris Publications. Stewart, W. A. 1968. A sociolinguistic typology for describing national multilingualism. In J. A. Fishman, ed., Readings in the Sociology of Language. The Hague: Mouton. Steger, H., ed. 1982. Anwendungsbereiche der Soziolinguistik (Wege der Forschung 319). Darmstadt: Wissenschaftliche Buchgesellschaft.

Bibliography of Work on Language and Religion by C. A. Ferguson General Issues 1973. "Some forms of religious discourse." International Yearbook for the Sociology of Religion 8.224-35. 1982. Religious factors in language spread. In R. L. Cooper, ed., Language Spread. Pp. 95-106. Bloomington, Ind.: Indiana University Press. 1985. The study of religious discourse. Georgetown University Round Table on Language and Linguistics 1985, pp. 205-13. 1990. Then they could read and write. In L. Bouton and Y. Kachru, eds., Pragmatics and Language Learning Monograph Series 1, pp. 7-19. Urbana: Division of English as an International Language, University of Illinois.

Sacred Texts 1965. "Holy Cross Day: A set of Lutheran propers." Response 6:172-76. 1968. The morning suffrages in contemporary English. Response 9:131-38. 1974. King James English as the language of American revelation. Paper presented at the VIII World Congress of Sociologists, Toronto. 1976. The collect as a form of discourse. In W. J. Samarin, ed., Language and Religious Practice. Pp. 101-109. Rowley, Mass.: Newbury House. (Also in M. A. Jazayery et al., eds., Linguistic and Literary Studies in Honor of Archibald A. Hill, vol. 1. Pp. 127-37. Lisse: Peter de Ridder Press.) 1977. The blessing of the Lord be upon you. In A. Juilland, ed., Linguistic Studies Offered


Speech Communities and Language Situations

to Joseph Greenberg. Pp. 21-26. Saratoga, Calif.: Anma Libri. 1983. '"God wishes' in Syrian Arabic." Mediterranean Language Review 1:65-83. 1985. Prayer of the people: Group construction of a religious genre of 'formatted discourse.' Unpublished ms.

Saints and Saints' Lives 1966. "Saints' names in American Lutheran church dedications." Names 14:76-82. 1967. St. Stephen of Perm and applied linguistics. In To Honor Roman Jakobson. Pp. 643-53. The Hague: Mouton. (Also in J. A. Fishman et al., eds., Language Problems of Developing Nations. Pp. 253-65. New York: John Wiley and Sons.) 1969. "St. Yared, the deacon, Ethiopian hymnist." Response 10:136-38. 1970. Nine Saints of Ethiopia. Unpublished ms. 1986. Devotional reading and science fiction: The medieval saint's life as a form of discourse. In B. F. Elson, ed., Language in Global Perspective. Pp. 113-22. Dallas: Summer Institute of Linguistics. In progress. Sacred biography: Life stories of Christian bible translators and alphabet makers. To be submitted to Biography.

1 Diglossia

In many speech communities two or more varieties of the same language are used by some speakers under different conditions. Perhaps the most familiar example is the standard language and regional dialect as used, say, in Italian or Persian, where many speakers speak their local dialect at home or among family or friends of the same dialect area but use the standard language in communicating with speakers of other dialects or on public occasions. There are, however, quite different examples of the use of two varieties of a language in the same speech community. In Baghdad the Christian Arabs speak a 'Christian Arabic' dialect when talking among themselves but speak the general Baghdad dialect, 'Muslim Arabic', when talking in a mixed group. In recent years there has been a renewed interest in studying the development and characteristics of standardized languages (see especially Kloss, 1952, with its valuable introduction on standardization in general), and it is in following this line of interest that the present study seeks to examine carefully one particular kind of standardization where two varieties of a language exist side by side throughout the community, with each having a definite role to play. The term 'diglossia' is introduced here, modeled on the French diglossie, which has been applied to this situation, since there seems to be no word in regular use for this in English; other languages of Europe generally use the word for 'bilingualism' in this special sense as well. (The terms 'language', 'dialect', and 'variety' are used here without precise definition. It is hoped that they occur sufficiently in accordance with established usage to be unambiguous for the present purpose. The term 'superposed variety' is also used here without definition; it means that the variety in question is not the primary, 'native' variety for the speakers in question but may be learned in addition to this. Finally, no attempt is made in this paper to examine the analogous situation where two distinct (reThis paper originally appeared in Word 15:325-40 (1959), reprinted by permission of the International Linguistic Association. A preliminary version of this study, with the title "Classical or colloquial, one standard or two," was prepared for presentation at the symposium on Urbanization and Standard Language: Facts and Attitudes, held at the meeting of the American Anthropological Association in November 1958, in Washington, D.C. The preliminary version was read by a number of people and modifications were made on the basis of comments by H. Blanc, J. Gumperz, B. Halpern, M. Perlmann, R. L. Ward, and U. Weinreich.



Speech Communities and Language Situations

lated or unrelated) languages are used side by side throughout a speech community, each with a clearly defined role.) It is likely that this particular situation in speech communities is very widespread, although it is rarely mentioned, let alone satisfactorily described. A full explanation of it can be of considerable help in dealing with problems in linguistic description, in historical linguistics, and in language typology. The present study should be regarded as preliminary in that much more assembling of descriptive and historical data is required; its purpose is to characterize diglossia by picking out four speech communities and their languages (hereafter called the defining languages) which clearly belong in this category, and describing features shared by them which seem relevant to the classification. The defining languages selected are Arabic, Modern Greek, Swiss German, Haitian Creole. (See the references at the end of this Reading.) Before proceeding to the description it must be pointed out that diglossia is not assumed to be a stage which occurs always and only at a certain point in some kind of evolution, e.g., in the standardization process. Diglossia may develop from various origins and eventuate in different language situations. Of the four defining languages, Arabic diglossia seems to reach as far back as our knowledge of Arabic goes, and the superposed 'Classical' language has remained relatively stable, while Greek diglossia has roots going back many centuries, but it became fully developed only at the beginning of the nineteenth century with the renaissance of Greek literature and the creation of a literary language based in large part on previous forms of literary Greek. Swiss German diglossia developed as a result of long religious and political isolation from the centers of German linguistic standardization, while Haitian Creole arose from a creolization of a pidgin French, with standard French later coming to play the role of the superposed variety. Some speculation on the possibilities of development will, however, be given at the end of the paper. For convenience of reference the superposed variety in diglosias will be called the H ('high') variety or simply H, and the regional dialects will be called L ('low') varieties or, collectively, simply L. All the defining languages have names for H and L, and these are listed in the accompanying table. Arabic H is called Classical(= H) 'al-fusha Egyptian (= L) 'il-fasih, 'in-nahawi

L is called 'al-'ammiyyah, 'ad-darij 'il-'ammiyya

SW. German Stand. German Schriftsprache (= H) Swiss (= L) Hoochtuutsch

[Schweizer] Dialekt, Schweizerdeutsch Schwyzertuutsch

H. Creole French (= H) francais Greek H and L katharevusa

creole dhimotiki



It is instructive to note the problems involved in citing words of these languages in a consistent and accurate manner. First, should the words be listed in their H form or in their L form, or in both? Second, if words are cited in their L form, what kind of L should be chosen? In Greek and in Haitian Creole, it seems clear that the ordinary conversational language of the educated people of Athens and Port-au-Prince respectively should be selected. For Arabic and for Swiss German the choice must be arbitrary, and the ordinary conversational language of educated people of Cairo and of Zurich city will be used here. Third, what kind of spelling should be used to represent L? Since there is in no case a generally accepted orthography for L, some kind of phonemic or quasi-phonemic transcription would seem appropriate. The following choices were made. For Haitian Creole, the McConnell-Laubach spelling was selected, since it is approximately phonemic and is typographically simple. For Greek, the transcription was adopted from the manual Spoken Greek (Kahane et al., 1945), since this is intended to be phonemic; a transliteration of the Greek spelling seems less satisfactory not only because the spelling is variable but also because it is highly etymologizing in nature and quite unphonemic. For Swiss German, the spelling backed by Dieth (1938), which, though it fails to indicate all the phonemic contrasts and in some cases may indicate allophones, is fairly consistent and seems to be a sensible systematization, without serious modification, of the spelling conventions most generally used in writing Swiss German dialect material. Arabic, like Greek, uses a non-Roman alphabet, but transliteration is even less feasible than for Greek, partly again because of the variability of the spelling, but even more because in writing Egyptian colloquial Arabic many vowels are not indicated at all and others are often indicated ambiguously; the transcription chosen here sticks closely to the traditional systems of Semitists, being a modification for Egyptian of the scheme used by Al-Toma (1957). The fourth problem is how to represent H. For Swiss German and Haitian Creole standard German and French orthography respectively can be used even though this hides certain resemblances between the sounds of H and L in both cases. For Greek either the usual spelling in Greek letters could be used or a transliteration, but since a knowledge of Modern Greek pronunciation is less widespread than a knowledge of German and French pronunciation, the masking effect of the orthography is more serious in the Greek case, and we use the phonemic transcription instead. Arabic is the most serious problem. The two most obvious choices are (1) a transliteration of Arabic spelling (with the unwritten vowels supplied by the transcriber) or (2) a phonemic transcription of the Arabic as it would be read by a speaker of Cairo Arabic. Solution (1) has been adopted, again in accordance with Al-Toma's procedure.

Function One of the most important features of diglossia is the specialization of function for H and L. In one set of situations only H is appropriate and in another only L,


Speech Communities and Language Situations

with the two sets overlapping only very slightly. As an illustration, a sample listing of possible situations is given, with indication of the variety normally used: Sermon in church or mosque Instructions to servants, waiters, workmen, clerks Personal letter Speech in parliament, political speech University lecture Conversation with family, friends, colleagues News broadcast Radio 'soap opera' Newspaper editorial, news story, caption on picture Caption on political cartoon Poetry Folk literature

H L x x x x x x x x x x x x

The importance of using the right variety in the right situation can hardly be overestimated. An outsider who learns to speak fluent, accurate L and then uses it in a formal speech is an object of ridicule. A member of the speech community who uses H in a purely conversational situation or in an informal activity like shopping is equally an object of ridicule. In all the defining languages it is typical behavior to have someone read aloud from a newspaper written in H and then proceed to discuss the contents in L. In all the defining languages it is typical behavior to listen to a formal speech in H and then discuss it, often with the speaker himself, in L. (The situation in formal education is often more complicated than is indicated here. In the Arab world, for example, formal university lectures are given in H, but drills, explanation, and section meetings may be in large part conducted in L, especially in the natural sciences as opposed to the humanities. Although the teachers' use of L in secondary schools is forbidden by law in some Arab countries, often a considerable part of the teachers' time is taken up with explaining in L the meaning of material in H which has been presented in books or lectures.) The last two situations on the list call for comment. In all the defining languages some poetry is composed in L, and a small handful of poets compose in both, but the status of the two kinds of poetry is very different, and for the speech community as a whole it is only the poetry in H that is felt to be 'real' poetry. (Modem Greek does not quite fit this description. Poetry in L is the major production and H verse is generally felt to be artificial.) On the other hand, in every one of the defining languages certain proverbs, politeness formulas, and the like are in H even when cited in ordinary conversation by illiterates. It has been estimated that as much as one-fifth of the proverbs in the active repertory of Arab villagers are in H (Journal of the American Oriental Society, 1955, vol. 75, pp. 124 ff.).



Prestige In all the defining languages the speakers regard H as superior to L in a number of respects. Sometimes the feeling is so strong that H alone is regarded as real and L is reported 'not to exist'. Speakers of Arabic, for example, may say (in L) that so-and-so doesn't know Arabic. This normally means he doesn't know H, although he may be a fluent, effective speaker of L. If a nonspeaker of Arabic asks an educated Arab for help in learning to speak Arabic the Arab will normally try to teach him H forms, insisting that these are the only ones to use. Very often, educated Arabs will maintain that they never use L at all, in spite of the fact that direct observation shows that they use it constantly in all ordinary conversation. Similarly, educated speakers of Haitian Creole frequently deny its existence, insisting that they always speak French. This attitude cannot be called a deliberate attempt to deceive the questioner, but seems almost a self-deception. When the speaker in question is replying in good faith, it is often possible to break through these attitudes by asking such questions as what kind of language he uses in speaking to his children, to servants, or to his mother. The very revealing reply is usually something like: 'Oh, but they wouldn't understand [the H form, whatever it is called].' Even where the feeling of the reality and superiority of H is not so strong, there is usually a belief that H is somehow more beautiful, more logical, better able to express important thoughts, and the like. And this belief is held also by speakers whose command of H is quite limited. To those Americans who would like to evaluate speech in terms of effectiveness of communication it comes as a shock to discover that many speakers of a language involved in diglossia characteristically prefer to hear a political speech or an expository lecture or a recitation of poetry in H even though it may be less intelligible to them than it would be in L. In some cases the superiority of H is connected with religion. In Greek the language of the New Testament is felt to be essentially the same as the katharevusa, and the appearance of a translation of the New Testament in dhimotiki was the occasion for serious rioting in Greece in 1903. Speakers of Haitian Creole are generally accustomed to a French version of the Bible, and even when the Church uses Creole for catechisms and the like, it resorts to a highly Gallicized spelling. For Arabic, H is the language of the Qur'an and as such is widely believed to constitute the actual words of God and even to be outside the limits of space and time, i.e. to have existed 'before' time began with the creation of the world.

Literary Heritage In every one of the defining languages there is a sizable body of written literature in H which is held in high esteem by the speech community, and contemporary literary production in H by members of the community is felt to be part of this


Speech Communities and Language Situations

otherwise existing literature. The body of literature may either have been produced long ago in the past history of the community or be in continuous production in another speech community in which H serves as the standard variety of the language. When the body of literature represents a long time span (as in Arabic or Greek) contemporary writers—and readers—tend to regard it as a legitimate practice to utilize words, phrases, or constructions which may have been current only at one period of the literary history and are not in widespread use at the present time. Thus it may be good journalistic usage in writing editorials, or good literary taste in composing poetry, to employ a complicated Classical Greek participial construction or a rare twelfth-century Arabic expression which it can be assumed the average educated reader will not understand without research on his part. One effect of such usage is appreciation on the part of some readers: 'So-and-so really knows his Greek for Arabic]', or 'So-and-so's editorial today, or latest poem, is very good Greek [or Arabic].'

Acquisition Among speakers of the four defining languages adults use L in speaking to children and children use L in speaking to one another. As a result, L is learned by children in what may be regarded as the 'normal' way of learning one's mother tongue. H may be heard by children from time to time, but the actual learning of H is chiefly accomplished by the means of formal education, whether this be traditional Qur'anic schools, modern government schools, or private tutors. This difference in method of acquisition is very important. The speaker is at home in L to a degree he almost never achieves in H. The grammatical structure of L is learned without explicit discussion of grammatical concepts; the grammar of H is learned in terms of 'rules' and norms to be imitated. It seems unlikely that any change toward full utilization of H could take place without a radical change in this pattern of acquisition. For example, those Arabs who ardently desire to have L replaced by H for all functions can hardly expect this to happen if they are unwilling to speak H to their children. (It has been very plausibly suggested that there are psychological implications following from this linguistic duality. This certainly deserves careful experimental investigation. On this point, see the highly controversial article which seems to me to contain some important kernels of truth along with much which cannot be supported— Shouby (1951).)

Standardization In all the defining languages there is a strong tradition of grammatical study of the H form of the language. There are grammars, dictionaries, treatises on pronunciation, style, and so on. There is an established norm for pronunciation, grammar, and vocabulary which allows variation only within certain limits. The orthography is well established and has little variation. By contrast, descriptive and



normative studies of the L form are either non-existent or relatively recent and slight in quantity. Often they have been carried out first or chiefly by scholars OUTSIDE the speech community and are written in other languages. There is no settled orthography and there is wide variation in pronunciation, grammar, and vocabulary. In the case of relatively small speech communities with a single important center of communication (e.g., Greece, Haiti) a kind of standard L may arise which speakers of other dialects imitate and which tends to spread like any standard variety except that it remains limited to the functions for which L is appropriate. In speech communities which have no single most important center of communication a number of regional L's may arise. In the Arabic speech community, for example, there is no standard L corresponding to educated Athenian dhimotiki, but regional standards exist in various areas. The Arabic of Cairo, for example, serves as a standard L for Egypt, and educated individuals from Upper Egypt must learn not only H but also, for conversational purposes, an approximation to Cairo L. In the Swiss German speech community there is no single standard, and even the term 'regional standard' seems inappropriate, but in several cases the L of a city or town has a strong effect on the surrounding rural L.

Stability It might be supposed that diglossia is highly unstable, tending to change into a more stable language situation. This is not so. Diglossia typically persists at least several centuries, and evidence in some cases seems to show that it can last well over a thousand years. The communicative tensions which arise in the diglossia situation may be resolved by the use of relatively uncodified, unstable, intermediate forms of the language (Greek mikti, Arabic al-lugah al-wusta, Haitian creole de salon) and repeated borrowing of vocabulary items from H to L. In Arabic, for example, a kind of spoken Arabic much used in certain semiformal or cross-dialectal situations has a highly classical vocabulary with few or no inflectional endings, with certain features of classical syntax, but with a fundamentally colloquial base in morphology and syntax, and a generous admixture of colloquial vocabulary. In Greek a kind of mixed language has become appropriate for a large part of the press. The borrowing of lexical items from H to L is clearly analogous (or for the periods when actual diglossia was in effect in these languages, identical) with the learned borrowings from Latin to Romance languages or the Sanskrit tatsamas in Middle and New Indo-Aryan. (The exact nature of this borrowing process deserves careful investigation, especially for the important 'filter effect' of the pronunciation and grammar of H occurring in those forms of middle language which often serve as the connecting link by which the loans are introduced into the 'pure' L.)


Speech Communities and Language Situations

Grammar One of the most striking differences between H and L in the defining languages is in the grammatical structure: H has grammatical categories not present in L and has an inflectional system of nouns and verbs which is much reduced or totally absent in L. For example, Classical Arabic has three cases in the noun, marked by endings; colloquial dialects have none. Standard German has four cases in the noun and two non-periphrastic indicative tenses in the verb; Swiss German has three cases in the noun and only one simple indicative tense. Katharevusa has four cases, dhimotiki three. French has gender and number in the noun, Creole has neither. Also, in every one of the defining languages there seem to be several striking differences of word order as well as a thorough-going set of differences in the use of introductory and connective particles. It is certainly safe to say that in diglossia there are always extensive differences between the grammatical structures of H and L. This is true not only for the four defining languages, but also for every other case of diglossia examined by the author. For the defining languages it may be possible to make a further statement about grammatical differences. It is always risky to hazard generalizations about grammatical complexity, but it may be worthwhile to attempt to formulate a statement applicable to the four defining languages even if it should turn out to be invalid for other instances of diglossia (cf. Greenberg, 1954). There is probably fairly wide agreement among linguists that the grammatical structure of language A is 'simpler' than that of B if, other things being equal, 1. the morphophonemics of A is simpler, i.e. morphemes have fewer alternants, alternation is more regular, automatic (e.g., Turkish -lar~-ler is simpler than the English plural markers); 2. there are fewer obligatory categories marked by morphemes or concord (e.g., Persian with no gender distinctions in the pronoun is simpler than Egyptian Arabic with masculine-feminine distinction in the second and third persons singular); 3. paradigms are more symmetrical (e.g., a language with all declensions having the same number of case distinctions is simpler than one in which there is variation); 4. concord and rection are stricter (e.g., prepositions all take the same case rather than different cases). If this understanding of grammatical simplicity is accepted, then we may note that in at least three of the defining languages, the grammatical structure of any given L variety is simpler than that of its corresponding H. This seems incontrovertibly true for Arabic, Greek, and Haitian Creole; a full analysis of standard German and Swiss German might show this not to be true in that diglossic situation in view of the extensive morphophonemics of Swiss.



Lexicon Generally speaking, the bulk of the vocabulary of H and L is shared, of course with variations in form and with differences of use and meaning. It is hardly surprising, however, that H should include in its total lexicon technical terms and learned expressions which have no regular L equivalents, since the subjects involved are rarely if ever discussed in pure L. Also, it is not surprising that the L varieties should include in their total lexicons popular expressions and the names of very homely objects or objects of very localized distribution which have no regular H equivalents, since the subjects involved are rarely if ever discussed in pure H. But a striking feature of diglossia is the existence of many paired items, one H one L, referring to fairly common concepts frequently used in both H and L, where the range of meaning of the two items is roughly the same, and the use of one or the other immediately stamps the utterance or written sequence as H or L. For example, in Arabic the H word for 'see' is ra'a, the L word is saf. The word ra'a never occurs in ordinary conversation and saf is not used in normal written Arabic. If for some reason a remark in which saf was used is quoted in the press, it is replaced by ra'a in the written quotation. In Greek the H word for 'wine' is inos, the L word is krasi. The menu will have inos written on it, but the diner will ask the waiter for krasi. The nearest American English parallels are such cases as illumination ~ light, purchase ~ buy, or children ~ kids, but in these cases both words may be written and both may be used in ordinary conversation: the gap is not so great as for the corresponding doublets in diglossia. Also, the formal-informal dimension in languages like English is a continuum in which the boundary between the two items in different pairs may not come at the same point, e.g., illumination, purchase, and children are not fully parallel in their formal-informal range of usage. A dozen or so examples of lexical doublets from three of the sample languages are given below. For each language two nouns, a verb, and two particles are given. Greek


ikos idhor eteke ala Arabic hioa'un 'anfun ahaba ma 'al'ana Creole homme, gens

house water gave birth


L spiti nero eyenise ma

shoe nose went what


gazma manaxir rah 'eh dilwa'ti

person, people

moun (not connected with monde)


Speech Communities and Language Situations ane donner beaucoup maintenant

donkey give much, a lot now

bourik bay apil kou-n-ye-a

It would be possible to present such a list of doublets for Swiss German (e.g., nachdem no 'after', jemand = opper 'someone', etc.), but this would give a false picture. In Swiss German the phonological differences between H and L are very great and the normal form of lexical pairing is regular cognation (klein chly 'small', etc.).

Phonology It may seem difficult to offer any generalization on the relationships between the phonology of H and L in diglossia in view of the diversity of data. H and L phonologies may be quite close, as in Greek; moderately different, as in Arabic or Haitian Creole; or strikingly divergent, as in Swiss German. Closer examination, however, shows two statements to be justified. (Perhaps these will turn out to be unnecessary when the preceding features are stated so precisely that the statements about phonology can be deduced directly from them.) 1. The sound systems of H and L constitute a single phonological structure of which the L phonology is the basic system and the divergent features of H phonology are either a subsystem or a parasystem. Given the mixed forms mentioned above and the corresponding difficulty of identifying a given word in a given utterance as being definitely H or definitely L, it seems necessary to assume that the speaker has a single inventory of distinctive oppositions for the whole HL complex and that there is extensive interference in both directions in terms of the distribution of phonemes in specific lexical items. (For details on certain aspects of this phonological interference in Arabic, cf. Ferguson, 1957). 2. If 'pure' H items have phonemes not found in 'pure' L items, L phonemes frequently substitute for these in oral use of H and regularly replace them in tatsamas. For example, French has a high front rounded vowel phoneme /u/; 'pure' Haitian Creole has no such phoneme. Educated speakers of Creole use this vowel in tatsamas such as Luk (/luk/ for the Gospel of St Luke), while they, like uneducated speakers, may sometimes use /i/ for it when speaking French. On the other hand /i/ is the regular vowel in such tatsamas in Creole as linet 'glasses'. In cases where H represents in large part an earlier stage of L, it is possible that a three-way correspondence will appear. For example, Syrian and Egyptian Arabic frequently use /s/ for / / in oral use of Classical Arabic, and have /s/ in tatsamas, but have /t/ in words regularly descended from earlier Arabic not borrowed from the Classical. (See Ferguson, 1957.) Now that the characteristic features of diglossia have been outlined it is feasible to attempt a fuller definition. DIGLOSSIA is a relatively stable language situation in which, in addition to the primry dialects of the language (which may



include a standard or regional standards), there is a very divergent, highly codified (often grammatically more complex) superposed variety, the vehicle of a large and respected body of written literature, either of an earlier period or in another speech community, which is learned largely by formal education and is used for most written and formal spoken purposes but is not used by any sector of the community for ordinary conversation. With the characterization of diglossia completed we may turn to a brief consideration of three additional questions: How does diglossia differ from the familiar situation of a standard language with regional dialects? How widespread is the phenomenon of diglossia in space, time, and linguistic families? Under what circumstances does diglossia come into being and into what language situations is it likely to develop? The precise role of the standard variety (or varieties) of a language vis-a-vis regional or social dialects differs from one speech community to another, and some instances of this relation may be close to diglossia or perhaps even better considered as diglossia. As characterized here, diglossia differs from the more widespread standard-with-dialects in that no segment of the speech community in diglossia regularly uses H as a medium of ordinary conversation, and any attempt to do so is felt to be either pedantic and artificial (Arabic, Greek) or else in some sense disloyal to the community (Swiss German, Creole). In the more usual standard-with-dialects situation the standard is often similar to the variety of a certain region or social group (e.g., Tehran Persian, Calcutta Bengali) which is used in ordinary conversation more or less naturally by members of the group and as a superposed variety by others. Diglossia is apparently not limited to any geographical region or language family. (All clearly documented instances known to me are in literate communities, but it seems at least possible that a somewhat similar situation could exist in a non-literate community where a body of oral literature could play the same role as the body of written literature in the examples cited.) Three examples of diglossia from other times and places may be cited as illustrations of the utility of the concept. First, consider Tamil. As used by the millions of members of the Tamil speech community in India today, it fits the definition exactly. There is a literary Tamil as H used for writing and certain kinds of formal speaking and a standard colloquial as L (as well as local L dialects) used in ordinary conversation. There is a body of literature in H going back many centuries which is highly regarded by Tamil speakers today. H has prestige, L does not. H is always superposed, L is learned naturally, whether as primary or as a superposed standard colloquial. There are striking grammatical differences and some phonological differences between the two varieties. (There is apparently no good description available of the precise relations of the two varieties of Tamil; an account of some of the structural differences is given by Pillai (1960). Incidentally, it may be noted that Tamil diglossia seems to go back many centuries, since the language of early literature contrasts sharply with the language of early inscriptions, which probably reflect the spoken language of the time.) The situation is only slightly complicated by the presence of Sanskrit and English for certain functions of H; the same kind of


Speech Communities and Language Situations

complication exists in parts of the Arab world where French, English, or a liturgical language such as Syriac or Coptic has certain H-like functions. Second, we may mention Latin and the emergent Romance languages during a period of some centuries in various parts of Europe. The vernacular was used in ordinary conversation but Latin for writing or certain kinds of formal speech. Latin was the language of the Church and its literature, Latin had the prestige, there were striking grammatical differences between the two varieties in each area, etc. Third, Chinese should be cited because it probably represents diglossia on the largest scale of any attested instance. (An excellent, brief description of the complex Chinese situation is available in the introduction to Chao (1947, pp. 1-17).) The weu-li corresponds to H, while Mandarin colloquial is a standard L; there are also regional L varieties so different as to deserve the label 'separate languages' even more than the Arabic dialects, and at least as much as the emergent Romance languages in the Latin example. Chinese, however, like modern Greek, seems to be developing away from diglossia toward a standard-with-dialects in that the standard L or a mixed variety is coming to be used in writing for more and more purposes, i.e. it is becoming a true standard. Diglossia is likely to come into being when the following three conditions hold in a given speech community: (1) There is a sizable body of literature in a language closely related to (or even identical with) the natural language of the community, and this literature embodies, whether as source (e.g., divine revelation) or reinforcement, some of the fundamental values of the community. (2) Literacy in the community is limited to a small elite. (3) A suitable period of time, of the order of several centuries, passes from the establishment of (1) and (2). It can probably be shown that this combination of circumstances has occurred hundreds of times in the past and has generally resulted in diglossia. Dozens of examples exist today, and it is likely that examples will occur in the future. Diglossia seems to be accepted and not regarded as a 'problem' by the community in which it is in force, until certain trends appear in the community. These include trends toward (1) more widespread literacy (whether for economic, ideological or other reasons), (2) broader communication among different regional and social segments of the community (e.g., for economic, administrative, military, or ideological reasons), (3) desire for a full-fledged standard 'national' language as an attribute of autonomy or of sovereignty. When these trends appear, leaders in the community begin to call for unification of the language, and for that matter, actual trends toward unification begin to take place. These individuals tend to support either the adoption of H or of one form of L as the standard, less often the adoption of a modified H or L, a 'mixed' variety of some kind. The arguments explicitly advanced seem remarkably the same from one instance of diglossia to another. The proponents of H argue that H must be adopted because it connects the community with its glorious past or with the world community and because it is a naturally unifying factor as opposed to the divisive nature of the L dialects. In addition to these two fundamentally sound arguments there are usually pleas based on the beliefs of the community in the superiority of H: that it is more beautiful,



more expressive, more logical, that it has divine sanction, or whatever their specific beliefs may be. When these latter arguments are examined objectively their validity is often quite limited, but their importance is still very great because they reflect widely held attitudes within the community. The proponents of L argue that some variety of L must be adopted because it is closer to the real thinking and feeling of the people; it eases the educational problem since people have already acquired a basic knowledge of it in early childhood; and it is a more effective instrument of communication at all levels. In addition to these fundamentally sound arguments there is often great emphasis given to points of lesser importance such as the vividness of metaphor in the colloquial, the fact that other 'modern nations' write very much as they speak, and so on. The proponents of both sides or even of the mixed language seem to show the conviction—although this may not be explicitly stated—that a standard language can simply be legislated into place in a community. Often the trends which will be decisive in the development of a standard language are already at work and have little to do with the argumentation of the spokesmen for the various viewpoints. A brief and superficial glance at the outcome of diglossia in the past and a consideration of present trends suggests that there are only a few general kinds of development likely to take place. First, we must remind ourselves that the situation may remain stable for long periods of time. But if the trends mentioned above do appear and become strong, change may take place. Second, H can succeed in establishing itself as a standard only if it is already serving as a standard language in some other community and the diglossia community, for reasons linguistic and non-linguistic, tends to merge with the other community. Otherwise H fades away and becomes a learned or liturgical language studied only by scholars or specialists and not used actively in the community. Some form of L or a mixed variety becomes standard. Third, if there is a single communication center in the whole speech community, or if there are several such centers all in one dialect area, the L variety of the center(s) will be the basis of the new standard, whether relatively pure L or considerably mixed with H. If there are several such centers in different dialect areas with no one center paramount, then it is likely that several L varieties will become standard as separate languages. A tentative prognosis for the four defining languages over the next two centuries (i.e. to about AD 2150) may be hazarded: Swiss German: Relative stability. Arabic: Slow development toward several standard languages, each based on an L variety with heavy admixture of H vocabulary. Three seem likely: Maghrebi (based on Rabat or Tunis?), Egyptian (based on Cairo), Eastern (based on Baghdad?); unexpected politicoeconomic developments might add Syrian (based on Damascus?), Sudanese (based on Omdurman-Khartoum), or others.


Speech Communities and Language Situations

Haitian Creole: Slow development toward unified standard based on L of Portau-Prince. Greek: Full development to unified standard based on L of Athens plus heavy admixture of H vocabulary. This paper concludes with an appeal for further study of this phenomenon and related ones. Descriptive linguists in their understandable zeal to describe the internal structure of the language they are studying often fail to provide even the most elementary data about the socio-cultural setting in which the language functions. Also, descriptivists usually prefer detailed descriptions of 'pure' dialects or standard languages rather than the careful study of the mixed, intermediate forms often in wider use. Study of such matters as diglossia is of clear value in understanding processes of linguistic change and presents interesting challenges to some of the assumptions of synchronic linguistics. Outside linguistics proper it promises material of great interest to social scientists in general, especially if a general frame of reference can be worked out for analysis of the use of one or more varieties of language within a speech community. Perhaps the collection of data and more profound study will drastically modify the impressionistic remarks of this paper, but if this is so the paper will have had the virtue of stimulating investigation and thought.

References on the Four Defining Languages The judgements of this paper are based primarily on the author's personal experience, but documentation for the four defining languages is available, and the following references may be consulted for further details. Most of the studies listed here take a strong stand in favor of greater use of the more colloquial variety since it is generally writers of this opinion who want to describe the facts. This bias can, however, be ignored by the reader who simply wants to discover the basic facts of the situation.

References Modern Greek Hatzidakis, G. N. (1905), Die Sprachfrage in Griechenland, Chatzedaka, Athens. Kahane, H., Kahane, R. and Ward, R. L. (1945), Spoken Greek, Washington. Krumbacher, K. (1902), Das Problem der modernen griechischen Schriftsprache, Munich. Pernot, H. (1898), Grammaire Grecque Moderne, Paris, pp. vii-xxxi. Psichari, J. (1928), 'Un Pays qui ne veut pas sa langue', Mercure de France, 1 October, pp. 63-121. Also in Psichari, Quelque travaux . . . . , Paris, 1930, vol. I, pp. 1283-1337. Steinmetz, A. (1936),'Schrift und Volksprache in Griechenland, Deutsche Akademie (Munich), Mitteilungen, pp. 370-379.



Swiss German Dieth, E. (1938), Schwyzertutsch Dialakschrift, Zurich. Greyerz, O. von (1933), 'Vom Wert und Wesen unserer Mundart', Sprache, Dichtung, Heimat, Berne, pp. 226-247. Kloss, H. (1952), Die Entwicklung neuer germanischer Kultursprachen von 1800 bis 1950, Pohl, Munich. Schmid, K. (1936), 'Fur unser Schweizerdeutsch', Die Schweiz: ein nationales Jahrbuch 1936, Basel, pp. 65-79. Senn, A. (1935), 'Das Verhaltnis von Mundart und Schriftsprache in der deutschen Schweiz', Journal of English and Germanic Philology, vol. 34, pp. 42-58.

Arabic Al-Toma, S. J. (1957), 'The teaching of Classical Arabic to speakers of the colloquial in Iraq: a study of the problem of linguistic duality', Doctoral dissertation, Harvard University. Chejne, A. (1958), 'The role of Arabic in present-day Arab society', The Islamic Literature, vol. 10, no. 4, pp. 15-54. Lecerf, J. (1932), Litterature Dialectale et reniassance arabe moderne (Damascus, 19323), pp. 1-14; Majallat al-majma'al-'ilmi al-'arabi (Dimashq), vol. 32, no 1 'Adad xass bilmu'tamar al-awwal lilmajami al-lugawiyyah al-'ilmiyyah al-'arabiyyah (Damascus, January 1957). Marcais, W. (1930-31), Three articles, L'Enseignement Public, vol. 97, pp. 401-9; vol. 105, pp. 20-39, 120-33.

Haitian Creole Comhaire-Sylvain, S.(1936), Le Creole haitien, Wetteren and Port-au-Prince. Hall, R. A., Jr. (1953), Haitian Creole, Menasha, Wis. McConnell, H. O., and Swan, E.(1945), You Can Learn Creole, Port-au-Prince.

Other References Chao, Y. R. (1947), Cantonese Primer, Harvard University Press. Ferguson, C. A. (1957), 'Two problems in Arabic phonology', Word, vol. 13, pp. 460-78. Greenberg, J. H. (1954), 'A quantitative approach to the morphological typology of language', in R. Spencer (ed.), Method and Perspective in Anthropology, University of Minnesota Press, pp. 192-220. Pillai, M. (1960), 'Tamil—literary and colloquial', in C. A. Ferguson and J. J. Gumperz (eds.), Linguistic Diversity in South Asia, Indiana University Research Center in Anthropology, Folklore and Linguistics: Publication 13, pp. 27-42. Shouby, E. (1951), 'The influence of the Arabic language on the psychology of the Arabs', Middle East Journal, vol. 5, pp. 284-302.

2 Language Development

Discussions of language problems of developing countries cover a wide range of problems such as national multilingualism, language education policies, and languages as symbols of group identity. Many of these issues can be dealt with by the conceptual frameworks used in the study of social organization, political systems, or economic processes. Some questions, however, relate to the state of a language itself, as shown by observations that such and such a language is "backward" or "inadequate" or that a particular language needs "purifying," "reforming," "modernizing," or some other forms of improvement. This kind of issue is closer to the conceptual framework of linguistics, although not commonly dealt with by professional linguists, and it may be useful to offer a linguistic perspective on it.

Linguistic Structure The traditional twofold task of linguistics is to make statements that hold true for all languages everywhere and at all times (a general theory of language) and to make statements about particular varieties of language under particular conditions (characterization of languages, e.g., grammars, language histories, etc.). In either case linguists have been concerned with what kinds of sequences are "pronounceable" in a given language (or in all languages), what kinds of sequences are grammatically possible in a given language (or in general), and what kinds of sequences are meaningful in a language (or in general). Although the range of possible variation remains within the definite limits of the general characteristics of languages, now commonly called "language universals" (Greenberg, 1966a, 1966b; Uspenskij, 1965), there is an astonishing diversity of possible structures as exemplified by the thousands of different languages now in existence and the several hundred for which there is historical documentation. Accordingly, it has often been tempting to regard one kind of linguistic (phonological-grammaticalThis paper was originally published in J. A. Fishman, C. A. Ferguson, and J. Das Gupta, eds., Language Problems of Developing Nations. New York: John Wiley and Sons (1968), reprinted with permission of John Wiley and Sons.


Language Development


semantic) structure as being in some way superior to or more advanced than others. As time has passed, however, linguists have increasingly become convinced that there is no simple scale of superiority in structure and no simple evolutionary line along which known linguistic structures could be placed. In this fundamental sense there is as yet no convincing evidence that the total structure of one language is better than that of another in that it is easier to acquire (as a first language), less ambiguous, more efficient for cognitive processes, or more economical of effort in oral use, let alone more "logical," "expressive," or the like (Ray, 1963, Ch. 10). The three great world languages—Russian, Chinese, and English—differ greatly in structure. Russian has a relatively stable vowel system, a pervasive feature of palatalization, and a complex inflectional morphology. Chinese has a phonology, and lexicon largely based on the syllable, it has distinctive tone, and it has almost no morphological machinery. English has a very variable vowel system and is somewhere between the other two in morphological complexity. There is at present no known way to rate the respective structures of the three languages as wholes.1 The assumption is now standard in linguistics that all known languages apparently constitute roughly the same kind of symbolic behavior system, in spite of this great variety, and that there are at the present time no "primitive" languages exemplifying the type of earlier stage in language behavior that must have existed hundreds of thousands of years ago (Hymes, 1961, 75—76).

Features of Development If judgments of backwardness or limited development of a language cannot be made on the basis of linguistic structure, how can they be made? The view adopted here is that there are at least three dimensions relevant for measuring language development: graphization—reduction to writing; standardization—the development of a norm which overrides regional and social dialects; and, for want of a better term, modernization—the development of intertranslatability with other languages in a range of topics and forms of discourse characteristic of industrialized, secularized, structurally differentiated, "modern" societies. Hymes (1961) offers a more comprehensive approach for evolutionary study of language; here we follow a more limited approach comparable to developmental studies in political sociology or economics rather than the evolutionary approach of anthropology. Ferguson (1962) uses two dimensions: (a) "utilization in writing," which combines graphization and modernization, and (b) standardization. Haugen (1966) offers a four-way matrix of development form-function-societylanguage giving (a) selection of norm, (b) codification, (c) elaboration of function, (d) acceptance; of these, elaboration corresponds roughly to modernization, and the other three are aspects of standardization. Fishman (1968) emphasizes that the processes of language development are not single events but involve repeated elaboration and recodification. Sjoberg (1964) offers a suggestive but oversimpli-


Speech Communities and Language Situations

fied matching of stages of language development with preliterate, preindustrialized civilized, transitional, and industrialized nations.

Graphization The regular use of writing in a speech community, like such other innovations as the use of a steel knife in a stone-age society, has repercussions throughout the culture and social organization. The relative permanence of written records makes possible the transmission of more material from generation to generation; the transportability of written records makes possible communication with a larger number of people; and the immediate fixing in written form makes possible more complex sequential thought on the part of individuals. In this essay, however, we are concerned with the effect of writing on the development of language itself. The first point to be made is that the use of writing adds another variety of language to the community's repertory. The vocabulary, grammatical structure, and even the phonological structure of the language as used in writing begin immediately, as it were, to have a life of their own. Linguists like to point out that speech is primary and writing secondary and that written language is always in some sense a representation of speech. Although this is true in a general way, and is worth repeated emphasis to correct widespread misconceptions, the fact is that writing almost never reflects speech in an exact way, written language frequently develops characteristics not found in the corresponding spoken language, and it may change along lines quite different from changes in the spoken language. After the spread of writing, varieties of the spoken language can no longer be described in vacuo; they will interact with the written form to a greater or lesser degree, and the linguistic analyst must note spelling pronunciations, lexical displacements, and grammatical fluctuations which originate in or are reinforced by written usage. It is remarkable that communities, as they begin the regular use of writing, generally do not feel that ordinary, everyday speech is appropriate for written use. Sometimes this may be because the community already makes use of a classical language, but sometimes it merely transfers to the new medium some of the attitudes already present in the community toward the language of higher levels of discourse such as formal speeches, religious rituals, and the like. It may be assumed that all speech communities show linguistic differentiation along a casual/ noncasual dimension (Voegelin, 1960), and many communities will regard the new use of writing as far along toward the noncasual end, only much later coming to recognize the value of written representation of casual speech. Two welldescribed examples of this tendency may be found in the beginnings of modern Bengali prose and the early use of modern English as a regular means of written communication (Das, 1966, pp. 17-22; Jones, 1953, Ch. I). It is sometimes asserted that the existence of a written variety inhibits language change, thus constituting an important influence for uniformity through time comparable to the kind of regional and social uniformity implicit in standardization. Evidence on this point is conflicting and the question merits systematic study (cf. Zengel, 1962). The second point to be made is that the use of writing leads to the folk belief

Language Development


that the written language is the "real" language and speech is a corruption of it. This attitude seems to be nearly universal in communities which have attained the regular use of writing. It is only the occasional perceptive observer, or in more recent times the professional linguist, who sees the relationship in other terms. To the extent that after the passage of time the written form of the language proves to be the more conservative, the spoken may be regarded with some justification as derived from it, but the picture is invariably more complicated than this, since isolated relic areas may be less innovative than the written language, or the original dialect base of the written language may have been a highly divergent variety of the language. The importance of this folk belief for language development lies in the way it limits the kind of conscious intervention in the form of language planning that the community will conceive of or accept. Much time and effort is often spent on questions of orthography and language reform, in the tacit assumption that changes in the written language will be followed automatically by changes in speech. Some reforming zeal is also expended on bringing pronunciation in line with existing written norms. Insofar as these various efforts are part of a standardization process which responds to the communicative needs of the speech community, they may result in actual change, especially if they do not conflict with the basic phonological and grammatical structure of the language, but often the efforts fail, at least in part because the beliefs do not correspond to the realities of the written-spoken relationship.

Standardization Language standardization is the process of one variety of a language becoming widely accepted throughout the speech community as a supra-dialectal norm—the "best" form of the language—rated above regional and social dialects, although these may be felt appropriate in some domains. The process of standardization in language is often mentioned in works on general linguistics and many books on the history of particular languages deal with the process, but general treatments of standardization are rare (cf. Kloss, 1952, Ch. I; Guxman, 1960; Ray, 1963). The concept of standardization also includes the notions of increasing uniformity of the norm itself and explicit codification of the norm. It is sometimes extended also to include such notions as the introduction of writing, the expansion of lexicon, and even the choice of one language instead of another as an official or national language, but it will not be understood in these senses here. Various aspects of the process of standardization can be documented for scores of languages in the past and it is in progress in many languages today. While standardization is recognized as a dimension in language development as viewed here, there are a good number of instances where a language has been highly standardized and has then regressed to a state of dialect diversity without a standard and may even have been restandardized on a different basis later. This is especially clear with languages with a very long written history such as Egyptian, but other cases are also well-known (Pulgram, 1958, Ch. 23). Such regressions, comparable to returning from a high level of technology to a lower one, are known


Speech Communities and Language Situations

in other aspects of cultural evolution and do not invalidate the developmental viewpoint. Although at this point in history there can be no certainty about the nature of the final achievement of standardization as a stage in language development, it seems possible to interpret the various forms of standardization as moving toward an ideal state when the language "has a single, widely accepted norm which is felt to be appropriate with only minor modification or varieties for all purposes for which the language is used" (Ferguson, 1962, p. 10). If this interpretation is followed, a number of special types of language standardization can be viewed as way-stations in the developmental process. Examples of these special types include diglossia, where the supradialectal norm is not used for ordinary conversation, as in Arabic and Tamil (Ferguson, 1959), and multimodal standardization, where competing supradialectal norms exist, as in Eastern and Western Armenian (Garibian, 1960), Hindi-Urdu (Gumperz and Nairn, 1960), Norwegian BokmalNynorsk (Haugen and Chapman, 1964, pp. 365-366). The process of language standardization is not well understood and needs both case studies and attempts at generalization so that some testable hypotheses can be advanced, but at least two points can be made on the basis of present knowledge. First, there are many paths of standardization and a number of sociolinguistic variables to be investigated in connection with the different paths. Second, in most of the well-known cases of language standardization in Europe since the Renaissance, a number of features keep recurring, although they are not all present in each case: 1. The basis of the standard was the speech of an educated middle class in an important urban center. 2. The standardizing language was displacing another language from its position as normal written medium. 3. One writer or a small number of writers served as acknowledged models for literary use of the standardizing language. 4. The standardizing language served as a symbol of either religious or national identity. Some of these features are also evident in language standardization in other parts of the world and at other times, but the examination of other cases would probably require adding some features to the list.

Modernization The modernization of a language may be thought of as the process of its becoming the equal of other developed languages as a medium of communication; it is in a sense the process of joining the world community of increasingly intertranslatable languages recognized as appropriate vehicles of modern forms of discourse. This view of modernization—and indeed the very term itself—should not disguise the fact that this process is not really new or "modern": it is essentially the same process that English went through in the fifteenth century or Hungarian in the

Language Development


nineteenth when the language was extended to cover topics and to appear in a range of forms of discourse for which it was not previously used, including nonliterary prose and oral communication such as lectures and professional consultation. Two important forms of discourse in contemporary modernization are the news and feature stories of the press and radio. The process of modernization thus has two aspects: (a) the expansion of the lexicon of the language by new words and expressions and (b) the development of new styles and forms of discourse. The second aspect has less often been discussed than lexical expansion and would repay study. Interestingly enough the new forms of discourse that must be developed seem in themselves to be less distinctive for the speech community than the more literary forms (oral or written) preceding them. Thus the poetic structures (meter, rhyme, assonance, allusion, stanza-form, etc.) of a given language may be highly distinctive and difficult to transfer to other languages, whereas the structures of nonliterary prose (paragraphing, ordered sequences, transitions summaries, cross-references, etc.) tend to be universal and highly translatable. Lexical expansion is required in order to treat new topics, and this seems to take place most effectively when the tempo of change is not too fast, the practitioners who need the vocabulary are involved in its creation, and there are sufficient lines of communication among the users of the new terminology to achieve consistency. In this area, too, there has been little systematic study. The efforts of language planners generally focus on the production of glossaries and dictionaries of new technical terms and on disputes about the proper form of new words, when the critical question seems to be that of assuring the consistent use of such forms by the appropriate sectors of the population. When the lexical expansion of modernization actually takes place, it may not be at all in accord with the carefully prepared glossaries of the planners. Probably the use of new terms and expressions in such places as secondary school textbooks, professional papers, and conversation among specialists is far more important than the publication of extensive lists of words. Case studies of rapid lexical expansion in recent times should be made to determine the factors accounting for success in cases like Japanese and Hungarian and relative lack of success in cases like Hindi and Arabic. On the issue of the source of new vocabulary and the methods of word creation, one important point seems to be that a technical vocabulary can be equally effective whether it comes from the language's own processes of word formation or from extensive borrowing from another language. The issue of purism can be a critical one in the sense that feelings may be strong and disagreements sharp, but it seems almost totally irrelevant to the final success of the lexical expansion process. Of the two examples cited, Hungarian followed almost exclusively the path of internal creation, whereas Japanese used extensive borrowing from English as well. This issue is important for social psychological research in finding the factors involved in the attitudes adopted, and it has a kind of importance for linguistic research in that it may involve changes in word structure and the distribution of sounds, but it seems of less importance for understanding the process of language development itself (cf. Ray, 1963, pp. 36-44 and references).


Speech Communities and Language Situations Summary

Among the many language aspects of national development that could be the object of study and measurement, it is possible to isolate the question of the degree of development of a particular language. Language development in this sense is viewed here as having three conceptually distinct components: (a) graphization, the use of writing; (b) standardization, the use of a supradialectal norm; and (c) modernization, the development of vocabulary and forms of discourse. Graphization adds to the language a new variety, which in relation to the spoken varieties tends to be slower to change, is generally regarded by the users as more fundamental, and can serve as a better means of standardization and modernization. Standardization brings to a language the kind of integration and uniformity needed for large-scale communication, but there are various paths of standardization, and analysis of these and the relevant social variables is needed. Modernization provides the language with the specialized subvocabularies and forms of discourse corresponding to the highly differentiated functions the language must fulfill in a modern society. Finally, all three components of language development can be the object of language planning (Haugen, 1966, and references) although the factors making for success and failure in such planning are not clear.

Note 1. This does not, of course, exclude the possibility that particular features of particular languages may be rated as more regular, easier to acquire, etc. For example, the representation of the grammatico-semantic category of number (singular-plural) is very complex and irregular in Russian, much simpler in English, and very simple and relatively unimportant in Chinese. It is reasonable to assume—and there is some confirmatory evidence—that children learning these languages acquire the notion of plurality at roughly the same ages but that it takes the Russian child longest to attain full mastery of the plural inflection, the English-learning child less time, and the Chinese child least. There is no way to weight this isolated phenomenon against the total structure of the language. Children seem to master the basic structures of their languages at roughly the same age, no matter what language they are acquiring.

References Das, Susirkumar. 1966. Early Bengali Prose. Calcutta: Bookland. Ferguson, Charles A. 1959. Diglossia. Word, 15:325-340. Ferguson, Charles A. 1962. The Language Factor in National Development. Anthro. Ling., 4(1), 23-27. Reprinted in F. A. Rice (ed.), Study of the Rale of Second Languages, Washington, Center for Applied Linguistics, 1962, pp. 8-14. Fishman, Joshua A. 1968. Sociolinguistics and the Language Problems of the Developing

Language Development


Countries. In J. A. Fishman, C. A. Ferguson and J. Das Gupta (eds.), Language Problems of Developing Nations. New York: John Wiley and Sons, 1968, pp.3-16 Garibian, A. S. 1960. Ob armjanskom nacional'nom literaturnom jazyke. In M. M. Guxman (ed.), Voprosy formirovanija i razvitija national'nyx jazykov, Moscow 1960, pp. 50-61. Garvin, Paul, and Madeleine Mathiot. 1960. The Urbanization of the Guarani Language— A Problem in Language and Culture. In A. F. C. Wallace (ed.), Men and Cultures, Philadelphia: University of Pennsylvania Press, pp. 783-790. Guxman, M. M. (ed.). 1960. Voprosy formirovanija i razvitija nacional'nyx jazykov. Moscow. Greenberg, Joseph H. 1966a. Language Universals. In T. A. Sebeok (ed.), Current Trends in Linguistics III. The Hague: Mouton, pp. 60-112. (Reprinted separately with slight revisions by Mouton.) Greenberg, Joseph H. (ed.). 1966b. Universals of Language, 2nd ed. Cambridge: MIT Press. Gumperz, John J., and C. M. Nairn. 1960. Formal and Informal Standards in the Hindi Regional Language Area. In C. A. Ferguson and J. J. Gumperz (eds.), Linguistic Diversity in South Asia, Int. J. Amer. Ling., 26(3), Pt. III (Bloomington, Ind.). Haugen, Einar. 1966. Dialect, Language, Nation. Amer. Anthro., 68, 922-935. Haugen, Einar, and Kenneth G. Chapman. 1964. Spoken Norwegian, rev. ed. New York: Holt, Rinehart and Winston. Hymes, Dell H. 1961. Functions of Speech: An Evolutionary Approach. In F. Gruber (ed.), Anthropology and Education, Philadelphia: University of Pennsylvania Press, pp. 55-83. Jones, R. F. 1953. The Triumph of English. Stanford, Cal.: Stanford University Press. (Repr. paper 1966.) Kloss, Heinz. 1952. Die Entwicklung neuer germanischer Kultursprachen von 1800 bis 1850. Munich: Pohl. Pulgram, Ernst. 1958. The Tongues of Italy; Prehistory and History. Cambridge, Mass.: Harvard University Press. Ray, Punya Sloka. 1963. Language Standardization. The Hague: Mouton. Sjoberg, Andree F. 1964. Writing, Speech, and Society: Some Changing Relationships, in Proc. 9th Intl. Cong. Ling. The Hague: Mouton, pp. 892-898. Uspenskij, B. 1965. Strukturnaja tipologija jazykov. Moscow. Voegelin, C. F. 1960. Casual and Noncasual Utterances within Unified Style. In T. A. Sebeok (ed.), Style in Language, 57-68. Cambridge: MIT Press, pp. 57-68. Zengel, Marjorie Smith. 1962. Literacy as a Factor in Language Change, Amer. Anthro. 64, 132-139.

3 The Role of Arabic in Ethiopia: A Sociolinguistic Perspective

National Sociolinguistic Profile Formulas One method of presenting the Sociolinguistic setting of a language is to include it in a formula representing the Sociolinguistic profile of a nation or other political entity (Ferguson, 1966; Uribe Villegas, 1968). This method differs from others in that it selects a political entity rather than any other demographic, societal, cultural, or psychological framework, and in that it uses a particular taxonomy of language types and functions (Stewart, 1968). This method makes no strong claims for predictive value and omits important Sociolinguistic data relevant for assessment of the 'roles' of languages in a nation; it does, however, offer a convenient way of making gross Sociolinguistic comparisons among nations and it seems to have considerable heuristic value in suggesting lines of investigation and data collection often overlooked in the establishment of national language policies. Briefly summarized, the method consists of (1) identifying the number of major and minor languages and languages of special status in the nation and (2) representing them in an additive formula using capital and lower case letters standing for language types and functions respectively. A third, more informative, expansion of the formula specifies the languages by name, so that a separate key can provide information on degree of linguistic distance among them and dialect diversity within them; if necessary, information can be added on the diversity of writing systems used. A sample national profile formula in alternative expansions might read: 1. 2Lmaj + 6Lmin + 1Lspec 2. (Sow + Sei) + (5Vg + Sge) + Crl Formula (1) states that in the nation in question there are two major languages, six minor languages, and one language of special status. Expanded formula (2) specifies the major languages as two Standard languages (S) one of which is offiThis paper originally appeared in the Georgetown University Monograph Series on Languages and Linguistics 23.355-70 (1970), reprinted by permission of Georgetown University Press. 48

The Role of Arabic in Ethiopia


cial (o) and also serves as an important lingua franca within the country (w) and the other is used extensively in education (e) and serves as the nation's means of communication with other countries (i). It further specifies the minor languages as five vernaculars (V) which primarily serve to identify their speakers as members of particular ethnic or other sociocultural groups (g) and one standard language which not only serves this function but is also used in education. Finally, it specifies the language of special status as a Classical (or dead Standard) language used chiefly for certain religious (r) and literary (1) purposes. Further details of the method, with more precise defining criteria for the various categories, can be found in the articles cited.

Language Situation in Ethiopia Like many other nations of Africa, Ethiopia is a highly multilingual country, although it differs from most other African nations in having an indigenous language constitutionally recognized as its official language. The currently available body of data is not adequate for definite identification of the major and minor languages of the country, but an approximation can be made on the basis of the present estimates of the Language Survey of Ethiopia, subject to correction as more extensive and accurate information becomes available. The Ethiopian profile formula reads: 1. 5Lmaj + 13Lmin + 3Lspec 2. (3S + 2V) + (13V) + (1C + 1S + Arabic) 2a. (Sowe + Sie + Sgw + Vgw + Vg) + (13 Vg) + (1Cr + 1Sw + Arabic) L maj (in approximate order of sociopolitical importance) Sowe

Sie Sgw Vgw Vg

Amharic English Tigrinya Gala (-Oromo) Somali

(Ethio-) Semitic Indo-European (Germanic) (Ethio-) Semitic E. Cushitic E. Cushitic

L min (in alphabetical order)

v1g V2g V3g

V4g V5g V6g V7g V8g V9g V10g


Afar Anyuak Beja Chana Gurage Derasa Gumuz Hadiyya Janjero Kefa Kembata Sidamo

E. Cushitic Nilo-Saharan N. Cushitic (Ethio-) Semitic E. Cushitic Nilo-Saharan E. Cushitic Omotic Omotic E. Cushitic E. Cushitic


Speech Communities and Language Situations V12g V13g

Tigre Wellamo

(Ethio-) Semitic Omotic

Geez Italian Arabic

(Ethio-) Semitic Indo-European (Romance) Semitic

L spec Cri SW Arabic

Amharic is a standard language, with a writing system of its own (the Geez syllabary with a few additions) and literature going back to the fourteenth century; it serves as the medium of instruction in all government primary schools, the primary language of oral and written communication in the government and the armed forces and the only Ethiopian language whose function as a lingua franca is national in scope; it is declared in the Constitution of 1965 as the official language of the Empire. English is the medium of instruction in all government secondary schools and higher education; it is an important spoken and written medium in government communication; it is the language of upward socioeconomic mobility. It has been publicly recognized by the government as the nation's second language, and serves as its chief medium of communication with other countries. Tigrinya is a standard language, using essentially the same writing system as Amharic; it has a small literature, and the publication of newspapers in Tigrinya antedates that of Amharic. Formerly the medium of instruction in primary schools in the Eritrea region, in which role it is being replaced by Amharic, it still serves as a lingua franca in many parts of that area. Galla is a vernacular with considerable dialect diversity which does not seem to be moving toward standardization; it is not normally written but is spoken as a mother tongue by more people than any other language in Ethiopia. In certain parts of the country it serves as a lingua franca. Somali is a vernacular spoken over a large but sparsely settled area. It has considerable dialect diversity, but mutual intelligibility is high among them and there is some trend towards standardization. It has a large oral literature but is rarely written; in neighboring Somalia where it is the mother tongue of 90% of the country, Arabic or European languages are used for writing (Andrzejewski, 1962). The minor languages are all vernaculars used by ethno-linguistic communities of at least 100,000 members. Most are clearcut languages, but several, e.g. Wellamo (-Gofa-Gemu-Kullo- . . .) and Gumuz (-Sese-Disoha-Dakunza-Sai- . . .) might be regarded as dialect clusters. Afar is often considered together with the closely related language Saho. Chaha Gurage may not be spoken by 100,000, but it is included as probably the most important representative of the cluster of languages called Gurage which taken together may have nearly a million speakers. Geez is a classical language known from inscriptions as far back as the fourth century B.C.; its periods of literary flowering were between the seventh and thirteenth centuries, long after it had ceased to be a spoken language. Today it serves as the liturgical language of the Ethopian Orthodox Church; it is the vehicle of traditional Ethiopian ecclesiastical and historical literature and is still used for the

The Role of Arabic in Ethiopia


composition of poetry. Geez uses a syllabary of some 250 characters derived from the writing system of South Arabic inscription. Italian has no official or publicly recognized status in the nation, but there are several thousands for whom it is their mother tongue and there is a fairly active Italian press. In its standard form (with some dialect differences brought from Italy) it serves as a lingua franca among some sections of society, particularly in the Eritrea area. In a pidginized form it serves as a lingua franca at a different level in scattered areas of Ethiopia. The use of Italian seems to be declining in favor of English and Amharic.

Varieties of Arabic Arabic, as a great world language spoken by some hundred millions of people over the enormous area from Morocco to the Persian Gulf and attested in literature for nearly a millennium and a half, offers a bewildering range of variation. First there is the Classical written language extending from pre-Islamic poetry to modern technical journals: this variety shows essentially the same sound system and morphology but with considerable variation in vocabulary, syntax, and forms of discourse. Next there is Colloquial Arabic, the chain of regional dialects which constitute the Arabs' mother tongue today. The extent of variation among these dialects is greater than that between what are recognized in other circumstances as separate languages (e.g. Norwegian and Swedish), but the speakers of these dialects have a strong sense of linguistic unity, and a speaker of Arabic recognizes that speakers of other dialects are also speaking Arabic. These two varieties, Classical and Colloquial, exist side by side in the Arabic speech community in a diglossia relationship (Ferguson, 1959a; Gumperz, 1962; Fishman, 1968). Among the regional dialects some may be regarded as 'prestige dialects' (cf. Johnstone, 1967, pp. xxix-xxx), notably those of important urban centers such as Cairo, Beirut-Damascus-Jerusalem, Baghdad (Muslim variety) and northern Moroccan cities. Arabic speakers, within the areas of influence of these prestige dialects, may in the course of their lives adjust their own dialect in the direction of the prestige dialect or even be bidialectal (e.g. Blanc, 1964). Intermediate between the two varieties or sets of varieties, relatively 'pure' Classical and Colloquial, there are many shadings of 'middle language'. These intermediate forms, some highly fluctuating and transitional, others more stable, represent two tendencies: classicization, in which a dialect is modified in the direction of classical, and koineization, in which dialects are homogenized by the modification or elimination of features which are felt to be especially distinctive of a particular regional dialect (Blanc, 1959). Some of these intermediate varieties may be viewed collectively as a 'panArab koine' (cf. Johnstone, 1967, pp. xxv-xxx), and indeed the Arab world seems to be developing such a koine for at least the third time in its known history (preIslamic poetic koine, koine of early centuries of the Muslim era, modern koine; cf. Ferguson, 1959b).


Speech Communities and Language Situations

Finally, in certain areas and under certain social conditions where Arabic has been used for limited purposes by people of other mother tongues, it has developed pidginized forms in which the lexicon and overt grammatical categories of the language have been drastically reduced. The best-known examples are the Turku of the Lake Chad area and Central Africa, and the 'Bimbashi' Arabic which spread southward from the Sudan (Heine, 1968, and references).2

Arabic in Ethiopia Having reviewed the method of sociolinguistic profile formulas, the general language situation in Ethiopia, and the nature of sociolinguistic variation within Arabic, our task is now to identify the kinds of Arabic and their respective functions in Ethiopia in such a way that this information can be represented in the total profile formula for the nation. Since at least as far back as the fourth millennium B.C. there has been traffic and communication across the Red Sea, between southern Arabia and the coast of eastern Africa including the Ethiopian area. And since at least the seventh century of the Common era, this has involved the appearance of speakers of Arabic (as opposed to South Arabian languages) on African soil. This process of temporary and permanent immigration of Arabic speakers from Yemen and the southern coast of Arabia has continued into the nineteenth and twentieth centuries. The immigrants have brought both language and religion, and Arabic and Islam have spread to African populations, partly separately and partly in close connection. Also, peoples further south along the East African coast and inland who have become Muslim, as a result of influence from Yemen and southern Arabia, have moved northward, bringing with them the use of Arabic for various purposes within their basically non-Arabic-speaking-society. The best example may be the constantly expanding population of Somali tribes, all of whom have been Muslim since the beginning of the sixteenth century. Since at least as far back as the second millennium BC, there has been traffic and communication between Egypt and the Ethiopian area. With the coming of Christianity into Ethiopia in the fourth century, religious ties with the church in Egypt formed a special line of communication, and in medieval times a large part of the literary production in Geez consisted of translations from Arabic works used by the Coptic Christians of Egypt. In the nineteenth century, Egyptian political influence extended down the Red Sea on to the Eritrean lowlands and the citystate of Harar, and this also directly affected the spread of Arabic and Islam, separately and together. Finally, since at least the nineteenth century there has been movement of Arabic-speaking Muslims from the Sudan into Ethiopia. In addition to groups of Arabic mother tongue, many have been speakers of other languages who used Arabic as a lingua franca. This rapid and drastically oversimplified historical account of the spread of Arabic into Ethiopia cannot do justice to the complex story, which deserves re-

The Role of Arabic in Ethiopia


search and study in itself, but it can give some indication of the varied strands of influence involved in the present-day use of Arabic in the nation. One aspect of Arabic influence on Ethiopian language—the presence of Arabic loanwords—has received treatment in a number of studies by Leslau (e.g. Leslau, 1957).

Arabic as Mother Tongue It is not possible to estimate with any high degree of accuracy the number of native speakers of Arabic resident in Ethiopia, although it must run in the tens of thousands. The total number is, however, relatively small, and by this criterion Arabic cannot be included in the L min of the formula. The varieties of Arabic in use by the mother-tongue speakers are roughly comparable to those in use in other parts of the Arabic-speaking world, i.e. there is a diglossia situation in which the speakers acquire the Colloquial in childhood and then superpose some amount of Classical Arabic for written and formal oral use. The kinds of Colloquial in use in Ethiopia seem to cluster around two norms, one of which may be labeled 'Yemeni', the other 'Sudanese'. Neither of these two varieties is homogeneous in Ethiopia and there is fluctuation and use of intermediate varieties, but Arabic speakers generally recognize the existence of the two major types, which differ in pronunciation, certain details of morphology, and in a considerable number of lexical items, including some items of basic vocabulary. The two varieties in any case are to a high degree mutually intelligible. As an illustration of the nature of the difference, we may cite material elicited from two Ethiopian speakers of Arabic. Both had essentially the same sound system, but differed, for example, in their reflexes of Classical /q /: Classical



/q/ / /


/g/ /d/


3, d/ 0, t/


In matters of morphology, for example, the 'Sudanese' had the ending -ta for the first and second person singular of the past tense while the 'Yemeni' had -t for both, but in some styles of speech used -tu for the first person and -ta for the second. Or, the equivalent of this was da after the noun for the 'Sudanese' and ha a before the noun for the 'Yemeni'. On the standard 100-word list of basic vocabulary used in the Survey, the two informants had different words on about thirty items, although this may be misleading, since for a number of these the other word would also have been familiar either as a synonym or from Classical use. Examples of the differences: Yemeni Sudanese Classical 'foot' xuff rijl rijl 'man' rajul zol rajul 'sit' jalas ga'ad jalisa, qa'ada 'water' moya (masc.) moya (fern.) ma' 'what' 'es sunu ma


Speech Communities and Language Situations

Arabic as a Religious Language Every Muslim in the world, regardless of mother tongue, learns at least a few expressions in Arabic, such as greetings (e.g. some version of Assalamu 'alaykum 'Peace be on you'), invocations (e.g. Bismillah 'in the name of God'), a statement of faith ('There is no god but God, and Muhammad is God's messenger') and prayers, including the Fatiha, the opening surah of the Qur'an. Additional study of Islamic precepts requires memorization of further Arabic material, especially the Qur'an, and ideally the mastery of Arabic to read the traditional works of theology, jurisprudence, ethics, traditions of the Prophet, and so on. In Ethiopia there are great differences from one region to another, one ethnic background to another, and one individual to another, in the amount of Arabic a Muslim acquires for primarily religious reasons. The mastery of a few greetings and so on is relatively insignificant in the total language economy of Ethiopia, but certain aspects of the religious use of the language deserve special attention. In the first place, many thousands of Muslims every year become literate in Arabic by studying with a traditional teacher (mu'allim) or attending some kind of traditional school (madrasa);3 typically this is their initial (or, in some cases, only) acquisition of literacy since it normally takes place before entry into a 'modern' government or private school. Secondly, there may be more than a hundred thousand Muslims in Ethiopia who do not speak Arabic well, but who make use of Arabic to the extent of reciting long passages from Arabic works, carrying on stereotyped conversational exchanges in a religious context, or following to some extent a sermon or exhortation in Arabic.

Arabic as Lingua Franca More important than the preceding two points, in terms of extent of active use of Arabic in Ethiopia, is the widespread use of Arabic as a means of oral communication between speakers of different languages. There is no doubt that Amharic is the most important lingua franca in Ethiopia as a whole, but a number of other languages serve as lingua francas in limited areas, not only major languages like Tigrinya and Galla as mentioned above but even quite minor languages such as Wetawit (Berta) in the Beni Shengul region of western Ethiopia. The use of Arabic as a lingua franca only partially follows regional lines; it tends to coincide more with religious boundaries. Arabic is used as a lingua franca mostly among Muslims of various mother tongues. Some indication of the range of use of Arabic as a lingua franca is given by the questionnaire replies of twenty freshmen at Haile Selassie I University who claimed knowledge of Arabic (October 1969). These twenty students, of about twenty-one years of age, came from six different provinces, and represented ten different tongues. Twelve of the students claimed to speak Arabic 'fluently', six 'with difficulty', and two 'only a little'. While we cannot assume that these findings are representative of the users of Arabic throughout the country, they clearly show that Arabic can function widely as a lingua franca.

The Role of Arabic in Ethiopia


There are of course many Muslims in Ethiopia who are unable to converse in Arabic, so that the latter cannot be regarded as a normal secondary language for Muslims, but it is probably true that hundreds of thousands (as high as a million?) Muslims in the country are able to use some kind of spoken Arabic as a means of oral communication, whereas the number of non-Muslims able to do so is very small. The kind of Arabic spoken in this way tends to cluster around 'Yemeni' and 'Sudanese' norms, but it often fluctuates more than mother tongue Arabic, mixes regional dialects, and incorporates features of Classical Arabic. Finally, we must take note of the fact that an indeterminate (although fairly small) number of Muslims who cannot use Colloquial Arabic as a means of conversation have learned enough Classical Arabic in madrasa, mosque, radio, and reading to be able to use it to a limited extent as a lingua franca, and with some hesitation we may add 'w' also to the 'C' part of Arabic in the formula: Crlw: Vgw.

Arabic as Trade Jargon Many of the Arabic-speaking immigrants to the Ethiopian areas through the centuries have been merchants, and Arab traders, shopkeepers, and small merchants can be found in many parts of Ethiopia. In communication between Arab merchant and customer, often a rudimentary, pidginized form of Arabic is used, and this use of Arabic is not so strongly limited to Muslims as the more general lingua franca use just described. Although there has been as yet no systematic study of this kind of Arabic, impressionistic observation notes some of the usual features of pidginized Arabic, such as the m. sg. for all persons of the verb, and so on. Some indication of the use of Arabic in trade transactions is given in the freshman student responses. Of the twenty students, eighteen checked 'usually use Arabic' or 'may use Arabic' in the market, in shops, or both (one student did not answer the question). Next to religious use (prayers, preaching), the trade use (market, shops) was most often checked in the 'usually use Arabic' column (religious use: twelve checks; trade use: nine checks).

Arabic in the Ethiopian Formula The material presented above on the types and functions of Arabic in Ethiopia may be summarized by an entry for Arabic in the national profile formula as: Crlw: Vgw: (Pt) This formula is to be interpreted as follows: there is a Classical form of the language which serves religious and literary purposes and is in a diglossia relationship with vernacular varieties of the language, the use of which serves as a mark of social group identity (i.e. Islam); both forms of the language, as well as the intermediate varieties characteristic of diglossic languages, serve as a lingua franca in the country. Less certain is the existence of a pidginized form of the language used primarily as a trade jargon.


Speech Communities and Language Situations

Attitudes toward Arabic We may assume that every community has some shared beliefs about language and attitudes toward language. In multilingual countries we can assume that some of these beliefs and attitudes will be about the appropriateness of the use of particular languages for different purposes as well as about esthetic and moral values inherent in one language and its uses in comparison with others. In order to understand fully the role of Arabic in Ethiopia, it would be desirable to have information on the attitudes of Ethiopians toward Arabic and its use in comparison with their attitudes toward other languages. Previous studies of attitudes toward Arabic (Ferguson, 1959; Nader, 1962) have been based on participant observation in communities of Arabic mother tongue, and studies of the role of Arabic in a multilingual society have been concerned with Arabic as a national language in relation to a European former colonial language (e.g. Gallagher, 1968) or to a local minority language (e.g. Jernudd, 1968). Accordingly, there is little precedent for a study of attitudes toward Arabic in a nation where it serves as a secondary lingua franca and religious language. A few predictions might be hazarded on the basis of the description above but field investigation is required for any dependable conclusions. Some meagre indications of the attitudes toward Arabic held by users of the language in Ethiopia can be found in the results of the questionnaire. To the question 'What languages would you like your children to know?' the twenty university freshmen and the seventy Dire Dawa respondents gave overwhelming preference to English, Arabic, Amharic, and French (82, 81, 61, 57 votes respectively), the other languages named being mostly mother tongues. This at least testifies to the importance they attach to knowledge of Arabic. The responses to the questions about which languages seemed most pleasant and most unpleasant gave preference to Arabic and English as the most pleasant, and apart from 17 votes for Gurage gave no clear pattern of languages regarded as unpleasant (scattered votes or no language named). Again, this gives some indication of a favourable attitude toward Arabic. The answers to a complex question on language preferences for different uses give some slight additional information. Arabic was not consistently preferred to English, mother tongue, or Amharic for any use, although the largest number of top preference votes for use of Arabic was for talking about religion. This suggests that the use of Arabic as a lingua franca is not out of some kind of preference for that language, but because it is favoured by the existing language competences of the people communicating.4 Finally, the answers to the questions about the use of Arabic in government schools and on the radio are of interest. The votes were overwhelmingly in favor of the teaching of Arabic as a subject in government schools, the use of Arabic in broadcasting to Ethiopians, and the recitation of the Qur'an over the Ethiopian radio. The vote was indecisive on the question of teaching the Qur'an in the schools (8 yes, 9 no, 3 no vote). Whatever else may be their attitudes about Arabic, the students seemed to want more use of Arabic under government auspices.

The Role of Arabic in Ethiopia


This very little bit of information about language attitudes is tantalizing, and points to the need for a broader investigation with other techniques. Even with fuller information on the attitudes of Ethiopians who use Arabic as a secondary language, any attempt at characterizing the position of Arabic in the nation or predicting future trends would fail without investigation of the attitudes of those in the country who have Arabic as their mother tongue as well as the attitudes of the vast majority of Ethiopians who have little or no knowledge of Arabic at all.

Notes 1. This paper is in the nature of an interim report on one subproject of the Language Survey of Ethiopia. The Survey is part of the five-nation Survey of Language Use and Language Teaching in Eastern Africa supported by the Ford Foundation. This paper was presented in preliminary form at the Conference on Ethiopian Languages held in Addis Ababa, October 1969. Even in its present form it provides very little information not already well-known to many Arabists and specialists in Ethiopian affairs. What merit it may have probably lies in the attempt to communicate this information in such a way that it can be readily assimilated by social scientists, linguists, or interested laymen and can thereby serve as the basis for more extended research or policy making. 2. The entire range of linguistic variation in Arabic has been studied chiefly by descriptions of 'pure' varieties and studies of local variation in a given dialect area. (For a summary of the research see Abboud, in press.) Studies of variation in some kind of social context have been extremely rare (e.g. Blanc, 1960 and 1964; Mitchell, 1957). We are certainly far from having sociolinguistically sophisticated studies of verbal interaction of small groups, studies of the sociolinguistic patterns of whole communities such as villages or social institutions, or large-scale studies of whole nations or the whole Arab world. It may be hoped that the new generation of Arab linguists will undertake studies which will utilize such fruitful sociolinguistic constructs as domain, network, social situation, role relationship, and interaction type (Fishman, 1968). 3. Of twenty university freshmen who claimed knowledge of Arabic (Addis Ababa, October 1969), all but three claimed some reading knowledge. Fourteen reported having learned to read in a madrasa, which they reported having attended for periods ranging from two to eight years (mean five). 4. The preferences for other languages are of some interest. English was the most strongly preferred for the largest number of uses: 15 out of the 20 gave it top preference for seeing movies and reading books for fun, and 11 and 13 respectively for reading newspapers and listening to news broadcasts. Amharic was not consistently given preference above mother tongue or English for any use, but was preferred by five respondents for talking during sports or for writing letters. As might be expected, the mother tongue was strongly preferred for listening to songs; more surprising was the vote on talking about religion, in which mother tongue preferences exceeded Arabic.

References Abboud, P. F. (1969), 'Arabic dialects', in A. Sebeok et al. (eds.), Current Trends in Linguistics, vol. 5: Southern Asia and North Africa, Mouton.


Speech Communities and Language Situations

Andrzejewski, B. W. (I 962), 'Speech and writing dichotomy as the pattern of multilingualism in the Somali Republic', in Colloque sur le Multilinguisme, Brazzaville. Bender, M. L. and Cooper, R. L. (1969), The prediction of between-language intelligibility', mimeograph, Addis Ababa. Blanc, H. (1960), 'Stylistic variations in spoken-Arabic: a sample of interdialectal educated conversation', in C. A. Ferguson (ed.), Contributions to Arabic Linguistics, Harvard University Press. Blanc, H. (1964), Communal Dialects in Baghdad, Harvard University Press. Ferguson, C. A. (1959a), 'Diglossia', Word, vol. 15, pp. 325-40. Ferguson, C. A. (1959b), 'The Arabic Koine', Language, vol. 35, pp. 616-30. Ferguson, C. A. (1966), 'National sociolinguistic profile formulas', in W. Bright (ed.), Sociolinguistics, Mouton. Ferguson, C. A. (1969), 'Myths about Arabic', Languages and Linguistics Monograph Series, vol. 12, pp. 75-82, Georgetown University. Fishman, J. A. (1968a), 'Societal bilingualism: stable and traditional', in Bilingualism in the Barrio, US Office of Education. Revised version of 'Bilingualism with and without diglossia; diglossia with and without bilingualism', J. Soc. Iss., vol. 23, part 2, pp. 29-38, 1967. Fishman, J. A. (1968b), 'Sociolinguistic perspective on the study of bilingualism', in Bilingualism In the Barrio, Washington, D.C., US Office of Education. Fishman, J. A. (1968c), 'The relationship between micro- and macro- Sociolinguistics in the study of who speaks what language to whom and when', in the Georgetown University Monograph Series on Languages and Linguistics 23: 47-58. Gallagher, C. F. (1968), 'North African problems or prospects; language and identity', in J. A. Fishman et al. (eds.), Language Problems of Developing Nations, Wiley. Gumperz, J. J. (1962), 'Types of linguistic communities', AL, vol. 4, part 1, pp. 28-40. Heine, B. (1968), 'Afrikanische Verkehrssprachen', Infratest Schriftenreihen zur empirischen Sozialforschung, Bd. 4, Koln. Jernudd, B. (1968), 'Linguistic integration and national development', in J. A. Fishman et al. (eds), Language Problems of Developing Nations, Wiley. Johnstone, T. M. (1967), Eastern Arabian Dialect Studies, Oxford University Press. Leslau, W. (1957a), 'The phonetic treatment of the Arabic loanwords in Ethiopia', Word, vol. 13, pp. 100-23. Leslau, W. (1957b), 'Arabic loanwords in Amharic', BSOAS, vol. 19, pp. 221-44. Leslau, W. (1957c), 'Arabic loanwords in Argobba', JAOS, vol. 77, pp. 36-9. Lukas, J. (1936), 'The linguistic situation in the Lake Chad area in Central Africa', Africa, vol. 9, pp. 332-49. Mitchell, T. F. (1957), 'The language of buying and selling in Cyrenaica; a situational statement', Hesperus vol. 44, pp. 31-71. Nader, L. (1962), 'A note on attitudes and the use of language', AL, vol. 4, part 6, pp. 25-9. Stewart, W. A. (1968), 'A sociolinguistic typology for describing national multilingualism', in J. A. Fishman (ed.), Readings in the Sociology of Language, Mouton. Trimingham, J. S. (1952), Islam in Ethiopia, Cass. Uribe Villegas, O. (1968), 'Instrumentos para la presentation de las situaciones sociolinguisticas', Revista Mexicana de Sociologia, vol. 30, pp. 863-4.

4 Religious Factors in Language Spread

The distribution of major types of writing systems in the world correlates more closely with the distribution of the world's major religions than with genetic or typological classifications of language, a fact which has often been noted by sociolinguists and others interested in the spread of writing systems. This correlation between religions and writing systems does not result from any inherent relationship between religious practices or beliefs and the processes of reading and writing. Rather, the present distribution of writing systems is largely a result of the fact that in many instances the spread of a major religion has simultaneously introduced the use of writing into a nonliterate speech community. So it has happened that wherever Western Christianity has spread in nonliterate communities, it has introduced a variety of the Latin script for writing local, previously unwritten languages; and wherever Islam has spread to nonliterate communities, it has introduced a variety of the Arabic script for writing previously unwritten languages. Sometimes, when a major religion has spread to a literate community, the effect of the new religion has been to replace a local writing system without replacing the languages spoken in the community, as when the Arabic alphabet replaced the other ways of writing Persian or Malay, or when the Latin alphabet replaced other ways of writing Old Norse or Philippine languages. One of the most impressive associations of a writing system with a religion is the use of the Hebrew alphabet by the Jews. For nearly two millenia Jews have used the so-called square Hebrew letters (derived from the earlier Aramaic alphabet) to represent their mother tongues and other languages used in Jewish communities, so that Hebrew script has been used by Jews to write varieties of Arabic, Persian, Spanish, and other languages, and is regularly used for both Hebrew and Yiddish today. In this case, the principal reason for the correlation between religion and writing system is the historical fact that Jews have traditionally become literate first in Hebrew, and then as they find it appropriate to make written use of other languages with their communities, the writing system of Hebrew is extended to them (Gold 1980). This indirect relation between religions and the spread of writing systems gives some indication of the indirect relation between religion and the spread of lanThis paper was originally published in R. L. Cooper, ed., Language Spread. Bloomington, Ind.: Indiana University Press (1982), reprinted by permission of Indiana University Press. 59


Speech Communities and Language Situations

guages in general. In this paper we can only explore a few common types of language spread in which religious factors play an important role. Such exploration may help us to understand better the processes of language spread as well as the role of religious factors in human social behavior. For the purposes of this paper we define language spread as the increase over time in the number of users or the amount of use of a given language or language variety, such increase being typically—although not necessarily—at the expense of the use of other languages, or language variety(ies). Religion will be understood in its general sense of beliefs or practices related to ultimate concerns, comprehensive integrative value systems, or supernatural phenomena.

Language Variation Register One place to begin the exploration is with an examination of religious uses of language. Every speech community may be assumed to leave special characteristics of language structure and language use which are appropriate for religious purposes, in religious settings, or on religious occasions. Such characteristics may involve differences in register, e.g., a special variety used in public religious ceremonies which is different from ordinary conversational language or nonreligious formal language. A religious register in a given speech community may even be a totally different language from the ordinary language of the community. The distribution of religious registers at a given time in a community must be the result of the spread (or receding) of language varieties during earlier periods. Let us examine one particular type—the spread of the language of sacred texts. When a Japanese Buddhist priest in a California Buddhist church recites a sutra in Pali with his English-speaking congregation, this is a fine example of the spread of a particular language variety over enormous distances in space and time. When accounts of the Buddha and his sayings were collected and came to be accepted as the canon of Buddhist scripture, they were in a Middle Indo-Aryan language, Pali, whose exact provenience is not clear. When the Pali scriptures were used in worship in India and Ceylon, the language functioned as a special religious register in many speech communities where related Indo-Aryan languages were the worshipers' mother tongues. When Buddhism spread to areas such as Burma, Thailand, China, and Japan, the sacred scriptures went along. Buddhist missionaries and scholars translated Pali and Sanskrit texts into other languages, but just about everywhere at least some uses of Pali were kept. In these new areas, the Pali language, still functioning as a religious register, was no longer related at all to the language of the worshipers, but retained its aura of sacredness. This sequence of events, with minor variations, has occurred again and again in the spread of religions with sacred texts. Stage I The language of the texts is close to the spoken language of the religious community.

Religious Factors in Language Spread


Stage II As the spoken language changes, the language of the sacred texts remains more or less intact and has a special religious aura. Stage III As the religion spreads to other areas, the language of the sacred texts functions as a sacred language in speech communities linguistically unrelated. An Islamic counterpart to the Japanese Buddhist priest's use of Pali would be a Pakistani mullah, native speaker of—let us say—Panjabi, who recites from the Holy Koran in Classical Arabic. It sometimes happens that a translation of the sacred text into another language achieves a status which creates a new sacred language which in turn spreads as a religious register in the same way. Thus the Vulgate translation of the Bible into Latin and the general preservation of Latin as the language of learning and religion gave Latin a sacred status in public worship in communities where it represented an archaic but related variety (e.g., Italy) and in places where it was distantly related or not related at all to the mother tongue of the people (e.g., Germany, Hungary), and the use of Latin in public worship continued to the 1960s. The Japanese Buddhist priest in California who provided the original example can also provide an example of a second sacred language translated from an earlier one. In current practice most of the ceremonial language—including the chanting of sutras—when it is not in modern Japanese or English translation, is in Classical Chinese, in the Japanese pronunciation, and the original Pali passages are very limited in number (Hanayama 1969, 45). This "layering" effect of successive sacred languages in the same ceremonial context is actually widespread among the world's major religions, reflecting successive periods in the spread of the respective religions. The extensive literature on the religious use of Latin is a mine of valuable material for the sociolinguist. It is, however, generally necessary to apply comparative dimensions, both with other sacred languages and with other types of register variation and functional allocation of different languages within speech communities. (Lentner 1964 offers an historical survey up to 1563; Kowlevsky 1957 discusses languages other than Latin in Catholic worship; many publications related to the Second Vatican Council reexamine the issues.)

Dialect Another topic to explore is dialect or language differentiation that depends on the religion of the language users. Every speech community in which there are significant religious cleavages may be assumed to have linguistic reflection of those cleavages in ordinary conversational language. The most familiar case is that in which the dialect variation based on religion is the result of earlier geographical variation which has been displaced by the movement of religious groups and subsequently identified with religious differences. The distribution of dialect differences in Konkani, an Indo-Aryan language spoken by a million and a half people in several areas of Western India, is a good example. Miranda's study of the situation asserts: "Generally, Hindus and Christians living in a given area speak considerably different dialects" (Miranda 1978,


Speech Communities and Language Situations

87). A few of these differences are related to religious terminology, differences in customs for eating, dress, etc., and a few are due to the greater influence of Portuguese or Christian Konkani. The major differences, however, including several striking phonological isoglosses, reflect patterns of migration from different places at different times. In fact the patterns of migration have been such that in some instances the Hindu varieties in one locality may be closer to the Christian varieties in another locality than to the Hindu varieties there. Miranda concludes ". . . what was originally a regional differentiation has been transformed into a social differentiation . . . with an interesting complication that the distinctive characteristics associated with Hindu and Christian dialects in one area are the reverse of those in the other area" (89). The Konkani case is of special interest also because it illustrates with unusual clarity the perennial problem of deciding what constitutes a separate language for discussion of language spread. The need for extralinguistic, sociological criteria in defining languages has been recognized for a long time (cf. Kloss 1952, Ferguson & Gumperz 1960), but the fluidity of users' identification of language status depending on changes in their perception of economic, political, religious or other values is not so often noted (but cf. Das Gupta 1974 and Duran 1974). Some varieties of Konkani shade off into varieties of Marathi, and linguistic arguments can be made that Konkani and Marathi should be regarded as two separate languages or that they should be regarded as varieties of a single language. This has been a controversial issue, with political implications, ever since the incorporation of Goa into the Indian Union in 1961, and religion is an important factor. Probably the majority of mother-tongue speakers of Konkani at the present time regard Konkani as an independent language, but the proportion is doubtless much higher among Christians. If it had not been for the spread of Christianity which resulted from the Portuguese colonial operation in Goa, the issue of the independent linguistic status of Konkani would probably not have arisen. A perfect counterpart to the dialect variation by religion in Konkani—although not to the question of one language or two—is the Christian-Jewish-Muslim dialect variation in Baghdadi Arabic described in Blanc 1964. Here again the presentday striking phonological and morphological differences among the three varieties can be shown to have originated in geographical variation which subsequently, as a result of the movement of people, acquired social significance in the same locale. Before leaving the topic of language varieties correlated with religious identification, we must note the obvious corollary to our introductory observation about writing systems. Since the spread of writing systems tends to reflect spread of religious systems, it would follow that the use of two writing systems of the same language would tend to reflect religious cleavages in the community, and this is indeed the case. Many (most?) instances of "multigraphism" of a language represent either synchronic religious differences or vestiges of earlier religious differences. In addition to pointing to such traditional examples of Serbo-Croatian, or Hindi-Urdu, we can refer to the written use of Konkani. In Goa proper, where nearly half of all the Konkani speakers are located, Christian writers use Latin letters and Hindu writers use the Nagari alphabet, although outside Goa the writ-

Religious Factors in Language Spread


ing system employed for Konkani usually is that of the surrounding language, regardless of religion.

Language Maintenance and Shift Immigration From exploration of the religious use of language in societies and its implications for language spread, we may move to the topic of religious influence on the maintenance and shift of languages in addition to or apart from their religious use. Whenever the use of a language or language variety is spreading, it may be assumed that religious beliefs and practices will in some measure affect the rate and extent of the spread. Let us examine one particular type of language spread—that resulting from voluntary immigration of individuals and groups into a different speech community, the intent of the immigrants being to locate new homes and remain indefinitely. In such cases of immigration, various patterns of language maintenance and shift are possible, depending on a wide range of economic and social factors including attitudes and expectations on the part of both the immigrants and the host population. A fine example is the substantial immigration of Germans and Japanese to Brazil. Both groups have been gradually shifting to Portuguese as a mother tongue, but the point to be made here is that their religious organizations, Christian and Buddhist respectively, have not only maintained German and Japanese in connection with religious activities but have established special schools and other activities for maintaining knowledge of the original homeland language. This pattern of linguistic conservation on the part of immigrant churches is a widespread phenomenon, very familiar to us in the American experience (cf. Fishman et al. 1966). One is tempted to claim that this is a universal phenomenon, i.e., that the reverse pattern will occur rarely or not at all. One does not expect an immigrant church to take leadership in facilitating a shift of language. Yet any such "universal" hypothesis fails to match the complexities of possible patterns of language shift in immigrant communities. Among the South Asian communities which immigrated into East Africa in the early years of the twentieth century, very complex patterns of language shift occurred. First of all, the Asians brought with them the regional, religious, and linguistic diversity of their homelands. Then, in their immigrant adaptation, they dropped relatively little of their linguistic diversity but added to their repertoires various African languages, particularly Swahili. All the groups tended to retain religious registers (including several classical languages), to retain such South Asian lingua francas as "Hindustani" and English as Asian community lingua francas in the new setting, and to add the use of African lingua francas such as Swahili and, again, English for use with Africans and Europeans. In all this complex set of changes in language use, it is difficult to discern an overall effect of language conservation on the part of institutional religion. The Sikh religion among the Panjabi speakers acted as a preserver of the use of the distinctive Gurmukhi writ-


Speech Communities and Language Situations

ing system for Panjabi, which was not learned by Hindu and Muslim Panjabis. This preservation was, however, primarily for the reading of Sikh religious literature, not to maintain the use of Panjabi as a written language in the community. On the other hand, the Christian religion among the Konkani speakers acted to accelerate the shift from Konkani to English as mother tongue. This kind of shift had precedents in those Goans who had shifted to Portuguese or English in Goa, but it was speeded up in East Africa, and the religious identification was a strong factor. It may still be safe to assume that in voluntary migration for socioeconomic reasons, religious affiliation or commitment will tend to be language-conservative (i.e., maintenance-oriented), to the greatest extent for the language of sacred texts, next greatest for the language of public ritual and explanation of the texts, and also for the mother tongue language of ordinary conversation. This conservation may be considerably modified by such factors as the presence/absence of coreligionists in the host population, the existence of shared lingua francas, and the ideological stance of the religion with regard to language (see 4 below). In the case of the Christian Goans in East Africa all these factors were at work. The Goans found European and African Roman Catholics in East Africa. Many of them already had some knowledge of English as a lingua franca in India, and their religion did not ascribe sacredness to Konkani for sacred texts or public ceremonies. In fact a worldwide shift in ideological stance in the Catholic church contributed to the shift to English by Goans in East Africa, namely the changeover to vernacular services in the 1960s. The shift in East Africa was from Latin to Swahili and English, and for the Goans this meant one more extension of the use of English. (Konkani is now used in churches only as a supplementary language for the non-English speakers.)

Colonization Another topic to explore in the study of religious factors in language maintenance and shift is the type of spread which is associated with intentional colonization, i.e., the sending of groups of people to new areas for the purpose of maintaining outlying units of the parent community, assuring access to resources or lines of communication required by the parent community, or simply extending the political, economic, or religious power of the parent community to other societies. Intentional colonization is an old phenomenon in human history, and probably took place also in prehistory. Two early examples in Western history were the planting of Phoenician and Greek colonies around the Mediterranean. As in many colonizations before and after them, those colonizers transplanted both language and religion to the new localities. Although the primary motivations were apparently economic, the colonizers were effective spreaders of their language and religion, tending to impose them on the "less developed" peoples with whom they came in contact. In the case of the Phoenician colonization, the language and religion survived in the biggest colony, Carthage, for centuries after both had died out in the homeland. In the case of the Greeks, it was a major characteristic of the Hellenistic period that Greek language and religion (along with other aspects

Religious Factors in Language Spread


of Greek culture) spread largely from Greek colonies. We can compare this phenomenon to colonizations of other times and places, including, of course, the spread of European languages and religions to the Western Hemisphere, Asia, Africa, and Oceania from the fifteenth to the twentieth centuries. The question we are concerned with here, however, is the relatively narrow one of the way language and religion are tied together in colonization. The relation between spread of religion and spread of language in the process of colonization is not simple and direct, since there are well documented, roughly comparable instances of colonization in which religion has spread but not language (Spanish colonization of the Philippines), and others in which language has spread but not religion (French colonization of Algeria). Yet the two are often tied together. Important explanatory factors in the spread of religion and language in colonization include: the number of colonists in proportion to local population included in the colony, the colonists' attitude toward incorporation of the local population into their society, the position of religion in the colonizing society, and the ideological stance of the religion with regard to language (see 4 below). One very common pattern of language use in colonial religious activities, typical of British colonization in Asia and Africa in the late eighteenth and early nineteenth century, was the three-fold use of local vernaculars, local lingua francas, and English (the colonial language) in missionary schools whose primary purpose was the spreading of the Christian religion. This pattern, in which religious activities directly affected the spread of lingua francas and English in the colonies, had important consequences for language use which are still evident in the independent nations that replaced the colonies. The establishment of this pattern resulted from a unique constellation of factors in British colonization: a missionary enterprise separate from the civil authorities, the religious ideology of propagating religion through the mother tongue, a commitment to lingua francas and English based on "practical" considerations not ideology, the availability of English-speaking missionary personnel prepared to learn local languages. This pattern of language use in colonial religious activities, typical of British colonizations in Asia and Africa, is only one among many. In order to understand its origin and its results in a general theoretical framework it would be necessary to have comparative studies of colonization in many times and places, under many policies and with different consequences, and among different kinds of religions and different kinds of languages. At this point we can only note that, in contrast to the fairly general, conservative, language-maintaining, intercommunity orientation of religion in voluntary immigration, the role of religion in colonization is often extracommunity-oriented and aimed at changing the language repertoires of speech communities. See Heath and Laprade (1982) for a case study of religion and language spread in colonization. Religious Ideology of Language Many factors extraneous to religion may influence or even be determinative in the effect of religion on language spread. Religion as such may be secondary or even marginal in comparison with economic, political, demographic, or other factors


Speech Communities and Language Situations

accounting for language spread. But in at least one respect the religious factor has a certain primacy: the total religious system includes a set of beliefs and attitudes about language. Thus if a particular proselytizing religion has as part of its articles of faith that its adherents must pray in a certain sacred language, this will tend to have a different language spread effect from that of another proselytizing religion which insists its adherents must pray in their mother tongue. We will put aside here the historical question of how a particular set of beliefs about language becomes part of a religious system, and explore several of the principal types of such beliefs. We may safely assume that all religious belief systems include some beliefs about language. At the very least a religious system specifies that some ways of talking are better than others: it is hard to imagine any set of religious beliefs which includes no "shoulds" about verbal behavior. Religions may have myths about the origin of human language, the diversity of human language, or one particular sacred language; they may specify characteristics of the language of inspiration or possession; they may prescribe the language of ritual interaction and private meditation—the possibilities of religious beliefs about language seem endless. Here let us explore only attitudes toward mother tongue, sacred texts, and the language of religious practice (including formal teaching of religion). It is not unusual for a religious community to have great respect for the language of the (real or presumed) founder of the religion or the mother tongue of the group in which the religion emerged (historically or mythically). Thus in the Sikh community the respect for Panjabi is great. It is noticeably greater than the respect which non-Sikh Panjabis have for their language. Hindu and Muslim Panjabis often report their mother tongue as Hindi or Urdu respectively, and these other languages traditionally function as the H varieties of diglossic situations, with Panjabi as L. Yet this religion, which started as a universal, proselytizing religion which would reconcile Hindus, Muslims, and others, does not insist on converts learning Panjabi as a language of conversation. English-speaking American converts or Swahili-speaking African converts may be encouraged to study some Panjabi in order to appreciate the Granth Saheb, their holy book, but they are not expected to acquire Panjabi as an everyday language. Religious factors alone are rarely decisive in bringing about a shift in mother tongue; but they may be strong secondary factors associated with other aspects of culture change. The attitude toward the language of sacred texts, for those religious systems which have sacred texts, varies considerably. Thus Islam, and Hinduism—speaking broadly—attach sacredness and significance to the very language of the scriptures so that translations of the Holy Koran, or the Vedas have traditionally been disapproved or viewed with alarm. Judaism tends to share this view of the language of the sacred text, but translation/commentary helps in Aramaic were produced to aid in understanding the texts, and in the second or third century B.C.E. a Greek version of the Scriptures was produced for the large Greek speaking Jewish community in Egypt. On the other hand Christianity and Buddhism have from their very beginnings encouraged the translation of their sacred texts into other languages, and even

Religious Factors in Language Spread


though both religions have exhibited strong attachments to particular sacred languages (cf. the Latin and Chinese examples in 2.1 above), their theology and their practice have not shown the same concern for exactness of text and inherent holiness of language which Islam and Hinduism, and to a lesser extent Judaism, have shown. These differences in religious ideology about language have had their expected effects on the spread of religious languages. Wherever Islam has gone, some knowledge of Arabic has gone with it, and the Koran in Arabic is recited. Wherever Buddhism has gone, knowledge of the Eight-fold Path has gone, but no one sacred language has gone: sutras for the scriptures may be recited in Pali, Tibetan, Chinese, or some other language. Thus, all Brahman priests, no matter what their sect or mother tongue, can make use of some Sanskrit if they meet together in Benares, and all Muslim pilgrims to Mecca make some use of Arabic, but there is no common language for Buddhist or Christian clergy to recite the Three Treasures or the Our Father. Many religious systems have no sacred texts in the sense of a holy book, but presumably all religions prescribe certain public uses of language as an aspect of religious practice. (Samarin 1976 offers a stimulating array of studies of language in religious practice.) This aspect of religious ideology may have more direct implications for language spread than the other two. When a religion spreads, for whatever reasons, this ideological stance is likely to be a strong determinant of an accompanying spread of language. Traditional language preferences for corporate worship, religious teaching, or public interaction (e.g., greetings), and the like must have an effect on language maintenance and shift. The use of Yiddish as the language of religious teaching in Hasidic communities and the use of German in corporate worship in Hutterite communities have helped to maintain these languages against the encroachment of other languages, to take two, often-cited examples. Yet in this no-man's-land between mother tongues and the language of sacred texts there are few studies relating religion and the spread of language. It is typical that none of the studies in the Samarin book attempts to relate the patterns of language use to the spreading or receding of the languages or varieties described. This brief exploratory paper suggests the following conclusions: a. Religious factors may be highly significant in processes of language spread (almost every paper in this volume mentions religion), but there is no simple, direct relationship between religion and language. b. The role of religious factors in language spread may be largely dependent on the place of religion with respect to more powerful factors of language spread: economic, political, demographic, geographical. c. When the spread of religion is tied to the use of a holy book and a traditional writing system, this complex is likely to be a strong factor in the spread of written forms of language. d. The effect on language spread of religious ideology about language is a promising—and little investigated—topic for research.


Speech Communities and Language Situations

References Blanc, Haim. 1964. Communal dialects in Baghdad. Cambridge: Harvard University Press. Das Gupta, Jyotindra. 1974. Ethnicity, Language Demands, and National Development. Ethnicity, Vol. 1, no. 1, pp. 65-72. New York: Academic Press, Inc. Duran, James. 1974. The Ecology of Ethnic Groups from a Kenyan Perspective. Ethnicity, Vol. 1, no. 1, pp. 43-64. New York: Academic Press, Inc. Ferguson, Charles A. & John J. Gumperz (eds.). 1960. Linguistic diversity in South Asia, Bloomington, Ind.: Indiana University Research Center in Anthropology, Folklore, and Linguistics. Fishman, Joshua A., et al. 1966. Language Loyalty in the United States. The Hague: Mouton. Gold, David L. 1980. The speech and writing of Jews in the U.S.A. In Charles A.Ferguson and Shirley Heath (eds.), Language in the U.S.A. Pp. 462-93. Cambridge: Cambridge University Press. Hanayama, Shoyu. 1969. Buddhist handbook for Shin-shu followers. Tokyo: Hokuseido Press. Heath, Shirley Brice & Robert Laprade. 1982. Castilian colonization and indigenous languages: The cases of Quechua and Aymara. In Robert L. Cooper (ed.), Language Spread: Studies in Diffusion and Social Change. Pp. 118-47. Bloomington, Ind.: Indiana University Press. Kloss, Heinz. 1952. Die Entwicklung neuer germanischer Kultursprachen von 1800 zu 1850. Munich: Pohl & Co. Kowlevsky, Cyril. 1957. Living languages in Catholic worship. London: Longmans, Green. Lentner, Leopold. 1964. Volksprache und Sakralsprache. Vienna: Herder. Miranda, Rocky V. 1978. Caste, religion and dialect differentiation in the Konkani area. International Journal of the Sociology of Language, 16:77-91. Neale, Barbara, 1974. Language use among the Asian communities. In W. H. Whiteley (ed.) Language in Kenya. Pp. 263-317. Nairobi: Oxford University Press. Samarin, William J. 1976. Language in religious practice. Rowley, Mass.: Newbury House.

5 Literacy in a Hunting-Gathering Society: The Case of the Diyari

In recent years, beginning with the often-cited article of Goody and Watt (1963), a renewed and growing interest has developed in the study of literacy as a phenomenon of human societies, and strong claims have been made about the cognitive effects of the shift from oral to literate culture (see, e.g., Goody 1977; Ong 1982; but cf. Frake 1983). Although many of the recent literacy studies are ethnographic in perspective (e.g., papers in Goody 1968; Schieffelin and Gilmore 1986), we still have very few descriptive studies of the introduction of literacy into particular nonliterate societies. One such study (Ransom 1945) appeared in the first volume of this journal. In it the anthropologist author attempted to reconstruct and interpret the sequence of events as vernacular literacy was introduced among the Alaskan Aleut by Russian Orthodox missionaries early in the nineteenth century; a valuable follow-up article appeared thirty years later (Black 1977). The present paper attempts a similar reconstruction of the nineteenthcentury introduction of vernacular literacy among an Australian Aboriginal group by German Lutheran missionaries. It is intended to be sufficiently detailed to suggest useful comparisons with other cases and thus to contribute to the formulation of a better theoretical framework for the study of this kind of cultural change and its implications. In the 1860s some three thousand to five thousand people lived in the Lake Eyre region of south-central Australia, the northeast corner of the present state of South Australia. Among these people about a dozen languages were spoken, of which the principal one was a cluster of language varieties referred to as Diyari (Austin 1981). A hundred years later these languages were almost completely extinct, and only a handful of the descendants of the earlier inhabitants remained. During that period of rapid human and linguistic decline, however, an unusual episode of the introduction of literacy took place as a direct result of the activity of Lutheran missionaries, who for fifty years (1867-1917) operated a mission settlement on Lake Killalpaninna in the Cooper's Creek district (Hebart 1938:188This paper was originally published in the Journal of Anthropological Research 4.3:223-37 (1987), reprinted by permission of the Journal of Anthropological Research.



Speech Communities and Language Situations

93; Proeve and Proeve 1952; Brauer 1956:223-31; Scherer 1966, 1979; Bonython 1985; Jones and Sutton 1986:22-53). In this paper I attempt to reconstruct the story of how the members of the Diyari-speaking community acquired a written version of their language and how the written use of their language spread throughout the community.1 The fuller story of the economic, social, and religious history of the Diyari and the overall fortunes of the community will be discussed to the extent that it is directly relevant to the more narrowly focused story of their literacy. Thus this paper is intended to offer a case study of one of the relatively rare instances of some form of vernacular literacy taking hold in a hunting-gathering society. "Vernacular literacy" here means the ability to perform reading and writing behaviors in one's native language as a means of exchanging messages within a social group; "taking hold" means that literacy becomes a part of the shared cultural resources of the society and is not merely a marginal phenomenon activated only by direct involvement with an impinging alien culture. The many instances of writing being introduced into a nonliterate culture and remaining marginal or disappearing altogether raise the question of the conditions that lead to the full incorporation of literacy into the cultural repertoire of a society.

Language among the Diyari The language situation in the Lake Eyre region of Australia in the 1860s was complex: the people were for the most part seminomadic, and their areas of movement overlapped, so that the various language groups were not neatly delimited geographically. The various groups, or segments of them, came together for ceremonies of initiation, rituals for increase of foods or rain, funeral rites, and other ceremonies celebrating traditional myths. Some marriages took place between speakers of different languages. Members of different language groups also interacted to a limited extent in trade, especially in the Kopperanna and Killalpaninna locations, exchanging such items as netted bags, red ocher, stone slabs, feathers, and wooden spears, shields, and boomerangs. The means of interlanguage communication in these diverse settings were apparently various multilingualisms, conventionalized manual signs, and to some extent the use of Diyari as a lingua franca. The local populations, like those in many parts of Australia, readily learned additional languages as they were needed; the skill of learning to understand (to "hear") another language was valued, as was the ability to imitate other people's pronunciation. Another language was learned by observation and use under "natural" conditions, not by formal teaching. (On patterns of multilingualism in Australian Aboriginal communities, see Brandl and Walsh 1982.) Even among members of a single language group, such as those for whom Diyari was the primary language, differences of pronunciation, vocabulary, and patterns of address depended on one's place of origin, age, sex, totemic clan ("skin" as it is called in current Aboriginal English), and individual family history. Every member of the Diyari community belonged to one of two exogamous matrilineal moieties and inherited his or her mother's skin (mardu) as a determinant of

Literacy in a Hunting-Gathering Society


permitted marriage partner, eating taboos, and various patterns of social interaction. One also inherited one's father's skin (pinthara), which gave ties to a particular totemic spiritual being (muramura) and related locations, features of the landscape, myths, and ceremonies. The people were doubtless sensitive to the variations in language correlated with such differences among users and uses and were accustomed to skillful exploitation of these variations in their communicative strategies with one another. From the Europeans'point of view, the most unfamiliar aspects of language use and attitudes toward language were the relatively easy multilingualism, the complex systems of kinship and ceremonial use of language, and the total lack of writing or any other representation of speech and of an associated formal "classroom" style of language instruction.

The German Lutheran Mission to the Diyari In 1838 about two hundred German Lutherans arrived in South Australia to settle; they were joined by others in 1839 and 1841. All were fleeing from the stateimposed union church of Prussia that was attempting to amalgamate the Lutheran and Reformed traditions and was persecuting Lutheran separatists. Most of them were farmers and local artisans from districts of Prussia that are now part of Poland. The German Lutherans of South Australia split over doctrinal issues, but their two main synods joined in a Confessional Union from 1864 to 1874 and in effect jointly sponsored the mission work among the Diyari. On October 9, 1866, in the Langmeil Lutheran Church in Tanunda, two missionary pastors, J. F. Gossling and E. Homann, both trained at the mission seminary in Hermannsburg, Hanover, were commissioned to undertake missionary service with the Aborigines in the Diyari area. Two lay missionaries were also commissioned—Hermann H. Vogelsang, a blacksmith experienced in construction and farm work, and J. Ernest Jacob, a millworker who had emigrated to Australia in the hope of doing mission work among the Aborigines. After a hard threemonth journey to Lake Killalpaninna, they set up a mission station, which they named Hermannsburg.2 Under the strenuous physical conditions and with the severe cultural clash and mutual suspicion between Aborigines and Europeans, the mission at first led a precarious existence (Proeve and Proeve 1952). Indeed, Pastor Gossling left within a year. In accordance with the mission policy and training curriculum established by Louis Harms, founder of the Hermannsburg Mission Society, the new missionaries proceeded to study the local language and to devise a writing system for it so that missionaries and Aborigines could make written use of the vernacular.3 Director Harms had extensively studied ancient and modem languages and required high standards of literacy and linguistic knowledge on the part of missionary pastors. He took for granted their need for study of the local language in order to preach the gospel (Proeve and Proeve 1952). Already by 1869 the missionaries had made sufficient progress in acquiring the Diyari language to be able to start a school and have a primer printed for use in teaching. Another Hermannsburg-trained missionary pastor, C. Schoknecht,


Speech Communities and Language Situations

reached Australia in 1871 and joined the mission, and in spite of "two years of disappointments and hardships" before he left, he was able to compile a short grammar and vocabulary of Diyari. Although this was never published (the original manuscript is in the Melbourne Museum), it was used by missionaries who came later.

Missionary Linguists One important factor in the introduction of literacy among the Diyari was the interest and competence of individual missionaries in the language and culture of the people. Over the years the Killalpaninna Mission had the services of three outstanding missionary-ethnographer-linguists of the Lutheran missions in Australia and Papua New Guinea: John Flierl, J. G. Reuther, and Carl F.T. Strehlow. John Flierl was trained at the mission seminary in Neuendettelsau, Bavaria, which sent a stream of pastors and missionaries to North America, Brazil, New Guinea, and elsewhere (Koller 1924). He arrived at the mission in 1878 and remained until 1885.4 Under his energetic leadership the mission settlement was built up, it was renamed the Bethesda Mission, the first converts were baptized, and Diyari translations of a catechism and Bible history were published. J.G. Reuther came to the Bethesda Mission in 1888 and was the principal missionary there until 1906. During his time literacy became well established: adults as well as boys and girls were taught to read and write. Church and school were conducted in Diyari, though the missionary staff, including wives, spoke German to one another and conducted additional services in German for themselves and any others—Europeans or Aborigines—who were present and wanted to join them. Pastor Reuther set himself the task of translating the New Testament into Diyari; this was completed and printed in 1897 (Reuther and Strehlow 1897), the first published translation of the New Testament into an Australian Aboriginal language. Reuther was an indefatigable collector of words, artifacts, stories, songs, and all sorts of traditional lore. His thirteen volumes of ethnographic notes are a treasure house of knowledge on Aboriginal customs and beliefs (Reuther 1981). They include four volumes on the grammar and vocabulary of Diyari. Modern linguists have some reservations about Reuther's orthography, but they acknowledge the general accuracy of his grammatical and lexical observations. (For a vivid account of Reuther's ministry at Bethesda, see Scherer 1979.) Strehlow, the third outstanding missionary-linguist to work at the Bethesda Mission, arrived in 1892 and remained only two years.5 In the short time Strehlow was at Killalpaninna, he helped Reuther with his Diyari New Testament and gave him important encouragement to carry on with his anthropological studies. Strehlow and Reuther had been seminary classmates at Hermannsburg; each believed in the importance of knowing the language and culture of the people to whom they were bringing the gospel, and each was an eager collector and interpreter of Aboriginal myths and ceremonial lore. At the same time, however, both were stern disciplinarians and regarded many ceremonies as idolatrous; although they attempted to learn about these and understand them, they would neither take part

Literacy in a Hunting-Gathering Society


in them nor encourage their performance. Both men fostered the production of written materials in the local language. These three, Flierl, Reuther, and Strehlow, were not the only missionaries who contributed to the study of Diyari language and culture and encouraged literacy. Pastor Schoknecht's work was already noted. Missionary C. A. Meyer, an elder of the Langmeil congregation who was commissioned for mission work among the Diyari in 1875 and who for some ten years was in effect the business manager of the mission, worked steadily on Bible translation and also managed to accumulate an impressive scholarly library. Pastor W. Riedel, who served the Bethesda Mission from 1908 to 1914, compiled a Bible history in Diyari; according to Peter Sutton (personal communication), he also produced a handwritten Diyari-German dictionary that is still in existence.

Literacy from Missionaries to Aborigines Missionaries Literacy was a central feature of the German Lutheran missionaries' lives and mission policies at the Bethesda Mission. The use of handwritten and printed material was part of a whole complex of values which included: fixed places of living, working, teaching, and worshiping; cleanliness and order, full clothing, and working for one's living; and truthfulness, public expression of faith in God, and fixed times for prayer and divine services. The missionaries regularly wrote letters and other communications to relatives, church officials, and others with whom they had friendly or business relations. They followed their printed Bible and church Agende of liturgy and hymns; they often wrote out sermons, and many kept a daily journal. The literacy practices of the missionaries' wives and the nonordained men and women of the mission staff must have been at least as salient to the Aborigines and quite likely more directly influential. Books, letters, financial records, and records of useful information (e.g., names and addresses, recipes, and inventories) were all in evidence. Vogelsang brought along books of medical remedies; C.F. Meyer kept a daily record of weather and temperature. A number of the women kept diaries or wrote memoirs (see Proeve 1946; Hardy 1984). The lay missionaries provided continuity for the missions: Vogelsang and his wife stayed with the Bethesda Mission for over forty years and became known affectionately and respectfully as "Father" and "Mother" Vogelsang; one of their sons eventually became the mission teacher. Jacob and his wife also served for forty years. All these people and other Europeans and white Australians associated with the mission were literate and made daily use of their literacy. The missionaries, both pastors and laymen, had large families, and children abounded. Although an occasional child might be committed to the care of others or sent away to school, many grew up playing and studying with Aboriginal children and learned to speak Diyari well. Also, they participated in family literacy


Speech Communities and Language Situations

events that put high value on their ability to read. For example, at the Vogelsangs' daily morning and evening devotions, the children vied with one another in taking their turns at reading portions of Scripture aloud (Proeve 1946). While many of these literacy attitudes and practices of the mission may have been widespread in German-speaking Lutheran communities of the nineteenth century, some were directly related to the mission policies of Director Harms of the Hermannsburg institution (Wendebourg 1910). Although the primary purpose of the mission was proclamation of the gospel rather than education as such, it was felt that the Christian school would grow naturally out of the Church, and a school was to be begun as soon as possible after the gathering of a congregation.

Aborigines For the Aborigines in the area, the peculiar phenomenon of literacy in the mission community must have been a relatively insignificant part of a whole way of life that clashed with their own at almost every point. Only gradually must they have come to understand what it was and what it might do for them, with their very different customs and needs. One early story is an isolated example of the Aborgines' perceptions of literacy: Mr. Ernst Jacob, who was out on the run looking after the sheep, sent word to "Father" Vogelsang, who was in charge of the store, that he would like him to send him some tobacco. A native boy was sent with the parcel together with a letter in which it was mentioned how many plugs of tobacco were forwarded. The boy, knowing what the package contained, could not resist the temptation, and took a few plugs and hid them in order to pick them up on his return trip. The boy was questioned about the missing tobacco by Mr. Jacob, and immediately admitted his guilt. The boy was at a loss to understand how the letter, which was closed up, had managed to see him taking the tobacco. (Proeve 1946:21)

As the Diyari came to realize that messages could be sent to others or left for others to receive, they must have thought of ways such messaging could serve their needs. Most of the applications of literacy in the mission would have been of little use to the Diyari, but some kinds of messages would have had immediate appeal. With the Aborigines' great interest in places, details of landscape, and rights to use or "ownership" of totems and ceremonies, visual messages could have been used to signal to others where someone was, had been, or intended to go, or where a ceremony would take place. It is not too farfetched to speculate that the toas, the mysterious directional signposts that the Diyari produced during the mission period (Jones and Sutton 1986), were a result of their realization that visual messages could be left. This could be true even if the toas never had much actual use as markers but came to serve as artifacts to be traded for other things desired, including the payment of money by white collectors. In general, the teaching of reading and writing in the classroom must have seemed a useless exercise, to be persisted in only because it led to food and other valued associations with the missionaries. As some Diyari began to accept the Christian religious teaching, however, the reading of Scripture became highly sig-

Literacy in a Hunting-Gathering Society


nificant; and as some of them began to work as drovers and laborers, both on the mission and on ranches and other sites, they would have found the sending of letters and keeping of simple records valuable for exchange of family and community news and carrying out their work. Even though the classroom emphasis was more on reading than writing and the missionaries do not seem to have instituted news bulletins or initiated other mechanisms of personal as opposed to church and school literacy among the Aborigines, the Diyari gradually came to use letters in Diyari to communicate with absent missionaries and with one another. Correspondence between Diyari and mission friends is well attested. After her husband died, Mother Vogelsang moved south to spend her last years, and Diyari people back at the mission corresponded with her in Diyari. The next generation continued this practice. For example, Rebecca, a Diyari woman, exchanged letters and cards with Vogelsang daughters. A few of these bits of writing are extant, some at the South Australia Museum, which hopes to collect more (Jones and Sutton 1986:138), and others will probably emerge in private holdings and mission records. Apparently none have been published or described in any detail in professional publications. Examples of letters between Aboriginals are much less likely to survive, since the recipients generally did not live under conditions suitable for preserving such message bearers and had no tradition of keeping such ephemeral personal possessions. Nevertheless, such letters do exist, and the Museum is hopeful of obtaining some. Although literacy in English became an important part of the school curriculum, especially after the Englishman Harry Hillier joined the mission staff as teacher in 1900,6 literacy in Diyari was already established. One Diyari speaker, Sam Dintibunna, who attended the mission school in the early 1890s and was literate in Diyari but not English, was able to write out several myths in Diyari, which were translated and published in the 1930s (Fry 1937). In one of the rare indications available on details of Diyari literacy behavior, Fry (1937) noted that the texts written by Sam Dintibunna showed inconsistent spellings and made no use of capital letters or punctuation, apart from an occasional straight line to mark the end of a sentence. Aboriginal converts who became strongly identified with the church life of the mission and took on roles of catechist or evangelist (in some other mission districts even becoming ordained pastors) naturally tended to make greater use of literacy. One mission-educated Diyari, John Pingilina, one of the first converts, played an important role in Lutheran mission activity, not only among his own people, but also at Hope Valley in Queensland, where he was sent with Missionary Meyer. There Pingilina learned some of the local language, Guugu Yimidhirr, and some of the local people learned his language. For a number of years he was a mainstay of the Lutheran missions in Queensland, later serving at Bloomfield. Pingilina is known to have written letters in Diyari back to his people. The last living Aboriginal who was educated at the Bethesda Mission school is Ben Murray. He was born in 1891 of Arabana mother and "Afghan" father and grew up among speakers of Diyari. Later he traveled widely and had varied experiences as a camel driver, stockman, and fruitpicker; he fought in World War I. In addition to Diyari he speaks at least Dhirari, German, and English and is literate in


Speech Communities and Language Situations

at least Diyari and English. Austin (1981) reported that his "high intelligence and sophistication made him an extremely valuable language helper." He is still able to read and write in Diyari, although now with failing eyesight. A fuller account of Diyari literacy should include discussions of the level of mastery of orthographic conventions, differences between spoken and written varieties, possible influences of one variety of the language on the other, and details of the actual use of literacy in the Diyari community. Such an account can be given only after the collection of written texts, a combing of the primary historical source documents for relevant comments, and the recording of focused interviews with the remaining speakers of Diyari and those who remember stories about the later days of the mission and events involving the use of Diyari.

Literacy and the Decline of Diyari Of the estimated 260 or so languages spoken in Australia before the arrival of Europeans, only about 150 are still spoken, and the number continues to decrease. Thus the descriptive grammars of Aboriginal languages are often salvage operations in which the author must rely on a few elderly informants rather than a viable speech community of primary speakers. This is the case with Diyari (Austin 1981), although the data available in this case are more plentiful than for many such grammars. Studies of language "death" (e.g., Dorian 1981; Dressier and WodakLeodolter 1977) have shown that a language may die out by various means, such as reduction in the number of speakers for economic or biological reasons (e.g., epidemics, loss of fertility), reduction in the range of uses of the language, and heavy borrowing from another language. In the case of Diyari, a number of factors were at work. During the years 1885-1908 the Bethesda Mission was a thriving institution, largely because of fairly good rainfalls, a substantial sheepraising operation, and the availability of government rations for the Aborigines. Even during this period, however, the Diyari faced several obstacles to the viability of their language community. European ranchers and settlers occupied traditional Diyari areas, Diyari men moved from traditional Diyari areas to tribally mixed locations, some took up positions with ranches and other operations in the region, and European diseases made serious inroads into the Diyari population. The mission also was almost constantly in debt, and these debts began to rise alarmingly. In subsequent years, severe drought conditions, the toll of epidemics of introduced diseases, Aboriginal mobility, personnel problems of the mission, changes in the local scene, and the spread of English all contributed to a rapid decrease in the numbers of Diyari speakers and decreasing use of Diyari by those who could speak it (on the population decline, see Jones and Sutton 1986:42-43). It is not clear whether there was any reinforcing communal use of literacy in the sense of group literacy events apart from church services, but the use of literacy remained an individual practice that could serve communicative and social functions with others who had attended the mission school and interacted intensively within the mission community. Even as the number of speakers declined and the range of

Literacy in a Hunting-Gathering Society


uses became more restricted, the individual competences endured as a shared mark of membership in the mission influenced Diyari-speaking population.

Comparative Observations The presentation of a preliminary version of a case study, as this must be until further research can be carried out, is of value chiefly for the themes it suggests for generalization and the hints it may offer for the discovery of general principles of successful and unsuccessful introductions of literacy in similar cultural contexts. The themes mentioned in these closing observations are derived chiefly from the Diyari account and the information available from the somewhat similar case of the introduction of literacy among the Aleut by nineteenth-century Russian Orthodox missionaries (Ransom 1945). In 1741 the Bering expedition discovered the Aleutian Islands and their hunting-fishing population speaking a language related to Eskimo. In the next decades Russian fur traders cruelly exploited the Aleuts, forcing them into a kind of mobile serfdom, and their numbers declined from perhaps twenty thousand to about a fifth of that by the time the missionary-linguist Veniaminov arrived on the scene in 1824. Russian Orthodox Christianity had already been introduced in the Aleutians, but Veniaminov built up the Church, studied the Aleut language and culture, devised an orthography based on the Russian (Cyrillic) alphabet, and produced a series of original and translated works for the Aleut: a primer, a catechism, a Bible history, a guide to Christian living that went through many editions (Veniaminov 1840a), and translations of the Gospels. In his research on the Aleut language and his translations, Veniaminov worked closely with Ivan Pan'kov, an Aleut leader (Black 1977). In Russian Veniaminov wrote a grammar of Aleut, an Aleut dictionary, and three volumes of ethnographic description (Veniaminov 1840b.)7 Literacy in Aleut became well established, being taught in church schools. Perhaps more important, it came to be transmitted in a male line, in accordance first with traditional beliefs about who would receive one's spirit after death, later on by the Christian model of godfather to godson. Lay readers assisted in the Russian services and conducted worship in the absence of priests. News bulletins posted outside the church building became customary and have persisted to the present (now mostly in English). Letters were exchanged. Hunters, often isolated for long periods of time, kept diaries including lists of their catches. The Russian church and Aleut literacy continued for decades after Alaska was sold to the United States in 1867, but eventually American schools were started. These forced the spread of English language and literacy and fought the existing pattern, which was for them, in the words of the American linguist Krauss (1973:803), "the wrong language, the wrong alphabet, and (perhaps most of all) the wrong religion." At present (1986) Aleuts number perhaps a thousand, and many of them do not speak their ancestral language. Although this account does not do justice to the full story, it is enough to show that some of the same factors seem to have been operative as in the Diyari


Speech Communities and Language Situations

case. In a previous study of a fourteenth-century introduction of vernacular literacy by the Christian missionary Stefan of Perm (Ferguson 1968), I called attention to the linguistic choices that had to be made by the missionary initiators of culture change: which language? which variety of that language? what kind of writing system? what materials for reading? Such choices do indeed strongly affect the "taking hold" of literacy, but in the present case study I would like to draw attention to other questions of social function and ideology. It is often claimed that the people of a particular culture and/or language will most readily accept those new elements of another culture/language that best fit the host structure. Without necessarily defending this claim against arguments of circularity and the lack of appropriate metrics of structural matching, it seems reasonable to hypothesize that the deliberate introduction of a new technological complex such as writing and its associated behaviors and values will succeed best when it (1) builds on an existing pattern in the host society, (2) meets some apparent needs in the society, or (3) is closely connected with another complex that is being successfully introduced. Further, it seems likely that the agents of change themselves might operate with hypotheses of this kind in mind, consciously or unconsciously. I have found no evidence that the missionaries to the Diyari or the Aleut sought to relate literacy in any way to local patterns or local needs. (In the fourteenth-century case, however, Stefan, in devising the writing system, incorporated some of the people's traditional decorative designs.) Instead, these missionaries saw literacy as an essential part of the proclamation of the gospel and the establishment of Christian public worship. It must be noted that they were successful in this aspect of literacy use: the anthropologist Ransom, writing on the basis of fieldwork in 1936-37 and 1940 when English was everywhere dominant, could report that "the Biblical literature is thoroughly familiar to every adult male, who has usually read and reread most of the published Aleut books" (Ransom 1945:337). The new religion brought with it a totally new pattern of ceremony, drastically different from Diyari corroborrees or Aleut shamanism, but as people accepted the new faith, the pattern of public reading in religious rituals became fixed and probably remained the central use of vernacular literacy in both communities. The posting of announcements on a sheltered wooden board outside the church building caught on very early among the Aleut and persisted as a traditional Aleut literacy use. However this practice began, it met a definite communicative need in the community and was successful. It is surprising that no comparable use seems to have been instituted in Lutheran missions in Australia, where it might well have reinforced local literacy. The writing of letters is nowhere reported as a practice deliberately encouraged by the missionaries; but the local people noticed that the missionaries and officials wrote letters, and when a distinct need arose, this use was added among both Diyari and Aleut. It seems reasonable that this use of literacy would be much more acceptable or appealing to hunting-gathering societies than record keeping, leisure book reading, or putting traditional myths or songs on paper. A common feature of the introduction of literacy among the Diyari and the

Literacy in a Hunting-Gathering Society


Aleut was the role of the missionary-linguist. In each case the missionary was committed to the use of the local language and learned to speak it with considerable fluency. In each case he produced several books in the language which could be used immediately in the community and wrote a grammar and dictionary as tools for teaching and research. In each case the missionary-linguist spent at least ten years with the people and lived among them with his family and other mission staff. Finally, in each case the missionary was fascinated by the local culture, studied it thoroughly, and wrote copious notes on it. However, we must not overlook that the missionary—in spite of his interest in the culture and his identification with the people, often in opposition to government officials and other fellow countrymen—was uncompromisingly opposed to many aspects of the traditional religion. This was clearly the case with Stefan of Perm, Veniaminov of Alaska, and Reuther of Australia. In both the Diyari and Aleut cases, the active involvement of certain members of the host society in religious activities was significant. This was true in the case of John Pingilina, as already noted, and was even more important in the case of the Aleut, where two "creole" priests (Russian father, Aleut mother) produced some of the early published materials. A final theme to be noted is the question of who in the society became literate and how literacy was transmitted from generation to generation as part of the total culture. In early nineteenth-century Russia, literacy has been characterized as "restricted" (Goody 1968); perhaps only about 10 percent of the population was literate, and the literate sector included the (male) products of clerical education. In Germany in the 1860s, literacy was very widespread, perhaps as high as 90 percent, and in most of the country schooling was universal for both boys and girls. (For estimates of numbers and distribution of literates in Germany and Russia in the nineteenth century, see Cipolla 1969.) From the actions of the two sets of missionaries, it seems that the German Lutherans concentrated on getting all the young people into school on the model of schooling in Germany, while the Russian Orthodox missionaries were concerned with training adult males who could take a role in the church liturgy, although they had no objection to women acquiring literacy on their own. Missionaries, both German Lutheran and Russian Orthodox, and aborigines, both Australian Diyari and Arctic Aleut, had sharp sex-role differences in religious life. In the German case, however, literacy instruction seems to have been tied more to school and everyday life than in the Russian case, where it was tied more closely to religious ceremony. In the Aleut community, the teaching and learning of literacy became part of the traditional male heritage and assumed some of the strong values associated with pre-Christian male roles. Ransom (1945) noted the Aleut's very different attitudes toward English literacy, which was resented and was accepted reluctantly as necessary for survival. Literacy in Aleut (and Russian), on the other hand, was highly regarded, and those who wrote exhibited great care and pride in both penmanship and their skill in handling the language. In a recent study of literacy in Samoa, Huebner (1986) follows Spolsky, Engelbrecht, and Ortiz (1983) in citing five conditions for the successful introduction of vernacular literacy: (1) willingness by those introducing literacy to have literacy


Speech Communities and Language Situations

in the vernacular; (2) perceived utility of literacy by traditionally influential members of the community; (3) the establishment of native functions for literacy; (4) the continued widespread use of the vernacular as a spoken language; and (5) the support of the maintenance of vernacular literacy by a powerful educational system under local control. Each of these hypothesized factors needs further amplification and modification, and they can be examined in the light of such case studies as the one presented here. For example, what is the effect of an ideological assumption by the agents of change that literacy is an essential instrument in the social values they are trying to spread? Should the fifth factor be reformulated to refer to a culturally anchored mechanism of intergenerational transmission rather than an "educational system"? To what extent does the addition of some form of vernacular literacy itself contribute to the continued use of the vernacular as a spoken language? These few comparative observations give all indication of the variety of factors involved in the introduction of literacy into a nonliterate, hunting-gathering society, and they point up the need for more and better case studies. Historical materials and present-day reminiscences can be utilized for more extended treatment of the Diyari and Aleut cases, and other cases can also be explored from a similar sociohistorical perspective; but the great need is for detailed case studies of ongoing instances of the introduction of literacy. Existing anthropologically oriented studies of the varying uses and functions of literacy in different societies (e.g., Goody 1968; Scribner and Cole 1981; Schieffelin and Gilmore 1986) offer relatively little on the actual processes of literacy introduction and the factors involved in the successes or failures of vernacular literacy to "take hold." This preliminary version of a case study will have served its function if it stimulates more and better studies of the phenomenon of literacy introduction, important as it is for theories of cultural diffusion, language change, and cognitive development.

Notes 1. This paper is one of a series of studies on the literacy aspects of Lutheran missions in Australia in the nineteenth century. It has benefited from conversations with Michael Aune, Helene Burns, Gwen Denial Hall, John Haviland, Shirley Brice Heath, Philip Jones, Walter Pitts, P.A. Scherer, and Peter Sutton, but responsibility for errors and inadequacies of fact or interpretation is my own. 2. Fuller accounts of the beginnings of the mission, including a change of site, interaction with Moravian missionaries, and many details not directly relevant to the introduction of vernacular literacy, are given in the historical references previously cited. 3.Similar mission policies were also in evidence in the activities of earlier German Lutheran missionaries who worked among Aboriginal groups in southern Australia; these men were trained at other mission seminaries, especially the Dresden (later Leipzig) Mission Society (see Brauer 1956:143-71). However, these earlier efforts were short lived, and literacy evidently did not take hold in these Aboriginal communities. 4. Flierl left in 1885 for New Guinea, where he was to spend most of his life. There he coped with a highly multilingual situation in which the mission eventually developed

Literacy in a Hunting-Gathering Society


literacy in various vernaculars, three regional "church languages" (Kate, Yabem, and Geraged), English-based pidgin (Tok Pisin), and English. 5. After leaving Bethesda Mission, Strehlow took charge of the Lutheran mission among the Aranda people of central Australia, which had in its turn been named Hermannsburg. In that second Australian Hermannsburg—and unlike the Diyari mission it has kept the name up to the present (Leske 1977)—Strehlow produced a New Testament in Aranda and completed a small grammar, a large dictionary, and a seven-volume ethnographic work on the Aranda. 6. Some teaching of English must have been done before Hillier arrived. Several visitors noted that the English of the mission Diyari was "better" than that of the local, uneducated Australians (Jones and Sutton 1986:39). This seems to have been the case at several Lutheran missions, where the kind of English taught was influenced by the educated, academic, second-language English of the German missionaries. 7. Veniaminov later became Bishop Innokenty of Alaska and in his last years was Metropolitan of Moscow and head of the whole Russian church; finally, in 1977, almost a hundred years after his death, he was officially recognized as Saint Innokenty, "the enlightener of the Aleuts and apostle to America" (see Garrett 1979).

References Austin, P., 1981, A Grammar of Diyari, South Australia. Cambridge, Eng.: Cambridge University Press. Black, L. T., 1977, Ivan Pankov—an Architect of Aleut Literacy. Arctic Anthropology 14(1):94-107. Bonython, E., 1985, Where the Seasons Come and Go. 2nd ed. Adelaide: Illawong Ptly Ltd. Brandl, M.M., and M. Walsh, 1982, Speakers of Many Tongues: Toward Understanding Multilingualism among Aboriginal Australians. International Journal of the Sociology of Language 36:71-81. Brauer, A., 1956, Under the Southern Cross: History of the Evangelical Lutheran Church of Australia. Adelaide: Lutheran Publishing House. Facsimile reprint 1985, Adelaide: Lutheran Publishing House. Cipolla, C. M., 1969, Literacy and Development in the West. Harmondsworth, Eng.: Penguin. Dorian, N., 1981, Language Death: The Life Cycle of a Scottish Gaelic Dialect. Philadelphia: University of Pennsylvania Press. Dressier, W., and R. Wodak-Leodolter, eds., 1977, Language Death. International Journal of the Sociology of Language 12. The Hague: Mouton. Ferguson, C.A., 1968, St. Stefan of Perm and Applied Linguistics. Pp. 235-65 in Language Problems of Developing Nations (ed. by J. A. Fishman, C. A. Ferguson, and J. Das Gupta). New York: John Wiley and Sons. Frake, C.O., 1983, Did Literacy Cause the Great Divide? American Ethnologist 10:36871. Fry, H.K., 1937, Dieri Legends. Folk-lore 48:187-206, 269-98. Garrett, P.D., 1979, St. Innocent, Apostle to America. Crestwood, N.Y.: St. Vladimir's Seminary Press. Goody, J., ed., 1968, Literacy in Traditional Societies. Cambridge, Eng.: Cambridge University Press.


Speech Communities and Language Situations

Goody, J., 1977, The Domestication of the Savage Mind. Cambridge, Eng.: Cambridge University Press. Goody, J., and I. Watt, 1963, The Consequences of Literacy. Comparative Studies of Society and History 5:304-26, 332-45. Reprinted 1968, in Literacy in Traditional Societies (ed. by J. Goody), Cambridge, Eng.: Cambridge University Press; and 1972, in Language and Social Context (ed. by P. Giglioli), Harmondsworth, Eng.: Penguin Books. Hardy, O., 1984, Like a Bird on the Wing: The Story of Luise Homann Based on Her Journal and Other Sources. Adelaide: Lutheran Publishing House. Hebart, T., 1938, The United Evangelical Lutheran Church in Australia (U. E. L. C. A.); Its History, Activities and Characteristics, 1838-1938 (English version ed. by J.J. Stolz). North Adelaide: Lutheran Book Depot. Facsimile reprint 1985, Adelaide: Lutheran Publishing House. Huebner, T., 1986, Vernacular Literacy, English as a Language of Wider Communication, and Language Shift in American Samoa. Journal of Multilingual and Multicultural Development 7:393-411. Jones, P., and P. Sutton, 1986, Art and Land: Aboriginal Sculptures of the Lake Eyre Region. Adelaide: South Australian Museum with Wakefield Press. Koller, W., 1924, Die Missionsanstalt in Neuendettelsau: Ihre Geschichte und das Leben in ihr. Neuendettelsau, Ger.: Verlag des Missionshauses. Krauss, M.E., 1973, Eskimo-Aleut. Pp. 796-902 in Linguistics in North America (ed. by T.A. Sebeok). Current Trends in Linguistics 10. The Hague: Mouton. Reprinted 1976, New York: Plenum Press. Leske, E., ed., 1977, Hermannsburg: A Vision and a Mission. Adelaide: Lutheran Publishing House. Ong, W.J., 1982, Orality and Literacy: The Technologizing of the Word. London: Methuen. Proeve, E.H., 1946, Three Missionary Pioneers. Tanunda, Austl.: Auricht's Printing Office. Proeve, E.H., and H.F.W. Proeve, 1952, A Work of Love and Sacrifice: The Story of the Mission among the Dieri Tribe at Cooper's Creek. [Adelaide?]: Lutheran Mission Booklet. Ransom, J. E., 1945, Writing as a Medium of Acculturation. Southwestern Journal of Anthropology 1:333-44. Reuther, J.G., 1981, The Diari (trans. by P.A. Scherer et al.). 13 vols. AIAS microfiche no. 2. Canberra: Australian Institute of Aboriginal Studies. Reuther, J.G., and C. Strehlow, trans., 1897, Testamenta Marra (New Testament in Diyari). Tanunda, Austl.: G. Auricht. Scherer, P.A., 1966, Mission among the Dijari (Dieri): Looking Back on a Hundred Years of Bethesda Mission. Lutheran Herald 46:276-78, 291, 304-5, 314-15, 317, 33335, 348-50, 366-68. Adelaide. Scherer, P.A., 1979, Donor of Aboriginal Heritage. The Lutheran 13:284-87. Adelaide. Schieffelin, B., and P. Gilmore, eds., 1986, The Acquisition of Literacy: Ethnographic Perspectives. Advances in Discourse Processes 21. Norwood, N.J.: Ablex. Scribner, S., and M. Cole, 1981, The Psychology of Literacy. Cambridge, Mass.: Harvard University Press. Spolsky, B., G. Engelbrecht, and L. Ortiz, 1983, Religious, Political and Educational Factors in the Development of Biliteracy in the Kingdom of Tonga. Journal of Multilingual and Multicultural Development 4:459-69.

Literacy in a Hunting-Gathering Society


Veniaminov, I., 1840a, Ukazanie puti tsarstvie nebesnoe (Road guide to the Kingdom of Heaven). St. Petersburg. Veniaminov, I., 1840b, Zapiski ob ostrova Unalashkinskago Otdela (Notes on the islands of the Unalaska District). St. Petersburg: Russian American Company. Reprinted 1887, ed. by I. Barsukov, St. Petersburg: Most Holy Synod, Moscow; English trans. 1984, by L.T. Black and R.H. Geohegan, ed. by R. A. Pierce, Kingston, Ont.: Limestone Press. Wendebourg, W., 1910, Louis Harms als Missionsmann: Missionsgedanken und Missionstaten des Begrunders der Hermannsburger Mission. Hermannsburg, Ger.: Verlag der Missionshandlung.

6 South Asia as a Sociolinguistic Area

In 1956, Emeneau's article, 'India as a Linguistic Area' appeared in Language, and it was followed by a number of additional studies, by Emeneau and others, of features of language STRUCTURE widely shared throughout South Asia that set that area apart from other regions of the world (e.g. Andronov 1964, Emeneau 1974, Kuiper 1967, Masica 1976, Ramanujan and Masica 1969). In all these studies the concept of Sprachbund or linguistic area was used in its established meaning of a multilingual area having languages of different families, in which many of the shared structural features are held to represent convergence that has resulted from long periods of language contact. This kind of 'areal' similarity that arises from mutual influences is contrasted with 'genetic' similarity between languages, which results from communicative continuity from a single origin at an earlier point in time. Thus, certain present day similarities between Indo-Aryan and Dravidian languages of South Asia in phonology, inflectional categories, syntax, an lexical semantics do not go back to similarities between Old Indo-Aryan and Proto-Dravidian but have resulted from lexical borrowing, L1-L2 language transfer, and other processes operative in long term bilingualisms. Less attention has been paid to shared features of language USE that set South Asia apart from other regions of the world. This kind of regional unity, which may be called a 'sociolinguistic area', arises in the same way. In fact, some of Emeneau's commonalities are sociolinguistic in this sense, and all the authors who write about South Asia as a linguistic area at least assume that the shared features of structure have come about diachronically through particular patterns of language use. It is of value, however, to focus specifically on certain shared features of language use apart from features of structure—to the extent that this can be done—simply because such similarities may be of great significance in characterising different regions and in understanding general processes of language change. To take an example from another part of the world, the three major languages of Southwest Asia (the 'Middle East'), Arabic, Persian, and Turkish, are genetically unrelated and their most striking similarity in structure is the huge mass of This paper was originally published in E. Dimock, B. Kachru, and Bh. Krishnamurti, eds., Dimensions of South Asia as a Sociolinguistic Area: Papers in Memory of Gerald Kelly. New Delhi: Oxford and IBH (1991), reprinted with permission of Oxford University Press. 84

South Asia as a Sociolinguistic Area


borrowed Arabic lexical items and some phonological, morphological, and syntactic features closely associated with them. However, these languages also share many features of language use where the lexicon is not shared. Taking a simple example, they all differentiate between one expression for 'please' that is offering a service (have some, come in, have a seat) and an expression for 'please' that is requesting a favour (open the door, pass the salt, go slower). The expressions in the three languages are quite different, not loanwords or loan translations, but they are equivalent in use, translation equivalents on the proper occasions. Such similarities—and there are many—reflect the centuries-long language and culture contact among speakers of Arabic, Persian, and Turkish. To the extent that such similarities set off this area from other regions of the world they contribute to its being a sociolinguistic area. We may note in passing that the languages of Europe, which also have very frequent use of expressions for 'please' tend not to differentiate the two types, and the languages of South Asia, tend to have much less frequent use of expressions for 'please' altogether. Pandit's book of 1972, India as a Sociolinguistic Area, is actually a collection of sociolinguistic papers, the published form of a series of lectures. He certainly agreed with the notion of South Asia as a sociolinguistic area, but he made no attempt to define the concept 'sociolinguistic area', and the excellent papers in the volume present only limited local examples of diachronic convergence and synchronic variation that would fit into the larger scheme. Shapiro and Schiffman 1983 is completely devoted to a sociolinguistic characterisation of South Asia, and it discusses in considerable detail some of the phenomena identified in the present paper, but it also does not deal explicitly with the concept of sociolinguistic area. It is D'souza who has attempted to clarify the concept and apply it to South Asia, in her dissertation (D'souza 1987) and in other papers (e.g. D'souza 1986, in press). Her eight examples of features of similarity (D'souza forthcoming) overlap somewhat with the features presented here. The purpose of this paper is to highlight a few of the features of language use that mark off South Asia as a special region of the world, partly as a stimulus for myself and others to pay more attention to features of language use that cluster in areal relationships, as a complement to areal studies focusing more narrowly on linguistic structure. Partly also, the paper is intended to sharpen our understanding of what it is that makes South Asia unique, and indeed gives South Asia its fascination for so many scholars, including Gerald Kelley, who studied both IndoAryan and Dravidian languages and cultures and strongly identified with the peoples of South Asia. Aspects of South Asian Language Use Many aspects of language use can be investigated in order to characterise one sociolinguistic area as opposed to others, and in this brief paper just eight aspects will be mentioned as highly salient in the identification of South Asia. Three are 'macro-sociolinguistic', two deal with types of linguistic variation, and three deal


Speech Communities and Language Situations

with specific communicative subsystems. No attempt will be made to do more than sketch the main points under each aspect, although in several cases reference can be made to detailed, systematic treatments of the topic.

Macro-sociolinguistic Aspects Multilingual Repertoires Every South Asian country is multilingual in the sense that two or more languages are in regular use on the national scene of government, politics, education, and the military, and all except Bangladesh have substantial mother-tongue communities of two or more indigenous languages. India gives 15 languages official status in its constitution. South Asia is multilingual in another sense: in most urban centers and at the boundaries of language areas, large numbers of individuals are actively multilingual. Many millions of South Asians make daily use of three or more languages, the choice of language set by function. Individual and group migrants often preserve their home language for many generations in addition to acquiring the language of their new location. Even though there are large areas— especially rural agricultural ones—that are predominantly monolingual, there is widespread acceptance of the naturalness of multilingualism. As a result of the British colonial period in South Asia, the English language is part of multilingual repertoires in every South Asian country (cf. Kachru 1969, Sridhar 1989, Ferguson 1989). This overall pattern of multilingualism in South Asia is older than our earliest records (Southworth 1977:21), and the official recognition of multilingualism in governmental communication goes at least as far back as the emperor Ashoka in the third century B.C. No other region of the world has had such a long-continued pattern of socially accepted, governmentally institutionalised multilingualism. In recent decades a number of scholars have produced descriptions, analyses, and interpretations of South Asian multilingualism; representative examples, in chronological order, are: Hodson 1936, Ross 1965, Sharma and Kumar 1977, Shapiro and Schiffman 1983: 177-193, and Srivastava 1988. The complex patterns of individual and societal multilingualism in the South Asian area continue to merit sociolinguistic research, especially because the bulk of research on multilingualism has been based on European and American contexts, and the sometimes very different patterns of South Asia provide a necessary corrective.

Ancient Literacy South Asia is one of the very few parts of the world where a tradition of literacy has been in continuous existence for well over 2000 years. During most of that time the ability to read and write was acquired by a relatively small proportion of the population, perhaps as low as two percent, and no part of South Asia achieved anything like 'mass literacy' until the nineteenth and twentieth centuries. The in-

South Asia as a Sociolinguistic Area


fluence of literacy in South Asian societies has, however, been much more intense and pervasive than these facts might suggest, and several features of South Asian literacy are highly distinctive of the area. Gough 1968a discusses the history of South Asian literacy in contrast to the history of literacy in China and in ancient Greece. Several specialized studies examine literacy in particular communities or regions in South Asia; for example, Gough 1968b details the history and present status of literacy in Kerala, a relatively highly literate part of South Asia, and De Silva 1979 compares the situation in Telugu and Sinhala in an attempt to find an explanatory Sociolinguistic framework. Components of the literacy situation in South Asia include: a. Sectoral division of competence in literacy and attitudes toward it. Some sectors of the population are traditional custodians of sacred and learned texts, some sectors make use of literacy for business and administrative purposes, some sectors attach little prestige or usefulness to literacy for their own occupations or ritual status. b. Linguistic layering. The notion of vernacular literacy, as opposed to literacy in a Classical language, has emerged repeatedly in South Asian history, and many communities have layers of Classical, vernacular, and dialect literacies, with differential functions. c. Oral transmission. Although the tradition is disappearing, many still attach greater prestige to, and have greater confidence in, memorized texts orally transmitted than printed texts. d. School connectedness. Literacy is widely regarded as primarily an aspect of formal schooling rather than a resource for everyday living.

Diverse Scripts Official recognition and public use of different systems of writing goes back at least to the third century B.C. in South Asia, and the number of different scripts has slowly continued to increase ever since. At the present time three families of scripts are employed in the area, their distribution coinciding generally with the religion of those who introduced the particular writing system. Thus, it tends to be the case that scripts of the devanagari type are in use in predominantly Hindu populations, Arabic-type alphabets have been introduced in Muslim communities, and versions of the Roman alphabet have been introduced by Christians. Writing systems in South Asia tend to coincide with language boundaries, but there are cases of the same script being used for more than one language and the same language being written in more than one script. The widespread use of devanagari-type and Arabic-type scripts is a conspicuous feature of the South Asian Sociolinguistic area, but it must be noted that individuals tend to have strong script loyalty and only very rarely does an individual acquire both a devanagaritype and an Arabic-type writing system. It may also be noted that the two types of script are extremely different in letter-shapes and principles of organization: one goes left-to-right, the other right-to-left; one writes all the vowels, the other omits most of them; on lined paper one is written down from the line, the other


Speech Communities and Language Situations

is written between the lines (note that Roman letters are written on i.e. up from the line).

Linguistic Variation Dialect Variation It is a working assumption of sociolinguistic research that all speech communities exhibit variation in form correlating with differences in location and social grouping, and dialect variation in South Asian languages has been the subject of a number of studies. Dialect variation that is conditioned by features of social stratification has been a special focus of sociolinguistic research. Although interest in the relation between social groupings and language is of long standing (cf. the discussion in Cohen 1956: 168-213), quantitative variationist study begins with Labov 1964. The bulk of this latter research treats social stratification in terms of social classes, i.e. groupings based on income, occupation, and education; the bulk of the social dialect study in South Asia, however, is in terms of castes, i.e. groupings based on ascribed status, ritual purity, endogamy, commensality, and occupation. Because of the pervasiveness of caste groupings in South Asia, not only among Hindus where it is traditionally ideologized, but also among Muslim, Buddhist, and other communities, sociolinguistic research has often focused on 'caste dialects' (e.g. Bloch 1910, Bright 1960, Gumperz 1958, McCormack 1968, Ramanujan 1968). Such studies typically try to match caste divisions of a community with linguistic variation in it. Pattanayak 1975 cautions against preoccupation with caste, insisting on consideration of other social variables that may at times outweigh caste or interact in complex ways with caste differentiation. Fortunately there have been at least two attempts to summarize the research on caste-related linguistic variation in South Asia in a broader framework of dialect variation (Bean 1974, Shapiro and Schiffman 1983: 150-176). On the basis of research so far, it seems that caste membership may be the single most important parameter of social dialect variation in South Asia, although such other parameters as education, sex, religion, and urban/rural are highly significant under certain conditions. Most researchers report no simple caste-language correlation and delineate two patterns of caste-related variation, one associated generally with northern areas of South Asia, the other with southern areas. The 'northern' pattern has the major differentiation between the lowest, so-called 'untouchable' castes and the rest of the community, and sometimes deep differences among the untouchable castes themselves (cf. Gumperz 1958). This linguistic division between most of the society and what may be called 'undercastes' that fall outside the ideologized system is also apparent in the 'southern' pattern but here, in addition to the linguistic line between undercastes on one side and the higher castes on the other, there is a strong line between the highest castes (typically the Brahmins of Hindu society) and the middle and lower castes. These patterns of caste-related linguistic variation (two-level and three-level)

South Asia as a Sociolinguistic Area


are typical of South Asia and no other large sociolinguistic area of the world. The details are quite varied and the diachronic origins are unclear, so that much work needs to be done in order to understand the dynamics of sociolinguistic stratificational variation in South Asia and to relate it to the dynamics of variation elsewhere. Even from the existing work, however, it is clear that this is a major characterizing aspect of South Asia as a sociolinguistic area.

Register Variation It is a working assumption of sociolinguistic research that every individual (and thus every speech community) has as part of his or her linguistic repertoire a range of variation associated with different occasions of use. This kind of variation, often called 'register variation', that marks USE is contrasted with 'dialect variation' that marks the USER as belonging to particular social groupings. Research on register variation in South Asian languages has been limited; for example, a few studies have appeared on 'baby talk', the variety of speech addressed primarily to young children (e.g. Bhat 1967, Dil 1975, Kelkar 1964). These studies, which show sizable BT lexicons unrelated to the corresponding adult terms, hypocoristic affixes, and phonological simplification processes, are not particularly distinctive of South Asia, similar phenomena occurring in many societies in other regions. The study of one kind of register variation in South Asia is, however, well developed: the various phenomena labelled 'diglossia.' More than a score of studies of particular situations are available, mostly on Bengali, Sinhala, Tamil, and Telugu. More significantly, repeated attempts have been made to summarize South Asian cases, explore the conceptual framework, and even develop a theory of diglossia. Particularly valuable are the papers on diglossia in Krishnamurti et al. 1986, the volume by Britto 1986, and such innovative studies as Singh 1986. South Asia has many situations in which the literacy version of the language differs considerably from the ordinary spoken variety. As Caldwell observed in 1856 (cited by Singh 1986) "it is a remarkable peculiarity of Indian languages that as soon as they begin to be cultivated, the literary style evinces a tendency to become a literary dialect distinct from the dialect of common life, with a grammar and vocabulary of its own." Deshpande (1986) claims that diglossias in South Asia go at least as far back as the third century B.C. Some of the diglossias in South Asia are of the narrowly defined, classic type of Arabic, for example Tamil. Others differ from that type in having a literary variety that is not regularly used for formal spoken purposes, for example Telugu or Bengali. Some have three distinguishable levels (e.g. Bengali sadhu bhasha, chalit bhasha, and local dialect; Sinhala Literary, Formal Colloquial, and Colloquial). The mega-diglossia of Hindi-Urdu with its local dialects, submerged languages, and dialectal literary varieties has no match in other sociolinguistic areas of the world, but it seems to fit well in the multilingual, multiple diglossic South Asian scene. The careful description and dissection of South Asian diglossias and similar situations is contributing directly to the understanding of language variation and language contact.


Speech Communities and Language Situations

Subsystems Kinship Terms The kinship terminology itself as well as the use of kinterms in reference and address and the culturally transmitted attitudinal and behavioural patterns vis-a-vis one's kin vary from one part of South Asia to another and from one social group in South Asia to another, but certain widely shared commonalities in the use of kinterms characterize the whole South Asian area as distinct from other regions of the world. There have been many studies of kinship terminology and behaviour toward kin for specific communities in South Asia; Bean's study of a Kannada-speaking village is an example of an excellent, thorough treatment of kinship system and kin address behaviour (Bean 1978: 45-82). However, no study of the area as a whole is available. The outstanding book of Karve 1953 offered a wealth of information, both historical and contemporary, based on the author's own fieldwork, wide-ranging inquiries, and historical research. Although it analyzed all the major languages of India and covered a number of 'tribal' communities and unusual social groups, it paid almost no attention to the large Muslim and Buddhist communities of South Asia. Shapiro and Schiffman (1983) in their discussion of the ethnosemantics of South Asia (228-238) summarise critically a number of kinship studies, but make no attempt at synthesis, partly because of the considerable differences in conceptual frameworks and research methods of the various authors. Here we may just note a handful of striking areal features, with the expectation that specialists could probably add a considerable body of area-wide patterns and tendencies. a. South Asian communities have rich kinship terminologies. A given speech community may use over 100 kinterms in everyday interaction, together with a complex system of subtly varying patterns of address and expression of deference, hostility, persuasion, and the like. For example, most South Asian systems have different single terms for father's elder brother and father's younger brother, for a sister older than oneself and a sister younger than oneself. b. Term of address typically neutralize some of the distinctions made in reference. For example, in most South Asian systems cousins of the same generation are addressed in the same way as siblings; in some systems the term for 'father' is used also to address grandfather, father's elder brother, and husband's father. c. Most reciprocal relationships are asymmetrical, i.e. one member is senior, the other junior. In general, the senior is addressed by kinterm, the junior by personal name. d. Many kinterms derived from Sanskrit have diffused throughout South Asia, sometimes amounting to as high as 50% of kinterms in a non-Indo-Aryan speech community. For example, the term mama is nearly universal in

South Asia as a Sociolinguistic Area


South Asia as the term for mother's brother. e. Generally a wife does not address or refer to her husband either by kinterm ('husband') or by name. She may use a special particle of address, or make use of an expression such as 'father of X', X being their first child, or some other circumlocution. f. There is not a neat match between kinship terminology and patterns of marriage. For example, some communities forbid marriage between any cousins of the same generation (a 'northern' Hindu pattern), some allow cross-cousin marriage, e.g. marrying one's father's sister's or mother's brother's child (a 'southern' pattern), some allow parallel cousin marriage, e.g. marrying one's father's brother's child (a Muslim pattern). Yet almost all communities address cousins of the same generation as siblings, so that two who address each other as brother and sister may come to marry each other even though no South Asian community endorses sibling marriage.

Forms of Address Ever since the publication of the Brown and Gilman classic paper on pronouns of address (Brown and Gilman 1960), the topic of forms of address has been an active area of sociolinguistic research, culminating in attempts to describe whole systems of address including pronouns, kinterms, personal names, titles, epithets, and address particles and to provide universalist frameworks of description and explanation (Bean 1978, Braun 1988, Parkinson 1985). Pronouns of address and larger systems of address that include pronouns have been investigated in a number of South Asian languages (e.g. Bean 1978, Chandrasekhar 1970, Das 1968, Jain 1969, Karunatillake and Suseendirarajah 1975, Mehrotra 1985, Shanmugam Pillai 1972, Singh 1989). Shapiro and Schiffman (1983) attempted a summary of several of these studies (244-254), but no one has attempted a characterization of the area as a whole. South Asian pronoun systems of address differ markedly, for example, from those of European languages or those of Southeast Asian languages. European languages generally have two levels of respect grading in the second person pronouns, the higher (more formal, more respectful) typically having its origin either in an earlier plural form or in an abstract noun or third person pronoun. Respect grading corresponding to these second person pronouns of address is rarely found in the first and third person pronouns. Subject marking affixes on verbs typically agree with the independent pronouns, and in the absence of independent subject pronouns they carry the address category (for fuller treatment and references, see Ferguson, 1991) Southeast Asian languages generally do not emphasize in their grammatical systems the differences between first, second, and third person, and the sets of personal pronouns are not very clearly set off from nouns of pronominal reference. These languages generally do not show overt morphological marking of person in the verb, and thus have no grammatical agreement between subject and verb. This widespread use of status and kinterm pronominals was cited by Haas as one of three prominent characteristics of the Southeast Asian linguistic area (Thompson


Speech Communities and Language Situations

1965: viii). South Asian languages, unlike those of Europe or Southeast Asia, typically have personal pronoun systems with three levels in the second person (e.g. Hindi tuu, turn, aap; Jaffna Tamil nil, niir, niimkal), two levels in the third person, and subject-verb agreement in person and respect level. The use of the second person pronouns and such other forms of address as kinterms, names, and titles serves as a very delicate indicator of relative status of the participants in a communication situation and as an expression of communicative strategies of the interactants. The use also varies extensively dialectally (by region and social stratification) and in register (occasions of use). The patterns of use, in spite of the great variability, show impressive areal similarities resulting from long-continued contact among different languages and cultures. South Asian forms of address are an important feature of the sociolinguistic area, and they constitute a promising topic of research for discovering principles of sociolinguistic convergence through contact.

Politeness Formulas All human societies apparently have a stock of interpersonal verbal routines such as greetings, thanks, congratulations, condolences, and curses, but speech communities vary greatly in the nature and amount of such verbal routines. Following Jespersen, such routines have been called 'politeness formulas' (Ferguson 1976) in spite of the obvious facts that they need not be polite, and that other dimensions than politeness may be at issue; they have also been called 'conversational routines' (Coulmas 198la), in spite of the fact that they may be exchanges outside the structure of conversation, and may even be regarded as substitutes for conversation. Whatever they are called, they are prime candidates for criterial features in the characterization of sociolinguistic areas, as illustrated by the introductory example in this paper. In the very brief treatment here, only formulas of greeting and expressing thanks will be discussed, and we may note in passing, as an indication of, the difficulty of achieving a universal framework for the analysis of politeness formulas, that in some speech communities greeting and thanking are regarded as a single category, at least in the sense that a single cover term is used for both and there is no clear boundary between them (Goody 1972, cf. also the close relation between apologies and thanks in Japanese, Coulmas 1981b). Studies of greetings and thanks have been published for several South Asian languages and particular speech communities (e.g. Mehrotra 1985 on greeting in Hindi and Apte 1974 on thanking in Marathi and other South Asian languages). One of the most striking characteristics of South Asian greetings, as opposed, for example, to greetings in the Arab world or in Western Africa, is the relative sparseness of their use and variety. Mehrotra's study of Hindi greetings offers a great variety of formulas and variation, dependent on the status of the greeter and respondent, nature of the occasion, and other factors. In spite of this, however, the comparativist observer notices immediately the fact that verbal greetings are exchanged less frequently, at less length, and with less variability dependent on time of day and degree of formality than in either of these other two sociolinguis-

South Asia as a Sociolinguistic Area


tic areas. Another prominent feature of South Asian greeting formulas is their tie to religious differences. Throughout the area the primary conditioning of variation in greetings is probably the religious affiliation of the speaker, the respondent, or both. Thus, in many South Asian speech communities there is no religion-neutral form of greeting in the mother tongue without resorting to English borrowings: a given greeting is Hindu, Muslim, Buddhist, or whatever. The formulas for expressing thanks in South Asian languages are also relatively sparse in use and variety. Perhaps the most salient feature of thanks in South Asia is, however, their relative formality or social distance compared to thanks in many other regions. The frequent use of thank you or other expressions of thanks between family members or close friends in much of the Englishspeaking world, for example, seems misplaced to many South Asians. Expressing thanks verbally, they would feel, is not needed in situations of intimacy and even implies a reduction in intimacy, a social distancing of some kind. On the other hand, at public ceremonies, such as professional conferences or guest lecturers, a formal vote of thanks is obligatory in South Asian contexts, whereas a corresponding expression of thanks in North American contexts is informal or even optional, and the term vote of thanks is unknown. The coming of Islam to South Asia brought along some of the extensive inventory of politeness formulas of Iran and the Arab world (cf. Beeman 1986), and Muslim communities in South Asia probably make somewhat greater use of thank you's than Hindu communities of the same mother tongue, but the pattern is basically the South Asian one.

Conclusion The eight features of language use that are listed and briefly characterized in this paper are sufficient to constitute South Asia as a sociolinguistic area just as fully as the features of language structure identified and described by Emeneau and other authors set South Asia apart as a linguistic area in the familiar sense. The list is in no way exhaustive and the characterizations are in no way definitive, but the aim of the paper will be achieved if the discussion broadens somewhat understanding of the language situation in South Asia and stimulates researchers to be more explicit about patterns of language use in connection with the study of areal phenomena.

References Andronov, M. (1964) On the typological similarity of Indo-Aryan and Dravidian. Indian Linguistics 25.119-126. Apte, Mahadev, L. (1974) 'Thank You" and South Asian languages: A comparative sociolinguistic study. International Journal of the Sociology of Language 3.67-89. Bean, Susan S. (1974) Linguistic variation and the caste system in South Asia. Indian Linguistics 35.277-293. Bean, Susan S. (1978) Symbolic and pragmatic semantics: A Kannada system of address. Chicago: University of Chicago Press.


Speech Communities and Language Situations

Beeman, W.O. (1988) Language, status, and power in Iran. Bloomington: Indiana University Press. Bhat, D.N. (1967) Lexical suppletion in baby talk. Anthropological Linguistics 9:5.33-36. Bloch, Jules (1910) Castes et dialects in Tamoul. Memoires de la societe de Linguistique de Paris 16.1-30. Braun, F. (1988) Terms of address: Problems of patterns and usage in various languages and cultures. Berlin: Mouton de Gruyter. Bright, William.(1960) A study of caste and dialect in Mysore. Indian Linguistics 21.4550. Britto, Francis. (1986) Diglossia: A study of the theory with application to Tamil. Washington, DC: Georgetown University Press. Brown, Roger W. and A. Gilman. (1960) Pronouns of power and solidarity. Style in language, ed. by T. A. Sebeok. Cambridge, MA: MIT Press. Chandrasekhar, A. (1970) Personal pronouns and pronominal forms in Malayalam. Anthropological Linguistics 12:7.246-255. Cohen, M. (1956) Pour une sociologie du langage. Paris: Editions Albin Michel. Coulmas, Florian (ed.) (1981a) Conversational routine: Explorations in standardized communication situations and prepatterned speech. The Hague: Mouton de Gruyter. Coulmas, Florian. (1981b) Poison to your soul. Thanks and apologies contrastively viewed. Conversational routine, ed. by F. Coulmas, The Hague: Mouton de Gruyter. Das, S.K. (1968) Forms of address and terms of reference in Bengali. Anthropological Linguistics 10:4.19-31. Deshpande, Madhav M. (1986) Sanskrit grammarians on diglossia, pp. 312-321. In South Asian Languages: Structure, Convergence and Diglossia, eds: Bh. Krishnamurti, C. P. Masica, and A.K. Sinha. Delhi: Motilal Banarsidass. De Silva, M.W.S. (1979) Vernacularisation of literacy: The Telugu experiment. Hyderabad: International Telugu Institute. D'souza, Jean. (1986) Language modernisation in a sociolinguistic area. Anthropological Linguistics 28.455-471. D'souza, Jean. (1987) South Asia as a sociolinguistic area. Unpubl. Ph.D. diss., University of Illinois at Urbana-Champaign. D'souza, Jean. Characterising a sociolinguistic area (forthcoming). Dil, Anwar. (1975) Bengali baby talk. Word 27.11-27. Emeneau, Murray B. (1956) India as a linguistic area. Language 32.3-16. Emeneau, Murray B. (1974) The Indian linguistic area revisited: International Journal of Dravidian Linguistics. 3.2-134. Emeneau, Murray B. (1980). Language,and linguistic area: Essays by Murray B. Emeneau, ed. by A.S. Dil. Stanford, CA: Stanford University Press. Ferguson, Charles A. (1976) The structure and use of politeness formulas. Language in Society 5. 137-151. Ferguson, Charles A. (1989) South Asian English: Imperialist legacy and regional asset. Paper presented at the International Conference on English in South Asia, Islamabad. (To appear) Ferguson, Charles A. (1991). Individual and social: Diachronic changes in politeness agreement in forms of address. The Influence of language on culture and thought, ed. by R.L. Cooper and B. Spolsky. Berlin: Mouton de Gruyter. Goody, E. (1972) 'Greeting', 'begging', and the presentation of respect. Interpretation of ritual, ed. by J.S. La Fontaine, London: Tavistock. Gough, K. (1968a) Implications of literacy in traditional China and India. Literacy in traditional societies ed. by J. Goody, Cambridge: Cambridge University Press.

South Asia as a Sociolinguistic Area


Gough, K. (1968b) Literacy in Kerala. Literacy in traditional societies ed. by J, Goody. Cambridge: Cambridge University Press. Gumperz, John J. (1958) Dialect differences and social stratification in a North Indian village. American Anthropologist. 60.668-682. Hodson, T.C. (1936) Bilingualism in India. Transactions of the Philological Society of London 1936. 85-91. Jain, Dinesh K. (1969) Verbalisation of respect in Hindi. Anthropological Linguistics 11: 3.79-97. Kachru, Braj B. (1969) English in South Asia. Current Trends in Linguistics, Vol. 4 Linguistics in South Asia, ed. by T.A. Sebeok. The Hague: Mouton de Gruyter. Karunatillake, W.S. and S. Suseendirajah. (1975) Pronouns of address in Tamil and Sinhalese—A sociolinguistic study. International Journal of Dravidian Linguistics 4.83-96. Kelkar, Ashok. (1964) Marathi baby talk. Word 20.40-54. Krishnamurti, Bhadriraju, C.P. Masica, and A.K. Sinha (eds.) (1986) South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Kuiper, F.B.J. (1967) The genesis of linguistic area. Indo-Iranian Journal 10.81-102. Labov, William. (1964) Phonological correlates of social stratification. The ethnography of communication, ed. by J.J. Gumperz and D. Hymes. Menasha, WI: American Anthropological Association. Masica, Colin P. (1976) Defining a linguistic area: South Asia. Chicago: University of Chicago Press. McCormack, William. (1968) A causal analysis of caste dialects. Studies in Indian linguistics, ed. by Bh. Krishnamurti. Poona and Annamalainagar: Centres of Advanced Study in Linguistics. Mehrotra, Rajaram R. (1985) Sociolinguistics in Hindi contexts. Berlin: Mouton de Gruyter. Pandit, Prabodh B. (1972) India as a sociolinguistic area. Poona: University of Poona. Parkinson, D. (1985) Constructing the social context of communication. Berlin: Mouton de Gruyter. Pattanayak, Debi P. (1975) Caste and language. International Journal of Dravidian Linguistics. 4.1-104. Ramanujan, A.K. (1968) The structure of variation: A study in caste dialects. Structure and change in Indian society, ed. by M. Singer and B.S. Cohen. Chicago: Aldine. Ramanujan, A.K. and C. Masica. (1969) Toward a phonological typology of the Indian linguistic area Current trends in linguistics Vol. 5. Linguistics in South Asia, ed. by T.A. Sebeok et al. Ross, A.D. (1965) Some social implications of multilingualism. Towards a sociology of culture in India, ed. by T.K.N. Unnithan, I. Deva, and Y. Singh. New Delhi: Prentice Hall of India. Shanmugam Pillai, N. (1972) Address terms and the social hierarchy of the Tamils. Proceedings of the First All-India Conference of Dravidian Linguists, ed. by V.I. Subramoniam. Trivandrum: Dravidian Linguistic Association of India. Shapiro, Michael, and H.F. Schiffman. (1983) Language and society in South Asia. Dordrecht: Foris. Sharma, P.G. and S. Kumar (eds.) (1977) Indian bilingualism. Delhi: Hindi Kendriya Sansthan. Singh, Uday N. (1986) Diglossia in Bangladesh and language planning problems. The Fergusonian impact, ed. by J.A. Fishman et al. Berlin: Mouton de Gruyter.


Speech Communities and Language Situations

Singh, Uday N. (1989) How to honour someone in Maithili. International Journal of the Sociology of Language 75.87—107. Southworth, Frank C. (1977) Functional aspects of linguistic heterogeneity. Indian bilingualism, ed. by P.G. Sharma and S. Kumar. Delhi: Hindi Kendriya Sansthan. Sridhar, Kamal K. (1989) English in Indian Bilingualism. New Delhi: Manohar. Srivastava, Ravindra N. (1988) Societal bilingualism and bilingual education: A study of the Indian situation. International handbook of bilingualism and bilingual education, ed. by C.B. Paulston. New York: Greenwood Press. Thompson, L.C. (1965) A Vietnamese grammar. Seattle: University of Washington Press.


Falling somewhere between formulaic routines and individual variation is the patterned, systematic variation triggered by aspects of the social context such as participants, situation, or message type. Ferguson's papers on diglossia illustrate how a social group will tend to develop identifying markers of language structure and language use different from those of other social groups. This is definitely one potential source for insights into the processes of conventionalization. But even within social groups, language shows variation across communicative situations and message types. Ferguson sees variation across communicative situations as an issue of register, and variation across message types as one of genre.

Register Variation A communicative situation can be identified by its recurrent pattern in a society, in terms of participants, setting, communicative functions, and so on. Over time, these communicative situations tend to develop identifying markers of language structure and use, more or less formatted, different from the language of other communicative situations (Ferguson 1985). Register is language identified in terms of its situation of use. Ferguson's approach to investigating this question of conventionalization in register variation is typical of the strategy of inquiry found in much of his work. Rather than attempting to address the question broadly, Ferguson investigates the phenomenon in a limited, manageable, but potentially revealing, domain, a domain often considered on the periphery of legitimate inquiry by mainstream linguists. Thus, while he now considers "Arabic Baby Talk" (1956) an example of how not to do research (he elicited intuitions from native Arabic speakers on how Arabic speakers speak to children, rather than actually collecting samples of mothers talking to children), the initial catalysts for the paper were Sapir (e.g., 1929) and his own observations from everyday encounters and casual observations of how Arabic speakers address children. Eight years later, "Baby Talk in Six Languages" appeared, an attempt to initiate cross-linguistic studies of marginal phenomena, but phenomena that are stable, conventionalized, and culturally transmitted. Here again we see the multiple em97


Register and Genre

phases characteristic of much of Ferguson's work: on what is common to all varieties of baby talk, on what is subject to variation (not only across languages but across individual speakers within languages), and on patterns of diffusion. Ferguson's interest in baby talk has helped to generate a reexamination of its role in the development of the child's communicative competence (cf. Gleason and Weintraub 1978; Snow 1979). Ferguson recognizes the importance of register variation for language development: it may well be the case that human infants exhibit register variation even before they produce vocalizations recognizable as the beginnings of language. . . . In an important sense, then, register variation may be seen not as a refinement in the use of language but as a principal source of language structure itself. (Ferguson 1982, 58).

But Ferguson's interests are broader. Among the characteristics of baby talk are prosodic, grammatical, lexical phonological and discourse features which are realized in specific and perhaps unique ways in particular speech communities and languages but whose general features and uses may be universal. In comparing it to the Stanford gorilla's use of sign with a novice signer, Ferguson speculates that "In adjusting our speech for talking to children, . . . we are essentially exercising deep-seated biological capabilities even more universal than we thought" (Ferguson 1978, 217). In 1968, for the international conference on pidginization and creolization in Jamaica, Ferguson coined the term foreigner talk (FT) for another form of simplified register. The paper for that conference, "Absence of Copula and the Notion of Simplicity" (Ferguson 1971), focuses on one feature of baby talk register and explores the manifestation of that feature within a framework of language typology, and in a variety of special registers. The purpose is to propose hypotheses about the notion of simplicity (cf. Ferguson 1959) that go beyond the "linguist's intuitive notions about languages" (Ferguson 1971, 145). Again, the focus is on the implications for diachronic change, language universals, language acquisition, and language loss. In "Towards a Characterization of English Foreigner Talk" (Ferguson 1975), Ferguson attempts to "establish the existence of at least one variety of foreigner talk as part of the total communicative competence of speakers of American English" (Ferguson 1975, 11). Here he not only identifies characteristics of FT shared by another simplified register (baby talk); he also reveals the negative attitudes associated with its use, as both condescending and a presumed impediment to learning. But again, there is an emphasis on diachrony: When and how is this learned, and how long has it been a part of the repertoire of the American Englishspeaking community? For Ferguson, the term foreigner talk invites comparisons with other simplified registers and emphasizes the conventionality of simplified registers rather than their idiosyncracies (Ferguson 1981). As a part of the total repertoire of competent speakers of a language, yet distinct from other registers within that repertoire, simplified registers must be included within the linguist's field of inquiry. His

Register and Genre


work in this area has led the way in exploring the nature of language addressed to second language learners (eg., Clyne 1981; Gass and Madden 1985; for a review, cf. Long 1981). "Sports Announcer Talk: Syntactic Aspects of Register Variation" (Ferguson 1983) represents an extension of Ferguson's investigation of register variation to a new occasion of use. Again, Ferguson argues for the inclusion of register analysis in the mainstream of linguistics and in the development of linguistic theory. The understanding of how register variation is shared and conventionalized, how it is transmitted and acquired, and how it changes through time is at least as fundamental in understanding the phenomenon of human language as the understanding of how the phonological-syntactic-semantic systems of speech communities are conventionalized, acquired, and changed. (Ferguson 1983, 154) But in this case Ferguson provides one of his finest examples of how to approach register analysis. First, he locates the register by identifying the situational or functional features that seem to characterize a recognizable kind of language. In this case, those features involve what the discourse does, the essentials of the roles of the participants, and the body of shared knowledge and values. Next, he refines the preliminary location by repeatedly checking the defining features for identifying both the register and variation within the register. He examines syntactic features such as inversions, result expressions, heavy modifiers, tense, and formatted speech or routines. Finally, Ferguson suggests focusing the description in terms of either the targeted register, or the lines of variation along which structure and use covary. Ferguson concludes the paper with a brief discussion of diffusion of the register and the resultant cross-linguistic variation. This impressive body of scholarship by Ferguson on registers, and specifically the simplified registers of baby talk and foreigner talk, has also generated work on other registers, including amateur radio operators, academic notetaking, and courtroom speech (cf. Ferguson 1985).

Genre Variation Just as register involves variation across situations of use, genre involves variation across message type. Although "Sports Announcer Talk" looks at the baseball sportscast in terms of register, it could as easily be analyzed as genre. In "Genre and Register: One Path to Discourse Analysis," Ferguson compares typical register analysis as practiced by anthropologists and linguists with genre analysis as often found in literary criticism. Genre also has been the object of investigation in other disciplines, each with its special focus: oral literary forms for the folklorist; expository forms for the rhetorician. For the linguist doing genre analysis, and certainly for Ferguson, the domain includes all of these, as well as more mundane forms found in everyday occasions of use. Ferguson is sympathetic to diachronic issues in his approach to the analysis of genre, a term he equates roughly with Hymes's message type. Here, Ferguson


Register and Genre

aligns himself with Bakhtin and ethnographers of speaking, but he is also interested in both the universal and the socially identified variability. For Ferguson, the distinction between register analysis, involving the question "What kind of language is appropriate for this message type?" and genre analysis, involving the question "What is the internal structure of this message type?" can be first found in Ferguson and Preston (1946). There they analyzed the structure of 107 Bengali proverbs in terms of both register (Sanskrit for more religious proverbs; Hindi/Urdu for lower-class proverbs; and the majority in Bengali) and genre (e.g., the structural pattern of question and answer). The observation in "Root-Echo Responses on Syrian Arabic Politeness Formulas" (Ferguson 1967) that Arabic proverbs are largely in Colloquial Arabic and that fewer are in Classical Arabic is a comment on register. But the analysis of the two-part and optional three-part exchange structure is one of genre. While the distinction between genre and register is harder to draw in "The Structure and Use of Politeness Formulas" (Ferguson 1976a), it is much clearer in "The Collect as a Form of Discourse" (Ferguson 1976b). Ferguson acknowledges that analysis of the register used in the collect is within the domain of what a theory of language should be able to explain. If the collect were in German, would the language be the same? Difference in language is a matter of "register." But regardless of the language used, the collect as genre (with a prescribed beginning, middle, and end) would be the same. Ferguson is interested in the "systematic conditions on grammaticality or acceptability that cut across what every linguist would regard as separate languages or seem to have a quite nonlinguistic locus" (Ferguson 1976b, 102). A common approach to diachronic change in linguistics is to take phonology or syntax as primary, looking at diachrony in terms of its effect on the system of language. Here Ferguson takes genre as primary and looks at a given diachronic change in terms of its effect on the form of discourse. He shows how a syntactic change in language has triggered a major alteration in the basic structure of the collect. The result is a description of both the universality and the variability of a given discourse form through time and across languages. (For a similar review, see Fowler 1982.) Ferguson's unpublished paper "Prayer of the People" (1984) contains many examples of the separation between register and genre. Ferguson shows that there is a structure to this particular prayer. This is a statement of genre. But what language (thou vs. you, for example) is a question of register. A fair question to ask in genre analysis is "What kind of register is appropriate for this genre?" This distinction is one not always recognized or made by discourse analysts. The distinctions between dialect, register, and genre are most clearly laid out in the last paper in this section, "Dialect, Register, and Genre: Working Assumptions about Conventionalization." To understand the processes of conventionalization, it is imperative to assume a perspective that includes both the diachrony and the variability of dialect, register, genre, and language attitudes. Such an approach also gives equal merit to that which is universal, not only synchronically but also diachronically, whether one is concerned with whole language diachrony (i.e., historical linguistics), zero to full language diachrony (i.e., child language), or first language to second language diachrony (second language acquisition).

Register and Genre


References Clyne, M. G., ed. 1981. "Foreigner Talk." International Journal of the Sociology of Language 28. Ferguson, C. A. 1956. Arabic baby talk. In M. Halle, ed., For Roman Jakobson. The Hague: Mouton. . 1959. Diglossia. Word 15.2:325-40. . 1964. Baby talk in six languages. American Anthropologist 66.6, pt. 2: 103-14. . 1967. Root-echo responses in Syrian Arabic politeness formulas. In D. G. Stuart, ed., Linguistic Studies in Memory of R. S. Harrell. Washington, D.C.: Georgetown University Press. . 1971. Absence of copula and the notion of simplicity: A study of normal speech, baby talk, foreigner talk and pidgins. In D. Hymes, ed., Pidginization and Creolization of Languages. Cambridge: Cambridge University Press. . 1973. Language problems of variation and repertoire. Daedalus 102.3:37-46. (Also in E. Haugen and M. Bloomfield, eds., Language as a Human Problem. New York: W. W. Norton, 1974.) . 1975. Toward a characterization of English foreigner talk. Anthropological Linguistics 17:1-14. (Also in B. W. Robinett and J. Schachter, eds., Second Language Learning. Ann Arbor: University of Michigan Press.) . 1976a. The structure and use of politeness formulas. Language in Society 5:13751. (Also in F. Coulmas, ed. Conversational Routine. The Hague: Mouton.) . 1976b. The collect as a form of discourse. In W. J. Samarin, ed., Language in Religious Practice. Rowley, Mass.: Newbury House. (Also in M. A. Jazayery, E. C. Polome and W. Winter, eds., Linguistics and Literary Studies in Honor of Archibald A. Hill. Lisse, The Netherlands: Peter de Ridder Press.) . 1978. Talking to children: A search for universals. In J. Greenberg, C. A. Ferguson, and E. Moravcsik, eds., Universals of Human Language. Stanford: Stanford University Press. . 1981. "Foreigner talk" as the name of a simplified register. International Journal of the Sociology of Language 28:9-18. . 1982. Simplified registers and linguistic theory. In L. Obler and L. Menn, eds., Exceptional Language and Linguistics. New York: Academic Press. . 1983. Sports announcer talk: Syntactic aspects of register variation. Language in Society 12:153-72. . 1984. Prayer of the people: Group construction of a religious genre of "formatted discourse." Unpublished ms. , ed. 1985. Discourse Processes, Special Issue: Special Language Registers 8.4. . 1994. Some working assumptions about conventionalization: Dialect, register, and genre. In Biber, and E. Finegan eds., Sociolinguistic Perspectives on Register. Oxford: Oxford University Press. . 1985. Genre and register: One path to discourse analysis. Paper presented at the Second Barbara Gordon Memorial Lecture in Linguistics at Florida International University, Miami. [Chapter 12, this volume] Ferguson, C. A., and W. D. Preston. 1946. 107 Bengali proverbs. Journal of the American Oriental Society 66:4.299-303. Fowler, Alastair. 1982. Kinds of Literature: An Introduction to the Theory of Genres and Modes. Cambridge, Mass.: Harvard University Press. Gass, S. M., and C. G. Madden, eds., 1985. Input in Second Language Acquisition. Rowley, Mass.: Newbury House.


Register and Genre

Gleason, J. B., and S. Weintraub. 1978. Input language and the acquisition of communicative competence. In K. E. Nelson ed., Children's Language, vol. 1. New York: Gardner Press. Long, M. H. 1981. Input, interaction and second language acquisition. In H. Winitz, ed., Native and Foreign Language Acquisition, Annals of the New York Academy of Sciences 379:259-78. Sapir, Edward. 1929. Nootka baby words. International Journal of American Linguistics 5:118-19. Snow, C. 1979. Conversations with children. In P. Fletcher and M. Carman, eds., Language Acquisition. Cambridge: Cambridge University Press.

7 Baby Talk in Six Languages

Occasionally linguists have turned their attention to the description of marginal systems within languages, such as animal calls, hesitation forms, or baby talk. Such phenomena have sometimes been studied because of purely linguistic interest in synchronic description: they often have elements of sound or form which do not occur in the "normal" central system of the language or have unusual arrangements or frequencies of occurrence of elements which do occur in the central system. This kind of study is of particular relevance to the question of the monosystemic nature of languages versus polytypical analyses of "coexistent" systems. These marginal phenomena have also sometimes been studied from a psychological point of view, in relation to questions of language acquisition or language function. The present paper approaches the analysis of baby talk from a rather general taxonomic, linguistic interest. The intention is to initiate cross-language studies of marginal phenomena of this kind which will lead to a general characterization of them and to a framework for the characterization of single-language marginal phenomena in such a way that synchronic classification and historical explanation become possible. By the term baby talk is meant here any special form of a language which is regarded by a speech community as being primarily appropriate for talking to young children and which is generally regarded as not the normal adult use of language. English examples would include choo-choo for adult train, or itty-bitty for little. In most cases the baby-talk item can also be used in some other situation with special value; in some cases (e.g., peek-a-boo) the item has no counterpart in normal language since it refers to an activity or object appropriate chiefly for children. The method used here will be the comparison of baby-talk phenomena in six languages, selected for variety of linguistic structure and sociolinguistic setting within the limits of available material: (Syrian) Arabic, Marathi, Comanche, Gilyak, (American) English, Spanish. The first two are major languages of Asia with millions of speakers and strong literary traditions; the second two are of small This paper was originally published in J. Gumperz and D. Hymes, eds., The Ethnography of Communication (American Anthropologist 66.6 [Part 2]), reprinted by permission of the American Anthropological Association. 103


Register and Genre

nonliterate communities, one New World, one Old World; the last two are major European languages. The primary source materials for the first four languages are the articles of Ferguson (1956), Kelkar (1964), Casagrande (1948), and Austerlitz (1956); the material on English and Spanish was compiled from informants for this study. 1

Assumptions Before proceeding to examination of the material, certain assumptions of this study should be made explicit since they are not in agreement with general views of baby talk. Here it is assumed that baby talk is a relatively stable, conventionalized part of a language, transmitted by "natural" means of language transmission much like the rest of the language; it is, in general, not a universal, instinctive creation of children everywhere, nor an ephemeral form of speech arising out of adults' imitation of child speech. Like other marginal systems such as animal calls, however, baby talk tends to show somewhat different patterns of diffusion from the normal language: for example, particular baby-talk items are often present in contiguous but genetically unrelated languages. The assumption of relative stability as opposed to ad hoc creation is suggested by such cases of historical documentation as in Arabic where there is a record of Arabic baby talk used at the beginning of the nineteenth century which is very much like Arabic baby talk today.2 An even more impressive case is the persistence of baby talk words for food, drink, and sleep for some two thousand years in the Mediterranean area. The Roman grammarian Varro (116-27 B.C.) 3 cites Latin bua and papa or pappa as baby talk for 'drink' and 'food' respectively, and the use of Latin naenia 'dirge, lament' in the baby-talk meaning of 'lullaby' is attested. At the present time the general Arabic baby talk for 'drink' is mbu or mbuwa. The baby-talk word for 'food' is papa throughout the Spanish-speaking world; this is regarded by some speakers of Spanish as a special use of the adult word for potatoes, but it is attested in Spanish before the introduction of potatoes. The modern Moroccan Arabic baby-talk word for 'bread' is bappa (or babba, or pappa). A common Arabic baby-talk word for 'sleep' or 'lullaby' is ninni or ninne, which occurs also in Italian. The details of diffusion are quite unclear, but there can be little doubt of a historical connection between the Latin words and the contemporary Arabic, Spanish, and Italian ones. The assumption that baby-talk items are conventionalized and culturally transmitted, not universal, can be appreciated from a glance at Table 1, below. There are similarities in the structure of these items, which will be commented on below, but any simple notion of universality is refuted by such contrasts as the Syrian Arabic and Spanish baby-talk items for 'father' (baba : tata), 'baby' (bubbu : nene), 'food' (mamm : papa), 'little' (nunu : tiquitito).4 The assumption that most baby talk is taught as such by adults to children can be validated in an impressionistic way by simple observation. Adults inform the baby that a train is a choo-choo and a dog a bow-wow and in effect drill the child

Baby Talk in Six Languages


in such items until he produces his version of them. The alternative explanation, that millions of children independently create items like choo-choo and bow-wow instead of the hundreds of equally satisfactory onomatopoeias that could be imagined, is clearly unsatisfactory. It is, of course, true that adults sometimes do imitate an item of child speech and it gets accepted in a family; it is also true that there are resemblances between features of child speech and features of baby talk and that adults often feel that baby-talk items are imitations of child speech, but the general assumption seems safe that adults usually initiate baby talk, using the material familiar to them as appropriate for this. There are instances of baby-talk words becoming incorporated in normal language, e.g. English tummy, several Gilyak items (Austerlitz 1956: 271-2), Spanish pininos.

Material Baby talk includes at least three kinds of material: (1) intonational and paralinguistic phenomena which occur with normal language as well as with other babytalk material; (2) morphemes, words, and constructions modified from the normal language; and (3) a set of lexical items peculiar to baby talk. Intonational features have been noticed by many authors, and even casual observers may notice the higher overall pitch, preference for certain contours, and special features such as labialization which occur in baby talk in a number of languages. Much of this is subsumed under the term Ammenton. Very little systematic description of this kind of baby-talk material has as yet been attempted,5 and it will not be discussed further here. The baby-talk material derived from normal language shows considerable variability in the six languages, but a number of patterns of modification, phonological or grammatical, are sufficiently common to be of interest.

Modifications of Normal Languages Phonology6 Simplification of consonant clusters (e.g., English tummy for stomach) is attested for all except Arabic and may well occur there too. There is an interesting variation in this: Gilyak has many final clusters and, even though it simplifies them, its final clusters in baby talk are more complex than those of baby talk in the other languages. Replacement of r by another consonant (e.g. English wabbit for rabbit), either by a liquid l, y or w or by an apical stop t or d, occurs in all six languages. The replacement by / in several languages is surprising since some linguists feel that trills are more "basic" than laterals in that there are many languages with trills and no laterals but few the reverse. Replacement of velars by apicals (e.g., English tum on for come on) is attested for all except Arabic and Gilyak, and considering the frequency of velars in the


Register and Genre

Arabic and Gilyak baby talk it seems likely that this replacement does not occur in these. Some kind of interchange among sibilants, affricates, and stops (e.g., English soos for shoes) occurs in all but Comanche and Gilyak, but is of three different types: (a) hushing sibilants replaced by hissing sibilants (Arabic, Marathi, English); (b) sibilants replaced by [c] (Marathi, Spanish); (c) affricates replaced by stops (Marathi). The most interesting of these is probably the replacement of [s] by [c] (e.g., Spanish becho for beso) since the latter is felt by some linguists to be a less "basic" sound and this replacement seems very unnatural for English speakers. In Spanish baby talk the use of [c] for [s] is widespread and in fact is an identifying feature of baby talk; of the languages discussed here it occurs also in Marathi, and it is attested for Japanese baby talk as well. Distant nasal assimilation is attested for Marathi, Gilyak, and Spanish (e.g., Spanish mamoch for vamos), and may also occur in the others. Examples of loss of unstressed syllables occur in English and Spanish (e.g., Spanish tines for calcetines).7

Grammar At least one diminutive or hypocoristic affix is of frequent occurrence in each language. This may be a regular diminutive form (as Spanish -ito, -ita or Comanche -ci) or a form used chiefly in baby talk and only infrequently in normal language (e.g. , Gilyak k/q, Marathi -[k]ula/ -ukla, Arabic -o, English -ie). Greater use of nouns rather than pronouns and verbs is general: equational clauses without verbs replace normal construction with copula or verb (e.g., English dollie pretty for the doll is pretty), and third person constructions replace first and second person ones (e.g., English daddy wants for I want). In two of the languages, Arabic and Marathi, a shift in gender is used as a mark of endearment; i.e., a feminine noun, pronoun, adjective, or verb form is used in reference to a boy or vice versa. For example, in Arabic wen ruhti ya binti? 'Where did you go (fem.), little girl?' said to a boy; inta zu 'an? 'Are you (m.) hungry (m.)?' said to a girl.8 In Marathi the examples are with the use of a feminine ending on a boy's name and vice versa.

Lexicon The number of lexical items given in the references varies from about 25 to over 60. The commonest topics reported are: kin names, nicknames and the like; body parts and bodily functions; basic qualities like "good," "bad," "little," "dirty,"; and the names of animals and nursery games. About 30 such items common to most of the six languages are listed below, classified under four headings; in several cases attested items modified from adult words are entered when there is no special word.

Baby Talk in Six Languages


Characteristics Baby-talk words either as modifications of normal words or as special lexical items show certain general characteristics. In the first place, baby-talk items consist of simple, more basic kinds of consonant, stops and nasals in particular, and only a very small selection of vowels. One would expect that the rarer, more peculiar consonants or the consonants which tend to be learned later would not be found in baby talk, and generally this is true but there are some exceptions. Gilyak, for example, uses four phonemically distinct nasals in baby talk, and a variety of velars as mentioned above. Arabic has many baby-talk items with pharyngeal spirants although these are often assumed to be learned late in Arabic. The best example is the fact that labial emphatics exist in Arabic baby talk and may well be the first emphatics learned by Arabic children even though they are marginal in the adult language. A second phonological characteristic is the predominance of reduplication, both of parts of words and of whole words, in the baby talk of all six languages. For several of these languages reduplication plays a grammatical role of some sort in the adult language, but the reduplication in baby talk is generally separate and unrelated to the use in the normal language. Reduplication can probably be regarded as a feature of baby talk throughout the world. Each of the six languages has a typical ("canonical") form of baby-talk items. There is variation, dependent at least in part on the canonical forms of morphemes in the corresponding adult language, but the commonest form is CVC, i.e. a monosyllable beginning and ending with a consonant, with CVCV as next most common. Many items have CVCCV with a double consonant in the middle even if this is not common in the adult language. As an example of the variation conditioned by normal canonical forms we may cite Spanish: in adult Spanish, monosyllabic words of the shape CVC are extremely rare, and this form seems not to occur in Spanish baby talk, where CVCV is the commonest form. On the grammatical side, apart from the reduplication and canonical forms already mentioned among phonological characteristics, the most striking features are the absence of any inflectional affixes, the presence of a special baby-talk affix and the use of words in different grammatical functions. The semantic fields showing a special baby-talk vocabulary most commonly represented include kin, food, body parts, and animals. It must be noted that the features listed here as characteristic of baby-talk items are in general characteristic of the one-vocable utterances ("monoremes") used by children at the stage of linguistic development between the stage of call-sounds and other prerepresentational items and the stage of two-vocable utterances where words and sentences emerge.9 Common characteristics include reduplication; primitive affixes; food, animals, toys, etc., as referents. In view of this similarity one is tempted to make the hypothesis that every language community provides a stock of baby-talk items which can serve as appropriate material for babies to imitate in creating their monoremes but which do not interfere with the normal words of the language and can gradually be discarded

Table 7-1. Lexical Items Commonly Found in BT Arabic






KIN 1 . mother 2. father 3. baby

mama baba bubbu

(m)ai baba bal

— 'ap'i •? nini.?

yma da(j), dyj (nena)

mommy daddy ba-by

mama tata nene

4. food 5. drink, water 6. sleep

mamm mbu (wa) ninni, ninne

tata-? papa-/ —

mama, nana — qoq


(i)si toto bau calcal pipi nuni, nunu cimi

?a?h, ?asi — ?ana , nana — cici.? wi?asl ta?sl

— dink sleepy-bye night-night wee-wee, pee-pee poop(oo) — ow, booboo footsie — — —

papa (a)guita tuto, meme

7. urination

m mm m papa nini, 3°3° gai (gai) mumu, su


8. 9. 10. 11. 12. 13. 14.

defecation bath hurt walk, foot breast, milk penis vagina

ka o gullu gullu wawa dade zeze, zizz —

hisa, cisa [a?a] ypyp ykyk onk, amqamq mynk, myny coc(k) bew, elna

pipi; pichi, chichi popo, kaku — yaya, coco patita, pininos — — —

1. Marathi and English have many baby-talk words for 'mother' and 'father.' Marathi mai, OiOi, ai (regular adult word), m mi (English loan), 'mother'; baba, nna, dada, tatya, tata, 3ppa, nana, aba, bhau; p pa, d di (English loans), 'father.' English mom, ma, momma, mommie; dad, daddy, dada, pop. Casagrande says "There is no special baby word [for mother] in common use," the regular adult pia being used (1948: 12); mama ? is included in the alphabetized listing, however, identified as English. Spanish baby talk mama (also mami) is stressed on the first syllable,; mama with final stress is a somewhat informal adult word. 2. Comanche ?api ? also means 'father's brother' and 'father's friend'; baby talk toko ?, toto ? grandfather' (adult counterpart 'mother's father') is sometimes used for 'father.' Spanish tata may also be used for 'grandfather.' 3. Arabic bubbu and Comanche nini? are also used for 'doll.' Marathi bal is also an adult word, but is usually in baby talk with a special intonation, and other adult words for 'baby' are not used in baby talk. The Gilyak nena is glossed only 'doll.' A feminine nena occurs in Spanish, although in Chile nene may be used for both sexes. 4. English apparently has no common baby-talk word for 'food'; yum-yum 'delicious' is sometimes used. 5. Marathi papa also means 'kiss.' 6. Arabic has variants such as ?a??a ninni (-e), ?o??o ninni (-e). Spanish hacer tuto is attested for Chile, hacer meme (or mimi) for Mexico. 7. Spanish pichi, chichi are not attested for Mexico. 8. Chile: popo is 'anus,' sometimes 'vagina,' never 'defecation' or 'feces'; kaku is not attested for Mexico. 9. Arabic gullu gullu is from McCarus. 10. Spanish tenor una yaya (or yayita) is attested for Chile, hacerse coco for Mexico. 11. Gilyak amqamq is 'walk'; Gonk (variants Gon, GonGon, Gono) is 'legs and feet.' Spanish patita 'foot' is attested for Mexico and Chile; in the sense of 'walking, taking steps' Chile has andando patita, Mexico hacer pininos.


Table 7-1 (Continued) Arabic QUALITIES 15. nice 16. bad, don't! 17. dirty 18. hot 19. cold 20. n o t h i n gg l et f 21. little


dahh didde, (hu)mm kixx, ka ?uhh hu bahh nunu

aw aw


chan chan ? um-a ? ?a?cha.? ha(?) ? yakk, isi? ax hay? ?iti.? gar gar ?ici koko --pitukla —



23. cat

nawnaw, biss

mau, mini

24. 25. 26. 27. 28. 29. 30.

kuku bu bu tiss naww, bt?? eno ha?ha? kurr nahh

ciu bagul-bua bhur kukk, bua kokru hokurr khau

bird goblin going out peek-a-boo carry on back noise, ear goodies, candy



Gilyak ulak — alqalq — — ap(k)a —



p(r)it-tie nino (? ? ) alveolar click [yix] fuchi, chocho burnie ssss — fio a(11) gone cabo teenie tiquitito (-weenie), itty-bitty

gyck, gycy

doggie, bow-wow wa?6 ? — pussy(-cat) kitty(-cat) kaka ? bic-( )aq birdie muki ? humk boogeyman — — bye-bye — — peek-a-boo mama ? aci, (b)apu piggy-back — — — kok6.? — —

guau guau, gua gua cuchito, michi, bicho pipi cuco, coco mamoch calle oneta upa — uches

15. Marathi than Chan is a reduplicated form of adult Chan. Spanish nino (adult lindo) is attested only for Chile. 16. Arabic didde means 'don't or I'll slap you, spank you'; (hu)mm means 'don't touch.' 17. Spanish chucho is a baby talk form of sucio 'dirty'; possibly fuchi is also related to this. 18. Spanish ssss is accompanied by a gesture of shaking the fingers loosely as though just burnt. 19. Marathi gar gar is a reduplicated form of adult gar. 22. Chile guau guu, Mexico gua gua. 24. Marathi ciu is glossed 'house sparrow.' Comanche kaka


is also 'headlouse.'

25. Comanche has several words for frightening children; the muki ? is some kind of giant owl, the mumu ? is darkness or thunder, the ?ini ? is a noxious insect or small animal like a snake or scorpion. Chilean Spanish sometimes has cuca, feminine of cuco. 27. English peek-a-boo is chiefly American; the usual British form is bo-peep. 28. Marathi kokru ho- means 'play lamb,' i.e., be carried piggy-back. Comanche mama ? is glossed "horse; said by a child when he wants to be carried on someone's back."



Register and Genre

as real words emerge in the children's speech. The child may, and often does, create his monoremes from other sources such as sound imitation or fragments of adult utterances, but the baby-talk items tend to be one of the principal sources. The baby-talk lexicon of a language community may thus play a special role in the linguistic development of its children: the facilitation of each child's acquisition of a set of monoremes from which he can go on to the beginnings of real grammar. Experimental confirmation of this hypothesis would be difficult; perhaps the most relevant data would come from societies with radically different attitudes toward child language learning. (Cf. Voegelin and Robinett 1954.)

Function Under what circumstances and with what intentions is baby talk used? The published material is very limited on this point. There are, however, several situations or purposes mentioned in the articles or by informants, and these may be considered. Perhaps the primary purpose is felt to be teaching a child to talk; that is, people asked why or when they use baby talk will say that they use it when talking to young children to make it easier for them to learn to talk. If asked in more detail they may explain that what they are saying in baby talk is easier for the child to learn and that it is clearer, i.e., easier for the child to hear, also, especially in the Marathi material, whenever there is a choice between two ways of saying something, baby talk uses the more colorful, more "marked" in the linguistic sense. This feeling is obviously incorrect in details (is pussy so much easier than cat?) and too vague in formulation, but it seems to reflect in a folk-wisdom way the function hypothesized above. A moment's consideration, however, shows this is not the only time baby talk is used. It is used for one thing in talking to infants who are not yet learning to talk, and it is apparently used in talking to pets in every one of these six language communities. Obviously one is not teaching the infant or the pet to talk. Secondary uses of baby talk generally seem to reflect a desire on the part of the user to evoke some aspect of the nurturant-baby situation in which the primary use of baby talk occurs. This evocation may be from the side of the baby. For example, a child who has just gotten past the use of baby talk by his parents may then revert to baby talk—in fact, even use baby talk that he has not used before— in order to get attention or to be treated in some way as a baby. Also, adults use baby talk in reporting children's speech; in several language communities (e.g. Marathi, Norwegian) baby talk is often used to represent child speech in written literature such as novels and stories. The evocation of the nurturant-baby situation may also be from the side of the nurturant. For example, the use of baby talk to pets or small infants seems to show the kind of protectiveness and affection characteristic of the nurturant's relation with the baby. The Marathi author notes that the speaker gets a sense of pleasure from doing this. In Marathi, English, and Spanish, lovers' use of baby talk is attested, and in

Baby Talk in Six Languages


this case it may not always be clear whether it is the protectiveness of the nurturant or the dependence of the baby that is evoked. It is worth noting that Kelkar reports, on the basis of observation in multilingual situations, that adults who are using baby talk with other adults do not use baby talk in anything but their own language. It seems very likely, however, that this varies depending on a number of factors; it is in any case related to the important general issue of relationshipsignaling styles in a second language. Finally, it is clearly documented for several languages that baby talk is used in certain kinds of songs, riddles, and word-play on the part of adults which bear little direct relationship to the uses with children (Austerlitz 1956: 272-3).

Variability and Diffusion The fact of variability in baby talk was mentioned above; it requires further comment here. First, there is great family variation: an item gets used in a certain family and becomes well entrenched there but does not spread beyond that. There are also examples of items spreading from one family to another but not becoming general. Second, there is the areal diffusion previously referred to. Baby-talk items often diffuse within an area rather than according to the lines of genetic relationship followed by the great mass of linguistic phenomena. A good example is the baby-talk word [kix] meaning 'dirty, don't touch' and the like. This word, with slightly different forms depending on the phonological systems of the respective languages, occurs in almost every language of the Middle East. It is attested (McCarus 1963) for Arabic, Kurdish, Persian, and Syriac although these languages represent two different language families, Semitic and Indo-European (Iranian branch). The word [kix] is not attested for Turkish, which has no phoneme of the [x] type. Another good example is the use of a word like wdwa, uwwa, or vava in the meaning 'hurt, sore, injury' throughout the Middle East (Arabic, Syriac, Turkish, Persian, Armenian, Greek), with a [v] in languages like Persian and Greek that have no regular phoneme of the [w] type. The explanation for this kind of diffusion might lie in the fact that the babytalk items are not well integrated into the grammatical system of the language even though they are fairly well integrated into the phonological system. Because of this lack of integration it is clearly easier to borrow these terms from one language to another, but presumably social factors in addition to this linguistic factor should be sought as explanation. This kind of variability, being relatively independent of genetic relationship, offers a chance for the study of distribution of baby-talk items on a statistical basis throughout the world and the kind of analysis of statistical universals of one sort or another that Jakobson has tried (Jakobson 1962), at least with mama and papa, suggesting certain reasons for their occurrence with far more than chance frequency in languages of the world. It is a rare pleasure for the linguist to have a language phenomenon which can be studied all across the world without need for corrections from the genetic relationships that are involved.


Register and Genre

Another way in which baby talk can vary from one language to another is the size of the lexicon or the range of variation of a particular part of the lexicon. Actually one of the surprising features of the present study is the similarity of baby-talk phenomena in the six languages considered, when one might have assumed that there would be serious cultural differences in the kinds of items that would appear in baby talk and the situations in which they would be used. Further study along this line, however, would be useful. One other point of variability should be mentioned, the differences in attitude toward public use of baby talk, In our society baby talk is mentioned with an air of apology by adults talking seriously, and one feels a good bit of embarrassment in citing examples of baby talk. Also in our society it is quite widely believed that the use of baby talk inhibits learning of the language. That is, people feel that if they use too much baby talk at home, the child is not going to learn the normal language properly. This belief is presented explicitly in books on child development, although there seem to be no experimental data which would substantiate it.10 In the Arab world, however, there seem to be no such feelings. Adults may discuss baby talk perfectly easily, and they use it freely if it is appropriate. There seems to be no trace of the notion that use of baby talk may inhibit the acquisition of the adult language. Among both Americans and Arabs, however, it seems to be felt that baby talk is more appropriate for women to use than men.

Summary Baby talk is a linguistic subsystem regarded by a speech community as being primarily appropriate for talking to young children; it consists of intonational features, patterned modifications of normal language, and a special set of lexical items. The special lexical items typically number between 25 and 60 and cover kin names and appellations, bodily functions, certain simple qualities (e.g., dirty, pretty, hot, cold), and vocabulary concerning animals, nursery games, and related items. Baby-talk words typically contain stops, nasals, and a limited selection of vowels, have the structure CVC or CVC(C)V, are frequently reduplicated, and often have a diminutive suffix characteristic of baby talk in that language. Baby-talk words are not universal, but are transmitted much like other language phenomena in the community. Baby talk seems to serve in each language community as a special source for children's pregrammatical vocables, enabling them to create items at that stage which they can discard as they acquire true words and grammar. Baby talk in addition to this primary use is also used to talk to infants and pets and between adults in situations with "baby" aspects. Babytalk items are fairly well integrated into the phonological system of the language, but are so unrelated grammatically to the normal that on the one hand they show considerable variability within a speech community and on the other hand tend to diffuse readily across language boundaries regardless of genetic relationships. A given baby-talk system may be characterized in terms of internal structure by the size of the special lexicon and the range of variability. Externally it may be characterized by the extent of its secondary uses and the attitude toward its public use.

Baby Talk in Six Languages


Notes 1. As an additional source for Syrian Arabic, McCarus' notes were used; they also provided information on baby-talk items in Iraqi Arabic, Turkish, Kurdish, Persian, Syriac, and Alexandrian Greek. Further checking of Arabic was done with Mr. and Mrs. Moukhtar Ani of Damascus. Kelkar provided some additional Marathi information in a personal communication. Chief informants for the Spanish were Mrs. Raquel Saporta of Chile and Miss Yolanda Lastra of Mexico; English items came from the author and his colleagues. Susan Ervin-Tripp read the manuscript and made valuable suggestions. 2. Sabbagh's sketch (Sabbagh 1886) of colloquial Syrian and Egyptian Arabic, written in 1812, has five baby-talk words (voweling uncertain): bahh 'all gone,' dahh 'shiny nice,' uhh 'hot,' ncigg 'goo,' said to elicit smile and first word, mnahh 'sweet, goodies.' All these are in use in Syrian Arabic today (modern form for the last two nkigg, nahh). 3. Varr. ap. Non. 81. 2 cum cibum ac potionem buas ac pappas vocent et matrem mammam patrem tatam (Heraeus 1904, repr.: 170-172). 4. This notion of universality is found even in such careful works as Lewis (1957: 80): "In fact, baby language is an international language. If we make a short list of the earliest words actually spoken by children, with their meanings, we have a vocabulary that every one will recognize." 5. Kelkar pays considerable attention to intonation in his study 3.2. 6. The careful account of the phonological characteristics of Norwegian baby talk in Haugen (1942: viii-x) includes most of the characteristics listed here. 7. Surprisingly enough, Spanish baby talk shows distinctive use of stress, e.g. pipi 'bird': pipi 'urination.' Also, several baby-talk items differ from other adult words only in stress; for example, baby talk mama 'mother' and papa 'food' differ from informal adult mama, papa 'mother, ' 'father,' and baby talk gua gua differs from adult Caribbean Spanish guagua 'bus' and Bolivian gua gua, 'child.' Spanish baby talk has both CVCV ( = CVCV) and CVCV as canonical forms. 8. Arabic examples are from McCarus (1963). 9. Some monoremes persist as vocables in more complex utterances, but the notion of a monoreme stage in language development seems valid. A convenient recent account of the characteristics of monoremes is in Werner and Kaplan (1963: 134-137). Full recognition of the similarity between baby talk and actual items of child language is found in Jakobson (1962: 539): "Nursery coinages are accepted for wider circulation in the childadult intercourse only if they meet the infant's linguistic requirements. . . ." 10. This notion appears even in careful reviews such as McCarthy (1954: 536): ". . . baby-talk used by adults in the child's environment often makes for preservation of infantile speech habits." A more balanced statement on this point appears in Lewis (1957:89): "But a mother who, because of a theory that baby-language is too 'babyish'—not 'correct language'—refuses to speak it to her child may be doing him harm, retarding his language development. On the other hand, if baby language is spoken to a child for too long in his life he may be retarded in another way—his speech may remain childish at a time when he should have grown out of this."

References Austerlitz, Robert. 1956. Gilyak nursery words. Word 12: 260-279. Casagrande, Joseph B. 1948. Comanche baby language. International Journal of American Linguistics 14:11-14.


Register and Genre

Ferguson, Charles A. 1956. Arabic baby talk. In For Roman Jakobson, Morris Halle et al., eds. The Hague, Mouton. Haugen, Einar. 1942. Norwegian word studies, Vol. I, Part HI (Baby talk, pp. vi-x). Mimeographed, on deposit Library of Congress. Heraeus, Wilhelm. 1904. Die Sprache der romischen Kinderstube. Archiv fur lateinische Lexikographie 13: 149-172. Repr. in Kleine Schriften von Wilhelm Heraeus. J. B. Hoffmann, ed. Heidelberg, Carl Winter's 1937, pp. 158-180. Jakobson, Roman. 1962. Why "mama" and "papa"? In Selected writings. Vol. I. The Hague, Mouton. Kelkar, Ashok. 1964. Marathi baby talk. Word 20:40-54. Lewis, M. M. 1957. How children learn to speak. London, George G. Harrap. McCarthy, Dorothea. 1954. Language development in children. In Manual of child psychology, Leonard Carmichael, ed. New York, John Wiley. McCarus, Ernest. 1963. Near Eastern baby talk. Unpublished notes.

Sabbagh, Mikha'il. 1886. Miha'il Sabbdg's Grammatik der arabischen Umgangssprache in Syrien und Aegypten, H. Thorbecke, ed. Strassburg, K. J. Trubner. Voegelin, C. F. and Florence M. Robinett. 1954. 'Mother language' in Hidatsa. International Journal of American Linguistics 20:65-70. Werner, Heina and Bernard Kaplan. 1963. Symbol formation. New York, John Wiley.

8 Absence of Copula and the Notion of Simplicity: A Study of Normal Speech, Baby Talk, Foreigner Talk and Pidgins

The purpose of this paper is to examine one feature of human language in a general typological framework in order to obtain some insights into the notion of grammatical simplicity. The feature in question is the presence in some languages, or special varieties or registers of a single language, of an overt connecting link, or COPULA, between nominal subjects and complements in equational clauses of the type X is Y1 as compared with the absence of such a link in other languages or other varieties of the same language. Thus, English My brother is a student and Japanese Ani wa gakkusee desu differ from Russian Moj brat student(om) or Arabic 'Axi tilmioun by having a copula (is, desu) which has no overt equivalent in the latter two languages. Similarly, English Your mother is outside or has gone out may correspond to baby talk Mommy bye-bye with no copula, or French La machine est grande 'The machine is big' corresponds to Haitian Creole Machinna gro.

Normal Speech It may safely be assumed that all natural languages have grammatical machinery for equational clauses, but the details vary considerably from one language to another. There has been very little systematic study of clause types across languages, and future investigations may show the inadequacy of the crude classification used here, but it seems helpful for the purposes at hand. There seem to be two main types of language as far as equational clauses are concerned. Type A has a copula in all normal neutral equational clauses; the absence of the copula is This paper was originally published in D. Hymes, ed., Pidginization and Creolization of Languages. Cambridge: Cambridge University Press (1971), reprinted by permission of Cambridge University Press. 115


Register and Genre

limited to certain set expressions or signals a particular style or register, such as proverbs (e.g. Noththing ventured, nothing gained). In such languages the copula generally functions very similarly (i.e. has similar patterns of allomorphs, exhibits similar grammatico-semantic categories, occurs in similar constructions) to the members of the major word class of verbs. It generally differs from verbs, however, in certain respects, in some languages so much as to constitute a separate word class, in other languages in such a way as to belong to a distinct sub-class of verbs ('auxiliaries'). In Indo-European languages of type A the copula typically has a unique pattern of suppletion (e.g. Latin es-~fu-). In type A languages the copula often appears also in existential clauses of the type There is/are X, although they may have special constructions with the copula (e.g. English there is/are), or not use it at all (e.g. French il y a), in such clauses. Type B languages normally have no copula in equational clauses. The copula is invariably absent in a main clause when both members of the clause (subject and complement) are present, the clause is timeless or unmarked present in time, the complement is attributive (i.e. adjectival rather than nominal), and the subject is third person. In many type B languages the absence of a copula goes beyond these minimum limits. For example, probably in most type B languages the copula is absent with first and second person subjects as well as third (e.g. Russian Ja student 'I am a student'), although in some the absence is limited to the third person (e.g. Hungarian En diak vagyok 'I am a student' but O diak 'He is a student').2 In many type B languages the copula is absent also when the complement is a noun or pronoun, as in the Russian and Arabic examples previously cited, although in some the absence is limited to adjectival complements (e.g. Haitian Creole Chwal yo parese 'The horses are lazy' but Chwal yo se etalo 'The horses are stallions' (McConnell 1953:20). Again, many type B languages have no copula in either main or dependent clause but some have it only in dependent clauses, e.g. Bengali Se chatro 'He is a student' but Se jodi chatro hoe . . .'If he is a student. . .' (Sableski 1965) In all type B languages there seem to be conditions under which a copula must be used. The most widespread such condition is when a tense other than present is called for. Thus, English My brother was a student has Russian and Arabic equivalents with an overt was in Moj brat byl student, 'Axi kana tilmioan.3 Also, most type B languages seem to use a copula if only one member of the equational clause (subject or complement) is present, or if because of an inverted word order the copula would be in an 'exposed' position.4 Thus, Haitian Creole Machin na gro The machine is big' but Se gro 'It is big' (with copula se); Chwal yo na cha 'The horses are in the field' but Kote chwal yo ye? 'Where are the horses?' (with copula yo). Finally, in type B languages when emphasis is put on the semantic link, as in definitions and exclamatory pronouncements, a copula equivalent is used, either a special verb (e.g. 'stands', 'is found') or a pronoun (e.g. 'he', 'they'), or a verb 'to be' which is normally used in other tenses or in existential clauses. Thus, Russian cto jest' istina? 'What is truth?'. In type B languages there is often a special negative construction used in equational clauses without copula and not elsewhere in the language. Thus Arabic and Bengali have special negative copulas, lays-~las- and n ~ no- respectively,

Absence of Copula


which are used only here: Arabic Laysa (lastu) tilmioan 'He is not (I am not) a student'; Bengali Se chatro n e 'He is not a student', Ami chatro not 'I am not a student. Some, however, have the same negative formative in these clauses that appears in the negation of verbal predicates (e.g. Russian Ja ne student, Haitian Creole Machin na pa gro). Type B languages typically have a different verb or verb equivalent for existential clauses, e.g. Bengali ach- 'exist, be', Russian jest' 'there is/are', Haitian Creole ge, and sometimes they have still another special form of clause negation for this, e.g. Bengali net, Russian net, Haitian Creole na pwe. Bengali illustrates the full range of possibilities here (cf. Sableski): eta boi this is a book eta boi n e this isn't a book ekhane boi ache there are books ekhane boi nei there aren't here any books here

Simplified Speech It may be assumed that every speech community has in its verbal repertoire a variety of registers, that is, modes of speech, appropriate for us with particular statuses, roles, or situations (cf. Halliday et al., 1967, ch. 4). It may further be assumed that many, perhaps all, speech communities have registers of a special kind for use with people who are regarded for one reason or another as unable to readily understand the normal speech of the community (e.g. babies, foreigners, deaf people). These forms of speech are generally felt by their users to be simplified versions of the language, hence easier to understand, and they are often regarded as imitation of the way the person addressed uses the language himself. Thus, the baby talk which is used by adults in talking to young children is felt to be easier for the child to understand and is often asserted to be an imitation of the way the children speak. Such registers are, of course, culturally transmitted like any other part of the language and may be quite systematic and resistant to change. Unfortunately they have not been studied very much; for summary and references on baby talk, cf. Ferguson 1964. A register of simplified speech which has been even less studied, although it seems quite widespread and may even be universal, is the kind of 'foreigner talk' which is used by speakers of a language to outsiders who are felt to have very limited command of the language or no knowledge of it at all. Many [all?] languages seem to have particular features of pronunciation, grammar, and lexicon which are characteristically used in this situation. For example, a speaker of Spanish who wishes to communicate with a foreigner who has little or no Spanish will typically use the infinitive of the verb or the third singular rather than the usual inflected forms, and he will use mi 'me' for yo 'I' and omit the definite and indefinite articles: mi ver soldado 'me [to-] see soldier' for yo veo al soldado 'I see the soldier'. Such Spanish is felt by native speakers of the language to be the way foreigners talk, and it can most readily be elicited from Spanish-speaking informants by asking them how foreigners speak.5


Register and Genre

Similarly, Arabs sometimes use a simplified form of the language in talking to non-native speakers, such as Armenian immigrants. This form is sometimes referred to as the way Armenians talk and can be elicited by asking for Armenian Arabic. It is characterized by such features as the use of the third person masculine singular of the imperfect of the verb for all persons, genders, numbers, and tenses (e.g. ya'rif 'he knows' for 'you know', 'I know', etc.) and the use of the long forms of the numbers 3-10 with a singular noun instead of the normal contracted form of the number with a plural noun (e.g. date sa a for tlat saat 'three hours'). Some Armenians and other non-native speakers of Arabic do sometimes use these expressions, but it is not clear whether this comes as a direct result of interference from their own languages or results at least in part from imitation of Arabs' use of foreigner talk. In both baby talk and foreigner talk the responses of the person addressed affect the speaker, and the verbal interaction may bring some modification of the register from both sides. The normal outcome of the use of baby talk is that as the child grows up he acquires the other normal, non-simplified registers of the language and retains some competence in baby talk for use in talking with young children and in such displaced functions as talking to a pet or with a lover. The usual outcome of the use of foreigner talk is that one side or the other acquires an adequate command of the other's language and the foreigner talk is used in talking to, reporting on, or ridiculing people who have not yet acquired adequate command of the language. If the communication context is appropriate, however, this foreigner talk may serve as an incipient pidgin and become a more widely used form of speech. Baby talk and foreigner talk are not the only forms of simplified speech. English, for example, has special usages for telegrams and formal instructions which resemble baby talk and foreigner talk in omitting definite article, prepositions, and copula, and the resemblance of these usages to early childhood language behavior has been noticed (Brown and Bellugi 1964:138-9). The conventional nature of these usages, which native speakers explain as being more economical of space, time, or money, is shown by their use where the limitations are irrelevant, as with instructions printed on a package where there is plenty of empty space or choices of wording in telegrams where either wording is below the number of words allowed at minimum cost.

Simplicity The notion of simplicity in language and language description has been a perennial issue in linguistics as in other disciplines, and there is little agreement on what constitutes simplicity. Some recent work in linguistics has been concerned with a 'simplicity metric' in evaluating alternative grammars or partial grammars. The notion of simplicity in language itself, however, is only indirectly related to this. In the present paper we are concerned with the concept of simplicity in language, i.e. the possibility of rating some part of a language (e.g. a paradigm, a construction, an utterance, a clause type, a phonological sequence) as in some sense sim-

Absence of Copula


pler than another comparable part in the same language or another language. For sample statements of this sort, cf. Ferguson (1959:333-4) The notion of simplicity in language is important in several ways, since it may be related to theories of language universals, language acquisition, and language loss. Jakobson and others have assumed that, other things being equal, the simpler of two comparable features is likely to be the more widespread among languages of the world, the earlier acquired in child language development, and the later lost under pathological conditions. Even though the last of these assumptions may offer great difficulties because of the varied nature of pathological conditions, there seems to be some validity for the first two.6 Accordingly, the creation of taxonomies involving the dimension simple-complex and investigation of these across many languages offers promise in the development of the general theory of language. Also, any full-scale description of a language should identify simple versus complex (i.e. primary versus derivative) along a number of dimensions and thus offer predictions about possible orders of acquisition of the respective features. This process of prediction and empirical confirmation offers an opportunity for checking the validity of grammars which goes outside the linguists' intuitions about languages. For examples of predictions of this kind, cf. Ferguson 1966.7 The present paper suggests an additional approach to the study of simplicity in language, viz. the investigation of simplified registers, such as baby talk and foreigner talk, which give some indication of what folk grammatical analysis rates as relatively simple or easy versus complex or difficult. For discussions of simplicity in pidgins, see Samarin (1962:59-60), Ferguson (1963:119-20).

Hypotheses Even on the basis of the largely impressionistic and anecdotal accounts of simplified speech now available, it is possible to hazard some universal hypotheses. For example, 'If a language has an inflectional system, this will tend to be replaced in simplified speech such as baby talk and foreigner talk by uninflected forms (e.g. simple nominative for the noun; infinitive, imperative, or third person singular for the verb)'. Several such hypotheses might even be subsumed under a more general hypothesis of the form: 'If a language has a grammatical category which clearly involves an unmarked-marked opposition,8 the unmarked term tends to be used for both in simplified speech'. This general hypothesis may raise more problems than it solves at this point in our understanding of grammatical systems, but it illustrates the kind of hypotheses which may be generated in the study of language universals. A fairly specific kind of universal hypothesis is the central point of this paper. In pairs of clauses differing by presence and absence of a copula in a given language, speakers will generally rate the one without the copula as simpler and easier to understand. Also, studies of child language development seem to show that children, apart from some marginal cases, first make equational clauses without a copula and only later—if the language has a copula—acquire the construe-


tion with the copula. Thus, even though the linguistic analyst may find that in the full normal speech absence of the copula is to be regarded as a deletion and hence grammatically more complex than its presence, and even though languages which lack a copula in equational clauses may have quite complicated patterns of allomorphy and distribution of synonyms in verbs 'to be', it seems wise to make the assumption that other things being equal absence of the copula is simpler than presence of the copula.9 Therefore, given that languages can be classified into two types according to their equational clauses, type A with copula and type B without copula, then: Hypothesis 1 In languages of type A, the copula in equational clauses will tend to be omitted in simplified speech such as baby talk and foreigner talk.

Although this hypothesis says nothing about equational sentences in languages of type B, it predicts that speakers of a language of type A will tend to omit the copula when they are attempting to simplify their speech. Specifically it predicts that simplified registers in regular use in the speech community will tend to omit the copula, e.g. baby talk, foreigner talk, telegraph language, newspaper headlines. Going a step further, the hypothesis would suggest that a pidgin language whose lexical source was a type A language would tend to omit the copula. The wording of the hypothesis in terms of possibility ('will tend to') rather than in absolute terms ('will') is based on the existence of empirical data showing considerable variation in the extent to which the copula is actually omitted. For example, in French baby talk the copula seems to be omitted much less often than in English baby talk, although etre as an auxiliary is often left out (Papa parti 'Daddy bye-bye'). Also, of the Portuguese-based Creoles used in the Far East in the sixteenth century some apparently had a copula while others did not (Whinnom 1965). A further subhypothesis can be made with regard to the degrees of likelihood of omission of the copula under different conditions. This hypothesis is based on the descriptive statements made about type B languages, although their relation to the notion of simplicity is unclear. Hypothesis 2 In simplified speech of languages of type A, the copula is more likely to be omitted under each of the following conditions than otherwise:

main clause subject and complement both present non-emphatic timeless or unmarked present third person subject adjectival complement non-exposed position The presentation of these two hypotheses constitutes in effect the outline of a research project to examine the omission of copulas in baby talk, foreigner talk, and pidgins to find the extent to which the hypotheses would be discontinued, confirmed in principle, or even quantified. Some encouragement as to possible

results comes from recently presented evidence (Labov 1967) that certain varieties of English which frequently omit the copula do not do so in clauses where the standard language does not permit contraction, i.e. in instances of emphasis, exposed position, or absence of one member of the clause.

Concluding Observations For the linguist interested in typology and language universals this paper suggests the usefulness of a taxonomy of copula and copula-like constructions in the world's languages and the elaboration of hypotheses of synchronic variation and diachronic change in this part of language. The copula seems of particular interest because of the universality of equational clauses, the widespread patterns of polysemy and suppletion and possible exceptions to general hypotheses of the status of markedness in grammar. For the linguist interested in child language development, the paper repeats earlier suggestions that the notion of simplicity may be a useful one in accounting for the development of grammar in the child, repeats the point (Ferguson 1964) that baby talk is largely initiated by adults on the basis of existing patterns, and suggests further that the telegraphic style used by young children may in part be based on the fact that adults in their attempt to simplify their speech (i.e. use baby talk) tend to omit items such as the copula, prepositions, articles, and inflectional endings. For the linguist interested in pidgins and Creoles, the most important suggestion of the paper is probably the view that the foreigner talk of a speech community may serve as an incipient pidgin. This view asserts that the initial source of the grammatical structure of a pidgin is the more or less systematic simplification of the lexical source language which occurs in the foreigner talk register of its speakers, rather than the grammatical structure of the language(s) of the other users of the pidgin. Such a view would not, of course, deny the grammatical influence of the other language(s), but would help to explain some of the otherwise surprising similarities among distant Creoles by setting the starting point in a universal simplification process. It differs from the view held by some scholars from Schuchard to the present that 'the Europeans deliberately and systematically simplified and distorted their language to facilitate communication with the nonEuropeans' (Goodman 1964:124) by emphasizing the conventional, culturally given aspect of the linguistic simplification and by recognizing with Bloomfield the interaction 'between a foreign speaker's version of a language and a native speaker's version of the foreign language' (quoted in Goodman 1964:12).

Notes 1. The equational clause type includes a number of semantic (and in some languages grammatically distinct) sub-types such as identity (Her father is the President of the University), class membership (Your friend is a fool), attribution of a property (The towel is wet).


For the purposes of the present article these distinctions are generally disregarded, and the terms 'equational clause' and 'copula' are used to refer to any or all of them unless otherwise specified. For discussion of equational clauses, see Elson and Pickett (1962:112-13); sample definitions in specific languages, cf. Sableski (1965); Sebeok (1943). 2. It has been pointed out that in those early Indo-European languages which have equational clauses without copula, this is normal only in the third person. Cf. Meillet (1906-8:20). 3. Bally called attention to this feature of languages without copula in a more general discussion of zero and ellipsis (Bally 1922:1-2). 4. For the term 'exposed', cf. Hall (1953:66n.); for latter read former. 5. For examples of this kind of Spanish, see Lynch (1955), in which an Englishman is portrayed as using this kind of foreigner's Spanish; e.g. Osted moi buena conmigue . . . Mi no olvida nunca. 'You very good with me . . . Me not forget(s) never' (187). 6. On the question of order of acquisition, it is, of course, necessary to recognize that other things are not equal and that acquisition may run not only from simple to complex but from less effort to more effort, from heavy affect to light affect, or from high frequency to low frequency, and that interference from other parts of the language or another language may be involved. 7. The possibility must be noted that the speaker may, in the case of language development, reorganize his internal grammar in such a way that what was previously primary may become derivative and vice versa. Thus a speaker who learns Handschuh as a monomorphematic lexical item meaning 'glove' may later identify it as Hand 'hand' plus Schuh 'shoe' in a compound-word construction. Similar reorganizations of grammatical constructions make it hazardous to relate a line of derivation or the ordering of a set of rules to an actual developmental sequence, but the grammar will surely offer clues which can be checked against empirical data. 8. For an extensive discussion of marked-unmarked categories in grammatical universals see Greenberg (1966:25-55). 9. In making this judgment of simplicity on the basis of certain phenomena of child language development and 'simplified' registers, no claim is made about simplicity in grammar writing or in cognitive adequacy. To say a certain construction is 'simpler' in the sense used here says nothing directly about its value in communication or other language functions: no one would maintain that Russian and Arabic in which the copula is omitted are less adequate or more 'primitive' than Japanese and English which regularly use the copula.

9 The Collect as a Form of Discourse

Linguists and grammarians through the centuries have generally regarded a grammar as a characterization of the possible (i.e., grammatical, pronounceable, writable) utterances of a particular language. Sometimes this aim has been reduced, as when a linguist has tried to characterize only a limited corpus; sometimes it has been expanded, as when a linguist has tried to characterize a set of dialects or languages by the same grammar. Recently the notion of grammar has been consciously extended, in a direction sometimes implicitly suggested in the past, to a characterization of the utterances appropriate or acceptable under various sociolinguistic, psychological, or communicative conditions. No matter which of these views is taken, the linguist faces the same crucial question. What is the locus of the grammar; exactly what is this "language" which he is characterizing? The problem of how to define a language as opposed to a dialect or a family of languages is an old one which remains inadequately resolved (see, for example, Ferguson and Gumperz, 1960, or many other treatments), but it will not be examined here. An even more troublesome problem is the existence of systematic conditions on grammaticality or acceptability which cut across what every linguist would regard as separate languages or seem to have a quite nonlinguistic locus. We will examine this question by making some observations about a particular form of discourse, a very slight example, but one which may nevertheless be instructive: the collect. Contemporary attempts at discourse analysis have generally been intended to show features of a particular language which should be incorporated in a total grammar of the language, although some also have been intended to suggest universal features of grammar. Only rarely have linguists been interested in characterizing the changes in a particular form of discourse as it continues through time, or tracing the features of a form of discourse as it passes from one language to another. These latter interests have been more evident in the work of folklorists and students of comparative literature. The present paper is intended as a very This paper was originally published in W. J. Samarin, ed., Language in Religious Practice. Rowley, Mass.: Newbury House, 1976, reprinted by permission of Heinle and Heinle Publishers.


small exercise in showing continuity of a form of discourse across a language boundary and through the history of a language. The form of discourse to be discussed is the traditional brief prayer, uttered by the minister on behalf of the congregation near the beginning of the mass, which generally sets the theme for the day or season being observed. This prayer, which is a characteristic of the Western Church, apparently first emerged in Latin sometime in the 3rd to 5th Centuries of the Christian era, the earliest known collection being the Leonine Sacramentary (named after Pope Leo the Great, 440461 A.D.). From the earliest examples to the present day, this prayer, called simply "the prayer" (oratio) or "the collect" (collecta, collectio, origin disputed), has exhibited a very clear structure of form and content. A full collect has five parts: 1. an invocation, i.e., an address to God; 2. a "basis" for petition, i.e., some quality of God or some action attributed to him; 3. the petition or desire itself; 4. the purpose or reason for making the request, i.e., the good result which would follow the granting of the petition; and 5. a formulaic ending. Of these five parts, the second or the fourth is sometimes absent, and occasionally both are missing. This structure is represented by Formula One. 1. Collect

Invocation (+ Basis) + Petition (+ Purpose) + Ending

Details about the range of variation possible within each part and correlations among the form or content of the parts will be discussed only to the extent necessary for our purpose here. There is a sizeable literature on these matters and they are evidently amenable to rigorous formal analysis.1 The collects have had a long and complex history in the Roman Church, in some places or at some times being greatly expanded, but always the basic structure outlined above has been present, and in fact many collects have been retained without change since the Leonine Sacramentary. At the time of the Reformation those Churches which kept the main features of the mass, such as the Church of England and the Lutheran Churches of the continent, translated and adapted the historic collects for use in vernacular services of worship. The first collection of these prayers to appear in English was in the Prayer Book of 1549, and many of them appear in the same form—apart from modernization of spelling and punctuation—in Anglican, Lutheran, and other English-language service books in use at the present time. Translations and adaptations of Reformation collects are also to be found in dozens of other languages, sometimes dating back to the 16th Century, often prepared in the 19th Century in connection with Protestant missionary efforts. In recent decades, movements of liturgical renewal have given rise to new collects and new variations of old collects in many different modern languages, for use by Christians of many denominational affiliations. In this chapter we will


pay some attention to all three major layers of collects: Latin, Reformation, and contemporary.

Discourse Grammar Formula One above could be regarded as a kind of phrase-structure rule specifying the base form of a collect. If we assume that the constituents are ordered as listed, we could then identify possible expansions and appropriate lexical categories, and we could devise transformational rules which would characterize the full range of acceptable orders and constructions which appear in collects.2 For example, in Latin collects the Invocation may consist of a single word, e.g., Deus (God) as in the collects for Christmas (midnight), Ash Wednesday, Epiphany, 1st, 2nd and 4th Sundays in Lent, etc., or it may be expanded in typical noun-phrase fashion, e.g., Omnipotents sempiterne Deus (Almighty everlasting God) as in the collects for Christmas (early), 2nd and 3rd Sundays after Epiphany, or Palm Sunday. The Invocation may be preceded by certain other elements outside the noun phrase, most commonly an imperative (sometimes followed by a pronominal object), less commonly a direct object with its modifiers (very rarely a direct object alone). Another preposeable element quaesumus (we beg, please!) is quite rare as the first word, but frequently appears in second position between the preposed imperative or direct object and the Invocation. As a kind of surface constraint, we may note that the maximum amount of material that may precede the noun of the Invocation seems to be direct object (or imperative) + quaesumus + Adjective, and, in this case, apparently only a single adjective occurs, not two or more: Excita, Domine . . . (Stir-up, O-Lord) 2nd Sunday in Advent Convene nos, Deus salutaris noster . . . (Change us, God salvation our) 1st Sunday in Lent Ecclesiam tuam, Domnine . . . (Church your, O-Lord) St. John Concede, quaesumus, omnipotens Deus . . . (Grant, we-beg, Almighty God) Christmas Preces nostras, quaesumus, Domine . . . (Prayers our, we-beg, O-Lord) Quinquagesima

In spite of the fact that indefinitely many different collects may be composed, it is abundantly clear that there are severe, systematic constraints on order and construction, and the form of the collect could probably be captured by some kind of sentence grammar. It is not immediately apparent, however, where this particular sentence grammar belongs in a full grammar of Latin sentences, since some of its rules would be closely related to, or even subsumed under, rules of very general applicability, while others would be limited to related forms of discourse (e.g., prayers in general) or only valid for the collect itself. The collect poses one of the dilemmas of

the sentence-vs.-discourse issue in a very clear way. Theoreticians have tended to see the function of grammar writing as the characterization or generation of sentences, and discourse analysis as concerned with utterances larger than sentences. Elsewhere (Ferguson 1967) I have given a clear example of grammaticality of the typical sentence-syntax sort which goes beyond a single sentence. Here I draw attention to the piece of discourse analysis which is limited to single sentences. Sentencehood is not the crucial differentiator between grammar in the usual sense and discourse "grammar," or else the differentiation itself is questionable.3

Transfer Grammar When Archbishop Cranmer and his associates undertook to render the Latin collects into English equivalents for use in public worship, they faced all the usual problems of the translator. The two languages—like any pair of languages—differed in phonology, syntax, and lexicons; and in the relations between semantic value and the respective linguistic systems. The whole question of the commensurateness of linguistic systems has been discussed from many points of view (contrastive analysis, machine translation, language universals), and it will not be treated here. Also, of course, the existence of doctrinal differences interfered with direct translation. But in this section let us examine differences which are neither narrowly linguistic nor theological.4 One important difference between the Latin collects and Reformation English collects is the question of style, or perhaps better, register. What special features of the respective languages are appropriate for use in public prayers as opposed to other uses of language? Some stylistic features are common to both, such as the use of paranomasia or word play in which, for example, the stem or root appears in several places with different endings or an affix is echoed with different stems. This will of course differ in detail: for example, the Early-Modern English (EME) prayer register had more alliterative repetition while Latin had more complex interlocking patterns of word order. Also, a word play in one language often cannot be reproduced in the other simply because of the lexical and phonological differences. The feature which we will examine here is the EME practice of word pairing, in which two synonymous words or phrases are used in direct coordination or in a parallel or chiastic position apparently to express a single notion. This is part of a more general registral difference of conciseness vs. elaboration in the Latin and EME prayer languages. The Roman collect, like some other parts of the liturgy, tended to be very terse: a minimum of words tightly arranged with clause-final cadences (cursus); the Reformation English collect, like the vernacular devotional literature on which it was based, was free and expansive, somewhat wordy with a rhythmic flow involving sequences of short and long words. Latin collect style also had characteristic word pairs, but they were fewer in number and more often antitheses rather than synonyms. The word pairing which characterizes Reformation-English collects and prayer-book style in general may have originated


Register and Genre

in the pairing of Romance loanwords with the Anglo-Saxon glosses attested for Middle English, but in any case it was a striking feature of elaboration compared to the Latin originals. An example cited by Brook (1965:129-130) illustrates these registral differences (traditional collect for 1st Sunday after Epiphany, now replaced in Roman usage): . . . ut et quae agenda sunt videant et ad implenda quae viderint convalescant


. . . that they may both perceive and know what things they ought to do and have grace and power faithfully to fulfil the same . . .

For word play: videant, viderint; faithfully, fulfil. For pairs of words: perceive and know; grace and power, It is of some interest to note that this English pairing, whatever its origin, is very similar to the word pairing found in the poetic portions of the Old Testament, particularly the Psalms,5 and in all likelihood the two kinds of pairing were mutually reinforcing in the development of "liturgical English." What is of particular interest in this chapter, however, is where the description of registral features belongs in a grammar. Registral and stylistic features may cut across the various components of a grammar (phonology, syntax, lexicon) in complex ways, but clearly the identification of registers begins very early in child language development (cf. Weeks 1970) and some of the fundamental registral differences may be universally present in the linguistic competence of the members of a speech community. In any case, we need to try experimental versions of registral "grammars" either in the form of tagging appropriate elements and rules of a conventional grammar or in the direct formulation of registral constraints and regularities.

Diachronic Grammar Collects have been translated into English or composed directly in English since medieval times, and English collects are still being produced at the present time. As mentioned above, the two great periods of English collect composition are the 16th Century and the 20th. During the course of the four centuries between these periods, the English language changed in many respects, not only in linguistically well-recognized ways such as phonology or syntax but also in the characteristics and distribution of registers and styles. As an example of more narrowly linguistic change, we may note that second-person relative clauses, which were perfectly acceptable in the 16th-Century English, now are marginal or nonexistent for most speakers of English. As an example of registral stylistic change, we may note that the current preference in prayers is for greater simplicity, in the sense of fewer words and fewer subordinate clauses. The commonest way of expressing the Basis component of the collect in Latin is a relative clause dependent on the noun of the Invocation. For example, the collect for Epiphany begins:

Deus qui hodierna die unigenitum tuum gentibus stella duce revelasti . . . which was translated into EME: O God, who on this day by the leading of a star didst reveal thine only-begotten Son to the Gentiles, . . .

In modern English this kind of clause is at the very margins of acceptability. When second-person relative clauses appear in contemporary liturgical texts they often have third-person agreement (Almighty God, who knows us to be . . .); usually they are avoided altogether. A contemporary Roman Catholic version of the Epiphany collect reads: "O God, on this day you revealed your Son to the peoples of the earth through the guidance of a star . . ." The syntactic change eliminating such second-person clauses is relatively unimportant in the total historical syntax of English, but it has great effect on the collect as a form of discourse.6 In the first place, it often results in breaking the collect into something more than one sentence, and, in some cases, it is probably the direct cause of the elimination or revision of the Basis in a collect. Some contemporary versions of collects have removed the Basis from the collect and set it separately as a kind of introductory statement or "bid" to prayer preceding the collect itself. This change is chiefly due to the unacceptability of second-person relative clauses. Thus, two Experimental versions of a new collect for the Thursday after Ash Wednesday begin: (1) Lord, our God, You walk before us and give us guidance in everything we do. Stay with us, and be our . . . (2) Let us pray to the Lord, our God, who walks before us and gives us guidance in everything we do. Stay with us, and be our . . . Once a new format with preceding bid is attempted, it then becomes possible to set the Purpose in that position rather than the Basis, as is done in some contemporary versions, and this move toward a major alteration in the basic structure of the collect was apparently triggered by the syntactic change. The general preference for simplicity, interestingly enough, tends to take contemporary English collects nearer to the original Latin forms in wordiness and total length although the grammatical structure of English is still a bar to the kind of tight arrangement typical of the Latin style. The use of word pairing, however, has become so much a part of liturgical English that it frequently persists in contemporary collects, and the simplification comes more from omission of adjectives and adverbs and the dropping of other elements such as the resumptive "the same" (through the same thy Son, Jesus Christ our Lord, . . . ) in the ending formula for collects in which Christ has been mentioned.7 The writing of diachronic grammar is often viewed as the specifying of successive stages, and this may be done by providing relatively static grammars and statements of relationship between them or by giving a baseline grammar and


Register and Genre

succession of changes. In writing a diachronic "grammar" of the English collect, both these procedures would be possible and instructive, but at least one other kind of grammar would be of obvious value: a characterization of the structure common to all periods. Such a grammar would serve to place the English collect within the universe of all collects and at the same time specify its differences from non-English collects. Perhaps the fundamental problem here, as in all treatments of variation, is how to recognize and present the significant equivalences. At the one level, the secondperson relative clause of EME is equivalent to the contemporary independent declarative sentence, but at another level it is different and the EME structure is closer to the Latin. The attempt to write a diachronic grammar of a small form of discourse such as the collect might be revealing and instructive for the writing of larger-scale diachronic grammar, whether of the sentence type, discourse type, or more pronounced sociolinguistic focus.

Locus of the Grammar In each of the preceding sections we have commented on structural regularities which could be presented in the form of a grammar, but in each case the locus of the grammar has not been a homogeneous, full, natural language. It has been a form of discourse in a particular language, a form of discourse in two languages, or a form of discourse at different stages of the same language. Indeed, a very natural object-language for a grammar would be the class of all possible collects in any language of Christian worship at any time from the 3rd to 20th Centuries. The definition of this "language" is hardly more problematic than the definition of a homogeneous, full, natural language, and the latter as well requires a sociological component. Among the many variables of interest are code (i.e., which language or language variety is used), time period (e.g., century), doctrine, day or occasion, liturgical setting. We have already seen how different languages and different historical periods would correlate with differences in collect grammars. Differences in doctrine explain, for example, the radical shift in collects for saints' days between Latin collects and Reformation collects in any language, all reference to intercession of saints being removed from the latter. Differences in day or season account for such general textual differences as the alleluia qualities of collects in the Easter season, and are, of course, a crucial factor in determining the wording of particular collects. Differences in liturgical setting refer to the relation of the collect to other variable parts of the liturgy, to the formulaic expressions which precede and follow collects, to whether the collect is intoned or spoken by the minister or read aloud in unison by the congregation, and many other similar phenomena. Differences in this realm are also related to variation in the actual texts of collects. In the point of view adopted here, much of the basic grammar of the collect is independent of the variables given above, but the range of variation to be covered by the total grammar must be in terms of variables such as these. Perhaps

the most novel point for the linguist is the notion that most of the basic grammar and even some fairly superficial details of this form of discourse are essentially language independent. For the collect as a form of discourse we might start with a definition of the sociolinguistic setting (assembly of Christians, leader, themeprayer of day or season, etc.) and then construct a grammar and a set of conditions under which specified variation takes place. In her important early paper on sociolinguistic variables, Ervin-Tripp noted: "One of the major problems for sociolinguists will be the discovery of independent and reliable methods for defining settings" (Ervin-Tripp 1964). This problem is just as difficult as ever, but it still holds as much promise now as it did then. The present study has merely suggested it, giving a simple example, and has suggested that sociolinguistically located grammar, although at present far beyond our techniques of analysis and presentation, must be an ultimate goal for linguists who see themselves as students of language in society.

10 The Structure and Use of Politeness Formulas

The Structure and Use of Politeness Formulas The purpose of this paper1 is to examine with some care the little snippets of ritual used in everyday encounters between people, expressions like good morning, or thank you, or God bless you said when someone sneezes, or bye-bye said to an infant by a departing guest. All human speech communities have such formulas, although their character and the incidence of their use may vary enormously from one society to another. Strangely enough this universal phenomenon has been very little studied by linguists or anthropologists or other students of human behaviour. These politeness formulas (as I call them) are, in the word of Erving Goffman, 'among the most conventionalized and perfunctory doings we engage in and traditionally have been treated by students of modern society as part of the dust of social activity, empty and trivial' (Goffman 1971: 90). Goffman, in his intentionally irritating way, seems to attribute our failure to study these interpersonal rituals to the general decline of religion in modern times. He says, 'Only our secular view of society prevents us from appreciating the ubiquitousness and strategy of their location, and, in turn, their role in social organization' (89). Without in any way accepting his explanation for the dearth of systematic study of politeness formulas, I join him in bewailing it, and find in his works some of the most insightful treatments of them.2 Given the present popularity of the kind of investigation of animal behaviour which goes under the name of 'ethology' and the propensity on the part of a number of reputable scientists and sometimes less reputable popularizers of ethology to explain human behaviour by reference to the activities of geese, stickleback fish, chacma baboons, and so on, it is even more surprising that no one has attempted to spell out in detail the biological substrate of the universal human exchange of politeness formulas. I say this because it seems to me fairly plausible that this human phenomenon is related phyletically to the bowings and touchings and well-described display phenomena of other species.3 Yet the tendency has This paper was originally published in Language in Society 5:137-51 (1976), reprinted by permission of Cambridge University Press.



been to point to 'greeting' behavior of numerous animal species and then jump to interpretations of human religion, esthetics, and philosophy rather than proceed to systematic study of apparently simpler and more obviously related human behaviour. I am thinking, in particular, of the stimulating and instructive discussion on the Ritualization of Behaviour in Animals and Men conducted by Julian Huxley under the auspices of the Royal Society of London ten years ago (Huxley 1966). The biologists provided excellent accounts of ritualization and its presumed evolutionary advantages, and the social scientists discussed topics ranging from motherchild interaction to the ritualization of international relations, but there was a noticeable gap in that no one provided accounts of observed interpersonal rituals of the politeness formula sort. It is surely a matter of interest that just as all human societies apparently use ritualized, non-verbal signals, they also all have verbal ones, politeness formulas, which are used in conjunction with the non-verbal ones but are yet related in linguistic structure to the language as a whole which is used by the community. Coming from another direction, it is also surprising that the intense interest among linguists and psychologists in the innate aspects of human linguistic competence, innate devices for acquiring language, and the like, has not led toward consideration of possible innate predispositions to the use of interjections and ritualized exchanges in which a given formula triggers an automatic response. Interest has focused more on innate grammatical relations, innate grammar evaluation mechanisms, innate representation of the principles of universal grammar. Little interest has been shown in how such a complex and delicate capacity could have evolved in our hominid or pre-hominid predecessors. One place to look might well be the universally operative phenomenon of politeness formulas. For those neurolinguists who are interested in localization of brain functions, it is worth noting the evidence that aphasics with lesions in the left hemisphere who have trouble with speech in general may use ritualized politeness formulas, like the related hesitation forms, non-referential introducers and conventionalized interjections with impressive fluency. I am not, of course, seriously suggesting that human language had its origin in politeness formulas4—although that would be no sillier than other hypotheses that have been taken seriously—but I would like to persuade reluctant students of language, of whatever disciplinary or theoretical orientations, that politeness formulas deserve their attention. At present most accounts of politeness formulas are probably appendices of short chapters in grammars, although accounts may also be found in ethnographies and in guidebooks for travellers, officials, and missionaries. The accounts in grammars are usually limited to lists of formulas, with the briefest indication of their use and a sentence or two which says how important the formulas are in dealing with the 'natives'. As a typical example let me cite Banfield and Macintyre's Grammar of the Nupe Language. It contains lists of over 50 formulas and appropriate replies (108-13), and its 'sentence of importance' reads: 'It is a very important subject, as salutations, etc., play a very large part in native life and customs, and the foreigner who can make the customary polite inquiries and return the proper answers to such inquiries, will hold a high position in the estimation of the people with whom he comes in contact' (108). Social scientists are only

now beginning to recognize the importance and complexity of the use of politeness formulas in modern Western conversation and their usefulness to foreigners learning them, as ethnomethodologists and others subject American and English patterns to close scrutiny (e.g. Schegloff 1968, Schegloff & Sacks, 1973). Very rarely do we have a straightforward account 'of native customs' which gives exact texts of the formulas and appropriate conditions of response (an example is Mercier's study of Moroccan politeness). Even rarer is a formal analysis such as Irvine's grammar of Wolof greetings (Irvine 1974) or a discursive sociolinguistic analysis such as Apte's description of expressions of gratitude in Marathi and Hindi (Apte 1974). Studies which attempt comparative and general theoretical treatment of politeness formulas are usually limited to discussion of greetings, and I find it of some passing interest that four of the five important studies I am familiar with are themselves examples of ritualization. J. Huxley, R. Firth, E. Goody, and E. Goffman all are constrained to explain greetings in terms of a three-fold function, or threefold condition of occurrence. I doubt that these authors are Trinitarian in their theology, but they all seem resolutely trinitarian in scholarly explanation, even though their respective trinities do not match very well. Huxley says that the three functions of ritualization are to improve the signal and therefore communication, to reduce intra-specific damage, and to strengthen sexual and social bonding. Firth sees three 'major social themes' of greetings and farewells: attention-production, identification, and reduction of anxiety in social contact. Goody finds three 'general functions' attached to greeting: to open a sequence of communicative acts, to define and affirm identify and rank, and to manipulate a relationship to achieve a specific result. Goffman claims that there are three 'general circumstances' in which supportive interchanges, as he calls them, take place: business, accident and ceremony, i.e. people are in contact because of other things they have to do, or by chance being in the same area, or deliberately for the purpose of one or both of the individuals to perform a supportive ritual. The fifth, less ritualized, study by Hilary Callen (Callen 1970, Chapter 7) is perhaps closest to the viewpoint adopted here, but without the focus on language analysis.

Use of Formulas Let me introduce the topic by two personal experiences, one which convinced me of the importance of politeness formulas and one which demonstrated that at least some of the dependencies among adjacent formulas have the nature of productive syntactic rules familiar from elsewhere in grammars. The first was an informal experiment—if I dare use that word—which I conducted many years ago with my secretary at that time. To see what the result would be, I simply did not reply verbally to her good morning. Instead I smiled in a friendly way and through the rest of the day behaved as usual. The next morning I did the same thing. That second day was full of tension. I got strange looks not only from the secretary but from several others on the staff, and there was a definite air of 'What's the matter with Ferguson?' I abandoned the experiment on the third day because I was afraid


Register and Genre

of the explosion and possible lasting consequences. Of course it might not have been as serious as for the distracted night heron that Lorenz reports forgot to make his bow of greeting at the nest and was attacked by his own young (Lorenz 1937), or an unfortunate female gentoo penguin who neglects to bow in greeting to her mate when he is defending their territory (Roberts 1940). But it was serious. The importance of our trivial, muttered, more-or-less automatic polite phrases becomes clear when they are omitted or not acknowledged.5 The other experience was more complicated. I have recorded it elsewhere (Ferguson 1973), but it is worth repeating in this context. I was buying an article of clothing in an Arab market in Jerusalem years ago at a time when I was beginning my study of the Arabic spoken in that area. A passerby stopped to watch and enjoy the bargaining process, and when the purchase was completed he said to me mabruk. I did not know that formula, which is normally addressed to the owner of a new possession such as clothing, car, or house, but clearly some response was in order. Now Arabic has a sizeable number of what have been called 'root-echo responses' (Ferguson 1967) formulas, each of which is an appropriate response to the occurrence of a particular triconsonantal root such as slm or fw in the preceding formula. In my limited experience with Syrian Arabic, I had learned that an appropriate response to id mbarak 'happy holiday' (lit. 'blessed holiday') was 'alla ybarik fik 'God bless you' and I had also heard this response to the expression hallit Ibarake said about someone whose fruit trees were producing well. Probably alla ybarik fik was the root-echo response to brk. I tried it, and the smile showed I had given the right reply. The whole analysis took only a split second, and was just like getting an instance of grammatical concord or case government right. It is a good example of the kinds of 'rules' which govern the use of politeness formulas, although in detail it may be quite unlike any 'rules' for formulas in English. The greeting good morning is an excellent, uncomplicated example of a politeness formula. It is highly stereotyped and can be altered only with the definite recognition on the part of the speaker and hearer that it is being altered for some special effect. The adjective must be good, just as with birthday it is happy or with Christmas it is merry in American English. Substitutions of one of these adjectives for another in such formulas would mark intended humorous effect, or recognizable attempt to avoid a cliche, or the dialect of another part of the English-speaking world. Incidentally, good morning is in origin not an 'affirmation', as Firth identifies it, but a 'welfare-wish' may you have a good morning, but it can be treated as an affirmation and given a facetious response such as what's good about it? The formula good morning is one of a small closed set with good afternoon, good evening, and good night in structure, although the uses do not match exactly. It is interesting to watch the process of re-emergence of the formula good day which had almost completely vanished from American English. In the last decade it has become normal under certain conditions, such as picking up a hitch hiker, for the two persons in an encounter to use as their farewell exchange have a good day followed by you too. The occasions for use of good morning seem simple and obvious to the native

speaker of English, but in fact they are fairly complicated. It is said at a certain time of day; English is one of the languages which have salutations for different periods of the day such as Arabic or Gonja, not one of the languages without this temporal variation, such as Bengali or Wolof. The appropriate time of day for good morning varies regionally (although not as much as for good evening which differs sharply in different parts of the United States) but generally is between waking up in the morning and the midday meal. It is said only on the first encounter of two people in the morning and is not repeated at subsequent encounters (as hi or hello can be, sometimes with the addition of again or a comment). Its use implies a certain degree of formality in the occasion, and hence it is not normally appropriate for two university students seeing each other for the first time that morning, walking from one class to another. And, of course, it can be used metaphorically, or by displacement, on a wrong occasion to point up a particular aspect of the encounter. So we can say good morning sarcastically to someone who oversleeps and wakes up in the middle of the afternoon, or to someone who comes home at 2 a.m. when expected the previous evening. All these appropriateness conditions must somehow be acquired by the native speaker of English and are correspondingly problems for the foreign learner of English. The corresponding Syrian Arabic formula sabah lxer is used roughly the same way as English good morning but with many differences in detail, especially in kinds of response. Let us, however, examine briefly an Arabic salutation whose occasions for use are highly problematic for the English speaker. In the general Syrian dialect area the formula na?iman is part of the stock of polite salutations and its expected reply is the root echo alia yin am alek. Careful observation of its use would show that it is addressed to someone who has just had a bath, a haircut, or a shave, or has just awakened from a nap. The ordinary adult speaker of the language uses these formulas without conscious ruleappreciation, and upon being asked usually cannot give right away the full list of appropriate occasions. Like good morning, na iman can be used metaphorically. For example, to a student who asks a question in class which has just been answered, the instructor might say na iman implying that the student was napping during the previous discussion, and the class would immediately appreciate the joke. The social anthropologist or culture historian who might want to find a link connecting the various occasions of use of na iman would probably find it in the institution of the public bath, and indeed the isogloss which marks the distribution of this formula and its counterparts in the Arabic-speaking world and other Middle Eastern speech communities probably reflects the wide distribution and popularity of that institution. In Morocco the corresponding formula bsahht k or b ssahha uraha 'by your health' or 'by your health and relaxation' is apparently still used after another scene of the bath, blood letting (Mercier 1957). The point to be made here is the old, familiar anthropological and linguistic one that although a particular phenomenon is universal in human societies—in this instance the phenomenon of exchange of politeness formulas—the structures and incidence of use are so culture specific and tied to the cultural history of the particular society or group that the structural or functional universals must be sought at other levels.


Patterns of Response In both English and Syrian Arabic an appropriate response to good morning is an exact copy, with sometimes a slight change in intonation. This kind of full echo response is so common in formula exchanges that one is tempted to claim its universality, or at least the high probability that every speech community has at least some politeness formulas which have this kind of response. Another widespread form of response is the simple acknowledgement by an affirmative or interrogative word which says in effect 'formula noted, your turn now' without repeating or modifying the initiating formula. An English example is the yes? response to the summons use of hello (Goffman 1971: 104). Of greater interest to the linguist are modified echo responses, in which there is addition or deletion or permutation or the ringing of paradigmatic changes of some kind. The Arabic root-echo response already mentioned is an example. A simpler one is the wellknown response to assalamu alaykum 'peace be on you' which is wa alaykumu ssalam 'and on you be peace'. Finally, many speech communities have one or more generalized or 'general purpose' responses appropriate for a number of different initiating formulas with or without an echo component. Thus, apparently, any of the numerous topical salutations of Gonja may have the response awo 'it is cool' and for many of them the response alanfia 'good health' is appropriate. Syrian Arabic has a general purpose response alia yihfazak 'God keep you' which seems to be appropriate whenever there is no particular specified response, or as an addition to or further response to another welfare-wish with God as the subject. (This formula does, however, also have its own realm of specific appropriateness. For example, if someone's children are mentioned and the appropriate formula is used alia yxallilak (or—lo, etc.) yahun 'God keep them for you' (or 'for him', etc.) the use of alla yihfazak seems to be the prescribed response rather than a generalized use.) The Nupe salutations and greetings as reported by Banfield & Macintyre illustrate all these types of response: some formulas take just simple acknowledgement of 'yes' (hin, eba or to); others are echoed, e.g. formulas consisting of or beginning with oka 'I greet you (for)' have oku in the response; and there is a general purpose reply to all welfare wishes involving God, namely ami 'amen' (op. cit.). Many Arabic exchanges of greetings follow the simple principle of 'the same or more so'. Thus the common informal 'hello' of Syrian Arabic is marhaba (original meaning 'welcome'), and the responses most often heard are, in descending order of frequency marhaba, marhabten, mit marhaba, and mardhib, i.e. 'hello, two hellos, a hundred hellos, hellos'. This principle of response received endorsement in the Holy Koran itself, which says, in effect, (Surah IV, verse 86) 'If someone greets you, either return the greeting or greet him better, for God takes everything into account'. We must, of course, remind ourselves that responses according to this principle are just as conventionalized as other formulas. My occasional attempts to reply with '10 hellos' or '200 hellos' or '1000 hellos' were met with amused smiles or irritation. Unlike our English good morning, the Syrian Arabic good morning has an array of responses exemplifying the 'same or

The Structure and Use of Politeness Formulas


more so' principle. Two fairly common responses are mit ssbah '100 mornings' (sc. 'good') and sabah lxerat, the plural of the formula. Other good mornings which are appropriate as responses but not as initiators and are presumably felt to be stronger include: sabah nnur 'morning of light' which is one of the commonest of all responses, and mornings of flowers of various kinds, used between familiars as expressions of good will and humor, e.g. sabah lward 'morning of roses' and sabah lfull 'morning' of jasmine'. Incidentally, the 'same or more so' principle is just as evident among Christian Arabs as Muslims, and many people are not aware of the Koranic reference. It seems likely that this greeting principle was already in existence in Semitic languages at the time of Muhammad, but it would be interesting to trace its spread in relation to the spread of Islam and the Arabic language. The 'same or more so' principle is distinct from the 'you-too' kind of response which is familiar to us and used in many speech communities. Syrian Arabic examples6 are sallim idek '(God) keep your hands', w'idek 'and your hands' which are thanks and response in connection with a favor or act of service done with the hands, and the salutation and response used at several annual festivals such as New Year's or Christian Easter kull sane w inte salim w inte salim 'all year and you in sound health'—'and you in sound health'.

Variation The kinds of variation within languages and across languages which have been illustrated raise the fundamental question of the general conditions and general types of variation found. A great deal of variation is in the nature of fuller forms vs. shorter forms, and Goffman has suggested the concept of 'attenuation rules' for analysis of this dimension of variation. He talks about the abbreviated forms of 'passing greetings' and other kinds of reduced exchanges. This concept, picked up by others such as Irvine, is useful but it is misleading to the extent that it implies a norm of full forms and various kinds of attenuation depending on specifiable conditions. A conceptual framework which allows both elaboration and reduction from the norm seems more productive. The familiar Arabic (Islamic) greeting assalamu alaykum has as its normal response, as we have seen, wa alay'kumu ssalam. In some instances, however, a fuller form may be used such as wa alakumu ssalam warahmatullah 'and on you peace and the mercy of God' but this is recognized as being an elaboration. And in Morocco, at least, an attenuated reply wa alaykum 'and on you' is attested, for example in reply to a nonMuslim who has (either in ignorance or deliberately) used 'assalamu alaykum in greeting (Mercier 1957). In general the structure of politeness formulas varies in constituency and intensity in correlation with a number of social dimensions. At least four of these dimensions are operative also in 'greeting' behaviour of other animals, especially birds and primates, and seem likely to be universal in human societies (cf. Callan 1970, 117-22; Goodall 1971, 239-40; Irvine 1974, 168-70).


length of time elapsed since previous encounter distance between communicators number of individuals in the relevant groups relative social status of the communicators

The nature and amount of the variation is not predictable in any universal sense. For example, in some human societies the superior initiates the greeting (e.g. Moroccan Arabic), in others the inferior does so (e.g. Gonja), and in still others the social dominance differentiation is more complex (e.g. American English), but what is universal is the correlation between structure of formula and the social (or sociotemporal, sociospatial) dimensions. The relationship between formulas and social status may, of course, be viewed in either direction—that the social status is naturally reflected in the use of politeness formulas or that the function of the formulas is to mark the social status; certainly in some accounts of animal behavior the 'greeting' behavior seems to be the chief defining characteristic of the social hierarchy which is attributed to the group. Goffman and Firth in their somewhat different ways both feel that greetings and farewells constitute a natural unit and should be considered together. Firth defines greeting and parting behavior in their social sense as 'recognition of an encounter . . . as socially acceptable' and the 'recognition that the encounter has been acceptable' respectively (Firth 1972, 1). Goffman expresses the view that 'greetings mark the transition to a condition of increased access and farewells to state of decreased access' and is able to include them both in his definition of 'access rituals' which 'mark a change in degree of access' (Goffman 1971: 107). If the focus of the investigation is on the encounter itself it is certainly legitimate to investigate the behavioral brackets around it as similar and related phenomena, but this intended natural classification obscures the relation which is sometimes even closer between one or the other of these and other politeness formulas which are used in the course of encounters or even as expressive elements in monologs. Goody's study of Gonja gives ample testimony to this. In Gonja the same word choro must be translated as both 'greet' and 'thank' and includes verbal greetings, visits and other physical activities, and prestations. Certainly some of the structural features of formula exchange (internal as well as patterns of response and turn taking) are equally evident in greetings and thank yous, apologies, pardons, wishes for health, condolences, topical blessings, curses, and a host of other usages. It is true that of all these it is greeting behavior that has the clearest counterpart in animal behavior—even farewells are less well attested and problematic— and it would probably be a fruitful hypothesis to explore that all politeness formula use originated in 'greeting behavior' of some kind. The truth remains that we are too ignorant of the behavioral, specifically the linguistic, facts in this whole area to make an early claim about natural units. We need much more patient and careful description of the structure and use of politeness formulas in different communities and different languages. Before leaving this essentially synchronic, descriptive section I would like to mention the linguistic phenomena of embedding and replacement in formula constructions. What happens when one exchange must fit inside another by the

The Structure and Use of Politeness Formulas


nature of the occasion? Or when two responses are both appropriate for different reasons? The phenomena are reminiscent of similar ones in sentence grammars and discourse analysis. If we took seriously the suggestion that human speech started with formulaic exchanges, we could see in these constructions the forerunners of substantial parts of syntax. The notion of embedding in politeness formulas is discussed by Goffman in connection with such occasions as taking one's leave at the end of a farewell party, where the exchange has to terminate the immediate occasion and also refer to the expected long-term absence, or parting after an encounter which includes an introduction, where the farewell must be preceded by an acknowledgement of the introduction. A nice example of embedding and the deletion or replacement of one formula by another that 'outranks' it may be seen in Syrian Arabic farewells which include wishes for recovery from sickness. The normal farewell exchange in Syrian Arabic is a triad of formulas A:B:C such that if A is said first, the addressee must reply with B, and the original speaker may (optionally) reply with C. If B is said first then C is normally obligatory. A is (b)xatrak 'by your leave', said by the person who is leaving; B is ma ssalame 'with peace', said by the one who is staying; C 'alla ysallmak 'God keep you', a root-echo response to slm.7 The normal exchange about recovery of health is m afa nsalla '(may you be) strengthened, God-willing' R calla y ajik 'God strengthen you', the root-echo response to fw. At a farewell where the recovery exchange is also appropriate, there are several ways of continuing the exchanges. Here are two examples: I


Patient (leaving the office after receiving a prescription) Doctor

Visitor Patient Visitor Patient

mitsakkir ya daktor xatrak

Thank you doctor. By your leave.

m afa nsalla ma ssalame

Strengthened, God willing. With peace.

nsalla bitkun sahhet mamnunak ktir m afa xatrak c alla y affle wyihifazak

God willing you will have recovered. Thank you much. Strengthened! By your leave. God strengthen you and keep you.

In the first exchange the farewell sequence is complete A:B, but the recovery wish goes without response; in the second exchange it is the farewell which is incomplete and the recovery wish which has its response. When the recovery wish is embedded within a farewell exchange, ma ssalame normally provides full closure and the response to the recovery wish is not said, as in I. In II, however, the patient apparently wants to express special appreciation of his friend's coming to see him, and he takes the option of replying to the recovery wish and adding the general purpose response 'alla yihfazak, which here has the effect of closing the farewell in place of the ma ssalame which is normally required. Such patterns of embedding and deletion can be represented by rules similar to those of intrasentence syntactic patterns, with the usual inclusion of optional rules whose selection is conditioned by social factors and communicative intent.


Diachronic Considerations Like any other special subsystem in language, politeness formulas should be examined diachronically to see whether they behave in special ways different from the main body of the language. The baby talk register, for example, which is a special subsystem in language, turns out to be unexpectedly conservative—key items may remain in use for millennia—and it shows specific features of diffusion and serves as a source for certain kinds of lexical items in the matrix language (Ferguson 1964, 1975; Byron 1968; Crawford 1970; Oswalt 1975). Politeness formulas have at least three diachronic characteristics of interest: weakening, archaism, and areal diffusion. Politeness formulas, in so far as they are non-referential in meaning and important for their presence or absence on the appropriate occasion rather than for the exact meaning carried by their constituent parts, are subject to the special weakenings (aphesis, contraction, erosion) which expressions of that type such as titles, asseverative particles and the like undergo. (Cf. Jespersen 1922: 266-8, 273, for discussion of 'extreme weakening' of greetings, etc., which he relates to weakenings in non-verbal greeting gestures). In the past forty years it has been possible to observe the weakening from How are you to Hi! in American English. At the first stage the full form alternated with such shortenings as [haway ] [hay ] until the second of these became the commonest form, represented in writing as Hiyaf!, and began to lose its connection with the original How are you. In the next stage Hiya! alternated with Hi! until the latter attained its present dominance and for most speakers has no relation to How are you, which is now used in different functions from its shortened form. On the other hand, politeness formulas, in so far as they constitute a folkliterature genre similar to proverbs, riddles, and nursery rhymes, tend to include archaic forms and constructions which have disappeared from ordinary conversational speech. Many Syrian Arabic politeness formulas are wholly or partially Classical Arabic in form, just as a considerable proportion (possibly 40%) of current Syrian Arabic proverbs are Classical. It must be noted that the contradiction between the tendency to extreme weakening and the tendency to archaism is only apparent. For example Classical Arabic in s a llahu 'if God wills' is continued as a Syrian Arabic formula preserving the conditional particle in which is rare in many varieties of Syrian Arabic and the verb sa a 'wish' which is similarly marginal, yet the whole formula has been weakened to nsalla or nsalla. A parallel instance in English would be goodbye, said to come from an earlier form of God be with you; it preserves an archaic construction but is phonetically modified and eroded. The third feature of diachrony to be noted is the strong tendency for the structure and use of politeness formulas to diffuse with other elements of culture across language boundaries. Thus, for example, a striking number of Arabic greetings and thank you formulas have spread along with Islam to speech communities which have not shifted to Arabic. In South Asia, where patterns of politeness formulas are quite different from those in the Arab world, Muslim populations use Perso-Arabic formulas quite at variance with the usage of Hindu speakers of the

The Structure and Use of Politeness Formulas


same language. The best known example of such an Islamic formula is undoubtedly assalamu alaykum, which in one form or another has gone wherever Islam has gone. In the Middle East the diffusion of use as opposed to structure of politeness formulas is particularly clear. In Arabic, Persian, and Turkish there are often close counterparts in formula use even when the forms themselves are not borrowed. For example, European languages generally have one common expression for 'please' which may be used either for requesting a favor or offering a service, but Middle Eastern languages generally have two sharply different expressions corresponding to these two 'meanings'. In teaching Arabic to a European this distinction requires explanation, but in teaching Arabic to a non-Arabic-speaking Middle Easterner it is necessary only to give equivalents, e.g. arabic tfaddal = Persian befcerma id = Turkish buyrunuz, all 'please' in the sense of offering service or special consideration to the addressee. The archaism and cross-language diffusion of politeness formulas may result in long persistence of a formula in a community despite substantial change in language or religion. Thus, for example, the exchange of blessing formulas in connection with good harvests (cf. p. 141 above) has probably persisted for well over two millenia in the Middle East, as attested by the final verse of Psalm 129, which says in reference to the wicked who will have no harvest, 'no one passing them will say, "The Lord's blessing on you!"—"We bless you in the name of the Lord" '.

Acquisition Any discussion of language structure and use is incomplete if it does not include some account of acquisition both for the indirect evidence which ontogenesis offers on the structure itself and for the clues it may give to the understanding of diachronic processes leading to or from the patterns in question. Accordingly, let us ask how children, as they grow up, learn politeness formulas and how to use them. The recent study of Gleason and Weintraub (1975) on the acquisition of 'routines' offers an excellent start toward an answer to this question. The authors are interested in the general question of how politeness formulas and other ritualized routines are acquired, although the behavior they observed in detail was limited chiefly to the Trick or treat sequence used by American children at Hallowe'en as they go from house to house in costumes to collect gifts of candy and fruit. Gleason and Weintraub claim that routines are acquired differently from the rest of language in that they are explicitly taught by parents, who prompt their use with the markers Say and later What do you say? and who ask after the occasion What did you say? (Cf. also the conversation recorded in Sacks 1972, where a mother prompts her child to reply to a greeting). They point out that bye-bye, which is the earliest routine to be learned, may even be marked by Say when the child is too young to speak and is only expected to open and close its fist in a primitive motion of waving. 8 Firth refers to similar observations of Baganda children being drilled in the movements and gestures which accompany greetings and farewells even before they could speak (Firth 1972, 33).


Register and Genre

Gleason and Weintraub note another feature of the acquisition of formulas, 'the general failure of adults to provide expatiations of expansions based upon them. An adult teaching a lexical item and a concept embeds it in a number of forms: See the doggie? That's a doggie. The doggie is eating his dinner. But byebye and other early routines, including politeness formulas (Thank you in particular) and greetings, do not spark any explanatory discussions'. (Gleason & Weintraub 1975). Their point is that such routines have little internal structure or variability and little in the way of underlying cognitive structure compared with less ritualized speech and are to be learned as appropriate for a situation rather than to express a referential message. The point is doubtless well taken for such expressions as bye-bye or thank you, but one wonders how the patterns of the more complex interchanges of formulas are included. It is clear that only a beginning has been made in the developmental study of the structure and use of politeness formulas. The acquisition of politeness formulas is related to the general question of the role of unanalyzed units in language development, 'prefabricated routines' as Roger Brown has called them. Linguists and psycholinguists, in their concern to understand the astonishingly rapid acquisition of the complexities of phonological and grammatical systems and the creative aspects of language, have tended to neglect the role of phonologically or semanticosyntactically unanalyzed chunks which the child learns and uses in a kind of interim strategy until he gradually decomposes them into their constituents and frees these constituents for 'creative' recombination and extension. Kenji Hakuta in his study of a Japanese child acquiring English demonstrates the importance of this strategy in second language acquisition (Hakuta 1975: 19-50), but insufficient attention has been given to it in first language acquisition. Adult use of unanalyzed routines such as politeness formulas is evidence that for some parts of language this strategy remains available throughout the lifetime of the language user.9

Summary This paper has claimed that the use of politeness formulas is universal in human societies and has documented the highly patterned nature of such formulas and their use in particular speech communities. It has also considered the way they change through time and the way they are acquired in the language socialization of the individual. It has not, however, attempted to construct a theory of politeness formulas or to find a place for them within some larger theory. In fact, theories of quite different kinds may draw upon the data of politeness formulas or serve as a framework for their description. Some observers relate them to general theories of ritual (e.g. Firth 1971), others to a general theory of politeness (e.g. Brown and Levinson 1974) or to universals of encounters and greetings (Youssouf, Grimshaw & Bird 1975); Goffman uses them to find the ground rules of his social order of public encounter. In particular, politeness formulas pose problems for linguists and ethnographers of communication. They constitute another set of facts about human language which the linguists must somehow fit into their theories of

grammar or turn over to others who are better able to deal with them, and they constitute one more of the seemingly endless array of patterned uses of language which the ethnographers much describe, analyze and explain.

9. Discussions of phonologically unanalyzed word-shapes, their critical role in early speech development, and their apparent persistence in adult phonology may be found in Ferguson & Farwell 1975 and Ingram, forthcoming, Chapter 2.

11 Sports Announcer Talk: Syntactic Aspects of Register Variation

Introduction One or two sentences of a radio announcer's report of a game in progress will usually be sufficient to identify the particular kind of talk being used, different from all other kinds of radio talk, such as straight news, sermons, soap operas, or talk shows.' One clue to the identification is the subject matter and the specialized vocabulary of the sport, but the same topics and lexicon may also be used in afterthe-fact news broadcasts, editorials, or interviews on talk shows. A more distinctive clue is the prosodic pattern, i.e., the features of tempo, rhythm, loudness, intonation, and other characteristics of voice. This clue is so powerful2 that it can often serve to distinguish not only sports announcing from other radio talk but even baseball from football announcing when the segmental phonetic characteristics and, hence, the actual words of the broadcast are muffled or masked. In this paper, some attention will be paid to these lexical and prosodic clues, but the primary purpose is to explore a third clue, the syntactic differences between sports announcer talk (SAT) and other kinds of discourse, especially the mythical "normal" discourse variously referred to as vernacular, common core, unmarked, or ordinary adult conversation. The analytic approach adopted here is that of register variation, although it is recognized that other approaches might also be instructive for investigating the phenomena of SAT. For example, a structural analysis of the genre or form of discourse of sports announcing could provide an account of the sportscast, which is indeed a highly structured and well recognized genre of contemporary mass media discourse. Such broadcasts start with background information about the game, the occasion, the teams, and so forth; conclude with interviews of players and coaches; and include components of direct reportage, comment, advertising commercials, and other elements in relatively fixed proportions and relatively fixed sequence. Subgenres of radio and television sportscasts and variation by different sports, by college vs. professional, and by other parameThis paper was originally published in Language in Society 12:153-72 (1983) as one of a series of invited papers commemorating a decade of Language in Society. It is reprinted here by permission of Cambridge University Press. 148

ters are also distinctively patterned. The sportscast is a discourse genre as identifiable as the sonnet, the bread-and-butter letter, the knock-knock joke, the professional paper in linguistics, or any of the hundreds of such forms of discourse in the total repertoire of communities of users of English. No attempt at this kind of structural analysis will be made in this paper, beyond the identification of different components as the loci of syntactic structures. The primary aim is to characterize the language of sportscasting, not the genre of the sportscast. Such other approaches as speech act analysis or conversational analysis could also be instructive, although they are more obviously useful for discourse that is closer to interactive dialogue than the relatively monological, essentially narrationcum-interpretation nature of SAT. A rhetorical or pragmatic analysis might be helpful in explaining the possible origins of various aspects of sports announcer talk or some of the reactions of listeners to SAT but would not touch the basic conventionalization of the SAT register and the fact that features may be selected by the speaker as appropriate to the register and not necessarily in order to fulfil an immediate rhetorical function.3 The value of studying SAT lies in the increased understanding it may bring of language variation that is not dialectal (regional or social), not stratificational (e.g., formal-informal), and not based on different ways of saying the same thing ("free variation"). Register variation, in which language structure varies in accordance with the occasions of use, is all-pervasive in human language. Every utterance is situated in social context, and the form of the utterance represents a choice on the part of the speaker or writer as to the nature of that context. Even the choice to speak as neutrally as possible or to be the least tied to a particular context, is a significant determinant of register variation. "To speak at all is to choose a register which will index the moment" (Haviland 1979:389). The understanding of how register variation is shared and conventionalized, how it is transmitted and acquired, and how it changes through time is at least as fundamental in understanding the phenomenon of human language as the understanding of how the phonological-syntactic-semantic systems of speech communities are conventionalized, acquired, and changed. To put this in more concrete terms: What are the ways sports announcers are supposed to talk, so that members of the speech community will notice whether they do it well or badly? How widespread is knowledge of the markers of this kind of talk? How do people learn to do it? How did this way of talking develop? What relation does it have to other ways of talking? Can it spread from one speech community to another, even across language boundaries? How does this way of talking in American English compare with similar ways of talking in other languages and other speech communities? The primary data for this study consist of thirteen recordings of segments of American baseball and football games and two recordings of Japanese baseball games, totalling over ten hours of sportscasting, made in autumn 1981 by members of a sociolinguistic seminar. In addition, two published texts of British sportscasts made in 1969 were consulted, one (association) football, the other boatracing (Scheffer 1975). Finally, members of the seminar drew on their own knowledge and intuitions of American and Japanese sportscasting, sports reporting in newspapers, and informal conversation on sports.


Register and Genre

Locating the Register General discussions of register variation, especially those with some claim to the construction of a theory or model, usually provide a taxonomic grid of several major dimensions or parameters, such as field, mode, participants, tenor, and so forth (e.g., Leech 1966, Gregory 1967, Halliday 1968, Ellis & Ure 1969). Some studies of register variation are content to label a particular register or set of registers (e.g., Steel 1971, Henzl 1974, Ferguson 1977). Many register analysts, including some of the authors cited, find these two approaches unsatisfactory. The parameter approach often fails when applied to particular cases, because the situational or functional characteristics that correlate with structural differences in the language may cut across, lie outside, or be nested within the proposed parameters. The label approach is unsatisfactory because it offers no general framework for the total pattern of register variation in a single language or speech community or for comparisons between languages or communities or for "universals" of register variation. The approach followed here is to try to "locate" a presumed register by identifying situational or functional features that seem to characterize a recognizable kind of language. The preliminary location is then refined by repeatedly checking the characterizing features to find evidence that they are linguistically determinative, not only of the register to be analyzed, but also of other related varieties that share linguistic features, and for identifying subvariation within the register. As the location becomes clearer, the linguistic description can become more precise. The description may be focused on the targeted register (or something like it, if the locating process alters the analyst's views) or on lines of registrar variation along which structures and uses covary in patterned ways. The register description may then be further utilized for diachronic, comparative, acquisitional, or pathological studies. As a first approximation, sportscasting is the oral reporting of an ongoing activity, combined with provision of background information and interpretation. This location differentiates the presumed register from such other related varieties as the oral reporting of completed activities or the written reporting of either ongoing or completed activities. Even a brief inspection of such varieties as radio news reports after a game and newspaper accounts of the same game gives confirmatory evidence that there are systematic linguistic differences between those and the sportscasting that has been targeted. It should be noted that the very considerable amount of discourse analysis of narration is primarily focused on reporting after the fact (or fictive narration in which the events never took place) rather than on reporting events taking place at the time of the discourse, and, accordingly, the various schemata, story grammars, and the like available in the research literature are not helpful here. The two phases of discourse in this register, the announcing and the commentary, are characterized by somewhat different linguistic features and are even recognized in the folk taxonomy, as the "play-by-play" and the "color commentary." As a second attempt at location, sportscasting is a monolog or dialog-on-stage directed at an unknown, unseen, heterogeneous mass audience who voluntarily choose to listen, do not see the activity being reported, and provide no feedback

Sports Announcer Talk


to the speaker. This location differentiates the register from such other related varieties as television sportscasting, the reporting of a game in progress to a blind friend, or the patter of the announcer at a circus, all of which would be included in the first approximation. A quick inspection of such texts shows systematic linguistic differences among them.4 The audience specification here identifies fairly well the typical addressees of the mass media, but it is difficult to find linguistic characteristics corresponding to such a broad specification, and in spite of common assumptions to the contrary, there may not be an identifiable overall register for the language of the mass media. The allocation of the two phases (announcing and commentary) is related to the choice between monolog and dialog: if there is only one announcer he does both, but if there are two, one typically does the announcing and the other the commentary, with interesting boundary phenomena between the two roles and the two kinds of talk.5 Incidentally, the lack of listener feedback may have been the original source of the avoidance of silence in sportscasting and other kinds of radio talk. The announcer's maximum stretch of silence is very short and the time between moves in the reported activity must be filled with speech, the counterpart of ordinary conversation where the addressee is expected to emit signals of attention at frequent intervals. The dread of silence seems less severe in some other kinds of broadcasting, and in sportscasting it is less in some other countries. As a third contribution to the location, sportscasting is a variety of discourse in which the level of arousal or excitement varies significantly during the discourse, and the course of this level, as well as other features of the variety is determined by quite specific bodies of knowledge and values assumed to be shared by speaker and addressees. The radio announcer uses the technical jargon of the activity being reported, including numerous idioms and slang terms suitable for informal conversation; he also interprets events in terms of an established set of values about what constitutes good playing, moments of risk, significant points of heightened competition, players' career goals, and the like. The speaker assumes that members of the audience will want periodic updates on the course of the game, whether because they have just tuned in, are listening to the radio in addition to doing other things, or have simply lost track of the score or the place in the game. This location differentiates the register from such kinds of talk as the reporting of ongoing solemn public ceremonies, but classifies it with the reporting of political conventions. It offers a useful means of subclassification, since reporting of different kinds of activities involves different expectations about the audience's knowledge and values. Some sports or other activities are assumed to be widely known (e.g., baseball, football), others are assumed to need more explanation (e.g., Japanese wrestling, chess matches) and for the less-known activities, the announcer is permitted to acknowledge his own ignorance and may call on specialists. These examples also point up the fact that the rhythm of the activity itself is determinative of the kind of language used: how long do the units of the activity (quarters, innings, etc.) last, how much times passes between significant moves in the activity? 6 SAT could probably be located still more exactly, and a broad investigation of many related varieties as well as variation within SAT could probably clarify


Register and Genre

the nature of the variation included, but this three-step approximation will serve for our purposes. The three successive specifications come from no preestablished theoretical framework but from the intuitions of native speakers and nonnative observers, as modified by discussion of recorded texts. In attempting to make sense of this, we note that the first seems to specify what the discourse does, or what the members of the speech community see the discourse as doing. Even without extensive field testing we assume that many Americans will accept our specifications of what the sports announcer is doing and will recognize the existence of a special way of talking that goes along with sportscasting. This is not to say that a folk taxonomy of kinds of talk will correctly analyze the register variation we want to understand, any more than folk opinions reveal the complex patterning of phonological or syntactic structures in language. This step of looking for a kind of discourse that is readily definable to the members of the community does, however, seem a good heuristic device, and we may well follow the procedure often recommended by syntactic analysts of starting with the clearest cases, where consensus among speakers' intuitions is most readily forthcoming. The second specification gives the essentials of the roles of the participants in the discourse. For many kinds of talk it is crucially important to know about the relative statuses and communicative intentions of the participants, but in this kind of mass media communication such features are largely zeroed out. In SAT, this is even more true than in the register of international amateur radio talk, for which Gibbon (1981) shows participant equality and the great importance of channel features . The third specification deals with the body of knowledge and values shared by the participants and with related aspects of the topic of the discourse. In terms of parameters often suggested for register analysis, the specifications deal in a rough way with mode, participants, and topic, in that order. But in the present, primitive stage of register theory, it may be more productive to proceed in a more intuitive way by taking salient "locating" characteristics first, however they may be identified with proposed parameters.

Syntactic Characteristics The stretches of SAT that seem most distinctive syntactically, i.e., that differ most from other kinds of radio talk and from the registers of casual conversation, are the reporting of events during the game and the segments of briefing directly attached to them. This briefing consists largely of background information on individual players (e.g., where they are from, how they have been doing this season or in this game, features of their playing style) but includes also recaps of scores, information on teams, and other material. Since these stretches also constitute the essential core of the sportscast, it is appropriate in our search for syntactic characteristics to look first at them. Six types of syntactic phenomena will be discussed here, and it must be noted that no rigid line is drawn between syntactic features and closely related collocative or selectional features of lexical usage. Other types of syntactic phenomena could be examined. and indeed others were

treated in the seminar, but these will serve as illustrative examples of what a fullscale register analysis might include.

Simplification One of the most striking features of SAT is the frequent use of sentences lacking certain expected elements, most commonly (a) sentence-initial noun phrase or noun phrase plus copula, and (b) post-nominal copula. The absence of sentenceinitial material, sometimes called prosiopesis, is a long-recognized feature of various registers of spoken and written English (cf. Jespersen 1922). The prosiopesis of SAT seems to have at least two sources in more general or earlier register variation in English. One is the lowering of pitch and loudness for parenthetical expressions, of which it is an extreme form; the other is the structural simplification characteristic of a number of recognized registers (e.g., headlinese, foreigner talk [cf. Ferguson in press]). In advertising-language, where it is also quite frequent, Leech attributes it to an overcolloquialization, "a tendency to go beyond colloquialism in simulating the conditions of friendly, personal communication" (1966:9). Bowman, in her study of "fragmentary" sentences of spoken English, concludes that the "ellipsis of initial weak-stressed syllables . . . seems to contribute an air of informality" (1966:66). An explanation of its role in SAT could only come from extensive investigation, by such techniques as historical study of the evolution of SAT since the early days broadcasting, comparative study of SAT under different conditions, and opinion surveys or experimental studies of listener reaction. It seems likely, however, that prosiopesis in SAT serves to "index the moment" as nonleisurely (you have to speak rapidly and concisely), informal (you mustn't sound too bookish), exciting (like the attentiongetting language of headlines or advertising copy), and vignette-quality (like captions of pictures). It may also simply represent the erosion of less stressed elements of recurrent formulas, as Jespersen originally pointed out, so that Fastball! for [It's a] fastball is like Bye! for Good-bye! Whatever its exact role, however, the phenomenon deserves careful analysis, as Straumann pointed out decades ago (Straumann 1935:38-42). The material deleted is actually very limited in type: in almost all cases it can be reconstructed either as a personal pronoun, subject of the immediately following verb, or as a pronoun plus copula, before a noun complement. Examples: Pronoun Subject

Pronoun plus Copula

la. b. c. d. e. f.



[He] hit 307 [He] had 6 homeruns [He] pops it up [It] bounces into the seats [It] hit on the foul line [It] bounced to second base [They] scored three times

b. c. d. e. f. g.

[It's a] fastball. [It's a] strike. [It's] one and one [It's] a little bloop [It's a] pitch to uh Winfield [It's] a breaking ball outside [It's] a bloop single [It's a] swing and pop-up foul [He's a] guy who's a pressure player


Register and Genre

In these examples, as throughout our data (although exceptions appear in other published data), the indefinite article is omitted along with the copula if the noun complement has no modifier before it, but is expressed if there is an adjectival or nominal modifier, for example, [It's an] out! vs. [It's] a big out! This rule seems quite delicate: the fast of a fastball is not an adjective but a constituent of a compound, whereas breaking of a breaking ball counts as a modifier as in 2a and 2d. Copula deletion in SAT is also very limited in its conditions of occurrence. It takes place most often after a single-word proper name at the beginning of a sentence, typically the name of a player. Occasionally, instead of a name, the subject is a common noun or common noun with modifier, identifying the player. These are typical examples from part of one baseball game: 3a. b. c. d. e. f. g. h.

Klutz [is] in close at third. Milburn, with good speed, [is] at first. McCatty [is] in difficulty. The A's [are] now hoping to get . . . Milburn [is] remaining at first. Runners [are] leading from first and third. McCatty [is] in a tough spot. A's right hander [is] pitching from. . .

Many of the examples have an -ing form, and all of them could be captions on pictures, suggesting that this construction is closely related to the participial constructions (in -ing and -ed) of headlines, captions, and note-taking. The conditions of copula omission in headlines have repeatedly been studied, and manuals for headline writers recommend it under various conditions (analysis and review of literature [Mardh 1980:158-80]). Apart from the emphatic function attributed to it by some authors (Jones arrested more emphatic than Jones is arrested), the phenomenon in SAT seems to be a special case of a general type of shortening or simplification used in a range of registers, including headlines. The communicative functions and determinative factors differ somewhat from one simplified register to another, but the overall family resemblance is striking, as first noticed by Straumann (1935). The simplification similarities between event-reporting sentences in SAT and event-reporting headlines in newspapers presumably reflect, at least in part, functional similarities. In our seminar, however, it was noted that the similarities between these two registers in Japanese do not match the English ones. Some features, such as copula omission, occur in both headlinese and SAT in Japanese, but the two differ in other respects. For example, particle omission (e.g., case markers) is common in SAT but rare in headlines; headlines have much more nominalization, that is, use of Chinese-character nouns without verbs. This last contributes to the somewhat literary flavor that Japanese headlines have that is absent in English headlines and in SAT in either language. Cross-language register analysis, done in depth, would be an excellent way of examining hypotheses about form-function relations in register variation (cf. Ure 1972).

Inversions One of the most characteristic features of SAT, setting it apart from most spoken English, is the frequent use of inversions, that is, structures in which the predicate precedes the subject, as in the present sentence. Such orders are common in many genres of written English but, as remarked by Green (1980:584), sportscasting is "one of the few situations where inversions are used in speech with any appreciable frequency." Examples from the corpus: 4a. b. c. d. e. f. g. h. i.

Holding up at third is Murphy. Over at third is Murphy. Tagging at third is Nettles to score. And all set again is Pat Haden. And out right is Drew Hill. Coming left again is Diamond. On deck is big Dave Winfield. Pete goes to right field and back for it goes Jackson. And here once again ready to go back to pass is Haden.

In SAT inversions, typically the subject is a player's name, the verb is the copula or—less often—a verb of motion such as come or go. The preposed predicate is typically a locational expression, often containing a present participle. Most of the participle plus copula constructions could be regarded as present progressives if put in normal word order (Diamond is coming), but in the inversions, they seem to function as. predicate adjectives (cf. ready . . . is, holding up at third is). In discussing the possible origins of this construction, our seminar decided that the most likely source was the fact that it gives the speaker a little more time to ascertain the identity of the player whose action is being reported. The announcer may be able to see the play and describe it before he is sure who the player is. Green also sees this as the function of the construction, and she notes that in her data (from basketball), a number of other constructions also appear which "allow naming the agent to be deferred" thus "eas[ing] the sportscaster's encoding problem considerably" (1980:586). This is a plausible explanation, coinciding with speakers' intuitions, but it must be noted that in many instances of inversion, especially in baseball, the identity of the player is perfectly evident throughout. The inversions must be regarded as a register-marking feature freely used even when the practical stalling function is not needed. An excellent example is 4i where the once again indicates the announcer knows who the player is in spite of the unusually long preposed predicate.

Result Expressions In narrating an ongoing activity, the speaker may want to indicate that an event he reports leads to a particular state, which he then names. Such a tie could be expressed in a variety of ways: leading to, resulting in, which makes (made) it a, and so it's (it was) a, and many other possibilities. Two ways are very frequent


Register and Genre

in SAT and rare in other kinds of talk and, hence, are clear syntactic markers of the register: for + noun and to + verb.7 Examples: For Phrases

To Phrases


Joe Ross's caught it for a touchdown


b. c.

He throws for the out. Has it for the out.



Washington backhands it to throw across for the out.


And he just keeps alive, reaching out to foul-tip one back. And it gives us a double to Mumphries to lead things off. There's a strike on the outside comer to make it 2 and 1.

One example, 5d, has both. These "result" uses of for and to are semantically distinct from such ordinary "purpose" uses as Miami worked hard for that touchdown, we pause 5 seconds for station identification, plenty of time to get under it, they need one more runner to bring the tying run. The close semantic relation between purpose constructions and result constructions is often shown by formal resemblances in the world's languages (e.g., Latin ut clauses), but in this instance, the formal identification is characteristic of a particular register. The for and to expressions are not limited to a particular sport, but seem to occur across the board (e.g., baseball, football, basketball, swimming, track). It may be that these constructions arose in response to the need to save time or appear to save time, but they are also found in newspaper sports reports, where they may appear in longer, more complicated sentences. Examples: Brown, a junior from Sacramento, returned a kickoff 93 years for the Ducks' only touchdown in a 39-7 loss last week at Washington State. Mike Cooper hit two more free shots with 44 seconds to play to tie the game and set the stage for Nixon.

Most analyses of register give a great deal of attention to lexical specialization, but this is usually in terms of full lexemes (nouns, adjectives, verbs) or fixed idiomatic phrases, rather than grammatical formatives (prepositions, inflectional affixes, etc.). The use of to and for result expressions in SAT is an example of register variation which falls somewhere between the use of special lexicon required for the subject matter (e.g., technical terminology) and the use of special lexicon suggesting the right level or genre (e.g., formal or poetic equivalents of conversational terms). In this instance, the specialization consists of choosing one out of several possible alternative ways of expressing the meaning and making that way the norm, the particular meaning being a commonly recurring one in the register. This specialized result sense of for and to is not identified in the OED or Webster's Third,8 but it seems thoroughly ensconced in sports reporting in Britain and the United States, and may be extended to analogous situations, for example, (announcer at a marine show): The seal leaps 6 feet in the air, to get the fish (said just after the seal gets the fish). A beautiful displaced example occurs in the Preface to the Encyclopedia of Baseball: "These he parlayed with an unquenchable

Sports Announcer Talk


thirst for big league data to amass one of the most exhaustive files in baseball history" (Treat 1968).

Heavy Modifiers In sportscasting, it is a fairly frequent occurrence that the announcer includes a brief, incidental identification along with the name of a player, or less often an umpire, referee, or someone else referred to in the broadcast. The identification is typically a characteristic of his playing (e.g., left-hander) or his previous performance (e.g., batting average), but may be other kinds of descriptive material, occasionally just the name of his current role in the game (e.g., center fielder). In conversational accounts of games, such identifications are most often interpolated sentences—he's the center fielder—or fragments—you know, the center fielder. In the SAT register, however, the speaker freely uses syntactic devices more typical of written English, such "heavy modifiers" as appositional noun phrases, nonrestrictive relative clauses, or preposed adjectival constructions. The use of these heavy modifiers, like the use of inversions, contrasts with the conversational style of most SAT, which is even hyperconversational in its frequent deletions and routine formulas. Some instances of heavy modifiers illustrate the tension between these two characteristics; the clause may get so long that the speaker has to pick up the antecedent again, as in 6d, or the apposition may be continued with an and, as in 6h, neither of which would happen in the written English of a news story. In addition to the postposed devices, the sportscaster also makes use of preposed series of adjectives of the kind that are rare in conversational speech (6j1). Examples of heavy modifiers for identification, taken from the transcripts: 6a. Warren Cromartie, the left-handed hitter, swings . . . b. Eddie Yost, a crackerjack, who was not a power hitter, . . . c. David Winfield, the 25-million-dollar man, who is hitting zero, five, six in this World Series, . . . d. Steve Yeager, who won Sunday's game with the dramatic homerun on the heels of Guerrero's shot, Yeager coming up. e. according to Paul Pryor, the plate umpire. f. coming left again is Diamond, who caught that 34-yard pass g. where Winfield could deliver him with a ball to the outfield, Big David, beleaguered not only because of his failure to hit in the Series as well as he did in the regular season, but for other reasons as well. h. Bobby Watson, with a 3-run homerun last night and a single, and he has i. j. k. 1.

Larry Milburn, 3 for 4 yesterday, did not face . . . The quiet Texan Tommy John delivers . . . First-base umpire Larry Barnett waited a while before . . . left-handed throwing Steve Howe, who in the mini-playoffs or the playoffs just preceding this one, came out . . .

The referential function of adding incidental background information about persons mentioned in the discourse could be handled in SAT by using the conver-


Register and Genre

sational devices of interpolation or separate sentences, and this does happen (6i), but it is interesting that more formal, literary devices are used in the register. The appropriate use of heavy modifiers (and inversions) is a mark of the skilled user of SAT. Most listeners to sportscasts are probably unable to use such devices in running speech without considerable practice, even though they may be thoroughly familiar with the devices in written English.

Tense Usage Linguists analyzing the uses of English verb forms have repeatedly commented on the use of the simple present and the present progressive in the language of sportscasting (e.g., Close 1962, Palmer, 1965, Hirtle 1967, Leech 1971, Scheffer 1975), and tense usage seems an appropriate question to examine in order to characterize SAT. The basic facts are clear: in direct reporting of events the sportscaster uses the simple present to refer to actions of short duration regarded as taking place at the moment of speaking—Washington backhands it—and the progressive to refer to actions of extended duration—they're bringing that ball back to the 27-yard line—or summing up the game or season—the Expos are perking. When a rapid action is regarded as having already happened, it is reported with the simple past (or the present perfect), often in recapping or adding descriptive material to an action already reported in the present: there goes Haden back to pass . . . throws it . . . and Haden threw that ball high. Thus, the simple present is the characteristic form of the verb in direct reporting in SAT when the sport being reported is one with a succession of rapid events (e.g., baseball, football) and the progressive when the sport is continuous (e.g., boat racing, horse racing). This use of tenses seems to be in full accord with the general analyses of the semantic values of English verb categories. The competent user of English readily recognizes the different meanings of 7a. I line b. they 're lining f up in slot back formation c. [lined 8a. I steps b. he 's stepping c. [stepped

up to the plate

Members of our seminar noted that the simple present is used more in direct reporting and the progressive is used more in background briefing, and it is tempting to look for a functional explanation here (cf. also Weber 1982), but this distribution seems to be a natural reflection of the nature of the events being reported rather than a use of different forms to achieve the distinct goals of reporting and providing additional descriptive material. Examples from the transcripts: 9a. Burt ready, comes to Winfield and it's lined to left but Baker's there and backhands a sinker then throws it to Lopez. b. they line up into a slot back formation right

Sports Announcer Talk


c. they're bringing that ball back to the 27-yard line . . . that takes it back to the 27. d. He's hit immediately by Jim Stucky. Stucky [hit] him around the shoulder. e. Eric Gregg is umpiring at first base. f. Now the pitcher is throwing out of the sunlight into the shady area of the ballpark.

Routines A central concern of modem linguists has been the rule-governed nature of language and the creativity of the speaker in constantly producing, and of the hearer in constantly comprehending, novel utterances formed in accordance with the rules. Many sociolinguists, on the other hand, have been concerned with the regularities or "rules" in the variable selection of alternative ways of "saying the same thing." Both have paid little attention to the apparently noncreative "routines" in which the same alternative is constantly chosen and is felt to be appropriate for the occasions of its use, i.e., when the possibilities of creativity are not exploited. This lack of interest is surprising in view of the extensive use of such routines as politeness formulas, proverbs, cliches of various kinds, and so forth, sometimes amounting to twenty percent of ordinary conversational discourse (Sorhus 1977). Hymes, in his classic paper on the ethnography of speaking, discussed linguistic routines (Hymes 1962), but most linguistically oriented work on routines is quite recent (e.g., Coulmas 1981). In analyzing the characteristics of a register, one of the areas to investigate is the use of particular routines and idioms characteristic of that register as opposed to others. Gibbon (1981) proposes "idiomaticity" as a starting point for a general approach to the description of functional variation in language, but the particular register (international amateur radio talk) by which he exemplifies his approach is so far removed from ordinary conversation that it does not illuminate very well the uses of routines in registers consisting largely of common core English. SAT has many routines embedded in it, prefabricated stretches of discourse ranging from idiomatic phrases to fairly lengthy routines. As an example, we will describe here one specific routine limited to sportscasts of baseball, the giving of the "count," that is, the number of balls and strikes of a player at bat. Baseball has other routines as well, and other sports have corresponding routines, so that the description of the count will help to characterize SAT as a whole. First, let us note that the word count in baseball is used only in this one sense; it is not used for the score of the game, or as an indication of which inning the game is in, or as a statement of the number of hits in relation to the number of times a player has been at bat, all of which are numerical statements frequently made in announcing a baseball game.9 The term count is used in several sports (e.g., boxing, bowling), in each case with just one sense. It is as though count were one of a set of numbering expressions that any sport or game may select from and assign to a particular meaning. The term count and the routine itself are


Register and Genre

used also in familiar conversation about baseball and in newspaper reporting, but the full range of variants occurs only in SAT. Given the rules of English syntax and the required lexicon, there are very many ways to express the number of balls (0-4) and strikes (0-3) at a given moment of the player's time at bat. If we assume, however, that the announcer must give this information briefly, in summary form, the number of possible ways to express the information is reduced by some large factor. But even if the limit is one simple sentence, he still has a sizeable number of options. For example, the announcer could give the balls first or the strikes first, he could choose various connectors (e.g., against, to, and, versus), he could use cardinal or ordinal numbers, and so on. In fact, however, the routine allows very little variation. Here is an illustrative sample of counts from the tapes: lOa. b. c. d. e. f. g. h.

One and one. Count of one and one to M. One and oh. Two and oh. Oh and one. It's one and one. Nothing and one count. Nothing across.

i. j. k. 1. m. n. o.

One and one the count to R. Three and two. ... to make it two and one. Two balls and two strikes. Count's full now, three and two. One and two. And it's still one and two.

The order is invariably balls before strikes, cardinal numbers only are used, the connector is invariably and, and zero is normally oh but may be nothing if it refers to balls. I0 When the announcer wishes to report the count before any pitches have been delivered, instead of Oh and oh he may say Nothing across; the count of three and two is also called Full count. These two special synonyms are often used right before or right after the usual count. We may assume that the fullest form of the count is: 11. The count is N balls and N strikes to X (name of player). This full form is not attested in our tapes and must be extremely rare. The usual form is:

12. N and N. The forms found in SAT but not in ordinary conversation are inversion, inversion with copula deletion, and nominal constructions. Thus, the following may occur in SAT only: 13a. b.

One and one's the count. One and one the count.

c. d.

Count of one and one. One and one count.

Listeners from the general public apparently comprehend and accept these forms but do not use them actively in their own speech. The nominal constructions are another example of the caption form discussed above under Simplification. The use of and as the connector distinguishes the count from the other two routinized pairings of figures in baseball: game score (to or no connector) and

Sports Announcer Talk


number of hits out of times at bat (for).11 Thus, when the two figures are identical, there is a three-way contrast: 14a. 2 and 2 = two balls, two strikes (count) b. 2 (to) 2 = two runs, two runs (score) c. 2 for 2 = two hits, two times at bat (player's record) The insertion of the count into the reporting of a baseball sportscast is as formulaic as the hundreds of conversational routines described in such works as Goffman (1971) and Coulmas (1981). Unlike those routines, however, there is no dialogistic interpretation; announcer and listener are not interacting in some pattern of exchanging politeness to accomplish a communicative goal. The routinized way of reporting the count is an example of the general human tendency to routinize recurrent messages in recurrent communicative settings, eliminating both redundancy and stylistic variety. The emergence and conventionalization of registers includes not only the specialized use of particular lexical items and syntactic constructions, but the use of prefabricated modules inserted as needed at appropriate points in the use of the register. Such routines serve, then, not only as convenient, streamlined, unobtrusive counters in well-worn patterns of discourse but also as markers of the register itself. The structure of routines is of special interest in the study of linguistic diffusion, since they may be easier to transfer than features of phonology or syntax, which are more closely integrated into larger systems. Routines, as oversized lexical items, may be transferred essentially by lexical borrowing. Also, routines, being in effect small discourse genres, may be transferred by structural analogy without necessarily carrying phonological and grammatical substance with them, as in the case of literary genres or units of religious ritual moving from one language to another (Ferguson 1976). For these reasons, routines may be transferred across language boundaries more directly than other aspects of a register. When the playing of baseball spread from speakers of American English to speakers of Japanese, lexical borrowings were part of the diffusion, and in due course the Japanese sportscast evolved, combining features of American sportscasting with features of related Japanese forms of discourse. The point to be made here is that the count routine was borrowed more or less intact whereas the basic language of the sportscast is Japanese. The Japanese count uses some English vocabulary—one, two, three, four, strike, ball, no, nothing—fully naturalized to Japanese pronunciation—/wan tuu surii hoo sutoraiku booru noo nasinl—or approximating English depending on the speaker's knowledge of English or membership in an English-using social group (Bloch 1950). The count differs in detail, however, by listing strikes before balls, having no connector corresponding to and, and in having no plural ending on strike or ball. One interesting correspondence with the American English routine is the use of nothing, only for balls, not for strikes, even though, the balls are given in second place. Examples from a Japanese radio broadcast: 15a. Two strike two ball. b. No strike two ball.


Register and Genre c. Two strike nothing. d. Two nothing. e. Two two.

The Japanese announcer uses the fuller form with strike and ball more often than the American announcer, but the minimum forms are also used.

Discussion and Conclusion In this paper we have located a register, "sports announcer talk," and we have examined a few of its syntactic characteristics. In the first operation, we acknowledge that there is no foolproof way of identifying registers and indeed that the concept itself is very flexible, yet the description of what seem to be fairly clearcut registers is surely a useful step toward understanding the nature of register variation. In the second operation, we acknowledge that there is no foolproof way of deciding which linguistic features of a register are the most important, yet the study of salient characteristics of particular registers in comparison with other registers and with the same or related registers at different time periods and in different speech communities is the familiar path of linguistic inquiry. The combination of characteristics that constitutes SAT, including the six syntactic features described here, functions as a synchronic subsystem in English, and listeners evaluate sportscasters in large part on their ability to use this conventionalized register in accordance with the accepted norms, although other factors are also involved. As is the case with such synchronic systems, however, the component parts may have different origins, fulfill different functions, and serve simultaneously as parts of other systems. For example, the simplification features of SAT are shared with other simplified registers of the economy type, such as headlines, note-taking, and telegraphese (Ferguson in press), and they may have originally entered the register from the language of headlines and captions, possibly as a way of sounding exciting. They also appear to some extent in casual conversation, and thus, may serve the purpose in SAT of sounding informal and nonliterary. On the other hand, the use of inversions and heavy modifiers is shared with literary registers of English, and they may have emerged in response to more direct functional needs of SAT. The inversions may have come in response to the need to postpone mentioning the agent's name until the announcer is sure of it or a need to heighten the listeners' feeling of suspense or both. The heavy modifiers may meet the need to pack more background information into the event-reporting sentences. The striking tense usage in SAT may simply be the normal use of English tenses as required by simultaneous reporting, but it sharply differentiates this register from straight news broadcasts or accounts in newspapers. The special result expressions and set routines seem to grow directly out of the special communicative needs of sportscasting, in accordance with general principles of the development of patterns of recurrent messages in the specialized language of particular social groups. Yet all of these characteristics fit together and interact in the construction of sentences and larger units of discourse in sportscasting, and they all

Sports Announcer Talk


may be employed as conventionalized register-markers, not only in sportscasting as such but also in such secondary uses of SAT as making fun, play-acting, or creating an atmosphere. The speculations offered here about particular functions and origins of the characteristics must remain speculations until experimental studies of synchronic functions are carried out and the history of SAT is investigated. As noted in the Introduction, SAT is a promising topic for historical study, since it took shape in the early days of radio broadcasting in the 1920s and recordings can be followed from that time to the present. Similarly, some of the major international sports, such as basketball, baseball, and hockey, are relatively recent in origin and their spread to other languages and other SATs can be followed from recordings. On the other hand, the most international of sports, soccer (or football, as it is called in most countries), has been played in some form for centuries and in the nineteenth century spread so widely that it is now played in over 140 countries and has sportscasting in many languages; comparative/historical analysis would be correspondingly difficult. Looking at SAT from a broader, cross-language perspective could prove instructive for understanding the phenomenon, as has been suggested by the bits of information from Japanese baseball SAT. In fact, baseball SAT might be easier than some of the others to investigate, because it is of such limited distribution in the world (in spite of the "World Series"). The game is played and sportscasts are made in only a few languages: English, French, Japanese, Korean, Portuguese, Spanish, and perhaps several others. In any case, SAT is a register that is learned differentially throughout the speech community.12 In the United States, a passive knowledge of most aspects of SAT is typically acquired by teenage boys as they become familiar with popular sports and learn to listen to TV and radio sportscasting. In some instances, boys as young as five years of age become familiar with technical terms and such routines as the count in baseball; this is especially true if their father is a coach or in some other way is involved in keeping scoresheets and the like. Acquisition by girls is also widespread but usually less thoroughgoing than with boys. Active use of SAT begins with highschool announcing of games and imitations of sportscasters. There have been no longitudinal studies of the acquisition of SAT, but there is certainly a hierarchy reflected in the order of acquisition, since many American adults are familiar with the main features of baseball talk but are unaware, for example, of the strict differences among and, to, and for as connectors of numbers in baseball routines. Since most of SAT is not explicitly taught but is picked up in an unplanned, "natural" manner, research would probably show regularities of the type reported in the acquisition of other registers and routines (cf. Andersen 1977, Baumann 1977). This brief attempt at locating and characterizing sports announcer talk adds another study to the still quite small stock of descriptions of language varieties made in the tradition of register analysis.13 As such, it emphasizes the extent of conventionalization in register variation as opposed to analyses that emphasize the functional basis for details of linguistic structure in stretches of discourse. In human language, processes of conventionalization are always operating, as social groups develop shared norms of the varieties of language appropriate for different occasions and as individuals acquire and modify these norms.


Register and Genre

Comparative, historical, and acquisitional studies of register variation offer a valuable approach to understanding fundamental properties of human language and processes of language change.

Notes 1. This paper has made extensive use of the work done in a sociolinguistic seminar devoted to register analysis, offered at Stanford University in 1981-1982; and the eight members of the seminar are close to being co-authors: J. Anderson, C. Ely, D. M. Hoffman, T. Ishiguro, P. R. Kozelka, Y. Takahashi, M. Teutsch-Dwyer, D. R. Wylie. They have not seen the final version, however, and are not to be blamed for its shortcomings. 2. Some analysts have claimed that prosodic characteristics or the whole "tone of voice" are the primary clues to stylistic or registral variation in speech (e.g., Crystal 1975:97, 1976). 3. Functional analysis and the more static register analysis are complementary but do not exhaust the phenomena of situational variation in language: "Situativ bestimmte Variation ist bis jetzt weder im Konzept der Funktionstile noch im Konzept der Register befriedigend beschrieben warden" (Hartung & Schonfeld 1981:98). 4. Radio and television sportscasting are very similar and the slight differences are correspondingly interesting (cf. Weber 1982). 5. References to sportscasting in the linguistic literature, especially by British authors, often use the term "commentary" to refer to what is called "announcing" in the United States, and what is called "commentary" in the United States, if discussed at all, is referred to as advising and summarizing (e.g., Crystal & Davy 1969:138). 6. Sportscasting belongs in Goffman's category of "action override," one of the three "modes" of announcing he recognizes (Goffman 1981:232-34), but he devotes little attention to it. It belongs in Crystal and Davy's category of "unscripted commentary," which they describe as "a spoken account of events which are actually taking place, given for the benefit of listeners who cannot see them" (Crystal & Davy 1969:125). 7. In a count of the to construction in written journalistic English, it was found to occur 49 times in sports news as opposed to 2 times in ordinary news reporting (Wallin 1982). 8. A construction referred to as the "to of outcome" is identified in Quirk et al. (1972:754), for example, / awoke one morning to find the house in an uproar. This construction is closely related to the SAT to of result but differs in that (a) the events reported in the two clauses are in a temporal sequence, and (b) it is possible to insert only before the to, whereas with the to of result the clauses are typically simultaneous and only cannot be inserted. 9. Many of the technical terms of baseball are defined in the official rules, but count is not defined; it is simply taken for granted and is used in the statement of rules ("If the count is . . ."). 10. The expression of zero in sports talk is quite varied, but highly conventionalized. For example, zip can be used in reporting scores of games, love in tennis, and flat in giving times; nil, common in British sports talk, does not appear in American usage. 11. The order in which the figures are presented in game scores and hits out of times at bat is fixed. The practice of reading the higher score first is a nearly universal convention in American and Japanese sportscasting, but in the reading of scores in Great Britain and some other countries (e.g., Germany), the home team score is read first and the intonation with which the first score is read signals to the listeners which team has won (Cruttenden

1974). In Japanese sportscasting, the higher figure is given first also in reporting the number of hits out of times at bat i.e., the equivalent of "out of five, three" instead of the English three for five; this is in accord with the regular SOV, postpositional word order of Japanese. 12. Data on the acquisition of SAT were supplied by T. Andersen, R. Cooper, J. Mason, and A. Turner,high school students in Auburn, Al. 13. This study is an example of the kind of partial treatment discounted by those who want larger and better theories in sociolinguistics. It is merely one of the "fragmentary 'grammars' of various kinds of sociolinguistic co-variation which will, it is hoped, turn out to be insightful and, in due time, amenable to synthesis" (Neubert 1974). Such partial treatments, however, may be more constructive contributions to the development of sociolinguistics than elaborate but premature attempts at theory.

12 Genre and Register: One Path to Discourse Analysis

Two powerful tools of analysis and understanding available to the student of human language are the analysis of types of discourse and the analysis of how language varies depending on the occasions of its use. The former, the study of discourse types, is what is traditionally called "genre analysis." The latter, the study of language variation by use, is referred to by some as "register analysis." The purposes of this paper are to clarify the methods of genre analysis and register analysis and to suggest that they are two aspects of a sociolinguistically oriented discourse analysis that offers a major approach to the study of human language.

Genre Although there are parallels in Chinese and Indian traditions, and probably elsewhere, the Western tradition of genre analysis is the stream of philosophical and literary analysis that began with Aristotle, which will be considered here. After Aristotle it flowed through Cicero, Horace, and Quintilian, and after a halfsubmerged life during the Middle Ages it reappeared as a strong river at the time of the Renaissance. Since then it has flowed continuously in European literary studies, with occasional periods of flourishing when it came to dominate the literary scene—as in the eighteenth century and again in the twentieth century. Books and articles devoted to genre analysis or in direct opposition to genre analysis form a large part of recent writing on literary theory. Despite the mixed views of what genre means or how important the notion is, almost all these works on genre are derived from the Aristotle tradition. I cannot presume to sketch the history of literary theory or even to give an impression of the numerous theories and countertheories in literary circles today. What I propose to do is examine with some care what Aristotle had to say This paper is a version of a talk given first in Miami as the Second Barbara Gordon Memorial Lecture in Linguistics at Florida International University on March 19, 1985, and later in Dallas and at the University of California, Davis. It formed the basis for the opening lectures of a course given on register and genre at the LSA Linguistic Institute at Stanford during the summer of 1987. 167


Register and Genre

about genres and then point out a number of ways Aristotle's approach makes a modern sociolinguist feel uncomfortable. In this sociolinguistic response to Aristotle, I hope we can gain a better understanding of the place of conventional discourse types in the use of human language. Aristotle started his treatise on Poetics in such a straightforward fearless way. 1 "I propose," he wrote, "to treat of poetry in itself and of its various kinds." The word poetry is probably to be understood as more like "literary works" in meaning, since Aristotle included Socratic dialogues but excluded Empedocles' philosophical verse. As far as "its various kinds" are concerned, he proposed to inquire into the "structure of the plot"—what we might call the "content"—and into the "number and nature of the parts"—what we might call the "form." The first step in Aristotle's analysis was what we would now call a feature system. He identified several dimensions or features in terms of which his "kinds" of literature, the "genres," could be characterized. Each feature had two or more values that could be entered into the feature matrix of each genre, and although he gave his examples in discursive prose, he would not be tempted to use the familiar graphic representations. Thus, Aristotle wrote that genres differ in three respects—the medium, the objects, and the manner, and among the values of the medium were [ verse], among the values of objects were [ higher types of characters], and among the values of manner were [ narration] where [-narration] means presenting the characters as "living and moving among us." Aristotle has subfeatures such as different meters, narration in one's own person or by talking as another personality, and so on, but these three Aristotelian generic features will suffice to illustrate his method. He compared the feature structure of Homer's works with those of Sophocles and Aristophanes.


Homer Sophocles Aristophanes


High Characters


+ + +

+ + —

+ —

After the feature analysis, Aristotle went on to discuss the origins of literature and the history of particular genres. As to general origins, he had his own kind of innatist hypothesis. Literature sprang from two causes, each lying "deep in our nature"—the instinct of imitation (mimesis, the term is still with us) and the instinct for harmony and rhythm. His account of the origin and development of tragedy is a fascinating historical analysis, taking the genre from its earliest improvisation growing from dithyrambic poetry through a succession of changes in form and content. He concluded that "having passed through many changes, it found its natural form, and there it stopped." After the diachronic sections, Aristotle proceeded to give a synchronic analysis. In the case of tragedy his famous definition and a list of the six elements of plot, character, diction, thought, spectacle, and song, and also a list of the separate parts such as prologue, episode, exode, and so on. He discussed the elements in some detail, especially the plot with its moments of reversal, recognition, and

Genre and Register


suffering. After this thorough treatment of a single genre Aristotle proceeded to consider "what the poet should aim at": he offered some forty or fifty statements of what a writer of tragedy should do to produce the "specific effect of tragedy," "perfect according to the rules of art." The remainder of the Poetics as it comes down to us consists of the detailed analysis of another genre, the epic, and finally a comparison of tragedy and epic to decide which one is better. Aristotle, on the basis of a series of careful arguments decides: "It plainly follows that tragedy is the higher art, as attaining its end more perfectly." This summary of Aristotle's genre analysis does not do justice to the detail and subtlety of his work, but it is sufficient to show what a valuable resource it is for the linguist who wants to do discourse analysis today. Aristotle looked for community-validated units of discourse, his "kinds" or genres, he offered a way of characterizing the various genres, he attempted to understand the systematic relations among them, and he tried to make sense of diachronic changes in them and in the total system or repertoire of genres. This seems like a good start and indeed many have tried to build on the foundation he laid. Why is it that presentday linguists or sociolinguists are not attracted to this promising enterprise? Or, as I said earlier, what makes sociolinguists uncomfortable about it? Three characteristics of Aristotle's analysis, and possibly a fourth, cause most of the discomfort. First of all, it is Greek (i.e., parochial). Not that there is anything wrong with having an insightful analysis of Greek literary genres in the fourth century B.C.E. Not that there is in general anything wrong with having a good description of literary genres in one language, but Aristotle's presentation makes no effort to cast the Greek analysis in a broader framework. One line of evidence to validate an analysis must surely be to show how it can be compared with analyses of other languages or speech communities, or within the same language or speech community at some time in the past. During the heyday of American structuralism in linguistics, when languages were thought to vary infinitely, and each language was to be analyzed in its own terms, a unified frame of reference was found at least in a consensus on discovery procedures. At a comparable stage of genre analysis the linguists could at least agree on the methods of identifying genres, discovering their variants, and adumbrating the overall structure of genre repertoires. But just as core linguistics moved into a period of concern with linguistic universals and cross-language generalizations (Ferguson 1978), the sociolinguistically oriented discourse analyst would hope that universalist theories of genre structure and use would emerge. Aristotle's successors have sometimes broadened this field of inquiry to include many languages and various times and places, but when they have sought to be universalist, their categories have usually been too aprioristic and/or tied too directly to the literatures of Europe. It is only recently, with the renewed interest in genres, that some analysts have seen the problem in universal terms. One of the best modern books on genre, Fowler (1982), makes a conscious choice of using primarily English examples, in the hope that "one extensive literature may stand as an examplar of literature itself" (p. vi). Linguists have had the experience


Register and Genre

of using English as the source of understanding universal grammar, and have learned some of the possible pitfalls. Miner (1986) is a good example of current attempts to face this problem of the parochial versus the universal. With a few key illustrations from very different literatures, he demonstrates the seriousness of the problem and offers some tentative steps toward its solution as he sees it. The second discomfiting characteristic of Aristotle's Poetics is the fact that it is devoted exclusively to literature. Aristotle wanted to clarify what it is that poets do and analyze the products of their work. And genre analysis from Aristotle's day to the present has almost always been concerned with literary genres only, as though these were the only conventionalized discourse types characterized by features of form, content, and use. The literary theorists are legitimately concerned with the question of what it is that makes literary texts different from nonliterary texts, and it is the job of the literary critic or historian or theorist to define, characterize, and interpret the literary genres of a community. It is also reasonable that the literary scholar should attempt to analyze the changes that take place in genres over time, including the reinterpretation of old genres by the community, and the diffusion of a genre from one community to another, or one language to another. Literary genres, however, are only a subset of the genres of a community, and the sociolinguist tends to assume that they grow out of and are based on speech genres and nonliterary written genres. Sociolinguists operate with a number of basic working assumptions (Ferguson 1993), such that the emergence of a significant social grouping within a community will tend to be reflected in some way in the structure and use of language in the community. The features may be relatively slight or all-pervasive, they may be categorical or gradient, located in phonology or syntax, and so on, but Sociolinguists expect to find some relationship between language and social grouping in any speech community. Similarly, sociolinguists assume that whenever a particular situation recurs often in a community, and certain types of messages are felt to be appropriate for the situation, the form of the message tends to become conventionalized, thus on the one hand signaling the existence of the message and on the other hand offering a framework within which the language users can exercise their creativity. In short, the emergence of genres is a fundamental characteristic of human language use. Literary scholars have tended to be reluctant to admit the existence of genres outside "literature," although there are works devoted to "marginal" or "minor" genres. On the other hand, modern linguists analyzing discourse have generally steered clear of traditional genre analysis, largely, I suspect, because of their insistence on the primacy of spoken language and in particular the language of everyday conversation rather than the language of literary works. A valuable connecting link between literary scholars' concern with genre and the Sociolinguists' assumptions about the conventionalization of variation is found in folklorists' interest in the genres of folk literature and the phenomena of their performance. The folklorists' work has indeed been picked up both by some structuralist literary scholars (e.g., Todorov 1972) and by some sociolinguist researchers (e.g., Bauman 1977), but remains marginal to both. Perhaps the easiest place to join is the analysis of "minor" folkloric genres that later recur in both everyday

Genre and Register


speech and literary works, such as the riddle (cf. Todorov 1978, 223-45; Bauman 1977, 25-28). The third cause of sociolinguists' discomfort with the Aristotelian tradition is its evaluative nature. The genre analyst typically—in fact almost invariably—includes somewhere in the analysis a clear positive or negative evaluation of the genre itself, particular instances of the genre, or particular interpretations of the genre. Aristotle was not able to describe or interpret without evaluating. Whether one takes the positivist stance typical of American structuralism, with its emphasis on objectivity and nonjudgmental description, or the so-called post-modern stance of the impossibility of universalist evaluation, the result is the same: Aristotle was not only "culture-bound," missing the opportunity of comparing Greek with other literatures of the time, but he was irremediably prescriptive, finding (his) subjective evaluation of genres essential to understanding their formal context.

Excursus on a Genre In a large, complex, highly internally differentiated society, it may be expected that a complex, highly internally differentiated repertoire of genres will have been developed and that the processes of genre formation will be pervasive and, at least in many sections of social interaction, will be rapidly adaptive to new differentiation. Let us take as an example from the English-speaking American academic community, the genre or family of genres called the professional article—a written composition of limited length, prepared for publication, normally without financial renumeration, submitted to a "refereed journal," accepted, edited, and published. This message-type occurs frequently and is recognized as important both for the furtherance of the academic field in which it functions and for the individual professional careers of authors and editors. It must also fulfil the function of identifying by its generic characteristics the particular fields, subfields, and journals, as well as the individual authors and editors. Even the most cursory examination of this genre will reveal a large range of structural characteristics that are responsive to these various functions and have developed over time and continue to develop in the sensitive socialization and conventionalization processes of the community. In illustration of this perspective I examined just one year's worth of professional articles in just four leading journals—one each in linguistics and psychology (Language and the American Psychological Review) and two journals in the field of literary studies (PMLA and Genre). For the brief characterization provided here, I will pay attention only to the number of authors per paper, the use of subtitles, and the components of the first footnote. One could easily, however, discuss other salient generic features (internal discourse structure, patterns of punctuation or capitalization, methods of cross-referencing, etc.). On the one hand, part of professional socialization is the acquisition of skills in recognizing and producing these appropriate features; on the other hand, the features themselves are constantly shifting as the processes of conventionalization operate. Authors. In the two journals in literary studies, every article is written by a


Register and Genre

single author: the score is forty-four to zero. In Language, the score is fourteen single-authored articles to five articles of multiple authorship; the American Psychological Review contrasts sharply with nine to nine. Title. On the incidence of unitary titles versus title, colon, subtitle, the split is different: the psychologists and literary scholars have 50 percent or more of the compound titles, the linguists only one such out of nineteen articles. First Footnote. The differences are most striking regarding footnotes. All the journals have footnotes although they differ in number and content, but the first footnote, which generally is a metacomment on the whole paper and its preparation, is distinctive. For the psychologists, it has no identifying sign apart from its location at the bottom of the first page of the article and its smaller size of type. For the linguists, it has an asterisk and the succeeding footnotes are numbered. For the literary scholars, it is simply marked "1" as the the first note. But it is the content that is most elaborately conventionalized. The major component types are almost in full complementation across professions. Every one of the American Psychological Review articles includes as the final element in the first footnote a sentence that reads "Requests for reprints may be sent to . . ." followed by the full name and mailing address of one of the authors (or the only author). None of the literary journals have this component. Of the linguistic articles, only one has this in the first footnote: it is a psycholinguistic article of multiple authorship! Almost all the literary articles have a bibliographical citation in footnote 1: in Genre, twenty-five out of twenty-six, and the one exception is an article translated from French. None of the APR articles and only two of the Language articles have a citation in the first footnote, and these two are both "philological" in the sense that they are dealing with evidence from older written texts. Psychologists and linguists agree in having an expression of thanks as a final part of the first footnote (Language 16/19, APR 13/18); only one of the literary articles has a "thank you" in the first footnote, which has an asterisk and precedes the first numbered footnote. Finally the linguists are unique in having mea culpa components in the first footnote: four of the Language articles have a statement that the author accepts blame for errors and exonerates those whom he has thanked and in one case the funding agencies. Typical phrasings are: "None of these people is responsible for the errors," "remaining errors are my own." Because I am more accustomed to reading linguistic articles than the others, it seemed perfectly natural to have thank you's, acknowledgments of financial support, indications of previous versions of the article, and acceptance of responsibility for errors, and I was quite surprised to find that the literary studies do no thanking in the footnotes, and neither set of nonlinguists includes the attribution of self-blame. This brief characterization does not include all the identifying generic characteristics of even the first footnotes, but it does show how detailed the genre specification can be. I wondered how representative Language was among linguistic journals, so I tried a year's worth of Linguistic Inquiry. To my amazement, the numbers were almost identical.

Genre and Register


Register The history of register analysis is quite different from that of genre analysis. Although it must often have been noticed in the course of history that people speak differently on different occasions depending on who their addressee is, the setting in which the conversation takes place, the subject matter spoken about, the level of formality that the speakers assume, and other factors, the systematic analysis of this kind of variation is relatively recent, and the term register apparently was first used in this sense by a British philologist, T. B. W. Reid (Reid 1956). Reid's article was critical of the then newfangled structural linguistics that seemed to him to be so concerned with the structural characteristics of whole languages that the linguists could not cope with what he called the "register" variation. Although there may have been some validity to Reid's criticism of some structuralist descriptions of the day, it was actually structuralist researchers who began to treat register variation in their linguistic descriptions (cf. Ellis and Ure 1969; Ure and Ellis 1977). It was M. A. K. Halliday who first distinguished dialect variation (reflecting the "user") from register variation (reflecting the "use") (cf. Halliday 1968), and this user/use distinction in types of variation has lasted to the present (cf. Ghadessy 1988; Ferguson 1993). Ironically, Reid's 1956 article is now remembered principally as the origin of the term register in its current meaning, what was surely a casual, almost accidental, choice of term for Reid's original purposes. A few British linguists took up the analysis of register variation in a serious way, notably Jeffrey Ellis, who was most concerned with theoretical formulations, and Jean Ure, who carried out active research on register variation, especially in classroom settings in multilingual communities (e.g., Ure 1988). She is wellknown for her use of the concept of "lexical density" as a register feature. The most thorough treatment of the purposes and methods of register research by these two sociolinguists (Ure and Ellis 1977) never reached the readership it should have had because of delays in publication and the fact that its first appearance (1974) was in Spanish rather than English, French, or German. Most of the British work on register variation is in the tradition of Hallidayan "systemic grammar," and the linguists generally make use of Halliday's three parameters for features of register variation: field, tenor, and mode. (For a standard description, cf. Hudson's Section 2.4 "Registers" in Hudson 1980, 48-55). "Field" refers to the nature of the social activity in progress, including but not limited to the topic being spoken (or written) about. "Tenor" refers to the status and roles of the participants (e.g., doctor-patient, lawyer-client, mother-child, salesperson-customer, etc.). "Mode" refers to the channel or means of communication (e.g., letter/radio broadcast, short story, friendly conversation, planned versus unplanned discourse). Sometimes included in mode is the purpose or goal of the communication, sometimes confusingly referred to as the "functional tenor." Actually, in the more developed version of Halliday, now often called functional systemic linguistics, these three contextual parameters have been incorporated in a much larger system of relationships between language and context of situation which posits the natural determination of contexts and text so the linguis-


Register and Genre

tic structures co-evolve with cultural context and communicative function. For an up-to-date account of the current conceptualization of functional systemic linguistics compare Hasan (1980) (Paolillo's concept of "functional articulation" is rather close to this view of register variation in language [Paolillo 1991]). Comparative and historical studies of register variation as well as accounts of L1 and L2 acquisition of register variation are relatively rare, although one study of L2 acquisition has had wide acceptance. Lavandera (1975) showed that Italian immigrants to Argentina who spoke Spanish as L2 had a narrower range of register variation than native speakers of Argentinian Spanish. This principle of nonnative speakers having reduced ranges of register variation has been widely accepted among sociolinguists without much additional empirical study. Studies of register variation are most often synchronic descriptions of particular "registers" that for one reason or another are regarded as especially interesting or significant. Very little attention is usually paid to placing such a register within the total repertoire of register variation in a speech community, and almost no attention to how the register variation changes over time.2 If modern sociolinguists were looking for causes for "discomfort" with register analysis as typically practiced, they could easily identify two: the lack of attention to diachrony and to total register repertoire. It must be noted that some register analysts deliberately use other terms in place of register: Crystal and Davy (1969) chose the already overextended term style. Gregory (1967) chose diatypic variety. Also, some register analysts identify a different, usually larger, set of parameters of register variation. For example, Biber analyzes register variation under three (Biber 1986) or seven dimensions (Biber 1988), each of which continues a number of linguistic "features" (total of 41 features examined in 1986, 67 in 1988). The three 1986 dimensions were: (1) interactive versus edited text, (2) abstract versus situated context, and (3) reported versus immediate style. The 1988 seven were (1) interactive, exact informational versus affective, generalized context, (2) abstract versus situated context, (3) referential explicitness, (4) overt expression of persuasion, (5) abstract versus nonabstract information, (6) on-line informational elaboration, and (7) academic qualification/hedging. There is relatively good agreement between the two sets of dimensions in terms of both the clustering of linguistic features and the interpretation in communicative functions. The counterparts are roughly as follows:

Genre and Register


The Biber method of analysis includes a sophisticated factor analysis for which the data and functional interpretations seem particularly appropriate. Thus, a research study that began with the question of discovering the fundamental differences between spoken and written English ended up suggesting a general model of characterizing different English text types. It needs to be applied to other languages for which similar data (i.e., large, computerized corpora of spoken and written texts) are available, and the construction of a general theoretical framework of linguistic variation as well as a general method of exploitation of such data.

Conclusion If we summarize the characteristics of studies in genre analysis as opposed to studies in register analysis, we find a striking contrast, an almost complete complementarity. Genre Analysis

purely synchronic non-evaluative ("descriptive") isolated formal linguistic features usual practitioners—sociolinguists particularist sometimes comparative/contrastive usually spoken texts

diachronic perspective evaluative ("prescriptive") ordered component parts usual practitioners—literary scholars universalizing perspective typically a single language usually written texts

These two approaches to variation analysis belong together, and should be included in a comprehensive sociolinguistic framework of variation analysis, in the spirit of Gregory (1967). In short, Aristotle and Reid's respective concerns might be met by a unified sociolinguistic analysis of variation that would include both genres and register variation and give them both their rightful place in the linguistic and literary theories of the future.

Notes 1. Quotations from Aristotle are taken from Adams 1971. 2. An important exception to the synchronic perspective is the impressive historical study of variation in three English genres by Biber and Finegan (1988).

While the papers in part II deal with variation among individuals and between speech communities in an attempt to discover both the universal and the particular, those papers focus primarily on register and genre variation. The papers in part III examine variation and change at a number of different levels of linguistic analysis. They reflect, as do the previous set of papers, Ferguson's curiosity about the relationship among the universal, the socially conventionalized, and the individual in language, and how they influence language change. The papers raise two important questions also generated by the study of "marginal phenomena" like formated speech and simplified registers: What is the object of linguistic description? And what is the psychological reality of linguistic descriptions? (Ferguson 1982) But Ferguson is also concerned with the social dimension of language structure and use. Every speech community exhibits variation in language use—different kinds of individual, social, and national repertoires. Social issues arise "from changes in those patterns and from people's expectations and commitments about potential changes" (Ferguson 1973, 29). In these papers, we see Ferguson as mediator. While respecting the motivating questions, methodologies, and findings of a variety of disciplines and approaches to the study of language and society, he consistently reminds the reader of questions left unanswered by each. In suggesting extensions of research, he mediates between the linguist and other social scientists, between the traditions of the past and issues in contemporary linguistics, between micro- and macrosociolinguistic approaches, and among the historical linguist, the sociolinguist, the applied linguist, the syntactician, and the phonologist. Ferguson and Dil (1979) draw the attention of both linguists and social scientists to a sociolinguistic variable whose investigation has implications for both a theory of sound change and for theories of social diffusion. They illustrate this through a detailed examination of the phonological variable (s) in Bengali. For the social scientist, this variable is an indicator of the influence of education, regional and communial identification, and routes of modernization. To the linguist, the shift from /s/ to /s/ represents a universal tendency based on physiology, language processing features, and communication systems. The shift from /s/ to /s/, by contrast, is a particular tendency based on prestige, cultural innovation, and standardization. For Ferguson, a theory of linguistics must be able to account 177


for the particular as well as the universal, which involves a consideration of both linguistic influences (e.g., orthography, borrowings, foreign language interference) and social influences (e.g., regional differences, communal identity, and social status). Variation in the routes of diachronic change can also be seen by detailed comparisons of the seemingly same phenomenon in two languages. Ferguson (1990a), for example, traces the shift from /s/ to /h/ to / / in both Greek and Spanish. This identity of variation is only apparent, however, when the linguist takes into consideration lexical, morphological, and syntactic, as well as phonological information. Generalizations of diachronic change are arrived at inductively through this cross-language study. And although this might result in a picture of language change that is less parsimonious than one that works deductively from a search for universals, this does not seem to bother Ferguson, who is more concerned about getting the details right. In fact, we get a sense that he relishes reminding linguists concerned with the elegance of theory of the complicated nature of language. Such cross-linguistic studies of variation can provide similar insights at the morphological level. Ferguson (1991b) puts variationist methodology within a historical perspective. He shows, through an examination of drift toward loss of agreement in English and Swedish, how two cases of the same type of morphological simplification differ greatly from one another in detail. Furthermore, they both involve subtrajectories whose direction of movement could hardly be called simplification or natural change in any existing theory of markedness. Universals of language change can only be uncovered through an understanding of how drifts originate and the various routes they follow. This entails not only historical evidence, but quite possibly evidence from processes of pidginization and second language acquisition as well. For Ferguson, the search for universals can only be successful through a careful examination of what is unique in individual cases. This is illustrated in his paper "Then They Could Read and Write" (1990b), an examination of the conventionalization of literacy. Ferguson calls for sociolinguists and ethnographers to take an historical perspective in looking at the "pragmatics of literacy." This includes linguistic choices made in the process of introducing literacy into a speech community, choices such as the language and dialect to be written and the script chosen. But it also includes consideration of the source and scope of literacy and the institutional support for it. Only in this way can we avoid crude dichotomies such as the popular oral/literate one, which obscure the complexity of any given language situation. Another way to examine conventionalization is through empirical research of the variationist sort into that process in progress (Ferguson 1987, 120). Using examples from varieties of Arabic, Ferguson identifies at least three tendencies involved in the spread of favored varieties in standardization: koineization, variety shifting, and classicization. The overall processes involved in language spread are focus (i.e., the institutionalization of prestige forms) and diffusion (the dissolution of norms). Individual and social factors that influence shift include age, sex, per-

ceived similarity of forms to a prestige variety, and evaluative judgments of the nature of the form. The papers in this section exemplify Ferguson's desire to bridge macro and micro approaches to the study of variation and change. He sees a need for linguists, even for those who study language within its social context, to bring together "the research stream of relatively large-scale analysis of speech community characteristics and the stream of relatively small-scale analysis of individual behavior and dyadic and small group interaction . . ." (Ferguson 1991a, 187). Ferguson (199la) examines mismatches in agreement in features of politeness in Persian and Portuguese, and illustrates how morpho-syntactic change can result from a stretching of a grammatical pattern to meet communicative needs. Although a widespread phenomenon in human language behavior, it is rarely noted either in grammars of specific languages or in syntactic theory. A combination of macro and micro perspectives on "pragmatic considerations" (i.e., communicative functions of language) will provide a much clearer insight into the process of conventionalization than either approach will alone. But in formulating a theory of language change, including language standardization, the influence of language planning must also be taken into consideration. While Ferguson is optimistic about the linguist's role in influencing the direction of language change (Ferguson 1987, 130), he also recognizes that different paths of development can result as much from different cultural patterns as from the intentions of the linguist (Ferguson 1990b, 17). Methodologically, Ferguson is sympathetic to the quantitative, microsociolinguistic methods popularized by Labov (e.g., 1966; 1972) and others. He also recognizes the important role of macro-sociolinguistic approaches in the tradition of Fishman (e.g., 1968). In his own work, however, he seems most comfortable with informant work in the anthropological tradition of descriptive linguists and the careful scholarly examination of secondary sources in the tradition of European philologists. At a time when so much of the focus on linguistic research is on intricacies of the theoretical mechanisms used to generate descriptions of language, it is refreshing to encounter a linguist with Ferguson's detailed knowledge of the intricacies of the structure and use of so many varied languages.

References Ferguson, C. A. 1973. Problems of variation and repertoire. In M. Bloomfield and E. Haugen, eds., Language as a Human Problem. New York: W. W. Norton. . 1982. Simplified registers and linguistic theory. In L. Obler and L. Menn, eds., Exceptional Language and Linguistics. New York: Academic Press. . 1987. Standardization as a form of language spread. In P. Lowenberg, ed., Language Spread and Language Policy: Issues, Implications, and Case Studies (Georgetown University Round Table on Languages and Linguistics 1987). Washington, D.C.: Georgetown University Press. . 1990a. From ESSES to AITCHES: Identifying pathways of diachronic change. In


13 The Sociolinguistic Variable (s) in Bengali: A Sound Change in Progress? (with Afia Dil)

In recent years William Labov has identified phonological variables such as the variable (r), the degree of implementation of post-vocalic /r/ in the English of New York City, and in a series of studies has shown their importance for understanding the social functioning of language and the mechanisms of linguistic change (Labov 1972).1 The present paper identifies a phonological variable of this kind in Bengali: the variable (s), the phonetic range of sibilance, and shows the value of extensive study of the phenomenon. A number of phonological studies of Bengali have discussed the status of the sibilants in Bengali. Chatterji 1926 provided a considerable amount of basic data, including historical material; Ferguson and Chowdhury 1960 gave additional data and an interpretive summary of the situation; Chowdhury 1960 included several acute observations on the sibilants; and, finally, Dil 1972 gives new data and examples. The languages of South Asia have relatively few fricatives in their sound inventories, and in many instances the only fricative is a single sibilant. However, three types of sibilants occur in the area, and South Asian writing systems generally have separate symbols for them regardless of the phonology of the spoken languages. The three types will be referred to here, following traditional Indie terminology, as dental, palatal and retroflex, and symbolized as s, s, s, respectively. The first of these is similar to English s and the second and third to varieties of English sh; the symbols s and s will be used here for these two basic types in English and other languages. No attempt will be made to specify the phonetic details beyond this. For a summary of sibilant oppositions in South Asia, cf. Ramanujan and Masica 1969, 567-8; for the difficulties in phonetic specification of these sounds cf. Ladefoged, 47-9. In historical studies of Indo-Aryan languages, the status of the sibilants has often been taken as one of the crucial indicators of historical relationship. Thus, for example, the East Magadhan language from which modern Bengali, AsThis paper originally appeared in Studies in the Linguistic Sciences 9.1:129-37. 181


samese, Oriya, and the Bihari languages are descended, differed from other contemporary Indo-Aryan languages in that the three sibilant phonemes of Sanskrit had merged into a single sibilant phoneme pronounced as a palatal "shibilant" except where it was assimilated to neighboring sounds (Chatterji 1926, Pattanayak 1968). It seems likely that variation in the pronunciation of sibilants has been a focal point of language change throughout the history of Indo-Aryan languages from earliest times to the present. It is also likely that variation in sibilance has been a marker of social differentiation in these languages during the same period, but evidence for this is less clear and it has not been the subject of much philological or linguistic study. A thorough investigation of the situation in Bengali may be of value in the study of South Asian languages in general by suggesting patterns of sociolinguistic change at other times and places on the subcontinent, and it may have even broader significance for sociolinguistic theory. Most linguistic, phonetic and pedagogical studies of Bengali deal primarily with Standard Colloquial Bengali (SCB) and pay relatively little attention to the great range of regional, social and communal variation. Although this procedure has some disadvantages in that it tends to obscure certain historical and synchronic relationships among the dialectal varieties, it is nevertheless convenient both because Standard Colloquial is the accepted norm for educated conversation throughout the Bengali-speaking world and because it is the best described variety of the language. The central facts of SCB sibilance are quite clear: the SCB sibilant /s/ has the dental pronunciation [s] before certain dental obstruents, notably /t r/ and tauto-syllabic /n , while elsewhere, i.e. before other consonants, vowels, or boundaries, it has the palatal pronunciation [s]. A more detailed phonetic specification of the two Bengali variants is given in Kostic and Das 1969, 210-22. In addition to these central facts, there are many subsidiary details of variation which at times may even obscure the central facts. The nature of the subsidiary variations will be discussed below under five principal factors: 1) orthoepy, 2) learned borrowings, 3) foreign language interference, 4) regional provenience, 5) communal identity, and 6) social status. 1. Orthoepy. The Bengali writing system has separate graphemes for three sibilants: <s> /d nto 'dental s', <s>/tal bbo 'palatal s', and <s>/murdhonnas / 'retroflex s,?'. In Bengali spelling these are rarely interchangeable. Most words have only a single acceptable spelling of sibilants and this spelling is in general etymological, reflecting the spelling of Sanskrit etyma. As may be expected from the central facts of SCB pronunciation, the spelling poses a problem for Bengali school children, and considerable effort is expended to inculcate correct spelling. The discrepancy between orthography and pronunciation is at least marginally relevant to the description of the phonological variable of sibilance in that some speakers on some occasions attempt to "correct" their pronunciation to bring it into agreement with the spelling. This generally means that some attempt is made to distinguish between the dental and palatal sounds. Less effort is put on distinguishing palatal and retroflex, probably because the retroflex letter is rarer than the other two and commonly occurs in clusters, e.g. <st>, in which no contrast is possible with another sibilant in traditional spelling and contrast in pronunciation is possible only marginally in Arabic/English loanwords. This attempt to make one or more distinctions in "correct" speech which are absent in

ordinary language use is, however, relatively unimportant since only very few people attempt it (puristic teachers, Sanskrit scholars, etc.) and even those who do try it actually do so only on certain occasions and with limited success.2 Perhaps of somewhat greater significance is the fact that a distinction is generally made in the spelling of foreign words, such as proper names or actual loanwords. Thus, for example, a word borrowed from English which is spelled with "s" in English and is so pronounced in English will normally be spelled with a dental <s> in Bengali, while one spelled with "sh" and so pronounced will be spelled with a palatal <s>. More on this phenomenon in the following sections. One other detail of pronunciation may be noted here under orthoepy. Speakers of SCB sometimes use the [s] variant before a dental obstruent at a grammatical boundary so that in some styles of pronunciation some speakers would distinguish between aste 'slowly' with [s] and as-te 'to come' with [s]. This varies along some orthoepic dimension such as 'carefulness of speaking' but may be affected by other factors as well. 2. Learned borrowings. The Bengali lexicon contains many learned borrowings from Sanskrit, so-called 'tatsamas' and 'semi-tatsamas', which are more or less well integrated into the phonological system. Many of these learned borrowings have phonotactic characteristics, such as consonant clusters, which do not obey the constraints of the ordinary vocabulary of the spoken language ('tadbhavas'). This phenomenon is of particular interest for the (a) variable because a number of learned borrowings have initial s-clusters not found in tadbhava vocabulary, and the educated speakers of Bengali who make use of these words normally pronounce them with [s]. Examples include sk ndho 'shoulder', sk ndho 'fall', sp s to 'clean'. 3. Foreign language interference. It is not possible to provide a reliable estimate of the number of native speakers of Bengali who speak one or more other languages in addition. Weinreich's analysis of Indian bilinguals on the basis of the 1961 census (Weinreich 1957) suggested that Bengal as a whole may be less bilingual than other areas in South Asia, but whatever the exact figures on multilingual Bengalis in India and Bangladesh, the use of English and Urdu (or "Bazaar Hindustani" or some other variety of the family of languages and dialects referred to as Hindi), is widespread among them. The knowledge of these two foreign languages is related to the pronunciation of sibilants in Bengali. Both English and standard Hindi and Urdu have an s-s opposition, quite prevasive, of high functional load, and only rarely neutralizable in English and of similar though somewhat less salience in Hindi-Urdu. Also, some Eastern varieties of Hindi have only a single sibilant which is s-like in phonetic value. Accordingly, many Bengali speakers use the source-language sibilant in at least some of the loanwords of whose provenience they are aware. For example, a common Bengali word for 'movies' is /sinema/ which many Bengalis pronounce with English [s] either very often or always, while other Bengalis will pronounce a Bengali [s]. Any estimation of the extent of this kind of influence from English is very difficult. Since English is taught as a subject in nearly all secondary schools, is the medium of instruction in some schools, and is an important language in higher education, government, commerce, technology and many domains of language use, it might be expected that some preliminary estimates could be made by com-


Variation and Change

piling educational statistics. Even such preliminary estimates would, however, be of little value because one must consider the variety of English spoken in the various settings of English use (cf. Kachru 1969). Some Bengalis carry over into their English essentially the central facts of Bengali sibilant pronunciation, thus having no effective contrast between /s/ and /ยง/. Others are aware of the contrast in English and use it sporadically or in certian speech styles. An individual observed by one of us made no distinction in normal use in his English although he had an s-s distinction in his Urdu and knew there should be such a distinction in English; he would exemplify the English distinction by pronouncing the two words see and she when this was a matter of discussion, but not otherwise. It is not at all unlikely that individuals may have several styles or registers in both Bengali and English reflected in differential pronunciation of the sibilants. Clearly the situation is very complex but it seems likely that the pronunciation of sibilants in English loanwords in Bengali would turn out to be an excellent indicator of such dimensions as amount of education, identification with "modernizing" trends, and the like. The situation with regard to Urdu loanwords is somewhat different and will be discussed in more detail in Section 4, but it may be noted here that Bengali-Urdu bilinguals are particularly likely to keep their s-s distinction in the pronunciation of Urdu loanwords of Perso-Arabic origin, especially proper names and expressions related to religion, and this has extend far beyond bilingual usage to be identifiable as a feature of Muslim Bengali (Dil 1972). 4. Regional prov nience. Millions of Bengalis grow up learning to speak a variety of languge which is very close to or identical with SCB, but probably most speakers of the language acquire first a regional dialect which varies in many significant respects from SCB. Although local dialects tend to be readily intelligible to neighboring dialects, the extremes of dialect variation within the Bengalispeaking world may be intelligible only with the greatest of difficulty. A full description of the sibilants of local dialects would go far beyond the range of this study but some factors must be noted. An educated Bengali's pronunciation of SCB may have a regional coloring related to the local dialect so that even though he is very fluent in handling the Standard form, the careful observer can detect his region of origin, and the pronunciation of sibilants is one of the most useful diagnostic features in this. The palatal pronunciation of the sibilant was mentioned above as a characteristic of Bengali and closely related languages. In some dialect areas of Bengali, however, sound changes have occurred which have altered this picture considerably. For example, throughout quite a large territory in East Bengal the original sibilant in initial position, and to some extent in other positions, has either become the simple aspirate /h/ or has disappeared completely. In dialects where the sibilant has undergone this change, a new sibilant phoneme has arisen from the voiceless, aspriated, palatal affricate /ch/. In these dialects typically the palatal affricates have become more dental so that the original unaspirated /c j/ are pronounced [ts dz] while the voiceless aspirate has become [s]. The details vary considerably from place to place but this pattern is typical (Chatterji 1926; Ray et al 1966; Goswami 1961). For speakers of these dialects the acquisition of the Standard Colloquial as the

superposed variety required for education, formal public use and the like results in a range of variation in sibilant phonetics quite different from that of dialects whose phonology is more like that of SCB, and it seems likely that their pronunciation of /c ch j s h/ would turn out to be an excellent indicator of degree of mastery of SCB and amount of regional identification. Since the majority of the speakers of these dialects are Muslims this regional factor is in part confounded with the communal factor described below. 5. Communal identity. The Bengali-speaking world is, apart from a very small minority, divided into a Hindu community and a Muslim community. Even if an individual Bengali has no strong personal commitment to Hinduism or Islam he is still unmistakably marked in many ways as belonging to one or the other. For example, Hindus typically have family names and one compound given name, while Muslims typically have no family name and from three to six given names, and almost all Bengali given names are distinctively Hindu or Muslim. Although there is a strong common core of shared Bengali language and culture, there are pervasive differences between the two communities. Hindu and Muslim Bengalis share a common Bengali language, fundamentally the same in phonology, syntax and lexicon, but there are substantial lexical differences, for example, in kinship terms, in greetings and titles, in expressions relating to food and clothing, and in other semantic areas where there are behavioral differences between the two communities (Dil 1972). One typical pattern of lexical difference is to have a Common Bengali term used by both communities and one or more Muslim variants of the term commonly used by the Muslim community but little used or even unknown among Hindus (e.g. Common Bengali/dim/, Muslim/anda/ and/b eda/, 'egg')- Another pattern is to have two completely different expressions because the objects named differ (e.g. the items of clothing, Hindu/dhuti/, Muslim/ lungi/). In a few instances there are simply two words for the same thing, one Hindu and one Muslim, known to both but used communally (e.g. Hindu /j l/, Muslim /pani/, 'water'). In phonology the most striking communal difference is in the pronunciation of sibilants. The Bengali language has a number of words of Perso-Arabic origin which have come into the language, often by the indirect route of Urdu. Many of these words are completely naturalized in Bengali and are used without communal identification. Some are, however, chiefly used by Muslims, and most of the Muslim lexical variants in Bengali are of Perso-Arabic origin. In many Perso-Arabic lexical items such as proper names, certain high frequency words, and items referring directly to religion, Muslim speakers typically have a phonemic difference between /s/ and /s/. They also have an independent Izl phoneme in words of this kind, while for Hindu speakers [z] occurs only as a variant of /j jh/ before dentals in casual speech. The Hindu-Muslim communal difference in sibilance (i.e. the presence of additional /s z/ phonemes in Muslim variants), is heavily loaded with social significance and very resistant to change even when the attempted change is in the direction of orthoepy. For example, a Muslim speaker may find it impossible to use the [s] pronunciation in a Common Bengali word if the Muslim pronunciation has [s], and the Hindu speaker may find it very difficult to produce the [s] pronunciation in a word when his knowledge of the orthography or foreign provenience


Variation and Change

would recommend It. In large part these reactions are unconscious, but sometimes they may be made explicit.3 Clearly this is a fruitful area for investigation of a phonological variable of social significance. 6. Social status. Linguistic variation related to social stratification is widely attested in South Asian languages and has been given systematic treatment by a number of linguistis, as in the now classic paper Gumperz 1958, and the articles collected in Ferguson and Gumperz 1960. Most of these studies have used caste stratification as the principal social variable, but recent studies are beginning to pay more attention to the relation between degree of education and caste ranking in speech behavior, e.g. Pandit 1969. For Bengali, there have been almost no studies of language attitudes (Mukherjee 1976, Singh 1976). Because of the lack of information, it is impossible to discuss social variation in sibilance with any precision. On the basis of casual observation by the two authors, however, one important remark can be made: in a variety of settings, rural and urban, dental [s] pronunciations are used for SCB palata [s] by speakers of lower socioeconomic status, and efforts are made in classrooms and other prescriptive settings to alter this pronunciation toward the Standard. It seems that this dental pronunciation represents an incipient sound change, by which the historical palatal sibilant is becoming a dental one. This change probably reflects a "language universal" in the sens of a phonological tendency likely to appear at any time when the conditions make it possible. The universal may be stated as follows: When there is only one sibilant phoneme in a language, its principal allophone or clarity-norm variant will tend to be [s]- like, i.e. alveolar, with relatively high-pitched noise in the spectrum (Jakobson 1968, 55). Accordingly, if, by some kind of sound change, the only sibilant in the language has come to be palatal, retroflex, labialized or in some other way phonologically marked, this sibilant will tend to be replaced by something more like a simple [s]. This sound change in Bengali, if it is taking place, would be working its way up from lower social strata, i.e. it would be a change 'from below' like the raising of front vowels in New York English (for general discussion cf. Labov 1972, Ch 9).

Conclusions The kind of sound change represented by Bengali s > s should be recognized as a common type of change4 and carefully investigated where it seems to be in operation. The distinguishing characteristics of the type are three: a. A phonetically 'marked' phoneme, which has somehow arisen in a language, loses its marking; it either merges with its unmarked counterpart in the language or it changes its phonetic value without merger. b. The change begins in a relatively low social stratum and spreads throughout the community. c. The change is stigmatized by speakers of a prestige dialect, and the normative influences of education and social advancement work against the change.

An almost exact parallel sound change is the shift of interdental fricatives to stops ( >t, >d). The interdentals are marked and when they come into existence in a language they tend to be replaced by the less marked stops as has happened in Arabic and other Semitic languages and in most of the Germanic languages. This change seems to be taking place in contemporary English. The process has been best described for New York English, where the stop values are most frequent in lower middle class pronunciation and in casual speech and are heavily stigmatized (Ferguson 1978). Research on the Bengali phonological variable (s), the phonetic range of sibilance, is promising both for the linguist who wants to understand the processes of sound change and for the social scientist who wants to understand Bengali social change. Such research would a. Provide insights in the distribution of social forces and the processes of social change in Bengali society. The (s) variable is a sensitive indicator of the influence of education, the degree of regional and communal identification, and the routes of modernization. b. Deepen our understanding of linguistic change by observing the interaction of opposing phonological trends. One (s) is a universal tendency based ultimately on the shape and functioning of the human vocal tract and universal features of language processing and communication systems, and the other (s>s) is a particular tendency based on lines of prestige, cultural innovation, and standardization operating within the Bengali-speaking world.

Notes 1. For clarity of presentation phonemic transcriptions are enclosed in slant lines /s/, Bengali graphemic transcriptions in angle brackets <s>, English orthography in double quotes "s", and phonetic transcriptions in square brackets [s]. 2. Similar phenomena are found in many literate speech communities throughout the world. Other examples from Bengali include the attempt to pronounce aspirated consonants in cases where the aspiration is either optional or totally absent in the spoken language. 3. This same kind of resistance to change across a communal difference may be seen in the Bengali variable (r), the presence of contrasting /r r/ in Hindu Bengali and their merger to /r/ in Muslim Bengali (and in some regional varieties of Hindu Bengali). Muslim speakers who lack the r-r distinction find it difficult to perceive and produce this difference despite its presence in the orthography and acknowledged correctness in the Standard. In both the (s) and (r) variables the communal difference is complicated by regional and social factors, but it is nevertheless highly significant. 4. Kroch 1978 offers a description of this kind of change and seems to regard it as the principal type of sound change. Kroch's description is very useful, but it seems unlikely that this is the predominant type of sound change. Some innovations are not stigmatized, prestigious groups may be linguistically innovative (Labov 1972, ch. 9), and sound change need not be 'natural' (Ohala 1978).


14 Standardization as a Form of Language Spread

The Study of Standardization in Progress 'Language spread' is a useful and stimulating concept that seems to orient and unify many lines of sociolinguistic research, and in this paper it will be used to organize some thoughts about the process of language standardization and the kinds of research most likely to lead to better understanding of that process. Probably like other papers at this Georgetown University Round Table on Languages and Linguistics, this paper will start from Cooper's definition. By 'language spread' is meant 'an increase over time in the proportion of a communicative network that adopts a given language or language variety for a given communicative function' (Cooper 1982:6). By 'language standardization' is meant 'the process of one variety of a language becoming widely accepted throughout the speech community as a supradialectal normâ&#x20AC;&#x201D;the "best" form of the languageâ&#x20AC;&#x201D;rated above regional and social dialects, although these may be felt to be appropriate in some domains' (Ferguson 1966:31). It seems clear from these two definitions that standardization must be a type of language spread, characterizable in terms of the variables included in the respective definitions. Presumably, one could analyze particular historical instances of standardization in terms of the speech communities and communication networks involved, the languages and language varieties at issue, and the communication functions fulfilled by the spreading variety and by the varieties being supplemented or replaced. Presumably, one could then proceed to generalize across large numbers of instances, in the spirit of typologically based language universals research, and thus progress to a general model or set of alternative models valid in general for language standardization. Indeed, this is the approach suggested in Ferguson 1966, proposed more explicitly and with greater elaboration in Malkiel 1984, and essentially adopted in such comparative collections as Guxman 1960 and Scaglione 1984. This paper has been reprinted from Peter Lowenberg, ed., Language Spread and Language Policy: Issues, implications, and case studies (Georgetown Round Table on Languages and Linguistics 1987), by permission of Georgetown University Press.



Certainly, I am not going to reject this approach, since it offers a promising direction of research that can bring deeper understanding of the social factors at work in standardization as well as in the converse process of disintegration or dialect differentiation. We can be reasonably sure that if we choose the most diverse instances (to get an idea of the limits) and try for representative instances of hypothesized typological categories, we will find some useful generalizations, even if the accidents of history often fail to provide the kind of documentation needed to reconstruct the path of spread at the requisite level of detail. The approach proposed in this paper, however, is that of empirical research, of the variationist sort, in language situations of standardization in progress. Just as the variationist perspective has been fruitful in the study of language change in general, it is likely to provide new insights for students of standardization and language planning.

Tendencies Apparent in Standardization First, let us note that the spread of a favored variety in standardization is always a more complex process than the definitions suggest, and at least three tendencies are apparent. One tendency is 'koineization' or the reduction of dialect differences, both by dialect leveling, i.e. the avoidance of salient markers of particular dialects, and by simplification, i.e. the reduction in inventory and regularization in alternations that in other contexts is an aspect of pidginization. One welldocumented component of koineization is the avoidance of 'stigmatized' forms, i.e. forms that for one reason or another have come to be regarded as 'bad' or 'wrong', marking disfavored social groups or occasions of use. A second tendency is 'variety shifting', in which specific linguistic features came to be viewed as marking identity with particular social groups ('dialect shifting') and particular communicative functions or occasions of use ('register shifting'), and individuals adopt such features as part of their 'acts of identity' in producing utterances (LePage and Tabouret-Keller 1985). When this variety shifting is tending toward the spread of a supradialectal norm it is, of course, standardization par excellence; if it is tending toward fragmented norms it is dialect diversification. A third tendency is 'classicization', or the adoption of features considered to belong to an earlier prestige norm. This spreading may be from spoken forms identified by the hearer as belonging to the targeted norm, or directly from written texts representing that norm, or from the speaker's own innovations attempting construction of the presumed norm. In the context of the so-called 'creole cycle,' this form of spread is the decreolization phase. All these tendencies may appear simultaneouslyâ&#x20AC;&#x201D;sometimes even in the use of the same form by the same personâ&#x20AC;&#x201D;but they deserve analytic autonomy because of their different social dynamics and different sociolinguistic outcomes. The intensity of their operation and the degree of group consciousness of these tendencies can be interpreted in terms of the overall processes of 'focus', by which institutionalization of prestige norms takes place, and 'diffusion', by which such

norms are dissolved (for this terminology, cf. LePage and Tabouret-Keller 1985:187).

Standardization and Differentiation In a speech community or set of linguistically related speech communities that remain(s) in place over long periods of time, it is possible to have a succession of periods of focus with standardization and periods of diffusion with dialect differentiation. It is also possible, of course, to have a period of focus which results in dialect differentiation and separate standardizations, if the focus is in terms of smaller communities rather than the overall community. The paradigm case of the former type of successive periods of standardization and differentiation is the Egyptian example of the 'standardization cycle', as recently reviewed and interpreted in Greenberg (1986). His characterization of this four-thousand-year cycle is worth citing in full: An originally basically unified language develops regional dialects which if unimpeded will diverge in the course of time into mutually unintelligible languages. However because of social and political factors one of the dialects, in modified form, becomes the basis of a new common language, a koine which tends to supersede the original dialects. In a community with writing the common language acquired the additional prestige which accrues to literary use. In the course of time the spoken koine develops local dialects so that ultimately, if linguistic unity is to be preserved, a new common language must develop on the basis of a dominant dialect of the old koine (Greenberg 1986:273).

The paradigm case of the second type, that of successive periods of standardization resulting in separate local standardizations, is the example of Latin and the Romance languages, as generally recognized in the standard handbooks and introductions to Romance linguistics. For example, Elcock 1960 notes the stylistic differences between 'urban' and 'rustic' Latin passing into the 'learned' and 'popular vernacular' Latin of the Middle Ages and then 'From the pattern of unrecorded Romance as it must have been in the ninth century, certain local speeches, widely separated in the limited geographical concepts of the time, were to assume the role of standard languages' (Elcock 1960:334). The accounts of such historical instances are helpful as summations of countless individual events over considerable periods of time, and in this respect are like the neogrammarian 'sound laws' which summate complex verbal behaviors over time but do not elucidate the interactional mechanism that lead to the regularities. We are fortunate in having a growing number of detailed studies of standardization in progress in one of the world's largest speech communitiesâ&#x20AC;&#x201D;the Arabicspeaking 'nation' al-'ummah al-'arabiyyah, which includes a score of sovereign nations from Morocco to the Persian Gulf. This 'super' speech community has long been regarded as a typical example of the language situation of diglossia in which there are two functionally distinct norms, a superposed H variety, the Mod-


Variation and Change

era Standard Arabic (MSA), and a mother-tongue L variety, Colloquial Arabic, which exists in a series of local vernaculars (Ferguson 1959a, Altoma 1969, Diem 1974). Almost everywhere in the Arab world, however, communicative tensions that arise in the diglossia situation are resolved by the use of 'relatively uncodified, unstable, intermediate forms of the language' (Ferguson 1959a). Some observers identify a number of intermediate levels ranging from the MSA or a more traditional Classical Arabic to a 'plain colloquial', 'vernacular', or 'colloquial of the illiterate' (e.g. Blanc 1960, Badawi 1973, Meiseles 1980). Other observers (e.g. El-Hassan 1978, Mitchell 1980, Mahmoud 1984) posit a single intermediate variety, Educated Spoken Arabic (ESA) which is, in Mitchell's words, 'created' and maintained by the constant interplay of written and vernacular Arabic' (Mitchell 1986:13), and has within it a range of variation [Âą formal], and if [-formal] then either careful or casual (Mitchell 1986:17). The rapid changes now taking place in Arabic are the focus of study by an impressive number of sociolinguistically oriented researchers (cf. Daher 1986 for a review of some of this literature; note especially Jernudd and Ibrahim 1986a, 1986b), and we are in the favorable position of being able to follow aspects of the three standardizing tendencies in operation. In Ferguson (1959a) the tentative prognosis for Arabic was slow development over two centuries toward several standard languages, more or less like the model of Latin and the Romance languages. In Ibrahim and Jernudd (1986) the confident prognosis is for 'the emergence of a new, international koine. . . compatible with emerging national or subregional dialects of what will remain one Arabic,' more or less like the standardization cycle of the Ancient Egyptian language. In view of the inexact terms of the two prognoses and the present state of our knowledge, it is not clear in what ways the two prognoses differ, let alone which one of the two might be valid. Much more important at this stage than defending alternative prognoses is careful analysis of the language behavior of Arabic users, not only with the more general sociolinguistic goals of understanding the functioning of language in its social context and understanding the sources, paths, and outcomes of language change, but also with the more specific sociolinguistic goal of understanding the particular process of language spread called standardization.

Toward a Frame of Reference The remainder of this paper will be devoted to examining examples of the three standardizing tendencies I have identified here. Before proceeding, however, it is worth noting that linguists who are focusing on one part of the range of variation typically pay so little attention to other parts of the range that a full picture of the variation and directions of change never appears. Thus analysts of change in MSA (e.g. Blau 1981) generally leave out of consideration not only the dialects but even the 'various mixtures of Modern Standard Arabic with the local dialect' (Blau 1981:247). Analysts of the vernacular (e.g. Holes 1983) consider the 'crosscutting influences of the locally prestigious dialect . . . and of the supradialectal

variety of Arabic (MSA)' (Holes 1983:437-38), but do not attempt to describe unmistakably 'educated' spoken Arabic or oral or written MSA itself. Analysts of Educated Spoken Arabic (e.g. Mitchell 1986) leave out of consideration both the 'spoken prose' end of the range of variation and the 'speech of the uneducated and of the illiterate' (Mitchell 1986:14) at the other end. Sociolinguists studying language change must be grateful for the detailed studies of Arabic now at last appearing, but models of change of the attempted generality, for example, of Milroy and Milroy (1985) cannot be adequately formulated or tested in the Arabicspeaking world until both a fuller range of variation is investigated in at least some region and the verbal interaction of social networks of language users is recorded and analyzed in much more detail.

Koineization Leveling in the sense of avoidance of disfavored alternatives is harder to document than the adoption of preferred alternatives, which is treated here under variety shifting. Nevertheless, observers have noted that speakers of Arabic tend to avoid particular lexical, phonological, and grammatical characteristics that they feel would identify them as speaking a too local or too uneducated Arabic or that they fear will be misunderstood or regarded as comic by their listeners. Thus Blanc (1960) notes that Iraqi speakers conversing with Arabs from other areas avoid using timman for 'rice' or 'aku 'there is, there are'; that Aleppo speakers avoid their strong local 'imalah (in many words they have i for a). Similarly, Moroccans conversing with Eastern Arabs may avoid hut 'fish', khal 'black', shal 'how much', and other salient localisms. Those Syrian and Lebanese speakers who have long vowels in the masculine singular imperative may avoid this form in talking with other Arabs. Such examples are easy to multiply in this anecdotal way, but careful investigation of the avoidance phenomenon is badly needed. Equally interesting are the cases where a severely local form is avoided, as in Blanc's observation that the Syrians persisted in using fi(h) 'there is, there are' in inter-Arab conversation even though it is non-MSA whereasâ&#x20AC;&#x201D;as pointed outâ&#x20AC;&#x201D; Iraqis seem to avoid 'aku (and Moroccans kain). Perhaps the Syrians are secure in knowing that Egyptians also use fi(h). Such hierarchies of preference are not at all well understood, and we can hope that some of them will be explained in detail as research continues. The other aspect of koineization, simplification, is somewhat easier to specify, even if actual processes of spread are not clear. Many authors have listed simplifying changes in varieties of Arabic. For example, Ferguson (1959:619-20) identified a set of specific simplifying trends as a 'drift' or general direction of change in Arabic, and excluded them from the argumentation for the formation of the koine that was hypothesized in that article as the source of most modern dialects. A recent study of the Arabic of northeast Arabia and the river valleys of Iraq and Khuzestan (Ingham 1982) devotes a whole chapter to 'reductional changes'. Versteegh (1984), which takes a more extreme view than Ferguson (1959b) by positing a pidgin as the source of most modern dialects, lists numerous examples of simplifying processes.


Variation and Change

The following are typical examples of simplification in Arabic that appear in various lists. Phonology: Mergers: Shortenings: Cluster elimination:

j merges with y; d merges with d; i,u merge to . unstressed long vowels shortened; long in closed, nonfinal syllables shortened. 'helping vowels' inserted in word-final twoconsonant clusters.

Morphology and syntax: Inflectional reduction: Pattern reduction:

Pattern regularization:

dual lost in verbs, adjectives, nouns; fern. p1. forms lost in verbs and pronouns. trend to 'strict' agreement between subject and verb, eliminating difference between pre- and postverbal patterns. stem and affix alternation (indicative, subjunctive, jussive, imperative; m. f. sg. p1.) of 'hollow' verbs unified.

Insofar as such simplifications appear in various localities or social strata and spread throughout the Arabic-speaking world, they are part of the koineizing tendency of a pan-Arab language standardization process. Insofar as they are more locally accepted and differentiated from one another, they represent the development of regional standardization processes (e.g. j/y merger in the Gulf region, partial mergers of masculine and feminine second person singular pronoun forms and verb forms in Lebanon and Morocco).

Variety Shifting The adoption of preferred variants is the essence of standardization, as indeed it is the essence of all language change. The problem is to discover in each language situation why certain variants are preferred by certain users under certain conditions and how such lines of preference move through the speech community and result in shared irreversible systemic change. Sociolinguists also want to discover the linguistic and social constraints that operate in general in these processes of change, i.e. to discover the principles that explain possible outcomes of different language situations. The process of standardization, as a special case of language spread, which is in turn a type of language change, is a process of convergence whereby differently favored variants in different sectors of the community or on different occasions of use come to coincide. In the Arab world a number of recent studies of such convergence agree in showing that the dominant lines of convergence are toward regional standards, namely, prestigious urban educated speech patterns of various communicative centers, rather than toward a single unified prestige norm for the Arab world as a whole. The simplest illustration is the frequently discussed phonological variable (q). The almost universally accepted norm of MSA pronunciation is a voiceless uvular stop [q], but in many parts of the Arab world a regional prestige norm has a

Standardization as a Form of Language Spread


voiced velar stop [g] or a glottal stop [?]. Local reflexes of Old Arabic /q/ include not only these three sound types but also velar and postvelar (nonuvular) stops and several affricates. Wherever careful, detailed variation studies have been undertaken, the local pronunciation is being displaced by the relevant regional standard [g] or [?], and this holds true even where the original local reflex is [q]. Thus in Bahrain the low prestige rural Shiites who have traditionally said [q] are clearly shifting to [g], which is the pronunciation of the urban educated Sunnis (Holes 1986). In Amman, which is in the process of becoming the center of a new regional standard, a variety of local reflexes, including [q], are yielding to [?], the reflex of urban Palestinian and Syrian Arabic, a major component of the newly evolving Amman urban dialect (Abd-el-Jawad 1986). This example of the reflexes of Old Arabic /q/ moving toward regional standard prestige norms rather than the pan-Arab MSA norm (cf. Ibrahim 1986), is not a new phenomenon, having been noted for decades. It was reported in passing, for example, in Cantineau's descriptive grammar of the Palmyra dialect (Cantineau 1934). The Colloquial reflex of /q/ in that isolated local dialect was [q] and this was clearly the dominant pronunciation among the 6,000 inhabitants of the town. Cantineau noticed, however, that two other pronunciations were in use: the [g] of the surrounding beduin and those young Palmyrene men who admired the skills and bravery of the beduin and the motorized camel corps in the area; and the [?] of Damascus, the national capital, used by a handful of urban outsiders in Palmyra and by some Palmyrenes who identified with school, government, and 'middle class' values on the national scene. Cantineau did not provide much information on actual language use of individuals and social groups, so that it is now not clear whether these aberrant pronunciations were transitory phenomena limited to particular words or particular occasions of use, or whether they represented the beginning stages of a set of changes spreading through the community. The important point here is that the MSA/Classical /q/ was yielding to local prestige norms at that time and place as it is now doing in large areas of the Arab world. Other phonological, morphological, and syntactic examples of variety shifting toward regional standards can be cited. Although careful studies of these phenomena are not yet plentiful, a large proportion of the language change in progress in the Arab world seems to fall in this category. It must be noted, however, that the /q/ example is somewhat misleading in its apparent simplicity. The factors of age, sex, agreement with MSA, and several kinds of evaluative judgments (e.g. 'old-fashioned', 'tough, daring) all interact in complex ways, along with lexical variables such as the marking of particular words as foreign, local, formal, and the like (Holes 1983 presents a revealing account of a set of Bahraini phonological variables undergoing shifts of this sort).

Classicization The adoption of variants from a superposed, formal, traditional norm at the expense of local dialectal variants is often an important ingredient in the process of standardization, and, in the case of Arabic, dialectal convergence with MSA or Classical is a well-attested phenomenon. This classicizing tendency is most obvi-


ous in lexicon, either in borrowing words into Colloquial or in classicizing dialectal words by restructuring them phonologically and/or morphologically. Thus /zami'a/ 'university' in Syrian Arabic is recognizable as a classicism by syllable structure and vowel quality (a corresponding original Colloquial form would be / zam'a/; /jami'a/ 'university' in Bahrain Arabic is recognized as a classicism by having invariant /j/ instead of the j ~ y, which is the local Colloquial conterpart of Classical/MSA /j/. The range of stylistic variation utilized in the pronunciation of individual words is very great, and the factors affecting the choice of variants are complex, A simple example is the word for Egypt or Cairo in many dialect areas, where the pronunciation /ms(i)r/ is understood as MSA. The usual pattern seems to be to vary the pronunciation depending on the educational level of one's interlocutor, the formality of the occasion, etc. Some speakers may have only /mas(i)r/ in their verbal repertoire, but no one seems to select only the MSA variant exclusively. The frequency of use of the two pronunciations also varies by degree of education and social status of the speakers, reminiscent of the phonological variables explored by Labov. The situation differs, however, in that this is not a general a~i alternation but is limited to the particular word, and the MSA variant is part of a codified norm found in various sources of authority, including dictionaries, whereas the other variant is not given such overt recognition. Mitchell (1986) offers a number of more complex examples of this kind of stylistic variation in phonology, morphology, and lexical suppletion. Classicization is convergent in principle when there is a single codified classical norm. This is largely the case in Arabic, since the superposed MSA is much more homogeneous throughout the Arab world than the local dialects and there are no overall classical norms competing with it; fluctuation within MSA is to all intents and purposes part of the variation discussed in this paper. In the Arabic situation, however, and in many other situations of standardization, the classicizing tendency is much more lexical and formulaic in implementation and much more stylistically variable on a formal-informal or literary-colloquial dimension than the interdialectal koineizing variety shifting.

Conclusion The frame of reference suggested in this paper and the examples cited may in the long run turn out to be inaccurate, inappropriate, or wrongheaded (although the author sincerely hopes otherwise), but they will have served the intent of the paper if they point up the value of the study of standardization in progress as opposed to historical cases of standardization. The framework seems more or less applicable to earlier instances, and in some cases crucial features of convergence seem to have been adequately identified and significant trends of koineizing, variety shifting, and classicization have been described. Also, in some cases alternative possible outcomes have been the subject of informed and insightful speculation. But in no case do we have data of the richness of present-day variationist studies of language change, and thus in no case can we really see the process of standard-

Standardization as a Form of Language Spread


ization from the perspectives of universal constraints and the transition, embedding, evaluation, and actuation problems (Weinreich, Labov, and Herzog 1968). This year, 1987, is being celebrated in many countries as the bicentennial of the birth of Vuk Stefanovic Karazic, the great Serbian 'language reformer', and the case of Serbian standardization is one that has been discussed in countless books and articles over the years. In comparing it with Arabic, one notices immediately the diglossic situation with which it beganâ&#x20AC;&#x201D;the H variety of so-called 'Slavo-Serbian' and the L variety consisting essentially of local South Slavic dialects ancestral to modern Serbo-Croatian. Some people at the time favored the development of the H variety as the standard, some favored a so-called 'middle style', and some, including Vuk, favored the spoken language of the people. Eventually, the last is what won out. Nowadays one rarely considers what might have happened if the process of standardization had resulted in a common South Slavic standard which included all the varieties ancestral to modern Bulgarian, Serbo-Croatian, Slovenian, and Macedonian. On the other hand, discussion continues about the viability of the bimodal standard of Serbian and Croatian. The parallel with Arabic is far from complete, but at least we may note that South Slavic as a whole, like the Arabic 'nation' as a whole, had no obvious economic, cultural, and political center that could serve as the source of a new, unifying standard. The one element in the Serbian picture that has not been mentioned in the case of Arabic standardization is the existence of conscious language planners or 'reformers', such as Vuk, whose persistent advocacy and successive publication of folk poetry, grammar, dictionary, and New Testament translation had an important influence on the whole standardization process. Any full-scale study of Arabic standardization would have to include investigation of such deliberate attempts to influence the outcome, and we are fortunate to have already a few scattered studies of this kind (e.g. Benabdi 1986), but these studies do not include data on the effects of the planning. Unfortunately, variationist sociolinguists interested in language change generally ignore this issue, although it seems reasonable to assume that any 'theory of language change [including language standardization] is incomplete if it does not allow for the possible influence of language planning' (Ferguson 1983). The discussion of the present paper is intended to lead to a single programmatic conclusion. If we want to understand the standardization process, as an important type of language spread and hence of language change in general, the most obvious research strategy is to study the process in operation, and the most obvious way to do so is to collect detailed data in the variationist tradition as a kind of baseline and add subsequent comparable data at regular intervals in the future. Standardization in Arabic is one such situation to study.

15 From ESSES to AITCHES: Identifying Pathways of Diachronic Change

The first great discovery of modem linguistics was that sound change tends to be regular and by tracing the pathways of particular sound changes it is possible to trace the communicative continuities in language behavior that have come to be called genetic relationship among languages. A subsequent insight was that any language variety, whether of an individual or of a social group, tends to have regularities of internal relationships and dependencies that constitute a linguistic system, different in patterned ways from the systems of all other human language varieties. Finally, in recent years, linguists have emphasized the universal characteristics of both synchronic systems and diachronic changes, characteristics that underlie or constrain the variability of human languages. Joseph Greenberg's has been one of the clearest voices insisting on the necessary complementarity of diachronic and synchronic considerations and the importance of the empirical search for typologically based universals of human language (cf. Greenberg 1966, 1978). The present paper is an attempt to explore some aspects of sound change in the spirit of Greenberg's theoretical position. One of the most powerful tools in the armamentarium of linguists engaged in the study of diachronic phonology is the often implicit notion that some changes are phonetically more likely than others. Thus if a linguist finds a systematic correspondence between [g] and [d3J in two related language varieties, it will be reasonable to assume that the stop is the older variant and the affricate the younger one until strong counter evidence is found. The linguist makes such an assumption because experience with many languages has shown that the change of [g] to [d3] is fairly common and tends to occur under certain well-documented conditions whereas the reverse change is unusual and problematic. This line of argumentation has been employed, either explicitly or implicitly, since the earliest days of modern historical linguistics.1 Because of the importance of this methodological tool, one might expect that general treatises and introductory textbooks on historical linguistics would devote considerable space to a presentation of the relative probabilities of various possible Reprinted by permission of John Benjamins Publishing Company, Amsterdam, from Studies in Typology and Diachrony, edited by William Croft, Suzanne Kemmer, and Keith Denning (1990).




sound changes, as well as explanatory factors accounting for them. Also, because of the centrality of alternations and processes to the field of phonological theory, one might expect that general treatises and introductory textbooks in phonology would devote considerable space to this topic. Unfortunately, authors of books on historical linguistics or phonological theory have a great deal of other ground to cover, and this simple but important concern tends to be neglected. As a consequence, inexperienced students given synchronic or diachronic problems for analysis are often delayed in finding appropriate solutions because they lack a "feel" for what is plausible; they have no place to turn for a convenient list of possible changes with indications of relative probabilities and conditions favoring or disfavoring them. Some explicit, systematic attention given to this topic might do more than give assistance to inexperienced students, however. It might even contribute to the construction of more detailed and more explanatory models of phonology and phonological change. This brief paper will present some general statements about diachronic changes involving sibilants and h-sounds, generalizations of the kind that experienced linguists tend to make use of but not to collect and present. Similarly illustrative generalizations could be made not only about other segments but about syllable types, harmonies of various kinds, accents, and a broad range of phonological phenomena, including so-called 'low level' rules. I am starting with two assumptions that are intended to reflect the Greenberg position. One is that any synchronic phonological phenomenon represents a time slice in diachronic changeâ&#x20AC;&#x201D;i.e. there is a source from which it has come and a range of possible outcomes toward which it is moving. And this assumption is held to be equally valid whether the phenomenon is regarded from the perspective of elements in a system or relations between elementsâ&#x20AC;&#x201D;in more current terminology 'representations' vs 'rules' (Anderson 1985). In spite of what seems to me the obvious value of such an assumption, linguists often ignore or disregard its validity in respect of particular phonological phenomena. Lass, for example, expresses a negative view succinctly: 'a segment does not know where it came from' (Lass 1984:178). Yet just the opposite view seems more useful as a working assumption. Every segment provides, by its phonetic characteristics, allophonic variation, phonotactic limitations, morphological alternations, relative frequency, and a host of details of acquisition, dialect variation, etc., a set of clues as to its source and its possible directions of change, and in some instances it offers a remarkably clear picture of its diachrony. A single, noncontroversial example will clarify this point. If a language variety has a voiced labiodental fricative as a phonological segment, two of the most likely sources are (a) the strengthening of a bilabial semivowel /w/ and (b) the phonologization of voiced allophones of a voiceless labiodental fricative /f /. French and English may be taken as examples of (a) and (b) respectively as principal sources of their /v/s, although both have additional sources as well; both languages also have a contrasting voiceless /f /. In such cases one might expect that the incidence of initial /v/ would be much less in the (b) language than in the (a) language simply because initial position is an unlikely locus for previous allophonic voicing but a likely locus for fortition. This is, of


Variation and Change

course, an oversimplified account of the issues, but it is enough to point to the usefulness of the assumption.2 It should come as no surprise that English-learning children acquire their /v/ relatively late whereas French-learning children acquire their /v/ earlier by virtue of the greater number of instances of initial v- in common words likely to be used with children (Ingram 1986). The children do not know the history of their respective /v/s, and the segments do not 'know' their own history, but the segments display characteristics that reflect that history and under favorable conditions can be used to reconstruct it. The second assumption is that the course of any language change proceeds within universal human constraintsâ&#x20AC;&#x201D;perceptual, articulatory, cognitive, social. Although any change in language structure or language use occurs within particular systemic contexts, the course of the change is multiply constrained by universal factors so that some patterns of change tend to be more highly favored than others under specifiable conditions. This assumption is the warrant for empirical research looking for inductive crosslanguage generalizations.

Sibilant to Aspiration Diachronic change of sibilant to h-sound is attested for a number of languages, and synchronic alternation of [s] and [h] in a number of languages may plausibly be interpreted as such a change in progress. The questions to be asked are: What are typical sources and outcomes of [s] and [h] in human languages? and: What are typical pathways of change from [s] to [h]? These questions are not new. They were asked decades ago, and Grammont's work is a good example of an attempt at some generalizing answers in the period before the modern explosion of data from the world's languages and the development of detailed formal models of phonology (Grammont 1933/1950:164,188,192,206). Merlingen 1977 (191-198) is a good example of a more recent treatment that makes use of data from many languages and pays attention to phonological theory, although his own model is, as he notes, "in starkem Gegensatz zu den gerade herrschenden Stromungen" (p. III). When the change of [s] to [h] is mentioned in recent general or introductory books of phonology or diachronic linguistics, it is usually in connection with a discussion of weakening processes or strength hierarchies (e.g. Hooper 1976:217218; Hock 1986:81). None of these discussions, however, state solid crosslanguage [s] to [h] generalizations from a general perspective.

Sources and Outcomes A voiceless sibilant, i.e. a voiceless dental or alveolar fricative of the [s] type, is very common in the world's languages: Over 80% of the 317 languages in the UCLA Phonological Segment Inventory Database (UPSID), have an [s] in their phonemic inventory, and of the 37 UPSID languages with only a single fricative, that fricative is an [s] in 31 (Maddieson 1984:44). On the whole these sibilants arc relatively stable. Insofar as evidence is available on their history, an [s] generally continues an earlier [s] over long periods of time. New instances of [s], allophonic or distinctive, may come from various sources, such as: simplification



of an affricate (e.g. ts > s), devoicing of a [z] (e.g. in word final position), affrication of a [t] (e.g. t > ts > s or tt > tst > ss), 'reversion to the unmarked' (e.g. > s) or, of course, from borrowing. Furthermore, [s] may change, among various possible outcomes, to a [z] (e.g. intervocalic voicing), to some kind of [r] sound (e.g. intervocalic s > z > r or rs > rr), to a hushing sibilant (e.g. palatal or retroflex assimilation), or to an aspirated stop (e.g. Sanskrit asti Pali atthi). Overwhelmingly, however, [s] persists. The single best attested change of [s] is probably the weakening to [h] and related phenomena, the topic of this paper. In phonological development in childhood, an /s/ is generally acquired fairly early, often after the first several stops and nasals; if the language has both an /f/ and an /s/ the /s/ may be acquired earlier as a phonologically distinct segment, but later as a phonetically fully adequate articulation (Ferguson 1978a). The glottal fricative [h] is also common in the world's languages: over 60% of the UPSID languages have an [h] in their inventory. Unlike [s], [h] is a highly unstable sound diachronically. It comes from many possible sources. Lass 1980 hazards the estimate that 'the majority of /h/ in presentday languages can be traced back to the lenition of other obstruents' (179). Probably the commonest source of [h] is an earlier voiceless velar fricative that may have come in turn from a stop (i.e. k > x > h), but there are many cases of other stops and fricatives as sources of [h], and, much less commonly, sonants or glottal stop. Also, [h] may arise 'spontaneously' before an initial vowel or as a 'hiatus breaker' between vowels. One of the best attested origins is the change of [s] > [h] that is the topic of this paper. In phonological change, the segment [h] may become lengthening of the preceding vowel, spreading of 'murmur' or some tonal feature, and ultimately often zero, as in [s] > [h] > 0. In phonological development in childhood, the glottal sounds [h] and appear quite early, often in 'protowords' that have no adult model; when the ambient language has an /h/ this tends to be acquired quite early in words. Alternate Pathways The change [s] > [h] seems to be a prime example of a natural weakening process or lenition, in full accord with the appealing Venneman definition cited by Hyman (1979:165): 'a segment X is said to be weaker than a segment Y if Y goes through an X stage on its way to zero,' although the characterization in feature terms of the change itself and its phonetic conditioning may offer the phonologist some challenges. The change has been represented, conveniently enough for our purposes, in Anderson and Durand 1986, as follows:


Variation and Change

What is not usually noted, however, about this change, although it seems quite clear from the various descriptions, is that there are two main pathways of actual implementation. One pathway, which we may call the Greek type, starts in intervocalic position, then proceeds to word-initial position (at first especially when the preceding word ends in a vowel), and then to various preconsonantal positions, and last, if at all, to word-final position. The other pathway, which we may call the Spanish type, starts with syllable-final positions, first word-internal preconsonantal, then word-final when the following word begins with a consonant, then word-final with following vowel, then to other positions, and last, if at all, to word-initial position. Particular instances of these changes differ in details related to the phonology and morphology of the language variety in question, the overall pattern of phonological change in the variety, or aspects of the social pattern of diffusion of the change, but there seems no doubt that two different basic types of [s] > [h] change exist. The labels 'Greek type' and 'Spanish type' are convenient mnemonics, since the change in ancient Greek has been discussed in historical IndoEuropean linguistics for roughly a century and the Spanish change is by far the best described of all the known instances of [s] > [h]. In fact, apart from the large and still growing literature on the 'short a' phenomena of English (e.g. Ferguson 1975, Labov 1981, Harris 1987a, Kiparsky 1988), the aspiration and deletion of /s/ in dialects of Spanish may be the most extensively treated of all sound changes being investigated from an empirical, variationist perspective (e.g. Navarro Tomas 1948; Cedergren 1973; Terrell and Tranel 1978; Terrell 1979; Lipsky 1984, 1986). The s > h process in children's phonological development has not been investigated as such, but the substitution of [h] for adult model /s/ is relatively rare. Errors of English-learning children attempting Is/ are typically either the common 'phonemic' substitutions [ ], [ ], [ts], or 'distortions,' such as lateral fricatives, that are perceived by adults as some kind of /s/. Adult clusters with /s/ plus sonant are sometimes produced by children as preaspirated, murmured, or voiceless sonants that could be interpreted as h plus sonant, and /s/ plus voiceless stop is often produced as a preaspirated or lengthened stop and thus could also be interpreted as an instance of s > b. It would be worth investigating more closely children's production of adult /s/ in various positions in order to determine whether further parallels to diachronic s > h > 0 occur. Examples of changes of the Greek type that have received the most attention are the early Indo-European changes of [s] > [h] in Greek, Iranian languages, and Armenian. It is still not clear whether these are all examples of the same change which diffused across languages or are independent changes (perhaps some kind of drift phenomenon?), cf. Szemerenyi 1968. Although these example are all at least in part reconstructed, Indo-Europeanists seem very sure about them. Descriptions of Greek type [s] > [h] changes in progress are, however, not rare, either in terms of synchronic alternations or accounts of changes of shallow time depth preceding the present time. Examples cited in Merlingen include several Altaic languages in Eastern Siberia (e.g. Yakut and Evenki), several Austronesian languages (e.g. Tongan, Tahitian), and several West African languages (e.g. Lomongo, Kpelle).



Examples of changes of the Spanish type include varieties of Malay, the wellknown Sanskrit alternation of /s/ with visarga, and the historical loss of final /s/ in various Romance languages (cf. Longmire 1976). Although the change is widespread, has been intensively studied in Spanish dialects, and has at times served as a basis for phonological theorizing about weakening processes, the Spanish type of [s] > [h] change seems to be attested less frequently than the Greek type in the world's languages. Although both types of [s] > [h] change are weakening in the Venneman sense, it is only the Spanish type that fits fully into Donegan & Stampe's characterization of weakening processes as summarized by Kiparsky (1988:377): 'usually context-sensitive and favored in unstressed position, in the syllable coda, and in casual speech.'

The Greek Case One of the distinctive characteristics of the Greek branch of Indo-European is its treatment of /s/; for a convenient summary cf. Palmer 1980:235-239. Proto-IndoEuropean initial prevocalic s- was replaced by h-, and PIE intervocalic -s- was deleted completely. Linguists hypothesized that the intervocalic -s- had first become -h- and then been lost, with consequent vowel contractions. This hypothesis was confirmed when Linear B was deciphered and Mycenean Greek, which goes back several centuries before alphabetic inscriptional Greek, was shown to have had intervocalic -h- in the right places. Thus the conclusion is reasonably drawn that the change of [s] > [h] took place first intervocalically and then in wordinitial position before a vowel. Traditional examples are hepta 'seven' (cf. Lat. septem); zed 'boil' (cf. Skr. yasati); and there are scores of additional examples. Earlier /s/ is retained in final position and in clusters with voiceless stops, both initially and medially: genos 'kind' (cf. Lat. genus); treis 'three' (cf. Lat. tres); esti 'is' (cf. Skr. asti)', stetkho 'climb' (cf. Skr. stighnoti); hesperos 'evening' (cf. Lat. vesper). The. treatment of /s/ in clusters with sonants is somewhat more complicated, but equally instructive. In initial clusters of /s/ + sonant (m, n, 1, r, j, w) the /s/ is lost; evidence from early inscriptions suggests that first the /s/ became [h] or the cluster became a voiceless sonant. In medial clusters of /s/ + sonant the /s/ is lost, with either lengthening of the preceding vowel or gemination of the sonant, depending on the dialect. Sonant + /s/ clusters also yielded geminate sonants with loss of /s/, loss of sonant, and other outcomes, depending on the dialect. A few more observations will help to round out the picture. In word-final position Greek normally permitted only a vowel, n, r, or s, and no clusters except for ks, ps, ls. Final /s/ has persisted from ancient times to the present,3 and final /n/ has varied with zero for most of that time (Ferguson 1975:182-183). Although the change of [s] to [h] may have started as an implementational rule of allophony, other sources of [s] and [h] soon led to full contrast between them in initial position (e.g. sos 'your': hos 'who') and the occurrence of some intervocalic /s/s. Initial /s/s came from borrowing, from earlier *tj and *tw clusters (e.g. sebomai


Variation and Change

'respect' cf. Skr. tyajati; se 'you (sg, acc)'< *twe), and elsewhere, and, as already noted, /s/ occurred in clusters with stops, where it did not contrast with /h/. In vowel-initial words with intervocalic /s/, the h < s appears in initial position, e.g. heuo 'burn' < *euso, cf. Lat. uro, Skr. osati, in a kind of prosodic 'hopping' similar to the phenomena of the frequently discussed Grassman's law. Some intervocalic /s/s are the aorist and future morphemes; the aorist -s- remained after stops in any case and either was 'analogically reintroduced' after vowels or resisted the change because of its morphological identity. Finally, there are some rare doublets in initial position, e.g. hus ~ sus 'swine.' The s-forms in these cases are sometimes explained as borrowings from a presumed non-Greek Indo-European language that preceded Greek in the area (Hiersche 1970: 34-35), but they could just as easily be regarded as 'lag' words which the change was finally affecting. At this distance in space and time it is difficult to ascertain the social path of diffusion of the change throughout the Greek-speaking community. On the basis of some epigraphic evidence and comments by ancient writers, the second phase of the change, i.e. h > , or psilosis as the Greeks called it, began in the urbanized, Ionic settlements in Asia Minor. This was the dialect area where the loss of digamma /w/ began, and it seems likely that the s > h phase also started there. The Greek example of s > h is reasonably clear and convincing, at least in outline, but sceptics could argue that it is all based on ancient texts and reconstructions, as are the stock examples from the history of Iranian languages and Armenian. Accordingly it is reassuring to find that modern examples with data from living speakers are very similar. The s ~ h alternation in modern Yakut is a good example of an [s] > [h] change in progress (cf. Krueger 1962).4 Yakut has initial and final /s/ and /s/ in some medial clusters, but in intervocalic position older /s/ has become [h] and is represented by a different symbol in the orthography:

ssaa gun suohu cattle

-s taas stone kiis girl, daughter -hkihil red ohox stove

-sCkustar ducks baliksit fisherman

This alternation also appears at some morpheme and word boundaries. When a stem ends in /s/ or a suffix begins with /s/, the /s/ appears as [h] when preceded and followed by a vowel, e.g. bis- 'cut'+ -abin 1st sg pres = bihabin 'I cut'; saa 'gun, rifle' + -sit agentive = saahit 'rifleman.' Also, in some instances of compound words or 'close-knit' phrases, word final -s is replaced by -h when followed by a word beginning with a vowel, e.g. kiis 'girl, daughter' + ago 'child' = kiih ogo 'girl (child)'; agis 'eight' + at 'horse' = agih at 'eight horses.' Finally, in a few words initial s- varies with h-, e.g. the very common word suox, ~ huox 'not, there is not.' Apparently at the end of the 18th and beginning of the 19th century some Yakut speakers, particularly those in the city of Yakutsk, perhaps because of language contact with other languages in the area, began to pronounce intervocalic



/s/ as [h] (cf. Bohtlingk 1851/1964:158-159). By the present time all Yakut speakers use intervocalic [h] and are beginning to pronounce some initial /s/s as [h]. It must be noted that many Yakut speakers also speak Russian to a greater or lesser extent, and some Russian loanwords with intervocalic [s] are beginning to appear, but they have not affected the s ~ h alternation in native words.

The Spanish Case One of the most salient phenomena of dialect differentiation in modern Spanish is the variable pronunciation of standard /s/; for a convenient summary cf. Alcina Franch and Blecua 1975:340-354. Much peninsular Spanish has a somewhat retracted, [ ]-like pronunciation, noticeably different from French or Italian [s]s or, for that matter, [s]s of other parts of the Spanish-speaking world; also, in many varieties of Spanish / / has merged with /s/, in some areas with intermediate pronunciations. Most striking, however, is the variation between [s] and [h] or zero which is widespread in Andalusia (southern part of Spain) and in the Caribbean and coastal areas of Central and South America. It is still not clear whether this is all one change which diffused as a result of Andalusian migration to the New World or is a set of independent changes (perhaps some kind of drift phenomenon?); certainly the changes are now progressing relatively independently in the various areas although also continuing to show common features. The earliest evidence for the [s] > [h] > 8 change in Spanish is generally accepted as from the 17th century although some authors put it even earlier (cf. Alcina Franch & Blecua 1975:348) and it seems clear that two patterns of diachronic sequence are quite general, although not exceptionless: (a) syllable-final preconsonantal first, then word-final; (b) aspiration first, then deletion.5 The factors affecting the course of the change are similar everywhere, but every community that has been studied carefully has a somewhat different pattern, reflecting partly just how far the change has progressed, but also the particular weighting of the various conditioning factors in that speech community. The [s] > [h] change under way in Spanish dialects is in many ways a paradigm example of the kind of sound change that Kroch and others have suggested as the principal, or even sole, type of sound change: phonetically motivated ('articulatorily reduced'), originating in less prestigious groups, characteristic of rapid, casual speech, and resisted by prestige groups and careful speakers (Kroch 1978:25). The phonetic constraints that have been shown to be operative in [s] > [h] > in the Spanish are: word length (particularly monosyllables vs polysyllables), following segment (particularly consonant vs vowel vs pause), and whether the preceding and/or following vowel is stressed or unstressed. Examples from Cuban Spanish will serve to illustrate these three factors.

Word Length Middle class educated speakers of Spanish from Havana analyzed in Terrell 1979 showed only 6% deletion of word-final /s/ in monosyllables as compared to 36%


Variation and Change

in polysyllables. The strong effect of this factor is evident in the comparison of such high frequency items as the adverbs entonces 'then' and mas 'more'; the former showed 78% deletion, the latter only 3%. Terrell even hypothesizes that 'for many speakers entonces has been restructured in the lexicon without a final /s/' (604).

Following Segment For the same population, word-final /s/ suffers either aspiration or deletion in 98% of the instances when the following word begins with a consonant, in 82% of the instances when the following word begins with a vowel, and only 39% when the word is followed by pause.

Stress For the same population, /s/ is retained before an unstressed vowel only 19% of the time and before a stressed vowel 39% of the time. Researchers have not as often examined the role of stress on the preceding vowel, but this seems to be even greater (cf. discussion in Guy 1981:139-140). In addition to the phonetic factors that constrain the weakening of /s/ in Spanish dialects, several grammatical factors are apparent. The one most often discussed is the information load of the -s plural marking in the noun phrase: in Spanish the head noun, determiners, and adjectives all typically mark the plural with a final -s. A number of studies have reported that the most likely place of /s/ retention is the earliest slot in the noun phrase, which is usually a determiner (los, las, unas, unas). This is certainly interpretable as a tendency to preserve grammatical information (cf. Kiparsky's Distinctness Condition, Kiparsky 1972), but it must be noted that even in varieties of Spanish where this tendency is clearly operative, instances of the complete absence of the plural marker throughout the noun phrase still occur (Poplack 1980). A number of linguists have called attention to parallels between the presentday Spanish phenomenon and the historical loss of final -s, including plural -s, in French and other Romance languages (e.g. Longmire 1976; Terrell and Tranel 1978). The other grammatical factor is the morphological identity of the final -s subject marker of 2nd person informal singular in verb forms. Since Spanish is a socalled 'pro-drop' language in which the independent subject pronoun is often not present, the tendency to retain the grammatical information of the -s ending may be expected to be a factor in the deletion of this -s. Studies have generally shown relatively little extra retention of this -s, or none at all, in the absence of the pronoun, but in her study of Puerto Rican Spanish Hochberg 1986 found greater use of the pronoun tu when the verbal -s was deleted. In spite of the great amount of variationist study of the Spanish s > h change(s), it is difficult to ascertain the social path of diffusion of the change(s) throughout the Spanish-speaking world, but some interesting observations can be made. There is general agreement that in Spain the change began in Seville, in



the lower classes, and spread out from there, extending to upper classes in Seville and Andalusia and to lower classes in some locales outside Andalusia. It must be noted that the merger of IB and /s/ also started in Seville in the same way a century or so earlier, and its spread has tended to coincide in a rough sort of way with the spread of aspiration and deletion of /s/.6 A particularly interesting finding is that wherever differences between male and female speech have been examined in relation to the aspiration and loss of /s/, either the sex of the speaker has seemed to have no significant effect or it has been found that female speakers retain the /s/ more than males, regardless of age or social class (e.g. Spain: parts of Granada, Jaen; America: Panama, Buenos Aires). This pattern is in line with the widespread tendency of women to use standard variants more frequently than men, at least in complex, urbanized societies (Trudgill 1972).7 The Spanish case of s > h is clear, but one could ask whether any similar cases have been noted in other languages, other times and other places. Since this kind of change of [s] to [h] is purely allophonic or 'post-lexical,' it is not likely either to be represented in the orthography of the language or to be discussed by grammarians or other observers, unless it comes to be socially marked ('stigmatized' or 'prestigious'). A somewhat similar change was, however, duly noted and described by Sanskrit grammarians and was even given representation in the orthography (Fry 1941). Sanskrit /s/ appeared as [h] in syllable-final position before voiceless obstruents (except for the homorganic stops HI and /th/) and in absolute sentence-final position. Sanskrit has 20 stops (5 places of articulation, ± voice, ± aspiration) and at least 3 lexically distinct nasals, 3 sibilants, 4 semivowels /y, r, 1, v/, and /h/, as well as 6 vowels /a, a, i, i, u, u/ and two pairs of diphthongs /e, ai, o, au/. In word-initial position it permits many 2-consonant clusters, typically of stop plus semivowel but also of sibilant plus stop or semivowel, and some rare 3-consonant clusters. Word-finally—the 'weak' position—the language normally permits only single consonants, and these are often pronounced in a weaker way. The permitted finals are m, n, p, t, t, c, k, (and underlyingly s, r). As already noted, the only fricatives in the phonological inventory are the three sibilants, dental /s/, palatal /s/, and retroflex /s/; the sonant /r/ functions in certain respects as the voiced counterpart to /s/. Sanskrit /s/ in absolute word-final position, i.e. end of sentence or line of verse, weakens to (voiceless) [h], which the grammarians named visarjaniya or visarga 'off-glide.' Orthographically it has a special symbol, usually transliterated h, which is distinct from the symbol for the (voiced or murmured) / h/. A weakened variant of word-final /s/ also appears before (word-initial) voiceless obstruents. This weakened variant, also spelled with visarga, is described by the grammarians as a weak voiceless fricative that varies in place of constriction in accordance with the following consonant: labial before /p, ph/, retroflex before /s, t, th/, palatal before /s, c, ch/, and velar before /k, kh/. Note that /s/ is not weakened before the dentals /s, t, th/, although an /s/ preceding another /s/ is sometimes spelled with visarga. Note that final /r/ under many of these same conditions is also represented by visarga.


Variation and Change

Comparison of Cases Although each case of s > h > 0 reported in any detail is unique in some respects, the two major types identified here can be compared, and the four examples chosen will serve for exemplification.

Phonological Status One of the striking differences between the Greek type and the Spanish type is that the latter seems to start and remain an allophonic change, although it may result in neutralization (Skr. /s/ and /r/; Sp. /s/ and /r, x/). It is, as noted, constrained in various ways so that there is a great deal of 'free variation' of the kind that variationists try to capture with variable rules, but it never seems to result in clearcut lexical contrasts. The Greek type, on the other hand, although it may start as a 'free variant' allophonic change, seems to result fairly soon in lexical contrasts.

Conditioning Factors In both the Greek and Spanish types of change the phonetic environment and certain kinds of morpheme boundaries are conditioning factors, as are also style levels (casual/formal) and social groupings (e.g. class, area). When these are examined more closely, however, there are striking differences in detail between the two types. In the Greek type the crucial phonetic condition is the presence of a following vowel, to which the /s/ assimilates in the sense that the oral stricture is lost and the fricative noise is reduced, thus becoming more vowel-like. If this is what is happening, one can hypothesize that the aspiration that appears in the Greek type of s > h > 0 is more likely to be voiced or murmured than the aspiration in the Spanish type. To my knowledge, however, we lack reliable phonetic data on this pointâ&#x20AC;&#x201D;a reasonable challenge to experimental phoneticians. In the Spanish type the crucial phonetic condition is the syllable-final position of the /s/. The Spanish s > h > 0 seems to function in the language as part of a general erosion of final features, segments, and syllables, the diminution of phonation in the weakest part of the word. If this is what is happening, one can hypothesize that the aspiration that appears in the Spanish type of s > h > 0 is more likely to be a voiceless fricative assimilating in place of articulation to the following consonant. The descriptions of the Sanskrit grammarians (cf. Whitney 1872, 1962) are evidence in favor of this hypothesis, but it needs reliable phonetic data from ongoing variation. What originally attracted me to the issue of the difference between Greek and Spanish s > h was the fact that both languages also have undergone intervocalic spirantization of stops, and it seemed legitimate to wonder whether these changes were part of a larger set of related changes. As a result of this brief investigation, it seems that the s > h was in neither case closely related to the spirantization,



for which the crucial phonetic condition was the presence of a preceding vowel (Ferguson 1978b), and in any case other instances of the changes, such as Yakut and Sanskrit, have no corresponding spirantization. Hooper seems justified in claiming that Spanish s > h is an example of weak-position weakening, bearing no relation to the sonority hierarchy evident in the spirantization (Hooper 1976:217). The Greek s > h > 0 seems more similar to the somewhat earlier Greek w > in mode of operation and path of spread than the later Greek spirantization. The Spanish s > h > 0 is, if anything, more similar to the earlier Spanish initial f- > h- > than to the Spanish spirantization, although the f weakening started in and spread from Old Castile, not Andalusia. The speaker of a language undergoing the Spanish type of change finds the Greek type unimaginable, and probably vice versa. A native Spanish-speaking linguist has put this feeling into words: ". . . inconcebible pensar que haya un dialecto hispanico en que se elida, digamos /s/ en el verbo saber, porque resultaria haber, otro verbo." (Guitart 1978:88) Diachrony versus Synchrony One of the issues on which as yet no consensus has been reached in modern phonological studies is the synchronic status of sound changes in progress. If the beginning state and the end state of a sound change can be confidently identified, what is the status of the variation during the periodâ&#x20AC;&#x201D;which may be decades or centuriesâ&#x20AC;&#x201D;when the change is taking place? In terms of current generative models, when in the course of a change s > h > 0 are lexical items restructured to have /h/ or nothing in their underlying representations? The tendency has been for analysts to keep the earlier stage intact as long as possible in the synchronic description, even when they take a fairly strong 'concrete' position in the abstractness controversies. A strongly argued minority view, however, insists on more stringent criteria for evidence of synchronic rule operation. Thus Kiparsky 1973 has Grassman's Law and s-Aspiration as synchronic rules in Classical Attic Greek, whereas Bubenik 1983 analyzes them as diachronic stages no longer operative as rules. No, attempt will be made here to resolve this issue, since it involves divergent views of theoretical goals, canons of evidence, and notions of 'psychological reality.' Afterword This paper has offered a limited exploration of what seems to be a single pattern of phonological change [s] > [h] > 0, as a small example of a research strategy in diachronic phonology. The particular strategy that is advocated is the systematic cross-language study of identifiable patterns of phonological change in order to find evidence for their respective probabilities of occurrence and for the conditions that favor or disfavor them. This strategy, which is intended to reflect the spirit of Greenberg's position on synchronic and diachronic universals, shows promise of contributing both to the understanding of diachronic phenomena and to the


Variation and Change

construction of general theories of phonology. The exploration of this one pattern has shown that what seems to be a single pattern of sound change may actually be two or more different patterns that cut across various proposed dichotomies, such as strengthening vs weakening processes, lexical vs postlexical rales, gradual vs discrete changes, or changes from above vs changes from below. The identification of two or more different types of change where our linguistic preference for parsimony would have called for one 'should not cause surprise or despair,' as Kiparsky observes in a similar connection (Kiparsky 1988:384). It gives us the opportunity to tease out generalizations and research objectives we might otherwise have missed. Since we cannot all be Greenbergs and keep in our heads and/or our personal notebooks vast amounts of data from different languages, this research strategy underlines the need for large-scale reliable data bases of linguistic phenomena that theorists may use to construct and test promising hypotheses about language. As to the findings of the paper, it is quite likely that Greenberg has long since noticed the facts reported here, has arrived at a suitable interpretation of them, and has jotted it down in a notebook or placed it in a footnote of a publication I have not seen. Even if he has, I hope it will have done no harm to recommend here once again the usefulness of searching for cross-language inductive generalizations about diachronic change as one of the research strategies available for linguists.

Notes This paper has benefitted from conversations with G. Guy, T. Huebner, P. Kiparsky, M. Moyer, and others. A preliminary version of the paper was read by K. Denning, and G. Guy, and their comments were helpful; they cannot be held accountable, however, for my failure to heed their suggestions. Responsibility for errors of fact, interpretation, and presentation is mine. 1. Often nowadays the argument is simply taken for granted, with perhaps an appeal to what is assumed to be shared knowledge. Typical is comment such as this one after a series of changes mentioned: "These are all changes which present no particular difficulty" (Bynon 1977:56) 2. For a comparable discussion of Ms of different origins, cf. Tiersma 1975. 3. Thumb (1964:22-23) lists loss of final -s in the modern dialects only in Lower Italy and in the Tsakonian dialect. 4. Unfortunately, Baraskov 1953, on which Krueger based much of his phonological description, was not available to me, although several more recent Soviet publications were. Apart from a few minor comments on orthography or dialect, however, these did not add appreciably to Krueger's description of the s ~ h variation. 5. Although terms such as 'sibilant depletion' and 'disappearance' might be less ambiguous ('aspiration' suggests an aspirated s as in Burmese and 'deletion' suggests speaker intention), 'aspiration' and 'deletion' are the usual terms in the literature on Spanish and they will be employed here. 6. A recent study (Moyer 1988) has provided evidence that the weakening of /s/ is more stigmatized in Spain than the merger of / / and /s/ and is correspondingly lost sooner by working-class Andalusian in-migrants in Barcelona.



7. Guy 1981 notes the same phenomenon for the s-weakening in the Popular Brazilian Portuguese of working-class speakers in Rio de Janeiro. On the basis of the age distribution of the variation, however, he concludes that an s > h change is not now in progress, and that the women must have been the leaders in a kind of decreolizing restandardization.

16 Then They Could Read and Write

My concern in this paper is with the changes in language brought about by the introduction of literacy into a society. In investigating this topic, one of the most useful research strategies available to us is the development of detailed case studies of either contemporary or historical situations in which societies become literate. In this paper, we will concern ourselves with four such instances, each of which involve the creation of a writing system appropriate to the vernacular language of a nonliterate society as part of an attempt to spread or strengthen Christianity among its members. The four cases we will consider include the development of the Armenian alphabet (by Mesrop), the Glagolitic alphabet (by Constantine), the Permian alphabet (by Stefan), and the Aleut alphabet (by Veniaminov). Each of these four cases differs from the others in the degree to which the introduction of literacy proved successful. With this in mind, each will be discussed in terms of four factors that are likely to control the path of vernacular literacy development in societies of various types and, therefore, to promote or to retard its chances for success: 1) the linguistic choices to be made; 2) the source of the literacy initiative; 3) the scope of literacy within the society; and 4) the extent of the institutional support for literacy within the society and the means employed to transmit the new literacy skills among the members of the society and from one generation to the next. From our study, we will conclude that to understand the course of development of vernacular literacy in any nonliterate society, it is necessary to look at the pragmatics of that literacy. We will also become aware of the value of this type of study of well documented historical cases as a source of data. And, finally, we will argue for the importance of balancing the present emphasis in literacy studies on the development of individual literacy with investigations of social learning in cases of incipient and developing societal literacy. In 1983 the Journal of Pragmatics published a special issue on "Linguistic problems of literacy" edited by Coulmas, an indication of the recent upsurge of linguistic and sociolinguistic interest in written language. For many linguists Reprinted from L. Bouton and Y. Kachru, eds., Pragmatics and Language Learning Monograph Series 1:7-19, by permission of the Division of English as an International Language, University of Illinois, Urbana, I11. 216

this represents a decided change in research topics, since the traditionâ&#x20AC;&#x201D;in American linguistics especiallyâ&#x20AC;&#x201D;has been to emphasize the primacy of speech and relegate writing to a very secondary role, hardly appropriate for linguistic research. Some of the current research on literacy is connected with rather far reaching claims that have been made about the social and cognitive differences between orality and literacy, between nonliterate speech communities and literate ones (cf.â&#x20AC;&#x201D;Charbonnier 1973, Goody and Watt 1968, Ong 1982; but cf. Frake 1983). These claims are often phrased in terms of a great change that takes place in a society when literacy is introduced, and it is on that point that my concerns in this paper are focused. What DOES happen when literacy is introduced into a nonliterate society? Incidentally, what I mean by literacy here is societal literacy, i.e. the regular use of writing and reading to exchange messages in a society, 'regularly' meaning repeatedly on certain types of occasions for certain types of communicative function. My concerns are primarily with pragmatics, i.e. about what changes in patterns of language USE take place with the introduction of literacy, although generalizations about changes in language STRUCTURE brought about by literacy or in connection with the spread of literacy are equally interesting and deserve better investigation than they have received. My working assumptions are (a) that every instance of the introduction of literacy will have unique characteristics, and (b) that some generalizations will hold across many or even all instances. In the present state of our knowledge about these phenomena, one of the most useful research strategies is the production of case studies, i.e. detailed descriptions of particular instances, in order to be led to insights about possible generalizations and also to check on some of the claims that have already been made. The most obvious kind of case study is that description of literacy being introduced into a contemporary nonliterate society, and we doubtless need many such studies. Another type of case study is the description of historical instances of the introduction of literacy at various times in the past (cf. Ferguson 1968, 1987). Coulmas, looking primarily for structural changes in language resulting from literacy, has rejected historical studies because they do not offer reliable data on spoken language (Coulmas 1983:47), but if we are primarily interested in language use, some carefully selected historical studies may be very informative. One attractive type of introduction of vernacular literacy (i.e. literacy in the mother tongue of the community) is the stimulus diffusion model in which an individual in a nonliterate society gets the idea of literacy from culture contact. In these cases the individual, who typically becomes a kind of culture hero, has not acquired literacy himself from others but figures out, on his own, some kind of writing for his language. The case most often discussed as an example of this is probably Sequoyah's Cherokee syllabary, for which a considerable amount of documentation is available (cf. Walker 1981). It would be very instructive if someone were to make a careful comparison of a number of documented cases of this kind in North America, West Africa, and Southeast Asia.


Variation and Change

Literacy for Evangelization The particular kind of vernacular literacy introduction I have selected for this paper is the creation of a writing system incidental to the spreading or strengthening of Christianity in a nonliterate society. The four cases I have chosen are from the history of Eastern Christianity, since a tradition of vernacular literacy as an aid to evangelization has persisted for many centuries in the Eastern churches in contrast with the long tradition of Latin literacy in the West (cf. Korolevsky 1957). In each case one man is credited with the invention of the new writing system and a biography of the inventor written shortly after his lifetime is available. The biography, along with other documentation is sufficient to allow a reasonable reconstruction of language use in the target community before and after the introduction of vernacular literacy. The four instances are as follows: 1. Invention of the Armenian alphabet by Mesrop (also known as Mashtots) at the beginning of the fifth century; 2. Invention of the Glagolitic alphabet for the Slavs of Greater Moravia by Constantine (also known by his religious name of Cyril) in the middle of the ninth century; 3. Invention of the Permian alphabet by Stefan in the fourteenth century; 4. Invention of the Aleut orthography by Veniaminov (also known by his religious name of Innocent) in the nineteenth century. It must be noted that cases 2, 3, and 4 represent a continuing conscious tradition, each inventor being aware of the preceding instances, as well as having some ideas about literacy as such and about the invention of the ancient Greek alphabet at a much earlier period. All four of these introducers of vernacular literacy came to be recognized as saints by various churches, and the biographies of 1, 2, and 3 are classics of hagiographical literature in their respective churches. (For translations, discussions, and bibliographies, cf. the following selected references: the literature on Cyril and his brother Methodius is especially large, with hundreds of books and articles in many languages; for Mesrop, cf. Ashjian 1962, Koriun 1964, 1985, Nersoyan 1985-86 and Peeters 1929; for Cyril, Angelov 1969, Dvornik 1933, Lacko 1969, Vodopivec 1986; for Stefan, Ferguson 1968, Hamalainen 1950, Stipa 1961; for Veniaminov, Black 1977, Garrett 1979, Ransom 1945.) The economic and sociopolitical situations of the four instances differ considerably although in each case the nonliterate society into which vernacular literacy was introduced was less powerful than the surrounding states. Two of the societies were hunting-gathering peoples with little political organization, one was an established kingdom with some literacy in other languages (the Armenians), and the remaining case of Slavs consisted of an array of incipient states whose national boundaries were fluid. The subsequent histories of the four cases are also quite different. The Armenian alphabet took hold very quickly, and came to be used in a variety of ways including as the vehicle of a substantial national literature. Armenian literacy came to be more widespread in that society than literacy in neighboring societies and both the alphabet itself and the pattern of literacy have remained distinctive of

Then They Could Read and Write


Armenian ethnicity to the present day. In short, the vernacular literacy started by Mesrop was completely successful. The Glagolitic alphabet and Slavic literacy, after a promising beginning in Greater Moravia, was largely replaced there by Latin literacy, and later on by Slavic literacy with Latin letters, as a result of the destruction of the kingdom of Greater Moravia and the dominance of German-speaking missionary efforts in the area. Cyril's literacy was transplanted to Bulgaria where it took root and has continued to the present day. The Glagolitic alphabet was soon replaced in Bulgaria and elsewhere by the Cyrillic alphabet, which was much closer to Greek in letter shapes, and Glagolitic survived in regular use only among some Croatian Christians. The literacy initiated by Cyril and Methodius had a tremendous effect on Slavic culture, in that the texts they produced and the whole tradition of Old Church Slavic became the basis of almost all South and East Slavic literacy and had influence even among Western Slavs. Yet it cannot be said that Cyril's introduction of vernacular literacy succeeded in the way it was intended. The Permian literacy introduced by Stefan was the least successful of the four cases. The Permian language was used for a time in the life of the Church, and the texts in Old Permian are among the most ancient monuments of Uralic languages; also, the Permian alphabet apparently served an important symbolic function for the Permian nation. But Stefan's ecclesiastic and political successors rejected his vernacular literacy and in the absence of an institutionalized pattern of transmission his literacy disappeared from use. The Aleut literacy introduced by Veniaminov took hold quickly and firmly and persisted through periods of lack of support from the Russian side and indifference and active opposition from the American side. Although the number of speakers of the language has declined to fewer than a thousand, the original vernacular literacy persisted to the last, as did the religious identification with the Orthodox Church.

Selected Explanatory Factors In an early study (Ferguson 1968), I emphasized four choices that the initiator of literacy must make: which language? which variety of that language? what kind of writing system? what materials for reading? In a more recent study (Ferguson 1987) I discussed other social and sociolinguistic factors suggested by Spolsky and others as summarized in Huebner 1986. For the present paper I have selected four rather general factors that seem likely to be ultimately explanatory in accounting for the path of vernacular literacy development in societies of various types. This set of factors is essentially a revised version of the factors discussed before.

Linguistic Choices In the cases selected for study one important decision has already been made. The language to be written is the mother tongue of the community to be converted or strengthened in their Christianity, but this choice is undoubtedly a major factor in


Variation and Change

determining the course of literacy development in a given society. The choice of which variety of the language to represent in the written form can also be of importance and one can easily document cases where the spread of literacy is slowed or stopped by the choice of a variety not acceptable to the people who are to become literate in it. In the Armenian case a relatively unified literary language came into existence, probably based in large part on the dialects of Taron (modern Mush), the area that both Mesrop and Sahag, head of the Armenian church, came from. This literary language eventually came to occupy the position of the high variety in a diglossic situation that lasted for centuries, finally yielding to two modern standard varieties of Armenian, Western and Eastern, representing two clusters of modern dialects. In the Slavic case, the variety of Slavic chosen was the kind of South Slavic spoken in Thessalonica, where Cyril and Methodius were raised. This also came to be a relatively unified literary language that came to occupy the high position in a diglossic situation that included a large part of the Slavic speaking world. It eventually yielded to modern standard languages (e.g. Russian, Bulgarian, Serbocroatian). In the Permian case, the variety chosen was presumably that spoken around Ust'ug where Stefan was born, and some dialectal variation appears in the small heritage of written texts that have survived. When a new literacy was created after the Revolution, the old alphabet had long since been forgotten by the Permians, and a new Cyrillic-based alphabet was created. Today there are two Permian standard languages, one (Komi) in the North, corresponding roughly to Stefan's language, and the other (Permyak-Komi) in the South. In the Aleut case, Veniaminov probably used primarily the variety of Fox Islands Aleut spoken by John Pankov, his bilingual Aleut helper and co-worker. The choice of a writing system and its adaptation to the phonology and morphology of the target language has received much attention from linguists and sociolinguists. Some of the most fascinating historical detective work in language research has been devoted to discovering the sources of letter shapes and writing conventions in new literacies. Stefan's Permian alphabet is a good example. When I first studied this system I was able only to note that Stefan made use of some of the traditional decorative motifs used in Permian crafts and that the alphabet was strikingly different from the alphabet of the powerful Russians who were moving into the Permian area. But patient, painstaking work by the Uralicist Stipa (1961) showed convincingly that Stefan must have made some use of a particular Iranian writing system that had reached areas of Northern Europe where Finnic tribes were settled. Although this work is often, as I noted, fascinating, it seems likely that details of the writing system are less important in determining the course of literacy development than many other factors.

Source of the Literacy Initiative The driving force behind the introduction of vernacular literacy may be chiefly from inside the society, as influential members of the society or innovating agents of social change push for literacy in terms of recognized needs and planned goals

Then They Could Read and Write


of the society. It may, on the other hand, come from outside agents of change such as conquerors, colonizers, missionaries, or traders. Most often perhaps (especially in the more successful cases?), the source is both internal and external. In our first case, the source seems to have been pretty completely internal. In the fourth and fifth centuries the Armenian nation was in two parts, the larger Eastern half was the Armenian kingdom, subordinate to the Sassanian Persian empire, and the Western half was a part of the Roman empire. Most of the Armenians were probably already converted to Christianity, but large areas of the pre-Christian region still existed. The principal language of the Sassanian empire was Pahlavi, the principal language of the eastern roman empire at that time was Greek. Most Armenians were probably illiterate, monolingual speakers of Armenian, but a significant number of church leaders, government officials, merchants, and others were literate in Greek, Pahlavi, or Syriac. Greek and Syriac were the languages of the Scriptures and the liturgy in the Armenian Church, since Christianity had come to the Armenians from Greek-using and Syriac-using churches. A group of young monks, led by Mesrop, with the full approval of the head of the Armenian Church, Sahag (= Isaac), and the support of King Vramshapouh, decided that Armenian vernacular literacy was needed in order to convert the nonChristians, to deepen Christian faith and practice among the Armenian Christians, to maintain national unity across the political boundaries, and to keep from being assimilated either to Greek or Syriac types of Christianity. The invention and revision of the alphabet, different in appearance from either Greek or Syriac letters, the systematic literacy campaigns and schools that were set up, and the translations and original writings that were produced in Armenian all flowed from the decisions and activities of Mesrop and his associates in response to what they saw as urgent needs for the strengthening of Armenian Christianity. At the opposite end of our four cases was that of the Aleuts, in which the introducer of vernacular literacy was a total outsider. Veniaminov had to learn the local language, devise an appropriate writing system for it, and by example, personal devotion, and unceasing hard work, bring his vision for the development of the Aleuts into reality. The path of development of Aleut literacy was very different from that of Armenian literacy and the factors that led to its success were correspondingly different. In both cases, however, the nature of the source of the initiative was a crucial factor, and in both cases the knowledge, creativity, farsightedness, and determination of the principal introducer apparently played a considerable role. In the other two cases the source is harder to characterize. Cyril's family was of the well-to-do status of high Greek-speaking governmental officialdom in the Eastern Roman empire, but he was born and raised in Thessalonica where many Slavs had settled and he was a fluent speaker of the local variety of South Slavic. Stefan was apparently of Russian origin and culture but he was born and raised in an area where Permian was spoken, and apparently was fluent in that language from an early age. Both Cyril and Stefan were strongly attracted to monastic life and the study of book learning and foreign languages. Both were fascinated by the problem of creating a suitable writing system for an unwritten language, and both were enthusiastic about vernacular literacy as a means of evangelization.


Variation and Change

In spite of the difficulty of pinpointing exactly the effects of the source of the initiative in the various cases, it seems likely that this is an important factor to be investigated thoroughly in other studies of the introduction of literacy.

Scope of Literacy In many societies literacy is severely limited in its functions and/or its distribution among members of the society. The term restricted literacy (Goody 1968) has been used for such situations, as opposed, for example, to expanding literacy or mass literacy. In some instances of restricted literacy, such as the Tuareg of the Sahara or the Moro of Mindanao, the writing system has been in use for a long period of time, and the once wider functions have been narrowed, or fossilized, to a few uses such as in amulets and genealogical lists, and literacy is not regarded as a common means of exchanging personal messages or as a means of transmitting historical or technological information or as a form of diversion or enjoyment of literary texts. In some cases of restricted literacy the existence of the writing system may have great symbolic value for group identity (cf. Blood 1988), even though the actual use of literacy may be minimal. I mention this possibility because it seems likely that Permian literacy, after its introduction, stabilized in some such way. The first texts produced are known to have been liturgical and scriptural, but of the few extant texts a surprising number consist just of the alphabetic symbols in order or are the identifying labels on icons (Orthodox tradition calls for the written identification of the divine figures or saints painted on icons). There is little evidence that the Permian script was ever used for personal letters, record keeping, or original literature. The Permians were probably proud of their script, and Stefan's biographer was greatly impressed by the creative achievement of inventing the writing system, but quite possibly, the vernacular literacy, although used in connection with religious rites and of great symbolic value, "never became a part of the shared cultural resources of the society" but remained "a marginal phenomenon activated only by direct involvement with an impinging alien culture" (Ferguson 1987). The story of Aleut literacy was quite different. The Aleuts not only became strongly attached to the use of reading and writing in the religious contexts in which it was introduced, but came to exchange letters in it, keep inventories of possessions (such as furs obtained by hunters), and to depend on a weekly announcement board for community news. Even though the original set of religious texts remained the primary focus of Aleut literacy, and most adult males until quite recently (Ransom 1945) had read and even reread all of them, other uses of literacy became well established and part of traditional Aleut culture. In the Armenian and Slavic cases, the introducers initiated training programs and established schools, primarily to prepare clergy but also, especially among the Armenians, to spread literacy more widely in the society. In both cases a substantial program of translation of religious works was undertaken and original liturgical hymns were composed, and within several decades new saints' lives were being written and original theological treatises were produced. It is clear that

Then They Could Read and Write


Mesrop himself wrote letters to many groups and individuals to keep people informed, and the use of Armenian in church records as well as political records became established; it is less clear to what extent these other uses of literacy emerged in the early decades of Cyril's literacy. Incidentally, many of the students in Armenian and Slavic schools were also taught Greek (and for the Armenians, Syriac) as a necessary tool for studying religious classics and for translating them.

Institutional Support and Transmission One of the most important factors in the development of literacy is the provision of means for maintaining literacy competences in individuals and for the society once some form of literacy has been introduced. People concerned with literacy development often refer to this factor as the retention problem. From the activist point of view the question is asked: What should one do to assure retention? From a more basic research perspective the question would be: What patterns of literacy behavior tend to arise or are deliberately instituted that result in the transmission of literacy from one generation to the next? The literacy planners make two principal recommendations: provide suitable reading materials and start schools. In each of our four cases certain essential sacred texts were translated as the first reading materials, yet the patterns of literacy were strikingly different. The most interesting of the four is surely that of Aleut literacy, which persisted against heavy demographic, economic and political odds. At the very beginning of Aleut literacy, people got the idea that having knowledge of reading and writing was highly desirable, and could be transmitted as a valuable gift to a younger person with whom you had strong ties. Possession of the knowledge made it possible in the first instance to take active part in the new religious ceremonies. Many of the Russians working for the trading company were illiterateâ&#x20AC;&#x201D;literacy in Russian was largely limited to high officials, company clerks and the clergy. In traditional Aleut culture there was a custom of individuals identifying other persons to whom they felt their spirit would be transferred after death, and these identifications made for close ties between the individuals. As the Aleut shifted over to Christian faith, this traditional pairing was replaced by the godparentgodchild relationship. As a result, Aleut adults who could read and write would sometimes find it appropriate to pass on this highly regarded knowledge to adolescent godchildren. Because of the conditions of Aleut life, regular attendance at school was not possible for everyone, and when Russian schools eventually disappeared entirely, this more personal mechanism for transmitting literacy remained intact and effective. When Orthodox clergy were not available, literate Aleut deacons and readers could conduct regular services, and most males and many females could engage in other literate activities; Aleut books and documents were treasured possessions. The pattern of transmission established in Armenian literacy already in the fifth century was based on an articulated system of formal education, including teacher training schools to prepare the vartabeds, a new type of clergy who were teachers or 'doctors'. Learning to read and write was not limited to ecclesiastical


Variation and Change

personnel; although it did not reach the levels of the mass literacy that appeared in various parts of the world in the eighteenth and nineteenth centuries, its institutionalization in the society was thoroughgoing. In general, the pattern of literacy transmission instituted by Eastern Orthodox tradition, as exemplified in the stories of Cyril, Stefan, Veniaminov, and a number of other missionary figures, was provided primarily for participation in religious ceremonies. Literacy acquisition by other members of the recipient society, although not excluded, was not the focus of attention. This reflects the state of literacy in Orthodox nations before the twentieth century. The difference in transmission patterns that evolved among Permians and Aleuts quite possibly had more to do with the local cultural conditions of the two societies than with the implicit aims of the introducers of vernacular literacy. In any case, it is clear that the pattern of institutional support is an important factor in determining the path of literacy development.

Conclusion This once-over-lightly treatment of four cases of vernacular literacy is ample evidence, if such is needed, of the variation in patterns of literacy introduction even in cases that are selected for their close similarity in their motivations for social change and the nature of the beginning texts produced. The linguist may, with full justification, focus on the linguistic choices involvedâ&#x20AC;&#x201D;the language variety chosen, the writing system adopted, and the registers and genres of early texts. But to understand the course of development of any vernacular literacyâ&#x20AC;&#x201D;including such narrowly linguistic topics as the path of standardization over time, the layering effects of spoken and written varieties, and the changes in the structure of the vernacular itselfâ&#x20AC;&#x201D;one must look at the pragmatics of literacy. What is the source of the literacy initiative, how strong is it and how is it tied to other social forces and to key individuals? What is the scope of literacy uses, how restricted or expandable are they, how closely intertwined with communicative needs and social values? What means of institutional support and mechanisms of transmission emerge or are created for the new literacy behaviors in the society? Certainly the grand dichotomies of orality/literacy and nonliterate/literate seem far too crude and unfruitful. From my perspective we need much more in the way of detailed descriptions of the introduction of various kinds of literacies and the ways they function in various kinds of societies. Again, from my perspective, these descriptions should include historical case studies. Unfortunately we cannot send a Jack Goody or William Labov back in time, and there were no Scribner & Cole teams or Chafes or Tannens as researchers in those other times and places, but for some cases at least there is rich documentation, and insightful analyses are possible. One last word about literacy and language learning, to justify my choice of topic for this paper in a volume of this sort. The language learning perspective on language change is greatly neglected. All language change involves learning; users of a language learn to do things differently, and the acquisition of new forms and functions is a constant phenomenon in the human use of language. In particular,

Then They Could Read and Write


the introduction of literacy into a nonliterate community means, among other things, that members of the community learn new language behaviors that take their place among the existing patterns of language structure and use. In more linguistic jargon, they add new representations and new rules to their competence. Recent research in learning to read and write is overwhelmingly concerned with individual learning and with established literacies. That research needs complementing with the study of social learning in cases of incipient and developing societal literacy. I suggest that this is an important area in the almost unlimited field of research in pragmatics, and one with an ultimate payoff in our understanding of the processes of language learning.

17 Individual and Social in Language Change: Diachronic Changes in Politeness Agreement in Forms of Address

Tying together findings from macro- and micro-level research has been recognized as a major problem since the earliest days of explicitly sociolinguistic research, and Joshua Fishman's has been one of the clearest voices calling attention to it (Fishman 1972). The present paper explores some aspects of this problem in one traditional area of sociolinguistic researchâ&#x20AC;&#x201D;forms of address. It does so from the pragmatic perspective of politeness phenomena and in terms of a recognized area of linguistic theoryâ&#x20AC;&#x201D;grammatical agreement. It describes some mismatch phenomena in politeness agreement with second person pronouns in two languages, Persian and Portuguese, for which both macro- and micro-studies are available. Finally, it offers some cross-language generalizations that can be tested from other languages. The study of forms of address has been a fruitful area of sociolinguistic research, beginning with the frequently cited and reprinted classic paper of Brown & Gilman (1960) that has served as stimulus and model for hundreds of subsequent studies.1 From the original focus on the use of different second person pronouns this research has moved steadily to the analysis of larger systems of address forms that include pronouns, kinterms, names, titles, epithets, and interjections (cf. Bean 1978, Parkinson 1985). The most recent information on the state of research and theory is generally found in the publications of the Kiel research project on address forms, especially Braun 1988 and Winter 1984. Braun et al. 1986 provides an annotated bibliography of over 1100 items. Recent attempts at explanatory theory include Braun 1988: 7-67, 253-296; Brown & Levinson 1987: 198-204 et passim; Joseph 1987. Politeness phenomena in language have become an increasingly recognized This paper originally appeared in The Influence of Language on Culture and Thought: Essays in honor of Joshua A. Fishman's sixty-fifth birthday, edited by R. L. Cooper and B. Spolsky, pp. 183-197. Berlin: Mouton de Gruyter. 227


research topic from a variety of perspectives, the best known general treatment being that of Brown & Levinson (1978, 1987), which offers a detailed model that is intended to be universal, based on the notions of positive and negative politeness and face threatening acts (FTAs). Brown & Gilman (1989) analyze some forms of address phenomena using this model. Recent interest in grammatical agreement can be conveniently dated to the "universals" paper by Moravcsik (1978). Agreement phenomena, by which one grammatical element matches another in terms of some categorial feature, present a challenge to contemporary theories of grammar, and linguists of different theoretical approaches and interested in different languages have struggled with agreement issues (cf. ESCOL 1984, Barlow & Ferguson 1988). Agreement in features of politeness has, however, received very little attention, often not being noted in lists of possible agreement features (e.g. Pullum 1985: 80-81, Lapointe 1988: 71).

Second Person Politeness Agreement The two commonest patterns of grammatical agreement in languages are subjectverb agreement in such features as person, number, and gender and internal noun phrase agreement in such features as number, gender, case, and definiteness. Politeness agreement in forms of address may appear in either or both of these types in a given speech community. For example, if the speech community makes use of a language that has two or more second person subject pronouns of different politeness levels, the verb may show agreement with them in its subject markers. Thus in Bengali the verbs in (1) abc agree with the subject pronouns in politeness level. (1) you speak a. b. c.

tui bolis tumi bolo apni bolen

you ('inferior') speak you ('ordinary') speak you ('honorific') speak

Likewise, a language that has two or more subject pronouns of different politeness levels may show agreement between these subject pronouns and corefential object pronouns or possessive adjectives or pronouns. Thus, in Bengali the possessives in (2) abc agree with the subject pronouns. (2) you forgot your book a.

tui tor boi bhule gechis

you (i) your (i) book forgetting went b.

tumi tomar boi bhule goecho

you(o) your(o) book forgetting went c.

apni apnar boi bhule goechen

you(h) your(h) book forgetting went

These two forms of agreement in Bengali are quite straightforward, being similar to those in hundreds of languages. In Bengali, as tends to be the case in general, the number of grammatical categories overtly marked in the independent pronouns

is greater than the number of categories overtly marked in the subject affixes, i.e. some features are neutralized in the verb affixes. For example, the pronouns are obligatorially marked for number (sg tui, tumi, apni; p1. tora, tomra, apnara), but there is no number distinction in the verb. Also, the honorific verb forms do not distinguish between second and third person honorific as the independent pronouns do (e.g. apni, tint bolen 'you(h), he/she(h) speak(s)'). Bengali is a so-called "pro-drop" language in which the independent subject pronoun need not to be present but the subject marker on the verb must be. Also Bengali normally has no overt copula in present or timeless constructions, as in (3) abc, although in other tense/aspect categories a verb of being (ho-, ach-, thak-) appears, with appropriate subject marker suffixes (Ferguson 1972). (3) are you his brother? a. tui-ki or bhai you(i)-Q's brother b. tumi-ki or bhai you(o)-Q's brother c. apni-ki or bhai you(h)-Q that.ones brother Analytic problems arise of mismatches in agreement features, directionality of agreement, and the agreement status of omitted copula or verb arguments, and such problems would be treated differently depending on the analyst's grammatical theory or method of analysis. These syntactic phenomena are, however, in no way unusual or anomalous. Also, the conditions for occurrence or non-occurrence of the omissible items are, as usual, based on discourse structure or pragmatic considerations, as are the conditions for the selection of the appropriate politeness level of pronouns of the same person. Thus, these two examples of politeness agreement in Bengali make clear the kind of phenomena to be discussed in this paper. Some speech communities make use of languages that do not show politeness levels in pronouns but indicate politeness in forms of address by other means. For example, differences in the form of personal names in address may be very similar in communicative function to differences in politeness level in second person pronouns, as were pointed out in Brown & Ford (1961). More generally, politeness levels conditioned by the identity of speaker and addressee and their conversational strategies may appear in a variety of linguistic and nonlinguistic behaviors. Patterned cooccurrences of various politeness markers in forms of address often belong with lexical collocational constraints, systems of textual cohesion, or patterns of social interaction, but sometimes they belong clearly in the realm of grammatical agreement, as in the examples discussed here.

Diachronic Change One of the most common diachronic changes in pronominal systems is the introduction of a morphological or lexical distinction between the ordinary second person singular pronoun and a more deferential, formal, "polite" second person sin-


Variation and Change

gular. Another common change is from predominantly nonreciprocal patterns to more reciprocal or symmetrical patterns. Presumably changes of these kinds are part of (and thus signal) significant social changes such as some kind of socia class differentiation or the appearance of new social roles, and the changes make their way through the community from certain innovators and transmitters to ever larger sectors of the community. Like other sociolinguistic changes, a pronominal change such as these presumably moves from occasional variable behavior by a few people in limited contexts toward categorical behavior by larger numbers of people in a broad range of contexts. In the narrow aspect of linguistic form, the innovation of a polite 2nd sg may take any of a number of different routes, such as the use of an honorific noun (e.g. 'excellence', 'grace', 'highness'), a third person pronoun (e.g. 'she', 'they'), or a word of 'self as the new polite form, contrasting with the original second person singular pronoun. The most common of the various routes seems to be the use of the original second person plural to serve as a polite singular (Brown & Levinson 1987: loc. cit., Head 1978, Svennung 1958:373-393). If this first step is taken (i.e. plural for polite singular), alternative succeeding possibilities may be selected. For example, the change in linguistic form may stop here, as in French, where tu is the original 2nd sg and vous is now three ways ambiguous, meaning 2nd p1 ordinary, 2nd sg polite, and 2nd p1 polite. Another route is the creation of new plurals for the two singulars and the original 2nd p1 loses its plural sense. This is the case in Bengali, where tui is the original 2nd sg and tumi the original 2nd p1 and there are now two plurals, tora, the plural of tui, and tomra, the plural of tumi (Chatterji 1926:816-820). In this particular case a third level has come into existence from another source: the word apni 'self has come to be used as more polite than tumi, and it has acquired the plural apnara (Chatterji 1926:846-851). English is an example of a language in which a still different route is takenâ&#x20AC;&#x201D;one quite unusual among the languages of the world. After the plural ye/you came to be used as a polite singular and the objective you absorbed the nominative ye, the original plural swallowed up the singular almost completely, leaving the singular thou/thee only very marginal uses limited to certain registers and certain social groups (Byrne 1936, Leith 1983:106-110, Brown & Gilman 1989). With whatever route is taken, alternative agreement patterns are possible. For example, in languages such as French and Russian where the original plural pronoun is multiply ambiguous, the predicate agreement patterns may require singular agreement for real-world singulars and plural agreement for real-world plurals. Comrie 1975 examines some of the variety of predicate agreement patterns with "polite plurals" in ten languages and suggests important cross-language generalizations. His generalizations, however, have nothing to say about the value of semantic disambiguation, andâ&#x20AC;&#x201D;for the purposes of the present paperâ&#x20AC;&#x201D;have nothing at all to say about the communicative functions of the patterns. Corbett (1988:50) recognizes the general issue and recommends research: "The choice between agreement options is influenced by a range of sociolinguistic factors . . . analysis will require either extensive and careful informant work or the scanning of large corpora (or preferably both)".

A few detailed studies of historical change in pronominal address systems are available, chiefly from Germanic and Romance languages. Particularly valuable for our purposes are Lapesa's articles on the history of Spanish second person pronouns and related phenomena (Lapesa 1970, 1978). At least one study (Head 1981) attempts to relate present-day variation in pronouns of address to the nature and rate of diachronic change, by comparing the rate of decrease in the use of V in addressing parents in different urban centers, as the system moves toward TT for parent-child interaction.

Macro and Micro In spite of Fishman's efforts, the research stream of relatively large-scale analysis of speech community characteristics and the stream of relatively small-scale analysis of individual behavior and dyadic and small group interaction rarely meet. One area in which serious attempts have been made to connect the two kinds of research is that of phonological change, where sociolinguists are searching for a general theoretical model (e.g. Labov 1972, Milroy & Milroy 1985). The productive research methods used in the original Brown & Gilman paper were varied but were largely 'macro' in effect, providing considerable information on the forms and uses of pronouns of address in English, French, German, Italian and Spanish (and a little on other languages) and in addition offering a general semantic framework for analysis and a hypothesis of a universal historical process of change. The authors made use of (a) direct observation of behavior in long conversations with speakers of these languages, (b) elicitation of informants' self report by questionnaires and interviews, (c) analysis of written texts: letters, legal proceedings, plays, and other literary texts, and (d) consultation of secondary sources: language histories and published descriptions of pronoun semantics in particular languages. Some later investigators have made use of more 'micro' research methods chiefly the analysis of recorded spontaneous conversation and the sequential analysis of conversation in modern dramas. A few investigators have even used experimental intervention (cf. Lambert & Tucker 1976). The studies of Persian and Portuguese examined here are among the very few that have combined macro and micro levels of analysis. The Persian study, reported in Baumgardner 1982, was based in the first instance on a 108-item questionnaire administered in an open-ended interview format to 125 native speakers of Persian, born in Tehran or having lived there at least 20 years. The respondents varied in social class, sex, and age (3 x 2 x design). The self-report data from these interviews were recorded. The micro data consisted of some 20 hours of telephone conversations of a 23-year-old female with 29 different interlocutorsâ&#x20AC;&#x201D;family, friends, and strangers (wrong numbers). The woman herself and many of the interlocutors were among the respondents of the questionnaire-interviews, making it possible to check self-report against actual behavior. One study of Brazilian Portuguese, reported in Head 1976 and elsewhere, was


Variation and Change

based on written responses to questionnaires from 137 young adults, all highschool teachers, university students, or both. Thus, the respondents were very similar in age, educational level, and income, but from different geographical areas. Four large cities (Sao Paulo, Rio de Janeiro, Porto Alegre, Salvador) and a smaller city inland in the state of Sao Paulo are represented in the data. Another study, reported in Jensen 1981 and elsewhere, was based on "many hundreds of written questionnaires and many hours of recordings obtained in the capital city and interior areas of the states of Rio Grande do Sul, Sao Paulo, Rio de Janeiro, and Ceara" (Jensen 1981:53). The micro data on Brazilian Portuguese come from the extensive dialogue in a popular modern novel, Jorge Amado's Dona Flor e Seus Dois Maridos ('Dona Flor and her two husbands'). This novel has a very large cast of characters of varied ages and social classes, and the dialogue takes place under a great variety of circumstances, from quite usual ones to scenes of sheer fantasy. Jensen (1982) rates the dialogue as "natural, colloquial" and claims that "the conventions used to depict social relationships through dialogue [are] the same ones recognized and used by the community."

The Case of Persian The pronominal forms of address in Persian have not been described in detail until recently. Grammars and dictionaries of modern Persian usually treat the second person pronouns and related address forms very briefly and without discussion of sociolinguistic variation or change in progress. Recent treatment of the Persian pronominal forms of address may be dated to Hodge (1957), which deals with a range of sociolinguistic variation in modern spoken Persian and includes explicit treatment of forms of address. The first systematic study with a substantial number of subjects and quantitative data is that of Jahangiri's unpublished dissertation (Jahangiri 1980). Batani (1976) contains a paper on forms of address based on observations in Tehran, and Beeman (1986), based on extended participant observation in the city of Shiraz, a large village near Shiraz, and a number of other sites in Iran, includes an informative and insightful account of the subject (147151). Baumgardner, however, offers the fullest treatment, although unfortunately his paper on radio dramas (1978) and his Ph.D. dissertation (1982) remain unpublished. Early Modern Persian had the simple singular-plural opposition to : soma with corresponding object and possessive forms, both independent and enclitic/affixal, and agreeing subject markers as suffixes on the verb. This system remains in use in some rural dialects today, but most Modern Persian, including the standard spoken and written varieties, exhibits the innovation of using the plural soma also as a polite singular. An elaborate politeness system of forms of address which includes lexical replacements for pronouns, such as jenab-e ali 'exalted sir' for 'you', and a set of "self-lowering" and "other-raising" lexical replacements for some common verbs is in place (Baumgardner 1982, Beeman 1986); here, however, only the pronoun forms proper will be discussed. The possessive/objective suffixes and verb subject markers corresponding to to and Soma are as in Table 17â&#x20AC;&#x201D;1.

Table 17-1. Persian Second Person Pronouns independent pronouns to Soma

poss/obj suffixes

-oet -etan

subject marker suffixes -i id

'you (sg)' 'you (p1; pol sg)'

A new form, somaha 'you (p1)', appears in informal speech, resolving the singular: plural ambiguity of soma, but apparently not differentiating the to and soma politeness levels. There is no evidence of the appearance of *toha as unambiguous T plural and concurrent restriction of somaha to the value of V plural, but this possibility for future development remains open. It must be noted that the -id ending alternates with -in. This difference, which apparently earlier was a regional dialect variation reflecting two different developments of earlier Persian verb morphology, has become a register variation, so that in Tehran and elsewhere today the -id is a more formal variant alongside the more informal colloquial -in. The mismatch in grammatical agreement which is the focus of this paper is the occurrence of the independent pronoun soma with (a) the verb ending -i appropriate for to agreement and/or (b) the possessive/objective suffix -oet instead of -etan. The results of the extensive macro research of Baumgardner (1982) show that the use of TT, VV, TV, VT patterns in dyadic interaction varies in relation to differences in age, gender, social class, and the presence of 'outsiders' (in the course of family interaction). Since the research was limited to Tehran and vicinity, no regional differences were noted. The data came from self-report in response to (a) an elaborate 100-item questionnaire, each item containing multiple subquestions, (b) 8 specific attitudinal questions, and (c) a 21-item information sheet to identify the respondent in terms of a variety of possibly pertinent factors. This result offers a detailed, informative analysis of the relationship of Iranian (pre-Revolutionary) social organization and patterns of verbal communication. This kind of analysis contrasts sharply with the few lines of description of 2nd person pronouns offered by grammars of modern Persian, which provide almost no indication of the range of uses of to and soma. This kind of data, however, does not deal with how the T/V system is used "in ways which vary from the idealized usages associated with the demographic categories of social class, sex, and age" (Baumgardner 1982:169). Also it does not show how a speaker uses the system, for example, "to express a transient attitude or feeling, or manipulatively, to get the addressee to act in a desired manner" (Baumgardner loc. cit.). In particular, it does not reveal the agreement mismatches that are the topic of the present paper. Baumgardner's microanalysis is largely congruent with the self-report data, but it introduces the systematic use of mixed forms (M), i.e. the use of soma with T morphology. Thus in addition to the four patterns examined in the macro analysis, there are five patterns in which at least one of the interlocutors used a mixed form: MM, MT, MV, TM, VM. These new patterns neatly subdivide patterns of the macro analysis. The speaker, for example, split the VT patterns she used in


Variation and Change

talking with her parents and a set of uncle/aunt/in-law relations into MT for parents and VT for the others. In each case the use of the mixed form combined intimacy (close kin) with respect (older age). The micro analysis also showed switches during a conversation that revealed stages in changing from one pattern to another as the signaling of intimacy increased, and revealed that particular components of the genre (telephone conversation) were the loci of the incipient change of levels. Some speakers of Persian do not admit to using this kind of mismatch; others recognize its use when attention is called to it. It has very occasionally been noted in descriptions of Persian (e.g. Lazard 1957), and at least one modern Iranian novelist, Ismael Fasih, has used it to literary effect (Baumgardner 1982:169). It is a striking example of the stretching of a grammatical pattern for communicative functions, a phenomenon that is probably widespread in human language behavior but rarely noted in grammars or in discussions of syntactic theory. At the same time, we may hypothesize that the use of mixed forms is an indication of a change in progress toward increased use of the reciprocal TT pattern, a kind of diachronic change reported in the original Brown and Gilman paper, and hypothesized also in the Portuguese case to be described next (cf. Head 1981).

The Case of Portuguese The pronominal forms of address in Portuguese have been the subject of many studies (e.g. Braun 1988:77-99, Head 1976, Kilbury-Meissner 1982; extensive bibliography in Head 1976), and at least two extensive sociolinguistic studies of forms of address in Brazilian Portuguese were conducted in the 1970s (cf. Head 1976, Jensen 1981). Portuguese (and predecessor varieties of Latin) had the simple singular-plural opposition tu : vos with corresponding object and possessive forms and agreeing subject-markers as suffixes on the verb. At some point a noun phrase "your mercy" came to be used as a polite singular that was in due course shortened to voce and acquired a plural voces. As voce lost some of its significance of politeness or high status it moved toward equivalence with tu, and a new polite form o senhor 'the lord' or 'the gentleman' (feminine a senhora) came to be the polite second person singular pronoun and in due course os senhores, as senhoras its plural. Thus Portuguese seems to have three levels: tu, voce, and o senhor, with corresponding plurals. The verb endings agreeing with voce and o senhor are, however, those of the third person, so that when the independent pronoun is not present, as is often the case in 'pro-drop' Portuguese, the verb forms are ambiguous between 2nd and 3rd person and also between the voce and o senhor level of second person. The forms and functions of Portuguese pronouns of address are, however, much more complex than the three-level model sketched here, which is that of the standard written language. For example, additional polite forms are in use (e.g. vosmice, Vossa Excelencia) and various ways of avoiding specifying the level of politeness occur. Also, the amount of regional and social variation is great. In

ordinary conversation and writing many speakers of Portuguese hardly use the original vos forms at all except for certain ritual formulas (e.g. certain versions of the Lord's Prayer) and set phrases (e.g. Vossa Senhoria 'Dear Sir' in business letters), and many speakers of Portuguese simply do not know the verb endings with vos. Further, in some geographical areas and social groups the tu form is not Used in speech at all and the verb endings agreeing with it may not be in the active competence of speakers. In some areas the informal : polite opposition is essentially between voce and o senhor, whereas in other areas it is between tu and o senhor. Details of this great variability, at least for some areas and social classes, appear in Head 1976, Jensen 1981, and elsewhere. What is of special interest for the present paper is the use of mixed forms of address, the phenomenon referred to in Portuguese grammar and prescriptive statements as mistura de tratamento. Although not mentioned in some grammars and condemned in pedagogical and other prescriptive texts, this mismatching of categories, the "combination of pronouns from different personal components of the traditional paradigm" (Head 1976:301) is probably quite widespread. It is immediately evident in recordings of spontaneous conversation and is well attested in the dialog sections of dramas and novels (Jensen 1977, 1981, 1982). The mistura is of two types, disagreement between subject and verb and disagreement between subject pronoun and object pronoun or possessive. The second type is not always a matter of sentence-internal grammatical agreement in the sense of sentences (2) abc above, but includes the use of a subject pronoun of one level and the object pronoun of a different level by the same speaker to the same addressee in the same discourse. The first type is relatively unimportant for the purposes of this paper since it is marginal (limited chiefly to the use of tu with 3rd person endings) and is arguably part of a general syntactic change in progress in Portuguese by which subjectverb agreement is being lost (Naro 1981). It is of interest that the marginal opposition between the use of tu with 3rd person agreement and tu with 2nd person agreement (referred to by Jensen as tu and tu + s respectively) is used by individual (educated) speakers and authors in several ways. The tu + s pattern often carries a poetic, literary, or 'exalted' flavor or represents imagined conversations or talking to oneself, in contrast to the more natural, conversational tone of tu with 3rd person agreement. This opposition is, of course, not available to speakers who lack full competence in the 2nd person verb forms. The second type of "disagreement" in Portuguese is 'the speaker's freedom to choose among various members of the paradigm nominally characteristic of the tu, voce, or o senhor sets to create a wide range of potentially subtle degrees of address" (Jensen 1982:253). The freedom of choice is in some contexts quite wide. For example, Jensen (1981:59) lists fourteen fully acceptable versions of the sentence "You were there; I saw you", and others may also occur. One pattern is the use of te (the object clitic form of tu) with voce or o senhor. This pattern of softening the strict agreement is quite common, and some of the respondents in questionnaire studies admit using it when asked specifically about it. Jensen suggests that "within the family the power semantic may lead to the use of o senhor with a parent but the form will be accompanied by the te of solidarity"


Variation and Change

(1981:60). He also reports that even speakers who rarely or never use tu as a subject pronoun may on occasion use the object form as a softener for either voce or o senhor: "questionnaire data show that in certain dyads Sao Paulo speakers choose te as the most popular object pronoun to accompany voce (almost half— 48.6 percent—of occurrences of voce toward a teacher would be accompanied by te)" (57).

Discussion and Conclusions In addition to the data reported here from Persian and Portuguese, evidence for instances of agreement mismatch in politeness levels can be found in other languages, such as Hindi (cf. Jain 1973) and Spanish (cf. Kany 1851), but apparently no micro analyses have been carried out on other languages that would demonstrate the communicative functions served by the mismatches. Also, detailed historical studies have been published of the changes in pronominal forms of address over time in various languages (e.g. for Spanish, Lapesa 1970 with numerous references). In such historical studies, however, even when times and places of variation in agreement are noted, attention is focused on phonological and morphological factors, not possible communicative functions of the mismatches. Thus these studies of Persian and Portuguese offer a valuable insight into the way agreement patterns may be stretched for communicative purposes and the way such pragmatically motivated variation may simultaneously be part of long-term morpho-syntactic change. Examination of the data from Persian and Portuguese suggests the following conclusions: four substantive cross-language generalizations, one general hypothesis of theoretical import, and a proposal for preferred research strategies.

Substantive Generalizations 1. If two or more politeness levels of second person pronouns exist in a language and participate in agreement patterns, whether subject : verb or subject pronoun : possessive/objective pronoun, "mismatch" or "disagreement" patterns may arise, typically (but not exclusively) a subject pronoun of higher politeness level with a verb form or possessive/objective pronoun of lower level, to express intermediate levels or conflicting components of politeness. Examples: Persian soma 'you (p1: pol sg)' —» verb + -i 'you (sg ord)' (instead of -id 'you (p1; pol sg)'. Portuguese o senhor 'you (pol sg, subj)' —» te 'you (sg ord, indir obj)' (instead of le or ao senhor).

2. Mismatching politeness agreements tend to be out of awareness, stigmatized, and are thus more readily discernible by "micro" than "macro" research methods. Example: Many Persian speakers deny using the soma -i mismatch, but recordings of spontaneous conversations reveal principled use of them.

3. At the same time that politeness mismatch patterns of agreement serve communication functions, they are typically aspects of a change in progress in the pronoun system of the language, often a long-term change "simplifying" the system, i.e. reducing the number of overt grammatical categories and the amount of allomorphy, or a change increasing the "solidarity force" as against the "power semantic". Example: The Portuguese on senhor : te mismatch is part of the change from VT to TT pattern for children speaking with parents.

4. Diachronic changes in pronominal systems that mark politeness levels operate within universal typological (e.g. "markedness") constraints. Examples: Use of an original 2nd p1 pronoun as pol sg is a common (probably the most common) source of a pol sg in the emergence of politeness levels; the converse (sg for pol p1) is rare or nonexistent. Greater number of politeness levels in the sg than in the p1 is common; greater number in p1 is rare or nonexistent.

General Hypothesis Diachronic change in a language may proceed by having communicative functions ("pragmatic considerations") override strict syntactic patterns ("rules") in the short term, as part of a long-term syntactic reorganization.

Research Strategy Macro methods (e.g. questionnaires, interviews) are valuable for showing distribution of politeness agreement phenomena in a sociolinguistic community, but are inadequate for discerning discrepant patterns. Micro methods (e.g. analysis of spontaneous conversation) are valuable for showing phenomena out of awareness, but are inadequate or misleading for showing large-scale, long-term patterns of change. In order to document fully and reliably the kinds of morpho-syntactic change hypothesized here it is necessary to dovetail the two types of research. For example, if a change in forms of address pattern is apparent from macroanalysis one can proceed to a carefully focused microanalysis to determine whether an agreement mismatch is part of the process, and conversely, if a mismatch in politeness agreement is apparent from microanalysis, one can proceed to a carefully focused macroanalysis to determine whether the mismatch is part of a larger process of change in patterns of address. The relation between the stretching of a grammatical rule for pragmatic considerations and the larger structural changes in grammar and/or social interaction remains the explicandum, but even the discovery and description of a simple mismatch in grammatical agreement as part of a larger diachronic change can offer a contribution to the understanding of the "relationship between macro- and microsociolinguistic research" (Fishman 1972).


Variation and Change

Notes This paper was read at Stockholm University in October 1989, and its final version has benefitted from comments made by G. Guy, who read a version of the paper. Unfortunately, however, I bear full responsibility for errors of fact or interpretation that remain in it. 1. Following Brown & Gilman I use T and V as the abbreviations for the less and more polite forms of 2nd person pronouns, T/V as a shorthand abbreviation for a 2nd person pronoun system with politeness levels, and TT, TV, VT, and the like as indication of the patterns of use in dyadic interaction such that the first letter stands for the form used by EGO and the second letter the form EGO receives from the interlocutor. 2. No attention is paid here to politeness levels in 1st and 3rd person pronouns even though they may be components of the same overall politeness system. Such levels may be involved in some of the issues raised here, and in a large-scale study they would have to be addressed. In the present paper, however, the focus has been kept as narrow as possible for the basic points to be made, so long as the omitted material would not conflict with the conclusions.

References Barlow, M., and Ferguson, C. A. (eds.). 1988. Agreement in natural language: Approaches, theories, descriptions. Stanford, CA: CSLI. Batani, M. R. 1976. Problems in general linguistics: Ten articles by Mohammed Reza Batani. Tehran: Agah. [In Persian.] Baumgardner, R. J. 1978. A new T/V series: Persian pronouns of address. Paper read at the Summer meeting of the LSA, University of Illinois, Urbana-Champaign. . 1982. Sociolinguistic aspects of Persian pronouns of address: A macro/micro analysis. Unpubl. Ph.D. diss., University of Southern California. Bean, S. 1978. Symbolic and pragmatic semantics: A Kannada system of address. Chicago. Beeman, W. O. 1986. Language, status, and power in Iran. Bloomington: Indiana University Press. Braun, F. 1988. Terms of address: Problems of patterns and usages in various languages and cultures. Berlin, New York: Mouton de Gruyter. Braun, F., A. Kohz, and K. Schubert. 1986. Anredeforschung: Kommentierte Bibliographie zur Soziolinguistik der Anrede. Tubingen: Narr. Brown, P. and Levinson, S. C. 1978. "Universals in language usage: Politeness phenomena," in: E. N. Goody (ed.), Questions and politeness: Strategies in social interaction. Cambridge: Cambridge University Press. . 1987. Politeness: Some universals in language usage. Cambridge: Cambridge University Press. Brown, R. W. and Ford, M. 1961. "Address in American English." Journal of Abnormal and Social Psychology 62:375-85. Brown, R. W. and Gilman, A. 1960. "Pronouns of power and solidarity," in T. A. Sebeok (ed.), Style in language. Cambridge, MA: MIT Press. Byrne, St. G. 1936. Shakespeare's use of the pronouns of address: Its significance in characterization and motivation. New York: Haskell House. Chatterji, S. K. 1926. The origin and development of the Bengali language. 2 vols. Calcutta: Calcutta University Press.

Comrie, B. 1975. "Polite plurals and predicate agreement," Language 51:406-18. Corbett, G. G. 1988. "Agreement: A partial specification based on Slavonic data," in M. Barlow & C. A. Ferguson (eds.), Agreement in natural language. Stanford, CA: CSLI. ESCOL. 1984. Proceedings of the First Eastern States Conference on Linguistics. Columbus: Ohio State University. Ferguson, C. A. 1972. "Verbs of 'being' in Bengali with a note on Amharic," in J. W. Verhaar (ed.), The verb 'be' and its synonyms. Part 5. Dordrecht: D. Reidel. Fishman, J. A. 1972. "The relationship between micro- and macro-linguistics in the study of who speaks what language to whom and when," in J. B. Pride & J. Holmes (eds.), Sociolinguistics. Harmondsworth, Middlesex: Penguin. Head, B. F. 1976. "Social factors in the use of pronouns for the addressee in Brazilian Portuguese," in J. Schmidt-Radefeldt (ed.), Readings in Portuguese linguistics. Amsterdam: North Holland. . 1978. "Respect degrees in pronominal reference," in J. H. Greenberg et al. (eds.), Universals of human language. Vol. 3: Word structure. Stanford, CA: Stanford University Press. . 1981. "Variation and rate of change in the diffusion of new patterns of address," in D. Sankoff & H. Cedergren (eds.), Variation omnibus. Carbondale & Edmonton: Linguistic Research. Hodge, C. T. 1957. "Some aspects of Persian style." Language 33:355-69. Jahangiri, N. 1980. A sociolinguistic study of Tehrani Persian. Unpubl. Ph. D. diss., University of London. Jain, D. 1973. Pronominal usage in Hindi. Unpubl. Ph. D. diss., University of Pennsylvania. Jensen, J. B. 1976. "A investigacao de formas de tratamento e a telenoveta: A Escalada, Part I". Revista Brasileira de Linguistica 4, 2:45-73. . 1981. "Forms of address in Brazilian Portuguese: Oriental honorifics or standard European?," in B. H. Bichakjian (ed.), From linguistics to literature: Romance studies offered to Francis M. Rogers. Amsterdam: John Benjamins. . 1982. "Dona Flor and her five forms of address." Luso-Brazilian Review 19: 251-66. Joseph, J. E. "Subject relevance and deferential address in Indo-European languages." Lingua 73:259-77. Kany, C. E. 1951. American Spanish syntax. 2nd ed. Chicago: University of Chicago Press. Kilbury-Meissner, U. 1982. Die portugiesischen Anredeformen in soziolinguistischer Sicht. Hamburg: Helmut Buske Verlag. Laberge, S. and Sankoff, G. 1980. "Anything you can do," in G. Sankoff, The social life of language. Philadelphia: University of Pennsylvania Press. Labov, W. 1972. Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press. Lambert, W. E., and Tucker, G. R. 1976. Tu, vous, usted. Rowley, MA: Newbury House. Lapesa Melgar, E. 1970. "Personas gramaticales y tratamientos en espanol," in Homenaje a M. Pidal IV, Revista de la Universidad de Madrid 19, 74:141-167. . 1978. "Las formas verbales de segunda persona y los origines del 'voseo'," in C. H. Magis (ed.), Adas del III Congreso Internacional de Hispanistas. Mexico, DF: Colegio de Mexico. Lapointe, S. G. 1980. A theory of grammatical agreement. Ph.D. diss., University of Massachusetts, Amherst. Lazard, G. 1983. Grammaire du persan contemporain. Paris: Klincksieck.


Variation and Change

Leith, D. 1983. A social history of English. London: Routledge & Kegan Paul. Milroy, J. and Milroy L. 1985. "Linguistic change, social network and speaker innovation." Journal of Linguistics 21:339-384. Moravesik, E. 1978. "Agreement," in J. H. Greenberg et al. (eds.), Universals of human language. Vol. 4: Syntax. Stanford, CA: Stanford University Press. Naro, A. J. 1981. "The social and structural dimensions of a structural change." Language 57:63-98. Parkinson, D. B. 1985. Constructing the social context of communication. Berlin: Mouton de Gruyter. Pullum, G. K. 1985. "How complex could an agreement system be?," in G. Alvarez et al. (eds.), ESCOL '84, Columbus Ohio: Ohio State University . Svennung, J. 1958. Anredeformen: Vergleichende Forschungen. Uppsala: Almqvist & Wiksell; Wiesbaden: Harrassowitz. Winter, W. (ed.). 1984. Anredeverhalten. Tubingen: Narr.

18 Variation and Drift: Loss of Agreement in Germanic —It is very difficult to know the where and the how of the process of morphological simplification. (Leith) —Grammatical simplification takes place not suddenly and from one cause, but gradually and from a variety of causes. (Jespersen)

Two questions raised in the classic "empirical foundations" monograph (Weinreich, Labov, and Herzog 1968) are the inspiration for this paper: (1) Why do changes in a structural feature take place in a particular language at a given time, but not in other languages with the same feature, or in the same language at other times? and (2) How does a given historical change acquire fresh significance when viewed as part of a long-range trend? It has repeatedly been noted that the Germanic languages in their history from proto-Germanic to the present seem to have been moving—at different rates and by differing pathways—from a relatively "synthetic" grammatical structure to a relatively "analytic" structure, (i.e., from a complex inflectional morphology toward an ever simpler one) so that the phenomena of grammatical agreement amount to less and less and inflectional affixes are replaced in their syntactic and semantic roles by separate words such as prepositions and adverbs and by features of word order (cf. Berndt 1956; Werner 1984). This long-range shift in structural type may be called "drift" in one of the senses in which Sapir's term has been used (Sapir 1921; Venneman 1975; Itkonen 1977; Malkiel 1981; Andersen 1990). The particular aspect of the drift that will be the focus of this paper is the gradual loss of grammatical agreement. The earliest attested Germanic langages—and hence the reconstructed proto-Germanic—had a fairly complex pattern of subjectverb agreement, in person and number, and internal noun-phrase agreement, in This paper benefited from the discussion that followed an oral presentation at the Department of English, Lund University, April 1991. Helpful comments also were received from those who read an earlier version: William Croft, Mils Enkvist, Joseph Greenberg, Gregory Guy, and Elizabeth Traugott. Remaining errors and inadequacies are the author's responsibility. 241


Variation and Change

gender, number, and case. Present-day Germanic languages have much reduced systems of agreement. When a given language is moving toward morphological simplification, whether from internal or external causes, it may be assumed that the drift will tend to be manifested in the agreement system by (a) reduction in the number of categories in which agreement occurs and in the number of alternative patterns of agreement operating and (b) regularization of agreement, including regularization of allomorphy of markers and regularization of the conditions of occurrence of alternative agreement patterns (Ferguson 1989, 13-14); for these two basic components of "simplification" in language, compare Muhlhausler 1974. In this paper we will examine the drift toward loss of agreement in two Germanic languages, English and Swedish, and in two sets of agreement phenomena, subject-verb agreement and the agreement of the so-called strong/weak alternative forms of adjectives within the noun phrase. Modern English has lost almost all the subjectverb agreement phenomena of Old English but retains with vigor the -5 ending in the present tense for agreement with third person singular subjects; ordinary spoken modern Swedish, on the other hand has lost all traces of subject-verb agreement. In contrast, English has lost all traces of the Old English strong/weak declensions of adjectives, but Swedish still retains a much reduced, but solidly in place, version of the contrast. The time frames for the histories of English and Swedish do not match exactly. The first written sources available for English are Mercian texts from the eighth and ninth centuries. Old Swedish texts go back only to the thirteenth century; before that time inscriptions and documents are not sufficiently differentiated varieties of Common Scandinavian to be labeled linguistically Swedish. Despite this mismatch in time, the limitation on coverage to two languages and two subtopics, and the inevitable omission of a great deal of the detail involved, the paper makes some useful points on the relationship between variation and drift. The paper is written from the general variationist perspective of Labov, but no attempt is made to use the familiar Labovian research methodology as such. First subject-verb agreement will be discussed and then adjective agreement in definiteness; for each, changes will be discussed as successive reductions, unexpected innovations, and present-day variation. This will be followed by a section of "general considerations" and a final brief "moral."

Subject-Verb Agreement First Reduction In proto-Germanic the verb had separate forms for each person and number (i.e., six distinct forms in the present tense).1 By the time Old English and Old Swedish appear, however, both of them have reduced the number of distinct verb forms and thereby the number of agreement categories. The two innovations are, however, different. In Old English the three persons of the plural have fallen together, and the old verb form for 3pl is used for 1pl and 2pl as well. In Old Swedish, on

Variation and Drift


the other hand, the three persons of the singular have fallen together, and the old 2sg form is used also for 1sg and 3sg (cf. Table 18-1). In both languages this first innovation gives an indication of the eventual outcome. In English it is the -thl-s of the plural and 3sg that ends up being the one remnant of subject agreement in the verb, and in Swedish it is the -r of the 2sg that ends up being the invariable present tense suffix for all persons and numbers. Because these two innovations took place before the first recorded texts, there is no way to discover the details of the variation that must have accompanied the changes. It is a sober reminder of the inadequacy of current notions of markedness or naturalness that of the two languages, beginning from roughly the same structure and both "simplifying," one collapsed the three persons of the plural and the other the three persons of the singular. It is often assumed that the plural is more marked than the singular in declensions and conjugations and therefore less likely to show as many distinctions (e.g., of case or person), and many languages can be cited to show the strength of this assumption. Consequently, it would seem more natural for the plural categories to fall together (as in English) rather than the singular categories (as in Swedish). Also, it is often assumed that the third person is the least marked of the three persons, and is therefore most likely to have no overt subject marker in the verb, to be the person in which a copula is most likely to be omitted, and the person whose subject marker is most likely to spread to the other persons. Consequently, it would seem more natural for the third person forms to spread to (i.e., to merge with or replace) the others, as in the English merger in the plural endings. The Swedish merger fits this tendency in part because the 2sg ending spreads first to 3sg then to 1sg (relevant forms attested in Common Scandinavian inscriptions, Haugen 1976, 158). The patterns of variation that appear when mergers such as these take place are well worth studying because presumably phonetic factors, morphological-semantic factors, lexical factors, and social transmission factors are all involved, and natural constraints or "universal tendencies" are operative. It is only by detailed case studies that cross-linguistic generalizations can be verified about the processes of diachronic change.

Second Reduction The next change in English came apparently from a sound change that operated without regard for inflectional simplification. Unstressed vowels merged: a o u e fell together and were represented by e in the spelling (presumably sooner or later pronounced as shewa). One outcome of this pervasive change in late Old English and early Middle English was the merger of the p1 ending -ath and the 3sg ending -eth, a merger that was to play a role in English morphology from then until the present. Variation in the spelling of unstressed vowels is found in many documents of late Old English and early Middle English, but this variation apparently does not reflect directly the spoken variation that was taking place or had previously taken place. It can be assumed that the change started in some sector of the population and spread, that it was accelerated in certain phonetic environments as opposed to


Variation and Change

others, and that its phonological status varied. The written variation has been of value, however, for historical linguists in that the occurrence of hypercorrect "wrong" vowels in a text shows that the merger has already taken place in the speech of the writer or author or copyist (cf. Pyles and Algeo 1982, 152). The written variation is even sufficient to indicate, with careful statistical analysis, that the merger of unstressed vowels took place after the change of final -m to -n and overlapped with the dropping of final -n (cf. Moore 1928). It is even possible to hazard some hypotheses about the phonological status of the final reduced -e (Minkova 1983). Once again, the patterns of variation that appear when mergers such as this take place are worth studying, if for no other reason than that the phenomenon of merger of unstressed vowels is so common in the world's languages. It is indeed so common that some have been tempted to raise it to a kind of universal, but like many other cases of markedness or naturalness, low probability counterinstances do occur, as in the falling together of STRESSED short vowels in many varieties of spoken Arabic in the Syria-Lebanon-Palestine area. In any case, the traditional view is fairly convincing that this falling together of grammatical categories resulted from a sound change and was not morphologically conditioned like the first reduction. The reduction of unstressed vowels (i.e., the neutralization of phonological oppositions that are distinctive in stressed vowels) has remained an active process in modern English, evident especially in loanwords from other languages and in L2 acquisition by speakers of English. On the Swedish side, the plural verb endings persisted until early Modern Swedish times. In the sixteenth and seventeenth centuries a merger of the personal endings of the plural began its long, gradual course, partly phonetic, partly morphological, complicated by dialect differences, register differences, and writing conventions. Some of this variation will be summarized in the section entitled "Present Variation" below.

Unexpected Innovations While morphological simplification was in full swing in late Old English, two not obviously simplifying innovations took place whose effects are still working themselves out in modern English. One was the morphologically conditioned sound change of -th to -s in verb endings; the other was the limitation of subjectverb agreement to the 3sg present tense. For the change to s, careful statistical analysis of the variation in late Old English texts has made it possible to trace the path of its diffusion and some of the favoring and disfavoring factors that operated in its spread (cf. especially Berndt 1956). It is clear that the change came from the northern dialects and moved into the central and eventually even southern dialects in the process of the overall standardization of spoken and written English. The change from th to .s is "natural" in the sense that it is from a more marked to a less marked sound, what has been called "regression to the unmarked," but it is much less common in ordinary diachronic change than th to r. The th to s change is much more typical of transfer in second-language acquisition or bilin-

Variation and Drift


gualism. It is tempting to hypothesize that one factor in this change was the presence in the north of England of a population of "Danes" (i.e., speakers of varieties of Common Scandinavian who had settled there) and the probable widespread English/'Danish" bilingualism, but this source seems very unlikely because the Scandinavian language(s) at that time still retained the interdental fricative, although not in the verb endings where the change took place. In any case, the change of th to s was not a sound change in the neogrammarian sense at all, since it was limited to these verb endings. At most we can suppose that the spread of such a change, once initiated, would take place more readily than a less natural change such as s to th. Careful counts of the occurrences of the variants in extant texts have shown morpho-syntactic (Berndt 1956), lexical, and discoursal factors operative in the spread of this sibilation. The chief morpho-syntactic factors were person and number of the verb, the presence versus absence of the independent subject pronoun, and the distance between subject and verb. In the early northern texts, if we set aside the 1sg and 2sg present tense forms (1sg preserved the vocalic ending and 2sg already ended in -s as opposed to southern -st), the rank order of frequency of -s forms in 1pl, 2pl, 3pl, and 3sg exactly matched the rank order of frequency of occurrence of subject pronouns (as opposed to their omission). In the 3pl with nonpronominal subjects, the incidence of -s forms was greater the greater the distance between subject and verb. The chief lexical factors were the relative frequency and the colloquial versus learned quality of the verb. Variation studies of these lexical factors in Old English and Middle English times have not been systematically carried out, but several studies of early Modern English authors provide valuable clues. The -s form occurs in written texts more often in certain stock phrases common in the spoken language such as me seems, how comes it. Conversely, the auxiliaries hath and doth, which were surely pronounced with s in speech, tend to preserve the th in writing by virtue of their high frequency in traditional written texts. The most clearcut evidence for discoursal factors comes from comparisons of prose and poetry by the same author and comparisons of formal scientific or religious texts (higher frequency of the th variant) with imaginative, informal texts and texts, such as drama, representing speech (higher frequency of the s variant). For a discussion see Jespersen 1968 and Stein 1990; for the general issue of genre as a factor in such variation see also Devitt 1988. It is striking that very similar lexical/ discoursal factors are operative in the occurrence of nonstandard -s endings in the present-day English of Reading: "vernacular" verbs and have and do as auxiliaries (Cheshire 1982). The only phonological factor noted by Jespersen and others in these studies is the tendency of verb stems ending in a sibilant to retain the th (e.g., passe th vs. passes). It is not clear whether this reflects spoken usage or a writing convention. Complicating the variation between -th and -s was the variation between presence vs. absence of any ending at all. This will be discussed under the second innnovation in the following section. The limitation of agreement to the 3sg is an important feature of the standarization of English. It is an instance of an unnatural, marked construction becoming


Variation and Change

accepted as a sign of the standard language as opposed to various nonstandard dialectal variants. By Middle English times the spread of various changes had resulted in divergent patterns of present tense inflection in the Northern, Midland, and Southern dialect areas. Every one of these changes represented a morphological simplification, but the final outcomeâ&#x20AC;&#x201D;limitation to 3sg subject marker on the verbâ&#x20AC;&#x201D;represented a compromise among the changes that is not so obviously simplifying. The two simplifying trends affecting the 3sg marker that were evident in Middle English times (and indeed continue to the present) were (a) the loss of endings under various conditions and (b) the spread of one ending to other numbers/persons. The ultimate end of (a) would be a present tense without subject markers; the ultimate end of (b) would be the same ending on all persons and numbers (i.e., an ending in effect marking the present tense) or merger of the endings for particular persons/numbers across various moods and tenses. The trend of loss of endings apparently began with the reduction or dropping of plural endings when the verb was immediately followed by the subject pronoun, as binde we for we bindath, and then spread to the singular and to constructions in which the pronoun preceded. Along with this reduction went the dropping of final -e and -en that took place at various times and places. The trend of generalizing a particular ending included the spread of -(e)s to all of the present tense except 1sg in the Northern dialects and the spread of the -th of the plural and the 3sg to the first person in the South. The generalizing trend in the Midlands was to spread the plural -en of the present subjunctive to the present indicative and to the preterit. Thus it was only the Midland dialects that maintained a distinction between the plural (-en) and the 3sg (-eth). Given the amount of information available on the variation in Old and Middle English times, we can only assume that the present standard pattern of 3sg -s resulted from the overlapping operation of (a) the spread of the Midland pattern of distinct 3sg agreement as standard, and (b) the change of -th to -s.

Present Variation The first reduction in both English (plural merger) and Swedish (singular merger) has apparently left no residue in the modern languages, and the second reduction in English (vowel merger) likewise ran its course to completion. The change of -th to -s in English has almost run its course, but a few remnants remain. The limitation to 3sg agreement in English is now everywhere standard (apart from an old-fashioned use of don't for doesn't in some standard spoken varieties and a few other isolated exceptions), but nonstandard alternatives abound. The plural merger in Swedish has been completed in recent decades and part of this reduction is well documented in statements of official language policy. The old interdental ending has survived in some of the folk varieties of English in England as a regional variant that is steadily yielding to the standard sibilant. These regional varieties (e.g., Somerset and Dorset) are in the area of the AngloSaxon kingdom of Wessex, where the change to s arrived very late, and one could speculate that the interdental has simply persisted in these dialects.

Variation and Drift


It is clear from many studies of documents over several centuries of variable use that the -th form was used more in formal discourse than in informal, more in writing than in speaking, more in poetry and drama than in prose, and more in public documents than in private ones. The -th form was apparently used in writing long after it had died out of the "vernacular," the ordinary spoken language of the writers and readers. Jespersen (1968, 186-92) provides a summary of the variation between -th and -s in British English, noting the role of the King James version of the Bible, and reporting the usage of various authors from Chaucer to the nineteenth century. Stein 1990 reviews a number of texts from mid sixteenth century to the end of the seventeenth, noting especially the rise in use of -s between 1590 and 1600. In Devitt's study of American documents she reports that the use of -s grew from 33 percent in the period 1640-60 to 98 percent in 17901810, and she observes that "private records use the -s consistently and frequently whereas public records maintain variable usage through 1810" (Devitt 1989). Apart from the few remnants in local nonstandard dialect forms in Great Britain, all that is left of the -th now are more or less frozen forms in several archaizing registers. The greatest residue of -th is probably still in religious registers, although in the 1960s new Bible translations, books of worship, hymns, and other forms of religious discourse underwent a massive shift from the traditional thou language with its many archaic features (including -th) to a variety of English much closer to ordinary written and formal spoken English. Thus, this simple change of -th to -s took roughly one thousand years to reach completion. As noted, it was not a pure sound change, but it followed the familiar trajectory of starting in a marginal community, being taken up by lower-middle-class speakers, and diffusing to educated vernacular and ultimately formal speech and writing. The limitation to 3sg agreement has had an even more long drawn out and troubled history, which has not yet come to an end. The presence of a subject marker only in the least marked person and number is certainly an anomaly as the world's languages go, and the emergence of such a pattern deserves investigation. It is not surprising that the trends of loss of the ending and spread of the ending to other numbers/persons have continued in nonstandard varieties up to the present. The old reduction or loss of the ending when the verb was immediately followed by the pronoun was extended to presence of the pronoun anywhere in the clause, as in some varieties in modern Scotland (Jespersen 1968, 187), and was extended to completion, as in East Anglia (Trudgill 1974, 55-63). The old condition of higher percentage of dropping when the subject is an expressed pronoun is still documented for Early Modern English (Bailey et al. 1989) and apparently for some varieties of Vernacular Black English today. The conflict between the standardizing 3sg -s and a vernacular trend of loss of endings has not yet run its course. The old spread of the ending to all persons and numbers, which appeared in various parts of Britain in the past, has persisted (or been reactivated) in various parts of Britain today, "including parts of the north of England and especially the southwest and south Wales" (Hughes and Trudgill 1979, 17); it also appears in the use of what some linguists have called "hypercorrect 5" in nonstandard varieties of both black and white English in the United States (cf. McDavid 1980, 43-44).


Variation and Change

This use of -s in other than 3sg is reinforced by the vivid narrative use in the 1sg (then ! says . . .) heard in many parts of the English-speaking world. The conflict between the standardizing 3sg -s and a vernacular trend of the spread of a single ending has also not yet run its course. The merger of the three plural endings in modern Swedish and their ultimate disappearance in favor of the singular ending has been traced in some detail (Bergman 1968, 189-92; Wessen 1970, 283-88). The original present tense verb endings in Old Swedish were -um (later -om), -in (later -en), -a (Table 18-1), and three trends operated over time to reduce the three-way contrast. One was the reduction of endings to -e, another was the spread of the third person ending -a to other persons, and the third was the use of the singular ending in place of the plural endings. The reduction to -e began with the 1pl ending when the verb was immediately followed by the subject pronoun (vi barom, bare vi) and then generalized to other constructions (cf. the Old English parallel development). By the sixteenth and seventeenth centuries the ~e ending was common in writing for both 1pl and 2pl. During this same period the use of 3pl -a began to appear for the other persons also. The use of the singular form for the plural apparently began in the seventeenth century in the everyday spoken language of the Svealand region, and spread gradually until it became the spoken norm almost everywhere in Sweden. The written language, however, continued along a more conservative path. Just as the full forms of verb stems that were shortened in speech were typically spelled out in writing, a separate ending for the plural remained in written use well into the twentieth century. It is possible to follow the changes in prescriptivist norms for the dropping of the plural in writing. In the 1930s many authors began to use the singular in representations of dialogue even though they continued to use the plural forms in narrative and expository passages. In 1945 the Swedish Press Agency (Tidningarnas Telegrambyra) decided to allow the singular in everything but material of a strictly official nature. In 1952 the national Board of Education (SkoloverstyreIsen) approved the use of singular verb forms with plural subjects in written work. The same year the cabinet office gave limited approval; in 1967 full approval. In 1966 the courts converted to use of the singular form. Thus, in a mere three hundred years the spoken forms won out, and there is no subject-verb agreement left in Swedish. Table 18â&#x20AC;&#x201D;1. Present Tense Verb Forms in Old English and Old Swedish Old English bindan "bind" Singular 1 binde 2 bindest Plural 3 bindle 1 binda 2 binda 3 binda

Old Swedish binda "bind" binder binder binde er bindum bindin binda

Variation and Drift


Strong/Weak Adjectives Unexpected Innovation The drift toward simplification in morphology had begun already during the period of proto-Germanic and continued during the period of separation into the major branches of Germanic languages, as evidenced by changes in inflectional morphology of both nouns and verbs. One innovation assumed to have been made in proto-Germanic times, however, was in the opposite direction. The inflectional morphology of adjectives was reorganized into a system in which each adjective was declined in two ways, a "strong" set of endings used primarily in indefinite noun phrases and a "weak" set used primarily in definite noun phrases; of these two sets the weak one had fewer distinct forms. This double declension of adjectives is highly unusual, and it poses problems for almost any general account of grammatical agreement; Lapointe (1985) struggles with the analysis of this system in modern Norwegian, and Cooper 1986 in modern Swedish. Sample sets of forms from Old English and Old Swedish are shown in Table 18-2. How did this innovation come about? We can only assume that some population of proto-Germanic or pre-proto-Germanic speakers hit upon this way of marking definiteness at the same time that they were beginning to develop definite articles; in fact, because Gothic has the strong/weak adjective system but no definite article, the emergence of the former presumably preceded the emergence of the latter.2

First Reduction In the Middle English period three trends led to a great simplification in adjective endings: these were the merging of unstressed vowels (mentioned in the section entitled "Second Reduction above), the loss and regularization of endings leading to a relatively uniform contrast of singular versus plural in nouns and adjectives, and the loss of the gender system. The result of these trends in late Middle English was a two-way contrast between zero and -e, as follows: Strong sg: god p1: gode

Weak sg: gode p1: gode

The loss of the gender system is a particularly interesting phenomenon because it often accompanies morphological simplification, and in the history of English it may well be responsible for the early loss of the strong/weak adjective declension compared to other Germanic languages that still retain it. In many Indo-European languages the gender system shows regularization and reduction over time. If we assume that proto-Indo-European had a three-gender system, as in Sanskrit, Greek, Latin, Gothic, Old Icelandic, and so on, then it seems there are three routes followed in the gradual elimination of the system. One is the reduction first to a two-way contrast between animate and neuter and then loss, the second is reduction first to a two-way contrast between masculine and feminine and then loss, and the third is a weakening across the board. Examples of the first type


Variation and Change Table 18-2.

Strong and Weak Adjective Forms in Old English and Old Swedish

Old English god "good"

Old Swedish goper "good"

Strong m






god godne godes godum gode

god god godes godum gode

god gode godre godre gode

goper gopan gops gopum

gott gopan gops gopu

go gopa gopar gopi

gode gode godra godum

god god godra godurn

gode gode godra godum

gopi gopa gopa gopum

gop gopa gopa gopum

gopa gopa gopa gopum

Singular nom acc gen dat inst plural nom acc gen dat

Weak Singular nom acc gen dat

P1 nom acc gen


goda godan godan godan

gode gode godan godan

gode gode godan godan

gopi gopa

gopi gopu








godan godan godan godan godra godra (godena) godum godum

godan godan godra



include Hittite, and among the Germanic languages, Dutch and Continental Scandinavian, including Swedish. Examples of the second type include the Romance languages and many of the Indo-Aryan languages (e.g., Hindi and Kashmiri at the masculine-feminine stage and Bengali with complete loss.) English is the third type: certain gender markers came to signal co-reference, grammatical case, and discourse tracking functions until they either disappeared or lost the gender marking function completely (cf. Jones 1988). In Swedish the reduction in the strong/weak adjective system, which occurred in the late Old Swedish period, was similar to the English first reduction. The reduction and merger of vowels was much less important in the history of Swedish, but the other two trends were important: loss and merger of inflectional endings, and reduction of gender categorization. The reduction of case, gender, and number markers in nouns and adjectives began in the earliest stage of Old Swedish, but it accelerated dramatically in the late Old Swedish period. The reduction from three genders to two in nouns and adjectives was pretty well completed in Old Swedish times. The dative disappeared, the genitive ending was attached only

Variation and Drift


to the head noun and not to the adjective, the masculine singular nominative -er was often dropped so that masculine and feminine were both go, the accusative came to be identical with the nominative, and so on. The weak declension came to have -e in the masculine nominative singular and plural and -a everywhere else, thus being reduced to two forms, as in the Middle English adjective declension.

Second Reduction In early Modern English the loss of final -e, which began in the Northern dialects in Middle English times and spread throughout the language, eliminated the only marker distinguishing singular from plural as well as strong from weak forms in the adjective, which now shows no form of agreement with its head noun. In Modern Swedish the strong/weak contrast persists, with the ending -a generalized to all weak forms, contrasting with the strong forms of bare stem for common gender, stem plus -t for neuter, and -a for plural.

Present Variation The variation in the weak forms of the adjective that has been most commented on is the fluctuation between -e and -a, which is partly based on linguistic form (e.g., comparatives, ordinals, polysyllabic adjectives), partly regional, partly stylistic. It may represent conflicting norms (i.e. change in progress). The -e ending, once the masculine nominative singular and plural, now often serves the purpose of referring to male human beings. I do not know if Swedish variationists have studied this tiny bit of variation; it will be of interest for the general topic of this paper if it turns out to be a step toward the total elimination of the strong/weak difference in adjective inflection.3

General Considerations Drift If by "drift" we mean a recognizable long-term succession of changes tending in the same direction in a language/variety or a set of related languages/varieties, then there is little doubt that such drifts can be identified. A drift toward morphological simplification is a familiar phenomenon, reported from many languages in different parts of the world. It is so common that some linguists tend to view it as the natural pattern of language changeâ&#x20AC;&#x201D;the default drift, as it were. Other (and some of the same) linguists relate it to the process of pidginization, which is then said to differ from the drift primarily by the much greater speed with which it operates. In fact, however, instances of morphological simplification differ greatly from one another in detail and in what takes place in other aspects of the language(s). Also, and more important, some clear examples of drift are morphologically elaborating. For example, most of the Finno-Ugric languages have tended over time


Variation and Change

to develop additional case clitics and affixes, especially local and directional cases and some with more abstract meanings, and also to develop pervasive vowel harmonies. The proto-language may have had the germ of these developments, but they seem to constitute drifts. Or, a case more familiar to me, the various IndoAryan languages over time have developed toward strict Subject-Object-Verb (SOV) word order, ever more elaborate phrasal verb systems (= complex predicate, cf. Butt 1993), and certain ergative constructions. Modern theories of morphology have attempted to account for types of morphological change, and as they pay more attention to explicit cross-linguistic comparisons and long-term drifts we may expect further enlightenment. At the present time, however, neither the European Natural Morphology of Dressier and his colleagues (cf. Dressier 1987) nor the somewhat similar American theory of Bybee, sometimes also called Natural Morphology (cf. Bybee 1985), offer much real insight for understanding the various forms of drift. Meyerthaler's claim that drift is predictable in terms of a general evolutionary formula based on the concepts of self-organizing systems and markedness values (Meyerthaler 1987, 36-37) does not apply to drifts of elaboration or drifts that are long-term but not clearly simplifying or elaborating. The most appealing notion is that there is something about the structure of a particular language and the speakers' perception of that structure that offers the basis for the drift and enables us to accept the unconscious social rationality of Sapir's original characterization: "the unconscious selection on the part of speakers of those individual variations that are cumulative in some special direction" (Sapir 1921, 166). Whether this structural characteristic is called "ground plan," "parameter setting," or "system-dependent morphological naturalness," we are not yet anywhere near being able to state it to our satisfaction for any particular language. Furthermore, if someone does succeed in stating it satisfactorily, the problem remains of explaining why a particular language can have daughter languages that differ from each other in rate and route of change and even in the direction of the drift itself. Instances of drift persist over centuries, and one of the obvious goals of linguistic research must be to understand how they originate and the routes they are likely to follow. It is clear that a drift does not proceed in a straight line but that it zigs and zags, regresses here and advances there, while the overall trend continues. It is also clear that the drift consists of many smaller components, little currents and eddies, crosscurrents and countercurrents, that somehow combine to constitute the persistent main current of change. In identifying a drift, one danger for the linguist is to broaden or narrow the definition beyond what the phenomena justify. Thus, in naming the Germanic drift studied here "simplification" or "reduction of categories" it is easy to overlook certain phenomena of "elaboration." Two familiar examples of adding categories in Germanic are the development of definite and indefinite articles, which occurred in all the Germanic languages except Gothic and in many other Indo-European languages (Haugen 1976, 296-300), and the development of the medio-passive in -5 or -st, which occurred in all the Scandinavian languages (Haugen 1976, 309-

Variation and Drift


10). The morphological simplification drift in Germanic has steadily reduced (by mergers and losses) the suffixes of number, gender, person, and case, but has so far left intact certain other inflectional machinery (e.g., comparatives, tenses). In some instances an innovation seems to add a new resource for expressing the same categories. An example of this is the development of do-support in English, attested also in various other Germanic languages. This construction of a form of a verb "to do," "to make" with a nonfinite main verb expands the earlier pattern of negation, adds an additional means of forming questions to the older means of inversion and intonation change, which still remain, and adds an additional means of expressing emphasis besides those of word order dislocation and stress change, which still remain. (For a full discussion, cf. Ellegard 1953; Kroch 1989; convenient table in Traugott 1972, 199.) Another example is the development of the medio-passive in -s or -st in the Scandinavian languages, including Swedish, which offers a means of expressing passive and reflexive in addition to the constructions with a passive auxiliary and a reflexive pronoun, both of which, however, still remain. Perhaps the most striking example of a completely new grammatical category is the development of the so-called progressive verb forms in English, which have actually introduced a new aspectual category into the language. Starting from use with present tense active forms it has gradually expanded into a full paradigm of all tenses both active and passive, at least for action verbs: "Next month that house will have been being built for two years and no end is in sight" (cf. Mosse 1938). Of these examples, only the Scandinavian suffixed definite article and mediopassive added new inflectional affixes. The other examples involve no morphological elaboration and are consistent with definition of the drift as morphological simplification, including segmentalization (Traugott 1972, 17) and auxiliaries replacing affixes. What is much more interesting is the occasional clear instance of an innovation that goes contrary to the main drift and persists along with it for long periods of time. The two examples in the drift examined here are the creation of the strong/ weak declensions of adjectives in proto-Germanic and the creation of the -s suffix for third person singular agreement in English. This kind of innovation will be discussed in the next section. We have seen that morphological simplification, exemplified in this paper by reduction in the agreement system, is a drift that affects all the Germanic languages, and we have prescinded from hypothesizing a cause for its origin. We can, however, note that the drift is accelerated in certain times and places, and we can ask what the causes might be for the acceleration. Many observers have noted that Icelandic and Faeroese are the most slowly moving languages in this drift and that they have been the most isolated from contact with other languages. Accordingly, we may note that the acceleration of the drift in English and Swedish coincides roughly with periods of heavy influence from another language; for English it was Norman French and for Swedish it was Middle Low German. This kind of external causation is a hypothesis worth investigating, but it must be noted that the morphological simplifications that took place did not reflect morphological


Variation and Change

phenomena in the influencing language. English lost its gender system despite the fact that French had a well-established gender system; Swedish lost its case inflections despite the fact that Middle Low German had a richer case system. Thus the causative factor, if there was one, was not transfer of features from one language to another, but some more general effect such as a simplifying process from a number of partial bilinguals with imperfect learning of their second language. Thomason and Kaufman (1988, 263-331) in their detailed case study of English and other coastal Germanic languages agree with the position taken here in downplaying the simplifying effects of French and Low German, but they do not examine the particular phenomena of -th to -s 3sg agreement and the rise of strong/ weak adjectives in relation to the morphological simplification drift of Germanic languages.

Innovation When an innovation takes place in a language variety it is often either "natural" (i.e., in accordance with well-attested patterns of change that follow "universal tendencies") or is identifiably part of a recognized drift. Such innovations are often puzzling in that they may occur or fail to occur in typologically similar languages and may take place at quite different rates of change, but modern work in variationist analysis and theories of grammar are beginning to elucidate them. Of much greater interest are highly marked innovations and innovations that seem to go against a recognized drift. These "unexpected" innovations seem to follow the same trajectory as the more "natural" ones: they originate in some social group and certain speech styles and apply to some forms that show the requisite structural description, and then they spread to other social groups, other speech styles, and other linguistic forms. As noted above, the morphological simplification drift of Germanic languages as exemplified by English and Swedish offers two such unexpected innovations: the strong/weak adjectives of proto-Germanic and the 3sg verb agreement of English. The strong/weak adjective innovation is an example of what I think of as an "elegance" innovation (i.e., one that introduces a neat, pleasingly designed pattern that "catches on" chiefly by virtue of its appeal to the human system-perceiving and system-constructing capacities). An elegance innovation that acquires sufficient autonomy may persist for long periods in spite of counterpressures. The notion of "autonomy" here is analogous to the notion of lexical autonomy used by Zager and Bybee and others (Bybee 1985, 57); it involves semantic identity, frequency of occurrence, and formal salience. My assumption is that the strong/weak adjective innovation tied together (a) an incipient notion of deflniteness, (b) the existing difference between pronominal and substantival declensional endings, and (c) the pattern of internal noun phrase agreement in gender, number, and case; and it tied them together in a neat, memorable, "autonomous" pattern that would prove resistant to change for centuries. It began to unravel in English only as the gender system was on its way out and all inflectional suffixes were weakening; it disappeared completely with the loss of all adjectival inflection (apart from comparatives and superlatives). It seems likely to persist in Swedish until there is

Variation and Drift


further reduction in the gender categorization of nouns and in what remains of inflectional morphology. The 3sg agreement innovation of Middle English is an example of what may be called a "group identity marking" innovation. Because innovations typically start within a social grouping of some kind (e.g., by class, ethnicity, sex, age, activity, or a combination of such factors), they also, incidentally, mark the users' membership in that grouping ("sociolinguistic markers": Labov 1972b, 237-51; Hartung 1987). The group identity marking function can become highly salient, and may be a factor in the spread of a particular variety, as in the process of language standardization. The most likely explanation of the spread of the 3sg agreement in -s is, as noted above, that it resulted from the interaction of three trends: (a) the unique form for the 3sg in Midland dialects, (b) the change of -th to -s, and (c) the spread of some varieties of Midland English as part of the development of standard English. Although the 3sg -s ending has become fully established in the standard varieties of English throughout the world, its anomalous, highly marked status has presumably been a factor in keeping it from spreading to all varieties of English. The two competing trends of total loss of the -s and spread of the -s to the first and second persons and the plural are evident in many (nonstandard) varieties of English around the world. In some instances they continue local features that have never become standardized, and in other instances they are new emergences of "natural" tendencies that harmonize with the drift of morphological simplification. This conflict of patterns is likely to continue for centuries; a major change could come if at some time a new standardizing center were to arise that had a less marked pattern.

Modularity One of the great insights of structuralism was that languages are in some sense tightly structured whole systems in which every little element has a place in relation to all other elements. This represented a great advance over the previous more atomistic views of language. It no longer made sense to examine, for example, the nature of short o through the history of a language without trying to understand the place of short o in the vowel system (and the whole phonology, morphology, onomastics, onomatopoeia, etc.) at particular times in the history. The same insight made it incumbent upon the linguist writing a grammar to tie everything together in some way. Yet the linguist who wants to write a grammar or, for that matter, wants to construct a general theory of language, finds problems with this structuralist insight. Some parts of the grammar seem to form little systems that cohere together quite closely and have little to do with other parts. Also, all language varieties are in a state of variation and change, and the variation and change affect some parts of the language before others and may range from very localized in the grammar to very pervasive and massively reorganizing. With the appearance of the notion of modularity in current linguistic theory, the questions have become: What parts of the total grammar constitute separate


Variation and Change

"modules" with some degree of analytic independence? and What is the nature of the relationship between the different modules and between any module and the whole grammar? These questions become even more difficult if we go a step further and ask of our hypothesized modules and their interrelationships that they be not merely artifacts of our theory of grammar but in some way reflections of the actual language processing by hearers and speakers. There seems to be a general consensus that morphology and syntax constitute two such major modules, even though their boundaries may be indistinct. But there is no such consensus about other smaller systems and systems that cross morphology and syntax. Bybee, for example, apparently regards each paradigm in a language as a submodule with its own internal structure and some degree of independent psychological processing, although she does not use exactly this formulation. In this paper the expression "system of agreement" has been used freely and easily. Is there such a submodule in the languages being analyzed? In all languages? One way to obtain behavioral evidence of the existence of such a module separate from the inflectional morphology and lexicon of a language is to test the responses of speakers when trying to acquire competence in another language whose agreement phenomena differ from those of their language. Thus, if speakers of Arabic who use feminine singular agreement of adjective, pronoun, and verb with inanimate plural nouns are faced with learning another language that has no such pattern, do they sometimes exhibit transfer errors (parameter-setting errors) that reflect the feminine singular agreement pattern? This is a good test because the agreement pattern is not tied to actual morphological forms. In fact, speakers of Arabic do occasionally exhibit errors of this kind in other languages, and this is a small bit of evidence for the "psychological reality" of the Arabic pattern. It would be difficult, but not impossible, to develop similar tests for the reality of the possible agreement systems discussed in this paper. In the absence of behavioral evidence of this kind or some other kind, a properly cautious description would provide only the facts of the morphology and statements of co-occurrence restrictions, without assuming the existence of agreement systems as such. In that case, the historical changes could be listed as inflectional simplification, and the changes in agreement patterns could be regarded as epiphenomena. The decision was made, however, to use the concept of "agreement system" throughout the paper, for analytical and presentational convenience, leaving for others the challenging task of providing validating evidence for the reality of a system of agreement changing over time as part of the long-term drift.

Variation Structural variability in language is a universal characteristic: every speech community shows variation in the forms of the language(s) used in that community. The variation may be dialectal (i.e., correlate with ["reflect"] speakers' positions in the community; it may be registral (i.e., correlate with ["be appropriate for"] different occasions of use; or it may be individual, in the sense that it is not

Variation and Drift


conventionalized in the community but characterizes individual users. In many cases the variationâ&#x20AC;&#x201D;of whichever kindâ&#x20AC;&#x201D;is a sign of linguistic change taking place in the community. The occurrence of variation is not always a sign of change, since some kinds of variation are relatively stabilized and some kinds of change are gradual and not manifested in alternations, but variation in the form of shifting frequencies of alternants is probably the principal means of linguistic change. The changes in agreement systems discussed in this paper have all appeared (and some continue to appear) in dialect, register, and individual variation of this kind. Furthermore, the analysis of variation, whether in written texts or recordings of spontaneous spoken texts, offers the best pathway to understanding the nature of the changes. Analyses of agreement phenomena in written texts from the past (e.g., Moore 1928; Berndt 1956; Jones 1988 and Bailey et al. 1989 and in recorded spoken discourse e.g., Schneider 1983; Myhill and Harris 1986; and Poplack and Tagliamonte 1989) offer invaluable clues to changes. For general discussions of this approach, compare Labov (1972a) and Romaine (1982). In discussing the role of variation in implementing a drift and the role of variation analysis in charting and understanding a drift, at least three main points can be made. 1. Subtrajectories: A multiplicity of apparently separate changes combine to constitute the drift. The gradual disappearance of most of the nominal and verbal inflections in English and Swedish is not a single innovation moving across the morphological system; instead it is a series of relatively independent changes, each following its own path. For example, the disappearance of the dative does not parallel the disappearance of the accusative or the genitive. It is as though the members of the speech community are opportunistic, that when the language offers a feature that can be modified in a direction that fits the drift an innovator may pick up on it and the community may go along with it. 2. Complementarity: The various subsidiary changes interact: they may feed one another, reinforce one another, merge with one another. Despite the relative independence of the subtrajectories, they do interact with one another. The merger of unstressed vowels in English probably aided the merger and eventual loss of certain inflections. Interactions of this kind are so tempting for the analyst to hypothesize that some morphology theorists have assumed that cross effects of the logics of separate modules are a principal causal factor in morphological change. Phonological change (e.g., assimilation, weakening) disturbs a morphological pattern and thus leads to morphological change. But other theorists assume that the morpho-syntactic change facilitates the phonological change. Because both of these opposing hypotheses are plausible, it is up to the variationist to study changes in progress and to discover whether one, both, or neither of the explanations seem to be operating in a particular instance or in a set of similar instances. 3. Duration: Drifts are by definition long-lasting phenomena, but the duration of the subsidiary changes may be quite variable. What is most impressive in the limited review (in this paper) of a drift is the incredibly long time it may take for an apparently simple change to work its way through the community and the language: the shift of -th to -s in English verb endings took a thousand years.


Variation and Change

Moral The phenomenon of drift in human language, as Sapir realized, poses one of the most crucial problems for the linguistic theorist. It is at the center of linguistics' most troublesome antinomies: synchrony and diachrony, individual and community, structure and freedom. And we know at least one way to approach it: meticulous variationist studies of change in progress, undertaken from the perspective of what seem to be long-term structural shifts. We can be assured that every little thing we learn will be new, and will emend and amplify our favorite theory. Even this little exercise of reviewing an apparent drift in two Germanic languages, inconclusive though it is, is stimulating and challenging. I am grateful to Edward Sapir for having named the problem and having described it so lucidly, and I am grateful to William Labov for having provided a very valuable method of studying it.

Notes 1. Dual forms are not included here because the Old English 1du and 2du pronouns have no corresponding dual verb forms but take plural agreement. In Old Swedish even the dual pronouns were on their way out and are attested only in the earliest texts (Haugen 1976, 303). 2. The short and long forms of adjectives in Slavic languages bear some resemblance to the strong/weak forms of Germanic in semantic value, but the inflectional origins are different and the Germanic development is a unique and characteristic innovation. 3. Soderbergh 1990 gives a fascinating account of a three-year-old girl's encounter with the -e/-a variation over a period of a year, her gradual recognition of dialect, register, and language (Swedish/Danish) conditions of occurrence of the two variants, and her ability to express the systematic differences.

References Andersen, H. 1990. The structure of drift. In H. Andersen and K. Koerner, eds., Historical linguistics. (Current Trends in Linguistic Theory 66) Amsterdam/Philadelphia: John Benjamins. Bailey, G., N. Maynor, and P. Cukor-Avila. 1989. Variation in subject-verb concord in Early Modern English. Language Variation and Change 1.285-300. Bergman, G. 1968. Kortfattad svensk sprakhistoria. Stockholm: Prisma. Berndt, R. 1956. Form und Funktion des Verbums im nordlichen Spatenglischen. Halle: Max Niemeyer. Butt, M. 1993. Complex predicates in Urdu. Ph.D. diss., Stanford University. Bybee, J. L. 1985. Morphology: A study of the relation between form and meaning. Amsterdam/Philadelphia: John Benjamins. Cheshire, J. 1982. Variation in an English dialect: A sociolinguistic study. Cambridge: Cambridge University Press. Cooper, R. 1986. Swedish and the Head-Feature Convention. In L. Hellan and K. Koch Christensen, eds., Topics in Scandinavian syntax. Dordrecht: D. Reidel.

Devitt, A. J. 1989. Genre as textual variable: Some historical evidence. American Speech 64.291-303. Dressier, W. U., ed. 1987. Leitmotifs in natural morphology. (Studies in Language Companion Series 10). Amsterdam/Philadelphia: John Benjamins. Ellegard, A. 1953. The auxiliary do: The establishment and regulation of its use in English. Stockholm: Almqvist and Wiksell. Ferguson, C. A. 1989. Grammatical agreement in Classical Arabic and the modern dialects. Al-'Arabiyya 22.5-17. Hartung, W. 1987. Sprachnormen - ihr sozialer Charakter und die linguistische Begrifflichkeit. Zeitschrift fur Phonetik und Kommunikationswissenschaft 40.317-355. Haugen, E. 1976. The Scandinavian languages: An introduction to their history. Cambridge, Mass.: Harvard University Press. Hughes, A., and P. Trudgill. 1979. English accents and dialects: An introduction to social and regional varieties of British English. London: Edward Arnold. Itkonen, E. 1977. Short-term and long-term teleology in linguistic change. In J. P. Maher et al., eds., Papers from the 3rd International Conference on Historical Linguistics. Amsterdam/Philadelphia: John Benjamins. Jespersen, O. 1968. Growth and structure of the English language. 9th ed. New York: Free Press. Jones, C. 1988. Grammatical gender in English: 950 to 1250. London: Croom Helm. Kroch, A. S. 1989. Reflexes of grammar in patterns of language change. Language Variation and Change 1:199-244. Labov, W. 1972a. On the mechanism of linguistic change. In W. Labov, Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press. . 1972b. The study of language in its social context. In W. Labov, Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press. Lapointe, S. G. 1985. Cooccurrence and agreement in Norwegian noun phrases. In G. Alvarez, B. Brodie, and T. McCoy, eds., ESCOL '84: Proceedings of the first Eastern States Conference on Linguistics, Columbus, Ohio: Ohio State University. Leith, D. 1947. A social history of English. London: Routledge and Kegan Paul. Malkiel, Y. 1981. Drift, slope, and slant. Language 57.535-70. Meyerthaler, W. 1987. System-independent morphological naturalness. In W. U. Dressier, ed., Leitmotifs in natural morphology. Amsterdam/Philadelphia: John Benjamins. Minkova, D. 1983. Middle English final -e from a phonemic point of view. In J. Untermann and B. Brogyanyi, eds., Das Germanische und die Rekonstruktion der indogermanischen Grundsprache. (Current Issues in Linguistic Theory 22.) Amsterdam/ Philadelphia: John Benjamins. Moore, S. 1928. Earliest morphological changes in Middle English. Language 4.238-58. Mosse, F. 1938. Histoire de la forme periphrastique etre + participe present en Germanique. Paris: Klincksieck. Muhlhausler, P. 1974. Pidginization and simplification of language. (Pacific Linguistics Series B 26.) Canberra: Australian National University. Myhill, J., and W. A. Harris. 1986. The use of the verbal -s inflection in Black English Vernacular. In D. Sankoff, ed., Diversity and Diachrony. Amsterdam/Philadelphia: John Benjamins. Poplack, S., and S. Tagliamonte. 1989. There's no tense like the present: Verbal -s inflection in early Black English. Language Variation and Change 1.47-84. Pyles, T., and J. Algeo. 1982. The origins and development of the English language. 3d ed. New York: Harcourt Brace Jovanovich.


Variation and Change

Romaine, S. 1982. Socio-historical linguistics: Its status and methodology. Cambridge: Cambridge University Press. Sapir, E. 1921. Language: An introduction to the study of speech. New York: Harcourt, Brace. Schneider, E. 1983. The origin of the verbal -.v in Black English. American Speech 58.99-113. Soderbergh, R. 1990. The parent (linguist) and her child as metalinguistic collaborators. In B. Metuzale-Kangere and H. D. Rinholm, eds., Symposium Balticum: A Festschrift to honour Professor Velta Ruke-Dravina. Hamburg: Helmut Buske. Stein, D. 1990. Functional differentiation in the emerging English standard language: The evolution of a morphological discourse and style marker. In H. Andersen and K. Koerner, eds., Historical linguistics 1987. (Current Trends in Linguistic Theory, 66). Amsterdam: John Benjamins. Thomason, S. G., and T. Kaufman. 1988. Language contact, creolization, and genetic linguistics. Berkeley: University of California Press. Traugott, E. C. 1972. A history of English syntax. New York: Holt, Rinehart and Winston. Trudgill, P. 1974. The social differentiation of English in Norwich. Cambridge: Cambridge University Press. Venneman, T. 1975. An explanation of drift. In C. N. Li, eds., Word order and word order change. Austin: Texas University Press. Weinreich, U., W. Labov, and M. Herzog. 1968. Empirical foundations for a theory of language change. In W. Lehmann and Y. Malkiel eds., Directions for historical change, Austin: University of Texas Press. Werner, O. 1984. Morphologische Entwicklungen in den germanischen Sprachen. Das Germanische und die Rekonstruktion der indogermanischen Grundsprache. In J. Untermann and B. Brogyanyi, eds., (Current Issues in Linguistic Theory 22.) Amsterdam: John Benjamins. Wessen, E. 1970. Schwedische Sprachgeschichte. Bd. 1 Laut- und Flexionslehre. Berlin: Walter de Gruyter.


Although Ferguson's first professional experience with language planning (LP) as an explicitly named field did not come until much later (see below), his LP interests predate the term itself (Weinreich is credited with coining the term language planning in 1957; cf. Eastman 1983, 130; Cooper 1989, 29). Typical of much of Ferguson's work, his interest in LP aligned him with the minority position among his American colleagues in linguistics, most of whom believed that languages cannot be planned and are best left alone. As a graduate student in Oriental Studies at the University of Pennsylvania, Ferguson was supported with a fellowship from the Intensive Language Program of the American Council of Learned Societies. This involved him in the linguistic analysis, materials preparation, and teaching of Arabic, which in turn spurred his interest in a job offer from the newly formed Foreign Service Institute (FSI) of the U.S. Department of State. In 1946, Ferguson became one of FSI's "charter" linguists. Thus began a period of about twenty years in Ferguson's career in which he "was constantly operating with a professional tension between solving practical language problems and doing academic linguistics" (Ferguson, n.d.). A major practical problem for Ferguson during his ten-year tenure at the FSI was the limited competence in Arabic and Middle Eastern languages of students in the Foreign Service. He was instrumental in designing and implementing the Arabic language program in Washington, D.C., in training a corps of Arabists, and in establishing an Arabic language school as an FSI branch in Beirut. But the 1950s also marked a time in the United States during which there was a growing interest in the emerging nationhood of former colonies and in issues of national development (cf. Rostow 1952; Deutsch 1953). From his vantage point in the FSI, Ferguson saw that an important, albeit often-overlooked, dimension of national development was the language factor. The tension between academic linguistics and practical language problems continued during Ferguson's appointment to the faculty of the newly established Middle East Center at Harvard, beginning in 1955. There he taught courses in both linguistics and Arabic. His stay at Harvard (1955-59) and his contact there with students and other faculty (Dell Hymes, Roger Brown, John Carroll) strengthened and gave shape to his interests in social attitudes toward language, a recurring theme in his subsequent work on LP. 261


Language Planning

Much of the early work on LP among American linguists came from the Ford Foundation's desire to find solutions to language problems rather than from linguists interested in LP for theoretical reasons. An example is the five-nation Survey of Language Use and Language Teaching in Eastern Africa (cf. Bender et al. 1976; Ladefoged et al. 1971; Whiteley 1974; Ohannessian and Kashoki 1978). In 1955, at the LSA Linguistic Institute at the University of Michigan, the Ford Foundation sponsored an interdisciplinary meeting of linguists, English teachers, and social scientists to discuss language problems in developing nations where the foundation had been working. One eventual outcome was a recommendation to form a center that would function as a clearinghouse and informal coordinating body for the solution of practical language problems. This became the Center for Applied Linguistics, and Ferguson served as its first director, a position he held from 1959 until 1966. Through Ferguson's membership on the Social Science Research Council's (SSRC) Committee on Sociolinguistics (1963-72) and his chairmanship of the seminar on sociolinguistics at the Indiana University LSA Linguistic Institute (1964), Ferguson's growing friendship with researchers such as Fishman and Haugen deepened and influenced his thinking on language in society and specifically on language planning. The first two papers in this section come from this period in Ferguson's career. The Survey of Second Language Learning in Asia, Africa, and Latin America, carried out by the Center for Applied Linguistics generated "The Language Factor in National Development" (Ferguson 1961). And a conference organized by Ferguson and Fishman on "Language Problems of Developing Nations" in 1966 resulted in the book by that name (Fishman, Ferguson, and Das Gupta 1968) and in the second paper of this section. In both papers, Ferguson's rationale for using the nation as a geographical unit for the analysis of language situations is articulated in terms of its usefulness in solving language problems. His introduction of scales for measuring language development was offered not as a theoretical construct, but as working heuristics for analysis. His notions of major and minor languages, language dominance, and languages of wider communication presented here have all influenced other linguists, often in areas of linguistics distinct from LP. In some cases (e.g., languages of wider communication) the concepts are still used unchanged more than thirty years later. In other cases (e.g., standardization) they have spurred subsequent studies that have modified or replaced these terms. National sociolinguistic profile formulas of the kind proposed in the second paper were particularly influential in Indonesia (cf. Alisjahbana 1971) and Mexico (cf. Uribe Villegas 1969). Consistent with his approach to other areas of linguistics, whether on universals, acquisition, or any of the other topics he has dealt with over the years, is Ferguson's call for the collection of reliable data as a prerequisite for any useful theorizing and the need for thorough analyses of specific case studies leading eventually to more general conclusions. A milestone event in the development of LP as a field of inquiry, and Ferguson's first experience with it so explicitly named, was the Conference on Language Planning Processes held at the East-West Center, University of Hawaii, in April 1969. This conference was a key episode in a year-long venture initiated by Joshua Fishman. This venture, funded by the Ford Foundation, brought together

with Fishman three others intensely interested in achieving a better conceptualization of LP processes: Jyotindra Das Gupta, Joan Rubin, and Bjorn Jernudd. They spent the academic year 1968-69 together at the East-West Center discussing LP issues and clarifying their respective understandings of LP processes. At the April conference these three people were joined by Ferguson and a half dozen other conferees, who had impressive experience in LP and in the study of LP: S. Takdir Alisjahbana, chief architect of Bahasa Indonesia, the variety of Malay that became the national language of Indonesia; C. F. Gallagher, close student of LP in Ottoman and present-day Turkey, and in Japan; John Macnamara, psychologist and careful but somewhat skeptical observer of the Irish revival effort; Chaim Rabin, who had been involved in activities relating to the emergence of Modern Israeli Hebrew; Bonifacio Sibayan, grand old man of Philippine language and education policies; Wilf Whiteley, active contributor to the standardization of Swahili and keen observer of the various outcomes of Swahili LP. Einar Haugen, who had been invited to participate in the conference, but was unable to attend, offered a written paper on theoretical issues in LP. Two respected social scientists without experience in LP took part in the conference exploring the relevance of their disciplines for the study of LP: Helmut Kelman, known for his studies of the relation between individual and group efforts to a national system; and Thomas Thorburn, a Swedish economist familiar with the uses of cost-benefit analysis. The four participants of the East-West Center group contributed greatly to all the discussions and most of the papers presented orally were put in final written form and published in Rubin and Jernudd 1971. Ferguson's contribution to that conference was a paper presenting a hypothesis based on the history of standardization in English, Italian, and Bengali. In that paper, he suggested that change in the range of language uses typically precedes change in the community's evaluative attitudes toward the language. The welldocumented English situation is described in two books of almost identical titles published independently and without either author knowing about the efforts of the other (Jones 1953; Cottle 1969). These two books studied the opinions of English writers about their language during a period when the functions of the language changed drastically and the attitudes of the people about their language changed equally drastically. At the time of the introduction of the printing press into England in the fifteenth century the literary figures of the time showed a very low opinion of their language, typified by the words of the poet John Skelton (1545?), quoted in Jones (1953, 11): Our natural tonge is rude and hard to be ennuede With pollysshed tearmes lustye Oure language is so rustye So cankered and so ful Of frowardes and so dul. That if 1 wold apply To write ornatly I wot not where to finde Termes to seme my mynde.


Language Planning

At that time, the uses of English were limited largely to informal conversation; Latin was the language of education and religion, French was the language of law courts and certain types of literature. Early in the seventeenth century, when English had replaced French in law courts and Acts of Parliament, and had replaced Latin in the translations of the Bible and services of the church as well as parts of the educational system, the views of literary scholars had also changed to a high evaluation of the English language, as typified by the words of John Beaumont (1629), quoted in Jones (1953, 240): The relish of the Muse consists in rime, One verse must meete another like a chime, Our Saxon shortnesse hath peculiar grace In choise of words, fit for the ending place, Which leaue impression in the mind as well As closing sounds, or some delightfull bell.

When Ferguson deepened his research into the cases of English, Italian, and Bengali, he finally came to the conclusion that his hypothesis was not fully confirmed by the historical evidence and needed further research and careful reformulation. Accordingly, he withdrew the text of his paper, "Uneloquent Vernaculars," which had been prepared for the conference. In any case, Rubin and Jernudd's book became an important contribution to the literature on LP, even serving as a textbook for LP courses. The conference and the book strengthened Ferguson's commitment to LP as a topic for research. "Sociolinguistic settings of language planning" was part of the large-scale project "International Study of Language Planning Processes" (cf. Rubin et al. 1977), one of the less successful projects Ferguson has been involved in (personal communication, 9/1/93). But despite problems with the project, this paper survives because in it Ferguson identifies features of the linguistic structure of specific languages that affect language planning outcomes. For example, word compounding in Hebrew is marginal. By contrast, Hindi borrows from Sanskrit for academic purposes, from Perso-Arabic as a vehicle for humor and other "expressive" purposes, and from English for spoken, but not written registers. An agency trying to persuade a population to use a specific word would have to take into consideration the language attitudes associated with these varieties and language functions. Ferguson's LP research commitment was readily misunderstood as taking a strong policy and action commitment. When he read his paper on "National Attitudes toward Language Planning" at the Georgetown Round Table in 1979 (the fourth paper in this section), in which he tried to demonstrate that linguists' attitudes toward LP are conditioned by the beliefs of their cultural environment, he was understood by many in the audience as advocating for America the Swedish attitude to LP, although no such advocacy was mentioned in the paper. Instead, it both mitigates the American traditions of formalisms and autonomous linguistics (managed decisions can effect language change) and tempers the Scandanavian and Baltic confidence that LP can be the sole determinant of language change (i.e., Tauli 1968). Ferguson's interests in LP are evident in his papers that deal with individuals

Language Planning


who have had a substantive and sustained influence on language situations (i.e., Ferguson 1967, 1968, 1987). The last two papers of this section remind American linguists that the European view of caring for, as well as about, language is a legitimate view. Here Ferguson also demonstrates the value of individual case studies for illustrating how linguistic problems have been solved in language situations removed from each other and us in time and space. Although working in different ways, each linguist was presented with a language problem and found a solution for it. In presenting these cases, Ferguson challenges the notion that national development, and consideration of language factors in that development, only goes back to sixteenth-century Europe. Although the tension Ferguson felt for two decades may have subsided somewhat when, in 1967, he accepted a professorship at Stanford, his belief in the contribution of linguistics to the solving of language problems continues to influence his work today and is reflected in each of the papers of this section.

References Alisjahbana, S. T. 1971. Some planning processes in the development of the Indonesian/ Malay language. In J. Rubin and B. H. Jernudd, eds., Can Language Be Planned? Sociolinguistic Theory and Pracatice for Developing Nations. The Hague: Mouton. Pp. 179-87. Bender, M. L., J. D. Bowen, R. L. Cooper, and C. A. Ferguson, eds. 1976. Language in Ethiopia. London: Oxford University Press. Cooper, R. L. 1989. Language Planning and Social Change. Cambridge: Cambridge University Press. Cottle, B. 1969. The Triumph of English 1350-1400. London: Blandford Press. Eastman, C. M. 1983. Language Planning: An Introduction. San Francisco: Chandler and Sharp. Ferguson, C. A. 1967. St. Stefan of Perm and applied linguistics. In To Honor Roman Jakobson: Essays on the Occasion of his Seventieth Birthday, vol. 1. The Hague: Mouton. Pp. 643-53. . 1968. Language development. In J. A. Fishman et al., eds., Language Problems of Developing Nations. New York: John Wiley and Sons. Pp. 27-35. . 1987. Conventional conventionalization: Planned change in language. Paper presented at the Conference on Language Change and Social Context. Stanford University, summer 1987. . N.d. Long-term commitment and lucky events. In Konrad Koerner, ed., First Person Singular HI. Amsterdam: John Benjamins. Forthcoming. Fishman, J. A., C. A. Ferguson, and J. Das Gupta, eds. 1968. Language Problems of Developing Nations. New York: John Wiley and Sons. Jones, R. F. 1953. The Triumph of the English Language. Stanford, Calif.: Stanford University Press. Ladefoged, P., R. Click, and C. Cliper, eds. 1971. Language in Uganda. Nairobi: Oxford University Press. Ohannessian, S., and M. Kashoki, eds. 1978. Language in Zambia. London: International African Institute.


Language Planning

Rubin, J., and B. H. Jernudd, eds. 1971. Can Language Be Planned? Sociolinguistic Theory and Practice for Developing Nations. Honolulu: East-West Center Press. Rubin, J., B. H. Jernudd, J. Das Gupta, J. A. Fishman, and C. A. Ferguson, eds. 1977. Language Planning Processes. The Hague: Mouton. Tauli, V. 1968. Introduction to a Theory of Language Planning. Acta Universitatis Upsaliensis, Studia Philologiae Scandinavicae Upsaliensia 6. Uppsala: Almqvist and Wiksell. Uribe Villegas, O. 1969. La situation sociolingufstica de Mexico como marco de la condition indigena. Revista Mexicana de Sociologia 31:109-28. Whiteley, W. H., ed. 1974. Language in Kenya. Nairobi: Oxford University Press.

19 The Language Factor in National Development

Social scientists of various disciplines are concerned with the concept "national development", in particular, of course, economists and political scientists, but to a lesser extent scholars in other fields.1 Structural linguistics, however, in spite of its concern with diachronic matters, has been resolutely opposed to any developmental or evolutionary approach in linguistic analysis. The purpose of this paper is on the one hand to suggest the relevance of "national development" for linguistic analysis and on the other hand to point to linguistic aspects of national development as it is studied by social scientists in other fields. The approach followed here has resulted from the work of the Survey of Second Language Learning in Asia, Africa, and Latin America which was carried out by the Center for Applied Linguistics in collaboration with outside specialists from the United States and other countries.2

Language Development Scales Of the many scales which could be developed for measuring language "development" in a way which might correlate usefully with non-linguistic measures of development two seem particularly promising: the degree of use of written language and the nature and extent of standardization. Scales suggested here represent a modification of the viewpoint of Heinz Kloss.3

Use of Written Language The 3000 or more languages currently spoken vary in the use of a written form of the language from cases in which the language has never been written to languages with an enormous and very varied use of written forms. It is difficult to arrange these cases in a simple, linear progression, partly because of the complex variation This article originally appeared in Anthropological Linguistics 4.1:23-27, from which it is reprinted with permission of the editors. 267


Language Planning

and partly because of the great range in the amount of use. As a first approximation to a useful scale, we suggest the scheme: WO. not used for normal written purposes W1. used for normal written purposes W2. original research in physical sciences regularly published. A convenient set of criteria for establishing "normal" use of the written language is as follows, (a) The language is used for ordinary interpersonal epistolary purposes. People write letters in it. (b). The language is used in popular periodicals. Newspapers appear in it. (c) The language is used in books not translated from other languages. People write and publish books in it. Languages with rating "O." include languages such as Modern Aramaic for which no orthography has been suggested and which has no representative writing by members of the speech community, as well as languages like Tuareg where use of the writing system is limited to special and marginal purposes, or Lugbara where an orthography has been suggested which has been used in some dictionaries, grammars, textbooks, Bible translations and the like, but which has not yet become widely used in the community. Most languages are still at this level although there is a steady stream of languages moving up to level "1". Examples of languages at level "1." include Amharic, Thai, Slovenian. Many languages in category "1." have a substantial publication output. Languages with relatively small output often have a considerable amount of poetry, folkloric material and translations. Only a few languages fall in category "2." These are often also languages widely used for intercommunication by other speech communities. The number of languages belonging in this category, however, is steadily increasing and this suggests the need for a possible additional level W3. languages in which translations and resumes of scientific work in other languages regularly published.

Standardization The establishment of a scale of standardization is much more difficult because there are at least two dimensions involved, one of which is itself quite complicated. One such dimension is the degree of difference between the standard form or forms of a language and all other varieties of it. This difference may be very small or very great independently of the other dimension, which is the nature of the standardization and the degree to which a standard form is accepted as such throughout the community. The simplest approach seems to be to set up the end points of the scale as St0. and St2. Zero refers to a language in which there is no important amount of standardization. As an example, we may take Kurdish, where there is a considerable dialect variation but where no form or forms of the language has received wide acceptance as a norm among people who do not speak it themselves.4 At the other end of the scale we have what may be regarded as the "ideal" standardization. The term "ideal" is not inappropriate because individuals concerned with the development of their nation who make proposals for change

The Language Factor in National Development


in the language situation generally seem to make proposals aimed at achieving this "ideal" standardization even though they rarely state the desired goal explicitly. Category "2." refers to a language which, has a single, widely-accepted norm which is felt to be appropriate with only minor modifications or variations for all purposes for which the language is used. Differences between regional variants, social levels, speaking and writing, and so on, are quite small. An example of category "2." is Swedish, where the difference between the written and spoken standard is appreciable, but relatively minor and growing less, and where none of the original dialects are too far removed from the standard. Category "1." requires considerable subclassification to be of any use. Whatever scheme of classification is developed, it will have to take account in the first instance of whether the standardization is unimodal or bi- or multimodal, and in the case of more than one norm, the nature of the norms must be treated. Armenian may serve as an example of a bimodal standardization where the standards are essentially regional, East Armenian and West Armenian, both being used for normal written purposes. Greek may be cited as an example of a bimodal standardization based on a "vertical" or role differentiation, one being used for ordinary conversation and the other for most written and formal spoken purposes. Serbo-Croation has two norms based to a large extent on religio-cultural differences. Norwegian is an example of a bimodal standardization based on neither of these. As a single example of a more complicated case of standardization, we may mention the whole Hindi-Urdu complex with its regional standards in addition to the religio-cultural split.5

The Nation as a Linguistic Area Linguists have generally operated with the concept of speech community "a group of people who use the same system of speech-signals"6 as the locus of linguistic behavior, although recently some attempts have been made to deal with multilingual communities.7 Only rarely has the concept nation been utilized by linguists for this purpose, and then generally for describing certain features of language used in Europe.8 From many points of view, however, it is desirable to use the nation as the basis for general sociolinguistic descriptions: communication networks, educational systems, and language "planning" are generally on a national basis and national boundaries play at least as important a role in the delimitation of linguistic areas as any other single social barrier. In the description of the language situation of a given nation two fundamental points must be treated, the number of languages and the relative dominance of languages.

The Number of Languages In determining for taxonomic purposes the number of languages spoken in a country it seems advisable to distinguish between major and minor languages. A definition of major language which has proved useful in the work of the Survey is: a major language of a nation is a language spoken by at least ten million people or


Language Planning

one-tenth of the population. The number of major languages in a nation may vary from one to a dozen or more. It seems likely, however, that the important categories are: one major language (e.g. Thailand, Costa Rica, Holland), two Major languages (e.g. Canada, Belgium, Paraguay), and three or more major languages (e.g. Switzerland, Nigeria, India).

Dominance In a nation with more than one major language, it is often true that one is clearly dominant over the others or, in some cases, several languages are dominant over the others. One indication of dominance is numerical superiority: one language is dominant over others if it is spoken by more than half the population of the country. Another important indicator of dominance is the extent to which a given language is learned by native speakers of other languages in the country. For example, Persian and Pashto are spoken by about the same number of people in Afghanistan, but Persian is often learned as a second language by speakers of Pashto and other languages in the country, while Pashto, in spite of official government support and formal classes, is rarely learned well by speakers of other languages in the country . A third indicator of language dominance is the use of one of the languages of the nation for such clearly national uses as publication of official texts of laws or decrees, medium of instruction in government schools, normal channel of military communication. Full agreement among these three indicators provides the "normal" form of national language dominance. Cases where these indicators are not in agreement seem generally to have serious social tensions connected with language problems.

Languages of Wider Communication In many nations, especially in Asia and Africa, languages of wider communication (LWC) such as English and French, play an important role in the national language situation and this must be separately assessed. For one thing, the LWC may be the language used for the clearly national purposes listed under the third indicator of dominance. In addition it may be an LWC rather than one of the local languages which is used as the means of access to scientific and technological knowledge or to communicate with other nations in the expanding network of international communication.

National Sociolinguistic Profiles With the fairly simple machinery outlined in the previous two sections, it is possible to draw up for a nation a profile which will be of value for comparison with other non-linguistic indices of development. The great drawback here is not the theoretical complexity or even the practical man-hours of work involved. It is the lack of reliable data for most nations. The preparation of a national sociolinguistic profile in the sense described here

The Language Factor in National Development


calls for putting down on paper about a given nation the following information: how many major languages are spoken; what is the pattern of language dominance; are there national uses of a LWC; for each major language spoken in the country, what is the extent of written uses of the language (WO.-W2.) and what is the extent of standardization (StO.-St2.) and its nature (multimodal? range of variation from the norm?). Even with the preliminary profiles which can be put on paper on the basis of currently available data, it is evident first that there is a wide range of variability although certain types are quite common, especially the one nation- one major dominant language W1. St2. It is also evident that certain types of profile occur only in underdeveloped countries, especially the ones with no dominant language and an LWC used for national purposes as well as for access to science and international communication. Before any useful theorizing can be done, it is necessary to collect the data and prepare reliable national profiles. The possibility of significant conclusions arising from such study seems very promising.

Notes 1. As examples of recent works concerned with the theory of national development we may cite W. W. Rostow, The Process of Economic Growth (New York, 1952); G.A. Almond and J.S. Colemen (eds.), The Politics of the Developing Areas (Princeton, 1960). One work which emphasizes the role of communication in the concept of nationhood in K.W. Deutsch, Nationalism and Social Communication (New York, 1953). 2. For an account of the Survey and some of its results, see Second Language Learning . . . in Asia, Africa, and Latin America (Washington: Center for Applied Linguistics, 1961). 3. Cf. H. Kloss, Die Entwicklung neuer germanischer Kultursprachen (Munich, 1952), in particular pp. 24-31 Stufenfolge des Ausbaus eines Idioms zur Kultursprache. 4. Kurdish was the example used in the oral reading of this paper but it may be that it is not a fully satisfactory example since the dialect of Suleimaniya is beginning to be accepted as a norm by a considerable segment of the Kurdish speech community. Cf. E. N. McCarus, A Kurdish Grammar (New York, 1958) p. 1; D. N. Mackenzie, Kurdish Dialect Studies I (London, 1961) p. xviii. 5. Cf. J. Gumperz and C. M. Nairn, Formal and Informal Standards in the Hindi Regional Language Area, in C. A. Ferguson and J. Gumperz (eds.), Linguistic Diversity in South Asia (Indiana University, RCPAFL 13, 1960) 6. L. Bloomfield, Language (New York, 1933) p. 29. 7. Cf. U. Weinreich, Languages in Contact (New York, 1953) pp. 83-110; J. Gumperz' paper in this Symposium. 8. See, for example: L. Dominian, The Frontiers of Language and Nationality in Europe (New York, 1917); A Meillet, Les langues dans l'Europe nouvelle (Paris, 1928); S. Rundle, Language as a Social and Political Factor in Europe (London, 1946).

20 On Sociolinguistically Oriented Language Surveys

Many countries in Asia, Africa, and Latin America, as a matter of national development or even of national existence, must answer a set of language questions. The policy decisions which these answers constitute then require implementation, often on a large scale and over long periods of time. Some of these questions are of language choice: What language(s) shall be the official language(s) of the government, used in laws, administration, and the armed forces? What language(s) shall be used as medium of instruction at the various levels of the educational system? What language(s) will be accepted for use on the radio, in publishing, in telegrams, and as school subjects? Other questions involve language "engineering." Once a language has been chosen for certain purposes in a country it may be necessary to take steps to assure its adequacy for these purposes. The questions to be answered generally refer to standardization and modernization: What variety of the language should be selected or created as the standard form for written and spoken purposes? What means shall be used to provide modern terminology and the needed literary and scientific forms of discourse? Finding suitable answers to language questions like these in most of the developing countries is of crucial importance in their economic, political, and social development. Development of the educational system and development of communication networks in a country are increasingly recognized as critical elements in national development as a whole, and both of these are dependent on language policies. Decisions must be taken on language questions in terms of at least three important goals: national unity and national identity, access to modern science and technology, and international communication. Language policies are rarely set quickly and decisively. Like many national policies, they often develop gradually, vacillate, and are modified again even after they are thought to be final. Occasionally, however, a single decision, e.g. the choice of Bahasa Indonesia in Indonesia, may have enormous consequences for the country. Whether the language policies of a country grow gradually or by Reprinted from The Linguistic Reporter 8.4:1-3 by permission of the Center for Applied Linguistics.


On Sociolinguistically Oriented Language Surveys


jumps, it seems likely that the decisions involved will be better, i.e. will achieve the desired results more efficiently, the better the information is on which the decisions are based. It must be recognized, of course, that language policiesâ&#x20AC;&#x201D;again like many other national policiesâ&#x20AC;&#x201D;are not determined simply on the lines of rational analysis. In fact, decisions on language questions are notoriously influenced by emotional issues such as tribal, regional and religious identification, national rivalries, preservation of elites, and so on. They may even go directly against all evidence of feasibility. The fact remains that the availability of accurate, reliable information on the language situation of a country can be influential in making policy decisions and is of tremendous value in planning and carrying out the implementation of the policies. Strangely enough, very few countries or regions have attempted systematic surveys of the language situation. The most famous such survey was the monumental Linguistic Survey of India carried out by Sir George Grierson at the turn of the century, and even today when Indian officials need information on which to base decisions they have no better source to turn to. The existence of the LSI does not guarantee sensible decisions, and the LSI is now outdated in its methods and much of its information, but the availability of such information as is contained in it has been important. One of the most important recent attempts to survey the language situation in a country or region is the West African Languages Survey carried out since 1960 under the direction of Professor Joseph Greenberg with the aid of grants from the Ford Foundation. This survey has concentrated on the more narrowly linguistic problems of language description, and most of the publications coming out of it are technical articles and monographs of more direct interest to professional linguists than to government officials or language teachers. As a by-product of this survey, however, the linguist-investigators have accumulated a considerable store of information on the language situation in West African countries, although there are as yet no definite plans for publication of the material. Since previous language surveys have generally been motivated chiefly by interest in the collection of linguistic data, especially on languages little known or not known at all to the world of scholarship, it may be useful to describe the purposes and procedures of a survey not characterized by this "anthropological purism," as it has been called, but by concern with the language problem of government and, in particular, education.

Basic Data on Major Languages The first task of a country language survey is to determine which are the major languages of the country and to assemble the basic sociolinguistic information about them. Sometimes the determination of major languages is relatively simple, sometimes it is difficult; often the criteria must be worked out for the specific country. For example, Madagascar has two major languages: Malagasy, spoken by 90 per cent of the population; and French, the language of government and


Language Planning

education. Bolivia has three: Spanish, Quechua, and Aymara, the native languages of roughly equal thirds of the population. Kenya probably has ten major languages: eight languages spoken by more than 200,000 each; Swahili, a widespread lingua franca; and English, the principal language of government and education. It is presumably only from these major languages that candidates can be considered for a national language, official language(s) of government, and language(s) as mediums of instruction. In order to make decisions of this kind and even more importantâ&#x20AC;&#x201D;to undertake the necessary programs of language teaching, materials preparation, teacher training, publication, and so on, further information must be collected about each major language. Who speaks the language as a first language, where, and under what circumstances? To take a simple example, if a given country chooses English as its national language and language of education, and finds it necessary or desirable to have special English teaching materials for speakers of different major languages, the ministry of education must know the geographical extent of each of these languages, the amount of its use in linguistically heterogeneous urban centers, and the social limitations on its use in order to plan distribution of materials and teacher training. How much dialect variation is there in the language? For example, a given language may be spoken by a third of the population of a country and the government may wish to choose it as a language for literacy training, limited publication, and use as a medium of instruction at the primary level. If, however, the language in question has no standard form, but shows several major dialect areas with strong feelings of dialect identification by the speakers, the government policy may not be feasible. To what extent is the language used as a second language or lingua franca by others, and to what extent do native speakers of the language use other languages? Two languages may have roughly equal numbers of native speakers, but there is a long tradition of speakers of the one language learning the other in addition, while members of the second speech community do not reciprocate. In such a situation, the government can probably settle for the use of only one of the languages in education. To what extent is the language used in education? It might be expected that this information would be easy to obtain since the use of a language as the medium of instruction is presumably set by government policy. It often happens, however, that a given language is in fact used in the first two grades of school or as a preliminary step in adult literacy training when government policy either has not required this or has even forbidden it.

Language Attitudes In many ways the effectiveness of language policies in education is determined more by the attitudes of the people on language use than it is by the simple demographic facts of language distribution and use. Discovering language atti-

On Sociolinguistically Oriented Language Surveys


tudes is more difficult than finding the basic data and also may raise political issues which threaten the successful carrying out of a language survey, but it is of fundamental importance. What do the speakers of a language believe or feel about its esthetic, religious, and ""logical" values? About the appropriateness of its use for literature, education, and "national" purposes? What do the speakers of a language believe or feel about other languages in the country? Are they better or inferior to their own language in general or for specific purposes? As an example, speakers of Berber languages generally feel, that Arabic is superior to Berber for all purposes except intimate, domestic conversation. Speakers of Kurdish generally feel that Arabic is better than Kurdish for statements of religious truth and as a lingua franca with Arabs and Muslim speakers of other languages, but that Kurdish is more expressive and generally better than Arabic for other purposes. Obviously, educational policies in Arab countries with Berber or Kurdish minorities are related to this difference of attitude.

Survey Techniques Linguistic research uses principally techniques of elicitation, recording, and analysis. Such techniques are, however, only marginally relevant to a sociolinguistically oriented survey. The four techniques most likely to prove effective are: the culling of information from published sources, consultation with experts and persons knowledgeable about specific areas or problems, the use of questionnaires, and field observation and interviews. There is almost no published guidance on these survey techniques: the best discussion is apparently William Reyburn's "Problems and Procedures in Ethnolinguistic Surveys," reproduced for the American Bible Society in 1956. In many developing countries a considerable amount of sociolinguistic information can be found in articles, books, monographs, and reports on the area published in the languages of European scholarship, including former colonial languages. The material is generally scattered and difficult of access, and one element of a language survey would be the rather demanding library work of exploring this material for the relevant information. The most fruitful source of sociolinguistic information in many countries will be consultation with language teachers, missionaries, archeologists, government officials, and other informants. Much can often be done in the capital of a nation, but some consultation must be in the provinces. Questionnaires can be effective means of collecting sociolinguistic information from special subpopulations, in particular, school and university students. In the case of a country like Ethiopia there is a special resource for this kind of mass data collection: the university students in various parts of the country under a national service scheme. The critical technique remains the personal on-the-spot investigation of a coun-


Language Planning

try survey worker. Collection of data by the other techniques will show gaps and inconsistencies which can only be corrected by observation of classrooms and local life and interviews of selected individuals and groups. A sociolinguistically oriented language survey of a developing country should be closely associated with whatever linguistic research and teaching is taking place in the country. This usually would mean that the survey would be based at a university department of languages or linguistics, though in some cases the survey might be based at a research or language teaching institution other than a university, if the institution is clearly the center of linguistic research and training in the country. In either case the presence of survey personnel and activities can strengthen the existing work in linguistics and lead to further development of the university or other institution. A language survey in a developing country can also serve as a means of bringing together people who are working on related problems but who are not normally in touch with one another. In many countries this means three kinds of people: scholars in traditional fields of linguistic and philological study of Classical and modern literary languages; anthropologically-minded linguists doing field work on local languages; and foreign language teachers, especially of English and French. In some cases a further group, literacy specialists, are to be included. The most effective means of bringing these different kinds of people together on a regional basis is the holding of recurrent international conferences. The International Symposia held every eighteen months under the sponsorship of the InterAmerican Program in Linguistics and Language Teaching, financed in large part by grants from the Ford Foundation, have been successful in this, as has the Annual Congress of the West African Languages Survey. In many developing countries there is very little contact between groups within the country itself, let alone throughout the region of which it is a part. International conferences for reading of papers and discussion of specific problems in linguistics and language teaching are not only valuable for the exchange of information, but also for the strong stimulating effect they have on language research and the development of teaching materials.

21 Sociolinguistic Settings of Language Planning

All language planning activities take place in particular sociolinguistic settings, and the nature and scope of the planning can only be fully understood in relation to the settings. This paper will offer a discussion of the settings of the three languages and nations in which the international study of language planning processes was carried out: the Indonesian language in Indonesia, the Hindi language in India and the Hebrew language in Israel. Also, the study of language planning processes rests on a number of assumptions about the structure and use of languages in human societies; before the sociolinguistic settings are discussed, two of these assumptions will be made explicit. First, all languages change in the course of time, and all speech communities change through time in respect to the functional allocations of the varieties of language used in them. Second, all users of language in all speech communitiesâ&#x20AC;&#x201D; speakers, hearers, readers, writersâ&#x20AC;&#x201D;evaluate the forms of the language(s) they use, in that they regard some forms as 'better' or 'more correct' or 'more appropriate' than others either in an absolute sense or for certain purposes or by particular people or in certain settings. Most of the change which takes place in languages and in the allocation of language functions in speech communities is apparently by unconscious processes, i.e., it takes place gradually and out of awareness of the language users themselves. Much of the change is, however, related to the users' evaluations, and in some instances conscious, deliberate attempts to affect the course of language change, either to foster innovation or to preserve the existing state, contribute to the processes of change, sometimes crucially. It is in this last realm of deliberate attempts to influence the course of change that the notion 'language planning' becomes a useful concept for the analysis and understanding of language change. The two assumptions of change and evaluation are so basic to the study of language planning that they merit some further clarification and exemplification. This paper originally appeared in Language Planning Processes, edited by J. Rubin, B. H. Jernudd, J. Das Gupta, J. A. Fishman, and C. A. Ferguson, pp. 9-29. The Hague: Mouton.



Language Planning

Change The two most obvious kinds of language change are changes in orthography, i.e., the accepted means of written representation of language in a community, and changes in lexicon, i.e., the stock of words and their meanings in a particular language or language variety. Changes in orthography may be relatively trivial ones such as the gradual shift from spelling -ague to -og (e.g., dialogue > dialog) in American English or more systematic and pervasive changes such as the dropping of several letters and changing of spelling conventions carried out in Russian after the Revolution. Still more visible are changes in whole type styles or fonts such as the nineteenth and twentieth century replacement of the 'Gothic' or 'Fraktur' letter shapes by the roman shapes in German and Scandinavian languages. The most impressive of all is the creation or adoption of a totally different writing system, as when Turkey in the late 1920s exchanged Arabic script for the Latin alphabet and devised a totally new spelling system for Turkish. Changes in lexicon also range from isolated shifts of meaning in particular words all the way to massive replacement or additions in lexicon. For example, the word into in current American English of the last decade has had a new meaning added to it, something like 'interested in, concerned about, having some knowledge of or experience in' as in she's into ecology or he's into yoga. While this new meaning has spread rapidly and widely throughout the American English speech community, it seems to be linguistically isolated. On the other hand the great infusion of French- and Latin-based vocabulary into the English language which began in the eleventh century transformed the whole structure of the English lexicon. Changes in orthography and lexicon, as the most obvious types of language change and the most accessible to awareness and explicit discussion, have often been the focus of language planning, and in all the nations reported on in this book (Indonesia, India, Israel, Sweden, China) both orthography and lexicon have been the object of deliberate (governmental and non-governmental) efforts to affect the course of language change. Languages also change in pronunciation and grammar. These changes are on the whole less obvious and less accessible to the consciousness of language users, but they have traditionally been of greater interest for linguistic researchers who want to understand the 'natural' processes of language change and the 'universal' characteristics of human language. Correspondingly they have less often been the object of language planning, although some of the best known efforts at national language planning have specifically included these aspects of language. American English shows many examples of ongoing changes in pronunciation (Labov 1972: 260-325). For example, the distinction between the vowel sounds of cot and caught is disappearing, i.e., more and more people are pronouncing the two words identically. It is not just a matter of these two words but dozens of similar pairs (hock: hawk; tot: taught, taut) and hundreds of words which contain one or the other of the two vowels even though there is no exact pair (e.g., hot,

Sociolinguistic Settings of Language Planning


cob, locker, bottle, Tommie as opposed to gawk, raucous, talker, McCawley). This change is taking place largely out of awareness and has relatively little evaluation associated with it. Two ongoing changes in pronunciation which have powerful evaluative associations are the dropping and adding of r. after vowels (e.g., 'kahd' vs. card) and the use of d instead of the th sound in words such as the, this, then, etc. The dropping and reinserting of r are changes which occur in many parts of the English-speaking world, and almost always with social identification and evaluation, although not everywhere in the same direction (i.e., positive or negative). The change of th sounds to t or d has happened repeatedly in the Germanic languages (as well as in other languages in other parts of the world) and it is likely that English will eventually make this change; at the present time, however, the strong social evaluation against the change seems to be an important retarding factor. Such details of pronunciation may become the object of language planning, and among the most interesting questions of language planning research are under what conditions and to what extent such planning can be successful. The most obvious kind of change in the functional allocation of language(s) in a speech community is that in which whole languages replace other languages for very general purposes, in the extreme case that of a monolingual community shifting its mother tongue. As a familiar example we can cite the changes in language allocations in England in the fourteenth to sixteenth centuries. The distribution at the middle of the fourteenth century was, in general terms, English as the home speech, French as the language of parliament and the courts and Latin as the language of church, education and science. By the middle of the sixteenth century English had taken over most of the functions of the other two except for an important residue of Latin use in education and science (Jones 1953). In description of language change linguists are often able to base their work on descriptive grammars of a particular language at different periods of time. Also, linguists have accumulated enough information about processes of change that they may call on their theoretical principles for guidance in interpreting new data. In description of change in language allocation in speech communities the sociolinguist rarely has descriptions of the language situation of particular communities at particular periods of time (e.g., Clark 1956). Furthermore, so little systematic work has been done on the processes of change in language function that the investigator must draw chiefly on general social science principles or his own intuitions for guidance. Historical studies of change in language which focus on a community or an area rather than a language are rare (e.g., Pulgram 1958). At the present time the functional change which is most often the focus of political pressure and governmental policy making at the national level is probably the choice of medium of instruction in the educational system, and there can be little doubt that shifts in this allocation can have far-reaching consequences in the structure of the languages involved, in the patterns of communication in the nation, and in the broader political processes within which language policy decisions take place. In nations such as Indonesia and Tanzania where a minority language has become in many respects the dominant national language, an important factor


Language Planning

in the shift was the use of Indonesian and Swahili respectively as the medium of education in the schools in the period preceding formal independence and the new official language policy. The importance of the school context must not be overestimated, however, since major shifts in language use may take place without support from the schools or even in opposition to educational policy. The spread of Swahili in Kenya and Hausa as a lingua franca in large areas of West Africa came about largely without benefit of national policies to use them in the schools. On the other hand, the school context is probably of crucial importance in the spread and acceptance of new technical vocabulary, and the studies reported in this volume are directed to the use of approved vocabulary by students and teachers and their attitudes towards the words and the approving authorities.

Evaluation The whole area of users' evaluations of language is of great importance for identifying language change but it has only rarely been treated in general terms, although some small pieces of the picture have been studied (Labov 1972: 308-17). A few of the main issues will be touched on here as background for discussion of language planning.1 First, there is the general consideration that evaluationâ&#x20AC;&#x201D;like the processes of language change to which it contributesâ&#x20AC;&#x201D;may be either conscious or unconscious. A listener may rate speakers unconsciously by details of pronunciation and choice of words which he could not specify, or he may consciously listen for or comment on a particular form, construction or pronunciation of which he strongly approves or disapproves. Further, the relation between evaluation and actual behavior is complex. For example, the language user may strongly favor one variant although in his own speech he normally uses a different one. In the discussion which follows, these differences between unconscious and conscious evaluation and between conscious evaluation and actual behavior must be borne in mind. Evaluation may reflect such different realities as idealizations, stereotypes, completely unconscious, shared values or individual attitudes. One common type of evaluation consists of a unidimensional scale of linguistic phenomena, one end of which reflects the most careful use of language and the highest status in social stratification while the other end represents the most casual, unthinking use of language and the lowest strata in the system. The linguistic variation which is the object of the evaluation is typically also to be found in regional dialect variation, so that charting the path of change throughout the speech community will involve purely linguistic parameters (such as universal phonetic and semantic tendencies) as well as more sociological parameters such as group identification, social stratification, communication networks, and even the conditions of appropriateness for more and less careful speech. The attitudes toward the pronunciation of short a in American English provide an example of this kind of evaluation. The vowel of can't and similar words

Sociolinguistic Settings of Language Planning


ranges along a continuum from the pronunciation represented as 'cahnt' to that of 'caint'. Linguists customarily recognize at least five common variants2 and the variation correlates with geographical regions, social status, and degree of carefulness in speech. Pronunciations above what the listener regards as appropriate for the occasion are heard as affected, overcorrected, pedantic, or at best regionally marked; pronunciations below on the scale are heard as uneducated, substandard, backward, sloppy or at best as regionally marked. The details of the pronunciation are not explicitly understood by phonetically untrained speakers, but the reality of the variation and the evaluation attached to it, in part consciously, is a fact of American English, and scales of this kind occur in many speech communities. Parallel to this kind of evaluation of variation within a single language is a pattern of evaluation of the use of a particular language variety or distinct language in a community. To take a very simple example, of the four languages used in the Gede Settlement Scheme area on the coast of Kenya, the scale runs English, Swahili, Giriama, Waata. In given situations, any member of the speech community tends to rate unfavorably the use of a language 'higher' or 'lower' on the scale than he finds appropriate for the occasion (Sedlak 1974). Needless to say, patterns of evaluation are often much more complicated than this simple unidimensional one, and in fact the examples cited here are presented in an oversimplified way in order to make the pattern clear. Patterns of evaluation in a particular speech community tend to be reflected in the goals and activities of its language planners. Instead of attempting to examine other patterns in this brief discussion, however, it is of greater interest to note the existence of foregrounding patterns which give an explicit social value to a particular characteristic or set of characteristics. In all speech communities, it may safely be assumed, the language users sometimes explicitly call attention to particular features of language structure or use as signals of group identity, disapproved behavior, objects of correction or other social values. Such foregrounded social markers in language are only a small fraction of the total amount of evaluation which pervades the whole use of language, but they may have special importance as indicators of trends and values, and they constitute the primitive source from which institutional language planning activities ultimately are derived. The best known example of a foregrounded social marker in language is probably the story of shibboleth in the Bible. The people of Gilead and the people of Ephraim spoke the same language, but their pronunciation of certain sibilants differed, and the Gileadites were able to make explicit use of this difference by the diagnostic word 'ear of corn' which they asked the Ephraimites to pronounce. According to the story (Judges 12: 4-6) thousands of fugitive Ephraimites were identified by their pronunciation of shibboleth and were killed. In this case the community's view of the social marker presumably reflected the facts of language behavior, and the consequences for the group identified were catastrophic. Often, however, social markers are not used with such drastic intentions, and also the belief about the markers may correspond only partially with the facts. Nevertheless, the potency of such markers in cueing attitudes and actions is great, and they


Language Planning

not only contribute to the total picture of the processes of language change in the community, but they may become political issues or serve as symbols of deeper political issues. Some language evaluation is explained by members of the speech community in terms of particular reasons: e.g., such-and-such a form is better because it is consistent with other related forms, or is the original form, or just 'sounds better'. Such evaluation may be called rationalized evaluation. It is of importance here because most language planning involves such rationalized evaluation of language, and some theories of language planning are based entirely on it (e.g., Ray 1963, Tauli 1968; cf. Haugen 1966 for discussion). Three of the principal types of rationalized