U N I VE RS I T À D IPARTIMENTO
D E G LI S TU DI D I S AS S A RI S CIENZE U MANISTICHE E S OCIALI ___________________________
CORSO DI LAUREA MAGISTRALE IN LINGUE, CULTURE E
COLLOCATIONS AND REPRESENTATIONS OF MEN AND WOMEN IN THE BRITISH PRESS: A CORPUS STUDY
Relatore: PROF. ANTONIO PINNA
Correlatrice: DOTT.SSA STEFANIA GANDIN
Tesi di Laurea di: ANTONIETTA D EMONTIS
ANNO ACCADEMICO 2011/2012
Table of contents Introduction
1. Corpus Linguistics 1.1 Issues relating to the definition and nature of corpus linguistics
1.2 Critical points and potentials of corpus linguistics
1.3 Corpora: backgroud information
1.4 Corpora and lexicography
1.5 Corpus techniques of investigation
1.6 Lexico-grammatical profiles
1.7 Corpora and grammar studies
1.9 Corpora and language learning and teaching
1.10 Language and translation
1.11 Pre-electronic corpora
1.11.1 Biblical concordances
1.11.2 Early grammars
1.11.3 Early dictionaries
1.11.4 The Survey of English Usage (SEU) corpus
1.12 Electronic corpora
1.12.1 General Corpora
1.12.2 Specialized Corpora
1.12.3 Learner Corpora
1.12.4 Pedagogic Corpora
1.12.5 Monitor corpora
1.12.6 Parallel and comparable corpora
1.13 The web revolution
2. Language and Gender 2. Language and Gender
2.1 Sex and gender: an essential differentiation
2.2 Sociolinguistics and gender
2.2.1 Different approaches to studying language
and gender 2.3 The ascent of feminism: a brief review 2.3.1 Phases of feminism 2.4 Sexist speech
83 86 91
2.4.1 Gendered language
2.4.2 Women and men talking
2.4.3 Critical Discourse Analysis: the male as model
2.5 Gender identity and media
2.5.1 Advertising and gender
2.5.2 Media discourse
3. Man and Woman in the British press: a corpus study 3.1 The British daily press: tabloids vs. broadsheet
3.2 A brief survey of previous studies on the
representation of gendered items in corpora 3.3 Methodology: description of the corpus investigated
3.4 Methodology: description of the British National Corpus
3.5 Methodology: description of WordSmith
3.6 Analysis: a mixed method approach
3.7 Objectives of the analysis
3.8 Discussion of the findings
3.8.1 Power and misconduct
3.8.2 Social categories
3.8.3 Character and mental ability
Other reference sources
Introduction This dissertation investigates how the British press represents dominant social views and attitudes referring to men and women. Corpus investigation techniques are used with a corpus that I have personally compiled containing 1,000 articles from the on-line archives of two British newspapers â€“ The Guardian and The Daily Mail. Using the concordancing software WordSmith Tools, the analysis will concentrate on key collocations of the lexical items man and woman within the corpus and on their collocational behaviour, revealing how different social categories â€“ in this case men and women â€“ are represented, and possibly perceived, especially in the British social context. In other words, by analyzing the concordance lines of these collocations we can infer which cultural beliefs about men and women are conveyed by characteristic recurrent language use in the samples of British press taken into consideration. The results show that there are similarities and differences in the collocates of man and woman both within the same period of time and across time. Furthermore, evidence from the corpus tells us that men and women in 2011 are perceived differently from men and women in 2001 by the British society especially in terms of social condition, and sexual orientation. In this sense the analysis demonstrates that the perception of particular social groups and their relations can change over time, even within the space of a decade. This may depend on variation of particular 4
social, economic and cultural conditions. These changes and differences affect the way in which people speak and write. In linguistics, the changes in the lexical choices can be examined by means of a corpus. Bussmann portrays a corpus as “a finite set of concrete linguistic utterances that serves as an empirical basis for linguistic research.”1 According to this definition, a corpus is a very important tool of linguistic study, including diachronic lexical investigations covering brief time spans. The thesis is divided into three chapters. Chapter 1 illustrates what corpus linguistics is and how a corpus approach can be applied to several fields of study related to language, such as lexicography, grammar studies, language learning and teaching, and translation studies. Moreover I shall focus on some positive outcomes that the use of corpora may obtain in these different areas of language. In Chapter 2 I shall summarize the results of some previous studies in the field of “Language and Gender” that are concerned with the ways in which different social group are represented and perceived within a given speech community. Finally, in Chapter 3 I shall describe the process of data collection and corpus-building required for the present study, some methodological implications involved in the process, and the results of the analysis.
H. Bussmann, Dictionary of Language and Linguistics, London: Routledge, p. 260, 1996.
Chapter 1 Corpus Linguistics
1.1 Issues relating to the definition and nature of corpus linguistics In the last decade corpus linguistics has firmly established itself as a central approach to the study of languages, with important applications in linguistics-related areas such as lexicography/terminography, grammar studies, language learning and translation, to name but its most central fields of study. Corpus linguistics concerns the study of language in use by means of corpora. The word ‘corpus’ comes straight from Latin and means ‘body’, and as a metaphorical extension it has long been used to denote a body of texts. Tognini-Bonelli2 distinguishes between ‘corpus-based’ analysis, where existing hypotheses and theories are tested on corpus data, and ‘corpus-driven’ analysis, where the starting point is the data, and elements and classes are obtained directly from the corpus without being linked to pre-existing theories about language. The corpus-based approach is suitable for quantitative analysis which aims at examining the distribution of certain linguistic features within a particular set of texts, while the corpus-driven approach is particularly functional in qualitative investigations, aiming at defining linguistic categories on the basis of their recurrent associations with their verbal co-text and their wider situational and cultural context. Is corpus linguistics a methodology or a theory? A variety of definitions of corpus linguistics have been provided by some of the leading corpus theorists, some of which have been openly welcome or refused.
E.Tognini-Bonelli, Corpus Linguistics at Work, Amsterdam: John Benjamins, 2001, p. 177.
Some scholars consider corpus linguistics as a theory. For example Leech maintains that “Computer corpus linguistics defines not just a newly emerging methodology for studying language, but a new research enterprise, and in fact a new philosophical approach to the subject”.3 In the same way, Stubbs refuses the incomplete designation of corpus linguistics as a methodology, and remarks that “[…] a corpus is not merely a tool of linguistic analysis but an important concept in linguistic theory”.4 Furthermore Teubert underlines the theoretical aspect proper of corpus linguistics which he defines as “[…] a theoretical approach to the study of language”.5 Together with Leech, Stubbs and Teubert also Aarts6 consider corpus linguistics as a discipline. This, in turn, generates the further question of exactly what type of discipline. For instance, whilst Stubbs regards linguistics as an “applied social science”,7 Teubert affirms that “Linguistics is not a science like the natural sciences whose remit is the
G. Leech, “Corpora and theories of linguistic performance” in J.Svartvik (ed.) Directions in Corpus Linguistics, Berlin: Mouton de Gruyter, 1992, p. 106. 4
M. Stubbs, “British traditions in text analysis: From Firth to Sinclair” in M. Baker, F. Francis and E. Tognini-Bonelli (eds.), Text and Technology: In honour of John Sinclair, Amsterdam: John Benjamins, 1993, pp. 23-24. 5
W. Teubert, “My version of corpus linguistics”. International Journal of Corpus Linguistics, 2005, 10 (1), p.2. 6
J. Aarts, “Does corpus linguistics exist? Some old and new issues” in L.E. Breivik and A. Hasselgren (eds.), From the COLT’s mouth… and others’:Language corpora studies in honour of Anna-Brita Stenström, 1–19, Amsterdam: Rodopi, 2002. 7
M. Stubbs, “British traditions in text analysis: From Firth to Sinclair” in M. Baker, F. Francis and E. Tognini-Bonelli (eds.), Text and Technology: In honour of John Sinclair, 1-36, Amsterdam: John Benjamins, 1993, p. 3.
search for ‘truth’. It belongs to the humanities, and as such it is a part of the endeavour to make sense of the human condition.”8 In describing and typifying corpus linguistics, others have pointed out the scientific credentials of corpus linguistics. For instance, McCarthy maintains that corpus linguistics represents “cutting edge change in terms of scientific techniques and methods”,9 while Stubbs compares corpus linguistics to science, noting that “[…] geologists are interested in processes which are not directly observable because they take place over vast periods of time […] Corpus linguists are interested in processes which are not directly observable because they are instantiated across the language use of many different speakers and writers”.10 Other linguists privilege the notion of corpus linguistics as a methodology. For example, Gries affirms that “Over the past few decades, corpus linguistics has become a major methodological paradigm in applied and theoretical linguistics.”11 Tognini-Bonelli defines corpus linguistics as a “pre-application methodology” with a “theoretical status”.12 Likewise, Mahlberg portrays corpus linguistics as “an approach to the description of English with its own theoretical framework”.13 She considers the difference of views as being a result of the kind of corpus linguistics that the linguists 8
W. Teubert, “My version of corpus linguistics” in International Journal of Corpus Linguistics, 2005, 10 (1), 1–13, p.7. 9
M. McCarthy, Issues in Applied Linguistics, Cambridge: Cambridge University Press, 2001, p. 125.
M.Stubbs, Words and Phrases: Corpus studies of lexical semantics, Oxford: Blackwell, 2001, p.243. 11
S. T. Gries, “Some proposals towards more rigorous corpus linguistics”, Zeitschrift für Anglistik und Amerikanistik, 2006, 54(2): 191–202, p.191 12
E.Tognini-Bonelli, Corpus Linguistics at Work, Amsterdam: John Benjamins, 2001, p. 1.
M. Mahlberg, English general nouns: A corpus theoretical approach, Amsterdam: John Benjamins, 2005, p. 2.
practice: “there is still disagreement on whether corpus linguistics is mainly a methodology or needs its own theoretical framework. Advocates of corpus-driven approaches to the description of English claim that new descriptive tools are needed to account for the situation of real text, and ideas of theoretical frameworks to accommodate such tools have started to emerge.”14 Thompson and Hunston affirm that “at its most basic corpus linguistics is a methodology that can be aligned to any theoretical approach to language.”15 McEnery, Xiao and Tono, notice that as “corpus linguistics is a whole system of methods and principles of how to apply corpora in language studies […] it certainly has a theoretical status. Yet theoretical status is not theory in itself”;16 thus they assume that corpus linguistics is a methodology. Also McEnery & Wilson17 and Meyer18 consider corpus linguistics a methodology and similarly Bowker and Pearson talk about it as “an approach or a methodology for studying language use.”19 Nevertheless, McEnery and Gabrielatos notice that “Corpus linguistics may be viewed as a methodology, but the methodological
M. Mahlberg, “Lexical cohesion: Corpus linguistic theory and its application in English language teaching”, International Journal of Corpus Linguistics, 2006, 11(3): 363–383, p. 370. 15
G. Thompson and S. Hunston (eds.), System and corpus: Exploring connections, London: Equinox, 2006, p. 8. 16
T. McEnery, R. Z. Xiao, and Y. Tono, Corpus-based language studies: An advanced resource book, London: Routledge, 2005, pp. 7-8. 17
T. McEnery and A. Wilson, Corpus linguistics, Edinburgh: Edinburgh University Press, 1996.
C. F. Meyer, English corpus linguistics: An introduction, Cambridge: Cambridge University Press, 2002. 19
L. Bowker, and J. Pearson, Working with specialized language: A practical guide to using corpora. London: Routledge, 2002, p. 9.
practices adopted by corpus linguists are not uniform”,20 and they explain how such methodological divergences are produced by theoretical reflections. Teubert also tries to interpret the diversity of methods, and claims that “corpus linguistics is not in itself a method: many different methods are used in processing and analysing corpus data. It is rather an insistence on working only with real language data taken from the discourse in a principled way and compiled into a corpus.”21 Tognini-Bonelli22 classifies Corpus-Driven Linguistics (CDL) as a branch within corpus linguistics, a stance endorsed by Römer. 23 Others have employed the term corpus-assisted analysis highlighting the way in which corpus-analysis instruments and methods can be incorporated and improved by other approaches.24 These comprise getting samples, by reading and examining texts from the corpus, comparing results of different corpora, and exploiting other non-corpus ways of obtaining information concerning participants and practices in the discourse form.
T. McEnery, C. Gabrielatos, “English corpus linguistics,” in B. Aarts and A. McMahon (eds.), The handbook of English linguistics, Oxford: Blackwell, 2006, 33–71, p. 44. 21
W. Teubert, “My version of corpus linguistics” in International Journal of Corpus Linguistics, 2005, 10 (1), 1–13, p.4. 22
E.Tognini-Bonelli, Corpus Linguistics at Work, Amsterdam: John Benjamins, 2001.
U. Römer, Progressives, patterns, pedagogy. A corpus-driven approach to English progressive forms, functions, contexts and didactics, Amsterdam: John Benjamins, 2005. 24
A. Partington, “Corpora and discourse: A most congruous beast”, in A. Partington, J. Morley and L. Haarman (eds.). Corpora and discourse, 1–20, Bern: Peter Lang, 2004; J. Morley and P. Bayley (eds.), Wordings of war. Corpus assisted discourse studies on the war in Iraq, London: Routledge, 2009.
Going along with Fillmore,25 they also lay emphasis on the need to take advantage of the useful interchange of intuition, data-observation and introspection. The multiplicity of approaches in defining corpus linguistics may be considered as a positive thing. Moreover different interpretations are unavoidable, firstly because corpus linguistics is evolving, as Hoey26 remarks, and secondly because research is conducted in a broad range of ways and concerns a variety of issues. Also, corpus linguists’ work may concern corpus design, compilation or analysis – or sometimes all three simultaneously – and in each of these cases the linguist may have a different opinion on what the project involves. Providing his account of corpus linguistics Teubert claims that “only if the discourse of corpus linguistics remains controversial and pluralist will there be progress.”27 However, the approval of the variety of visions at the level of the academic community should not be confused with an appreciation of imprecision on the part of the researchers.
C. Fillmore, “Corpus linguistics or computer-aided armchair linguistics”, in J. Svartvik (ed.) Directions in Corpus Linguistics, Berlin: Mouton de Gruyter, 1992, pp. 35–60. 26
M. Hoey, Introduction to M. Hoey (ed.). Data, Description, Discourse: Papers on the English language in honour of John McH Sinclair, V-IX, London: Harpercollins, 1993. 27
W. Teubert, “My version of corpus linguistics” in International Journal of Corpus Linguistics, 2005, 10 (1), 1–13, p. 13.
1.2 Critical points and potentials of corpus linguistics The main criticisms on corpus linguistics are related to the fact that it copes with too small a quantity of data, very often including only text fragments. Consequently, investigations are de-contextualized and hard to generalize. For example, Widdowson analyzed some parts from a work by Fairclough,28 and in his analysis he specified his reservation about making use of text fragments. His hesitation is based on the fact that these parts or fragments are de-contextualised. Consequently, they are deficient in all the contextual information necessary to understand a text. As said by Widdowson, “The first thing to notice is that what we have here is a text fragment. We do not know how it functions in relation to the rest of the booklet. Furthermore, the three extracts that we have are discontinuous: a whole subsection between the second and third has been omitted. No reason is given for this. We do not know either anything about what the motivation for the text was: if, or why, it was commissioned, and by whom…”29 Always with reference to the data collection, another difficulty indicated by Stubbs30 is that text fragments are not representative of a specific language or topic or genre, even if the information has been arbitrarily selected. 28
N. Fairclough, Discourse and Social Change, Cambridge : Polity Press, 1992.
H.G. Widdowson, “Reply to Fairclough: discourse and interpretation: conjectures and refutations”, Language and Literature 5: 57 – 67, p. 62, 1996. 30
M. Stubbs, “Whorf’s children: critical comments on critical discourse analysis” in A. Ryan and A. Wray (eds.) Evolving Models of Language, Clevedon: Multilingual Matters/BAAL, 100 – 116, 1997.
Another critical point is that certain contexts may be difficult to retrieve (for example the layout of newspapers) or hard to maintain in a corpus. Nonetheless it is possible to incorporate some contextual information through the inclusion of titles and indexes into the corpus (for example, information concerning who created the text for whom or at what time). From a methodological point of view, the most challenging difficulty is inferring a straight correlation between linguistic items and ideological meanings, put differently, the propensity to deduce ideological preconceptions directly from linguistic forms, whereas interpretative attitude is not definitely clear. This may engender artificial and biased analyses. Whit regards to this, Fowler maintains that
“…any aspect of
linguistic structure, whether phonological, syntactic, lexical, semantic, pragmatic or textual, can carry ideological significance...”.31 Also Stubbs’ research concerning the use of ergative verbs demonstrated that it is plausible to conceive a relationship between a regular use of a linguistic structure and an ideological attitude.32 The investigation focused on the differences in the employ of ergative verbs in two books. Also these differences are examined from a statistical point of view and compared with a general reference corpus. Stubbs clarifies that these
R.Fowler, Language in the News: Discourse and ideology in the press, London: Routledge, p. 67, 1991. 32
M.Stubbs, Text and Corpus Analysis, Oxford: Blackwell, pp. 125-156, 1996.
grammatical dissimilarities do not engender the ideological differences, but they are correlated. With regard to frequency, what may be considered as particularly interesting is asymmetry in frequency. Hunston33 noticed that the frequency of different senses of ‘right’ varies when it occurs with man and woman in the journalism sub-corpora of the Bank of English: when it appears next to man, the sense of right is principally related to a professional sphere, like in the right man for the occupation whereas the largest part of right woman is employed in a familial or marital connotation, namely a right woman for a man to marry. In this case, the asymmetry may well reveal different ways of portraying men and women in the corpora. A great deal of studies in corpus linguistics has been dedicated to examine linguistic forms such as transitivity and nominalization in order to uncover ideology entrenched in a text, even if numerous scholars maintain that presupposing one-to-one connections between linguistic forms and ideology may be highly problematic.34 For example, Sharrock and Anderson explain that the employment of agentless passive structure is very frequently deciphered as a concealment of responsibility while a different explanation could be provided: 33
S. Hunston, Corpora in Applied Linguistics, Cambridge: Cambridge University Press, 2002. 34
See K. Durkin, “Review of Kress and Hodge, 1979”, Journal of Pragmatics 7: 101 – 105, 1983; A. D. Grimshaw, “Review of Kress and Hodge, 1979 and of Silverman and Torode, 1980”, Language 57: 759 – 765, 1981; K. Richardson, “Critical linguistics and textual diagnosis” Text 7: 145 – 163, Sealey, 1987.
“They make much of the fact that many sentences lack – in grammatical terms – a clearly identifiable ‘agent’ and argue from that, or imply, that concealment is going on. If, however, they saw those sentences in the context of other sentences in the same text, they would see that the ‘agent’ can be quite clearly and unproblematically identified”.35 Frawley has a similar opinion with reference to the use of the agentless passive: “Deletion of agentives, which is a much-recycled grammatical item for analysis, is not inscribed with the meaning of deferral of responsibility.”36 Subsequently also Fowler acknowledges that there is no straight oneto-one relationship between one linguistic structure and any particular ideological stance: “The theory of critical linguistics acknowledges that there is a lack of invariance between linguistic structures and their significances. This premise should be affirmed more clearly and insistently than has been the case. Significance (ideology) cannot simply be read off the linguistic forms that description has identified in the text, because the same form (nominalisation, for example) has different significances in different contexts…”37
W. W. Sharrock and D. C. Anderson, “Language, thought and reality, again”, Sociology 15: 287 – 293, p. 290, 1981. 36
W. Frawley, “Review of Van Dijk T. A. (ed), 1985”, Language 63: 361 – 395, p. 390, 1987. 37
R. Fowler, “On critical linguistics”, in Caldas-Coulthard and Coulthard (eds.): 3 –14, p. 9, 1996.
To recapitulate, the endeavor to formulate direct relationships between linguistic structures and ideological conjectures may lead to artificial analyses. Some researchers maintain that the use of a particular linguistic structure, (e.g. the agentless passive) could possibly suggest a certain ideological attitude (e.g. a way of blurring responsibility) even when it is reasonable to think of other reasons for using it, for example reasons related to the individual style of the writer. The most remarkable advantage of exploiting computer-held corpora is that they consent to accomplish quantitative study. Moreover the corpus approach significantly expands the quantity of data that can be processed while dropping time dedicated to analysis. What corpus linguistics may provide is quantitative corroboration in terms of data and methodology, as it offers results that may be replicated to a higher degree of objectivity. As stated by Stubbs, “findings can be replicated on publicly accessible data”,38 which means that whatsoever samples you notice in one corpus, others should be able to notice the same. What is revealed by a concordance software program is objective as the program displays all the examples no matter whether they match the analyst’s expectations or not, however it should be noticed that the interpretation of the results is not completely neutral and may vary according to different perspectives. To sum up, one of the strong points of the corpus-based approach is that it allows to derive frequency information and identify recurring patterns 38
M. Stubbs “Text, corpora and problems of interpretation: a response to Widdowson”, Applied Linguistics 22: 149 – 172, p. 153, 2001.
– by means of concordancing tools – which are not generally obtainable through manual analysis. The concordancing program reorganizes concordance lines in alphabetical order, and it is helpful in unveiling grammatical or lexical patterns which human intuition usually overlooks. As previously mentioned, many scholars maintain that it is not acceptable to center analyses on linguistic data acquired from a small quantity of text fragments. However recurrent patterns allow us to perceive distinctive ways in which individuals and experiences are represented and this might convey some ideological stances. With reference to this, Stubbs affirms that “Examples of individual utterances cannot tackle claims about the ideological implications of textual patterns…However, if such descriptions are regularly used in a wide range of reports, then they might come to seem a natural way of talking about things, and it is plausible that they come to influence how we think about such events...”.39
1.3 Corpora: backgroud information In the words of Sinclair, one of the pioneers of corpus linguistics, a corpus is a “collection of pieces of language” that are “selected and ordered according to explicit criteria, in order to be used as a sample of the language[…] encoded in a standardized and homogeneous way […] Its constituent pieces of language are documented as to their origins and provenance.40 Moreover, a corpus is also made up of authentic texts,
M. Stubbs “Text, corpora and problems of interpretation: a response to Widdowson”, Applied Linguistics 22: 149 – 172, p. 157, 2001. 40
J. Sinclair, “Preliminary Recommendations on Corpus Typology”, EAGLES Guidelines, 1996, available: http://www.ilc.cnr.it/EAGLES96/corpustyp/node5.html.
“gathered from the genuine communications of people going about their normal business.”41 More recently Sinclair underlines the representative dimension that a corpus should have: “A corpus is a collection of pieces of language text in electronic form, selected according to external criteria to represent, as far as possible, a language or language variety as a source of data for linguistic research.”42 According to Meyer, a corpus is “a collection of texts or parts of texts upon which some general linguistic analysis can be conducted.”43 In a more recent publication O’ Keeffe44 concludes that a linguistic corpus is a collection of naturally occurring examples of language that must be representative of a language or a variety of it, which is designed for some linguistic purpose – to investigate on certain linguistic features and/or to describe language structure and use – and which is available for both quantitative and qualitative analyses, the first aimed at examining how linguistic features are distributed in particular genres or registers (so this kind of analysis has to do with statistics and numbers), the second aimed at investigating on the way in which linguistic features and words are used and behave in a particular co-text and in natural contexts.
J. Sinclair, “Corpus and Text – Basic Principles”, 2004, pp.1-16, in M. Wynne, Arts and Humanities Data Service, Developing Linguistic Corpora: a guide to good practice, Oxbow Books, 2005, p. 16. 43
C. F. Meyer, English Corpus Linguistics: An Introduction, Cambridge: Cambridge University Press, 2002, p. XI. 44
A. O'Keeffe, M. McCarthy, R. Carter, From Corpus to Classroom: Language Use and Language Teaching, Cambridge: Cambridge University Press, 2007.
The authenticity issue is central to corpus linguistics, which has the objective of studying language as it is spoken or written in authentic contexts, for authentic purposes, by as many different individuals as possible. This methodology focusing on language ‘performance’ grew out of dissatisfaction with a tradition of linguistic research which was confined to patterns of language ‘competence’ rather than ‘performance’,45 often through de-contextualized sentences, using elicitation techniques (e.g., grammaticality judgements) and/or psycholinguistic experiments. In this sense, Chomsky’s revolution in linguistics about the generative power of rules is meaningful. Rules, he states, do not demonstrate what is there but what is possible. This interest in the generative aspect of language has modified the agenda of linguistics. The task of linguistics is no longer to decipher what we notice in existing texts, but to explain the language skill, or, in abstract words, the competence of a speaker to create new grammatical expressions. Whilst rules were once defined by language specialists in order to make easier the comprehension of existing texts, or to help us to learn a foreign language, the purpose for a Chomskyan linguist is to find out the rules we follow as native speakers without even being conscious of them. In traditional linguistics, classes, like nouns, or tenses, or persons, were functional concepts in the frame of a theory. According to this new agenda, we are born with the ability to follow the rules without ever having acquired them. Chomskyan linguistics therefore revolutionizes the status of linguistic rules. Rather than being instruments for language study, they now turn out to be the metaphysically 45
See N. Chomsky, Aspects of the Theory of Sintax, Cambridge: The MIT Press, 1965.
actual essence of language. Moreover Chomsky criticized corpus linguistics, asserting that “My judgment, if you like, is that we learn more about language by following the standard method of the sciences. The standard method of the sciences is not to accumulate huge masses of unanalyzed data and to try to draw some generalization from them.”46 Chomskyan linguistics, in turn, is often condemned within corpus linguistics. For instance, Carter declares that it demonstrates “no interest in language beyond the level of the sentence, there is no recognition that authentic data is of any significance and there is no acceptance that studies of large corpora or real language in use play any part in descriptive theories of language. Most significantly, too, there is a clear sense that the analysis of meaning is not a primary purpose.” 47 Sinclair also comments on introspective linguistics by making reference to science, remarking that “One does not study all of botany by making artificial flowers.”48 Thus Noam Chomsky has rejected the corpus as the main resource of our linguistic knowledge. Language, he says, is productive. With limited instruments, a restricted vocabulary and a manageable collection of rules, our language ability allows us to generate a countless number of expression. Corpus study, he maintains, will only inform us about what people have said until now. It will not inform us on what people are going to say in the
N. Chomsky, (Interviewed by Andor, Jozsef), “The master and his performance: An interview with Noam Chomsky”, Intercultural Pragmatics, 1(1): 93–111, p. 97. 47
R. Carter, Introduction to J. Sinclair and R. Carter (ed.), Trust the text: Language, corpus and discourse, 1–6, London: Routledge, 2004, p. 2. 48
J. Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991, p. 6.
future. That is undoubtedly true. Corpus linguistics cannot forecast language variation. Nonetheless, generative linguists are not very much interested in semantic change. They are concerned with grammar. But obviously, grammar also varies over time. What they intend by the “generative force of grammar” is that by employing the very same grammar, we can generate an endless set of utterances. In this perspective, corpus linguistics is not particular helpful in the study of the grammar of a language of which the rules have previously been ‘established’ (but, are these ‘established’ rules always adequate?). But it can give us more information on the sense of words than traditional or Chomskyan linguistics. It derives from the discourse all that we can learn about meaning. Corpus linguistics analyzes the position of the word as the core unit of language and for that reason it is an essential tool employed in lexicography but also in other fields of study related to language such as grammar studies, language learning and teaching and translation studies, to name but its most central fields of study. In the following pages we’ll focus on some positive outcomes which corpora use may entail in a number of different areas of language study. We will concentrate on the reason why corpus data and techniques are important to these areas, and how they play a role in the improvement of knowledge in each. Considering the great
amount of corpus-based linguistic studies, the examples are inevitably selective and principally they are conform to the topic of this work.49
1.4 Corpora and lexicography Corpora are nowadays the main instrument for lexicographers in the compilation of dictionaries and grammar books. O’ Keeffe notices that dictionary entries are now based “on actual use to facilitate the monitoring of language trends and usage changes”,50 while, for what concerns grammar books, learner corpora provide useful information about typical lexical and grammatical mistakes made by learners.51 It is important to remark that since lexicographic studies put emphasis on authentic data, examples from a corpus may be particularly helpful as they have been used in real communication and therefore can convey nuances of meaning overlooked in invented examples. Through the exploitation of corpora, lexicographers also try to get some information on the frequency of usage of various word senses
For additional examples consult K. Aijmer, B. Altenberg, (eds), English Corpus Linguistics: Studies in Honour of Jan Svartvik, London: Longman, 1991; N. Oostdijk, P. de Haan, (eds), Corpus Based Research into Language, Amsterdam: Rodopi, 1994; M. Kytö, M. Rissanen, and S. Wright, (eds), Corpora across the Centuries, Amsterdam: Rodopi, 1994; M. Rissanen, M. Kytö, and M. Palander-Collin, (eds), Early English in the Computer Age, Berlin: Mouton de Gruyter, 1993.
A. O'Keeffe, M. McCarthy, R. Carter, From Corpus to Classroom: Language Use and Language Teaching, Cambridge: Cambridge University Press, 2007, p. 17. 51
The most innovative work in this field was the Collins Birmingham University International Language Database (COBUILD) project, which was set up at the university of Birmingham in 1980 under the supervision of John Sinclair.
describing a dictionary lemma, information on lexical and grammatical cotext which recurrently surrounds a given word, and information on variation which is related to the fact that words and their senses may be used differently in different registers – fiction, academic prose, conversation, and so on – and in relation to its specific context of use the same word can acquire different senses, for instance metaphorical or literal. The notion of variation is also linked to the different uses of words and their senses in different regional varieties of a certain language (diatopic variation) and, in this case, we may have different evaluative uses of a specific word, for example, in British English the adjective homely means ‘cozy, comfortable’, so, it conveys a positive evaluation, while in American English, the same adjective means ‘unattractive’, so, it is associated with negative evaluations.
1.5 Corpus techniques of investigation There are two main techniques that may be used on a corpus to retrieve data from a corpus: concordances and word frequency lists (or ‘word lists’). Concordancing is an important tool in corpus linguistics and it consists in finding repeated occurrences of a given word or phrase in its surrounding co-text, using a standard corpus software (for example Wordsmith Tools or Monoconc Pro). The analysis of a word’s co-text gives us important information about the meaning of a word, the sense in which a polysemous word is used, the metaphorical or literal uses of a word, its evaluative uses, lexical and syntactic associations, the morphological productivity of a word, which 24
means its potential to derive new words. Concordance lines are usually scanned vertically or from the centre outwards in both directions so that we can examine lexical and grammatical features which occur before or after the node word – the word which is the centre of our analysis. The other common corpus technique is the computation of word frequency lists or wordlists, for which all the words are set in order of frequency. We can have frequency lists of individual words, sequences of words, i.e. chunks or cluster lists (associated words or combinations of words, e.g. I mean, the other, etc.), and words resulting from the comparison of two corpora. This function allows us to identify the most frequent words in a text or a collection of texts and/or the unusually frequent words, the key words,52 a specialised vocabulary which allows to characterise a text or a genre.
1.6 Lexico-grammatical profiles An important method of corpus study consists in creating a ‘lexicogrammatical profile’ of a word and its context of use by scanning concordance lines. A lexico-grammatical profile describes the ways in which we use words in real communication and their typical co-texts. Moreover it gives us information on form and content patterns, the knowledge of which is exploited in lexicographic and in phraseological studies to provide information concerning the most typical senses of a word and its/their
M. Scott, WordSmith Tools version 3, Oxford: Oxford University Press, 1999.
associations with other lexical words (collocation), grammatical words (colligation), semantic sets (semantic preference), evaluative aspects of meaning (semantic prosody), chunks and idioms. The phenomenon in which a word is associated with other content words is called collocation, and concerns the relationship between the node word and specific word-forms that co-occur regularly with it.53 Firth maintains that the meaning of a word depends on how it combines with other words in actual use rather than the meaning it possesses in itself.54 Moreover, since collocations result from recurrent combinations used by speakers, they necessarily are probabilistic events. According to a ‘textual definition’ provided by Sinclair, collocation is “the occurrence of two or more words within a short span of each other in a text”.55 Hoey suggested instead a ‘statistical definition’ for which collocation is “the relationship a lexical item has with items that appear with greater than random probability in its (textual) context”.56 A further psychological definition was developed by Leech. He introduced the notion of collocative meaning which consists in “the associations a word acquires on account of the meanings of words which tend to occur in its environment”.57 Colligation denotes the relationship between the node word and 53
J. Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991, and Trust the Text: Language, Corpus and Discourse, London: Routledge, 2004. 54
J. R. Firth, 1935. "The Technique of Semantics", Transactions of the Philological Society, 36-72.
J . Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991, p. 170.
M. Hoey, Patterns of lexis in text, Oxford: Oxford University Press, 1991, p. 7.
G. N. Leech, Semantics, London: Penguin, 1974, p. 20.
grammatical categories which co-occur frequently with it.58 Colligation has to do with word-classes which are closed and contain a limited number of components (e.g. English quantifiers) and with syntactic patterns which restrict the uses of the word (e.g. prepositions, tenses, clause positions, etc.).59 The notion of semantic preference refers to the phenomenon in which there exists some kind of semantic associations between a node word and a set of collocates sharing a common aspect of meaning. In other words, a set of collocates of a given node are semantically related in some respect. This makes reference to the concept of lexical field: a class of words which share some semantic feature. It is a matter of similarity of meaning. A semantic preference profile (through the study of concordance lines) consisting of a list of near-synonyms â€“ for example â€“ will enable us to reveal useful information about the context in which each near-synonym is typically used. Semantic prosody is a term first used by Louw60 to mean that words tend to occur in positive or negative environments, in other words, they have typical collocates that carry positive or negative meanings. Stubbs, for example, shows that more than 90 per cent of the words collocating with the word cause are negative, e.g. accident, cancer, commotion, crisis and
J. Sinclair, Trust the Text: Language, Corpus and Discourse, London: Routledge, 2004.
A. O'Keeffe, M. McCarthy, R. Carter, From Corpus to Classroom: Language Use and Language Teaching, Cambridge: Cambridge University Press, 2007, p. 14. 60
B. Louw, Irony in the Text or Insincerity in the Writer? The Diagnostic Potential of Semantic Prosodies, in Baker, M., Francis, G. & Tognini-Bonelli, E. (eds) "Text and Technology", Philadelphia/Amsterdam: John Benjamins, 1993.
delay.61 The association between a node word and a series of words can cause a certain evaluation or connotation. Partington62 distinguishes among three notions of ‘connotation’: social situational connotation, for which a word may be indicative of the social class of a speaker or the register of a text; cultural connotation, according to which a culture may add some attributes to the denotation of a word, and expressive connotation which is related to the fact that the choice of using a certain lexical item implies a favourable or an unfavourable evaluation by the speaker towards what is being described. Semantic prosody has a ‘pragmatic function’ as it describes the speaker’s evaluative attitude and expresses the speaker’s reason for making certain utterances (his/her communicative purpose), and it also has a ‘discourse or textual function’ because it allows us to identify the function of a lexical item in the discourse. The node word may also be part of idioms and/or chunks which are fixed and semi-fixed expressions, so lexico-grammatical profiles may be useful to provide information concerning these associations. Biber et al. use the expression ‘lexical bundles’, instead of ‘chunks’, to refer to recurrent strings of words; in conversation, “Do you want me to”
M. Stubbs, “Collocations and semantic profiles: on the cause of the trouble with quantitative methods”, Function of Language, 2/1: 1-33, 1995. 62
A. Partington, Patterns and Meanings, Amsterdam: John Benjamins, 1998, p. 65-66.
and “I don’t know what” are amongst the most frequent lexical bundles.63 It is essential to remark that lexical bundles are different from idioms. Idioms have a meaning that is not resultant from their parts, in contrast to lexical bundles, which do. In addition, lexical bundles are not complete phrases. The use of corpora brings some positive outcomes for lexicographic studies, in fact it allows a deeper understanding of word meanings and uses in real contexts and in relation to language sub-varieties. Also it grants a more detailed description of the lexico-grammatical and semantic relationships at work in a language.
1.7 Corpora and grammar studies Corpus investigation provides an interface between lexis and grammar and a description of particular language phenomena. Susan Conrad64 explains that grammatical choices are associated with vocabulary, grammatical co-text, discourse level factors and the context of the situation. With regard to vocabulary, she mentions three types of lexicogrammatical relationship. The first type concerns the lexical items which tend to occur with particular grammatical structures, for instance verbs that are most common with that-clause object, such as show, suggest. The second type of lexico-grammatical relationship has to do with words that occur as realisations of grammatical functions. A typical example may be found in the form of tenses: for example, in the formation of the English simple present tense the verb is inflected for the third person singular by 63
D. Biber, S. Johansson, G. Leech, S. Conrad, and E. Finegan, Longman Grammar of Spoken and Written English, Longman, 1999, p. 990-994. 64
S. Conrad, “What can a corpus tell us about grammar?”, in M. McCarthy & A. O’Keeffe (Eds.), Routledge Handbook of Corpus Linguistics, Abingdon: Routledge, 2010, pp. 227-240, p. 229.
adding the suffix -s, while the past tense is formed with -ed for regular verbs, etc. The third type of lexico-grammatical relationship involves words that occur in positive or negative circumstances, so it is related to the notion of semantic prosody. O’ Keeffe et al.65 offer a comprehensive analysis of get-passive (e.g. he got arrested). They demonstrate that the get-passive is often used to express unfortunate situations in lexico-grammatical associations of verbs such as killed, arrested, beaten, criticized, burgled, and so on. Grammatical choices are also associated with a certain grammatical co-text, which means that a particular grammatical feature tends to occur with specific other grammatical features. Grammatical descriptions in traditional textbooks always present the would-clauses as adjacent to an ifclause. Interestingly, Frazier found out that almost 80 per cent of wouldclauses are not adjacent to an if-clause but they often occur with infinitives and gerunds – among other features described in the study.66 The use of a particular grammatical feature through a text may construct and manipulate our understanding of a text on a discourse level. Burges67 conducted a study on the way in which writers address their audience in memos that are directed to groups of superior, inferiors or those of equal hierarchical standing in institutions. She found out that the choices
A. O'Keeffe, M. McCarthy, R. Carter, From Corpus to Classroom: Language Use and Language Teaching, Cambridge: Cambridge University Press, 2007, pp. 106-114. 66
S. Frazier, “A Corpus Analysis of Would-clauses Without Adjacent If-clauses”, TESOL Quaterly 37(3): 443-66, 2003. 67
J. Burges, “Hierarchical Influences on Language Use in Memos”, unpublished doctoral dissertation, Northern Arizona University, 1996.
that writers make between nouns and pronouns and their level of importance – in thematic or rhematic position – influence the reader’s perception of writer’s authority. Similarly, Biber et al.68 examine the choice of verb tense and voice through science research articles, finding that the alternation of verbs, between active and passive voice or between past and present, corresponds to transition parts that are of considerable rhetorical interest. Finally, grammatical choices are also connected to the context of the situation. The use and the frequency of a certain grammatical feature contribute to characterize the registers (also called genres in corpus lingusitics literature), which are varieties associated with specific circumstances of use and communicative purposes, and often identified by a particular name, such as academic prose, conversation, newspaper writing, fiction writing, etc. In particular, grammatical features used in academic discourse have received considerable attention. For example, Louwerse et al.69 examined the use of conditional verbs in lectures at an American university while Biber70 compared several grammatical features across ten spoken and written registers from four American universities in the Spoken and Written Academic Language corpus.
D. Biber, S. Johansson, G. Leech, S. Conrad, and E. Finegan, Longman Grammar of Spoken and Written English, Longman, 1999. 69
M. Louwerse, S. Crossley, P. Jeuniauxa, “What if? Conditionals in Educational Registers”, Linguistics and Education 19(1): 56-69, 2008. 70
D. Biber, University Language: A Corpus-based Study of Spoken and Written Registers, Amsterdam: John Benjemins, 2006.
1.8 Lexicogrammar Another field of study that corpus linguistics addresses is lexicogrammar. Lexicogrammar is Sinclair’s suggestion that there is no distinction between lexis and grammar, or that lexis and grammar are so strictly entwined that they cannot be efficiently studied independently.71 An example of the concept of lexicogrammar involves certain terms (lexicon) connected to certain verb tenses (grammar): a study conducted by Biber et al. shows that certain verbs such as know, matter, and suppose occur more than 80 percent of the time in the present tense, whereas smile, reply, and pause occur more than 80 percent of the time in the past tense. 72 They also observe that some verbs are habitually used in certain sentences: know and think are related to that-complement clauses,73 whilst the verbs like, want, and seem are linked to to-complement clauses. 74 According to Sinclair75 the meaning of a text can be described by a model which reconciles two contrasting principles: the open-choice principle and the idiom principle. The open-choice principle (also known as the ‘slot and-filler’ model) considers “language text as the result of a very large number of complex choices.”76 Grammars which suppose that the slots in a sentence are more or less arbitrarily filled by words (but making sure
J. Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991.
D. Biber, S. Johansson, G. Leech, S. Conrad, and E. Finegan, Longman Grammar of Spoken and Written English, Longman, 1999, p. 459. 73
Ibidem, p. 661.
Ibidem, p. 699.
J. Sinclair, Lexical Grammar, Naujoji Metodologija, 24: 191-203, 2000.
J. Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991, p. 109 and (ed.) Looking Up: An Account of the COBUILD Project in Lexical Computing, London: Collins, 1987, p. 320.
that the effect is grammatical) are founded on this principle. The openchoice principle is related to the ‘terminological tendency’ of language, “which is the tendency for a word to have a fixed meaning in reference to the world.”77 Nonetheless, given that “words do not occur at random in a text, and […] the open-choice principle does not provide for substantial enough restraints for consecutive choices”,78 there is the necessity of a second principle that provides for further restraints: the idiom principle. According to this principle, “a language user has available to him or her a large number of semi pre-constructed phrases that constitute single choices, even though they might appear to be analyzable into segments.”79 The idiom principle is related to the ‘phraseological tendency’ of language, that is, to the fact that words do not appear isolated but tend to “make meanings by their combinations.”80 This may be easily demonstrated by the expression of course (one of the examples Sinclair uses), which actually operates as a single word (it behaves in the same way as one-word adverbials like sure, perhaps, or maybe), even if it can be perceived as the apparently simultaneous choice of two words, and the components of which (of and course) are “not the preposition of that is found in grammar books” and “not the countable noun that dictionaries mention”81 but take on meaning in the phrase. 77
J. Sinclair, Trust the Text: Language, Corpus and Discourse, London: Routledge, 2004, p. 29.
J. Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991, p. 110.
Ibid, p. 110.
J. Sinclair, Trust the Text: Language, Corpus and Discourse, London: Routledge, 2004, p. 29.
J. Sinclair, Corpus, concordance, collocation, Oxford: Oxford University Press, 1991, p. 111.
According to these assumptions, Sinclair also maintains that complete freedom of choice of a single word is rare and the initial choice of a word or a phrase is actually the fixed element of a selection that entails several words. Put differently, words acquire particular meanings or functions by virtue of being co-selected with other words. In view of that, he suggests the notion of extended units of meaning82, for which words are not isolated items, but they are connected to their co-text at various levels (collocation, colligation, semantic preference, and semantic prosody mentioned previously).
1.9 Corpora and language learning and teaching The growing use of corpora in language learning and teaching (LLT) is intimately correlated to the diffuse accessibility to fast personal computers and ever more sophisticated softwares and, in large part, to the fact that corpora may offer “the basis for more accurate and reliable descriptions of how languages are structured and used.”83 As Zhang observes, corpora have the benefit of “providing large databases of naturally-occurring discourse so that analyses can be based on real structures and patterns of use rather than perceptions and intuitions.”84 The most directly useful feature of corpus investigation is lexical analysis, revealing teachers which terms and semi pre-constructed clauses
J. Sinclair, Trust the Text: Language, Corpus and Discourse, London: Routledge, 2004, p. 24.
G. Kennedy, An Introduction to Corpus Linguistics. London: Rodopi, 1998, p. 88.
W. Zhang, Corpus Studies: Their Implications for ELT, IATEFL Issues (152), 2000, p. 9–10.
students are most likely to learn and therefore most important to teach.85 Mindt reflects on the advantage of using corpora in materials plan: “Corpus-based decisions on foreign language teaching syllabuses could help considerably to bring textbooks for teaching English as a foreign language into closer correspondence with actual English.”86 As said by Higgins “the most valuable contribution a computer can make to language learning is in supplying…masses of authentic data….The most powerful of these tools is the concordance.”87 Through the use of concordance software, words can be displayed in context. This display is commonly named the KWIC (keyword in context) function. With a sufficient quantity of hits (lines), the text observable in a KWIC display may enable us to derive meaning of the key word or phrase88 and it shows grammatical forms.89 These tools may be helpful for language learning, but in particular if students
are obtaining answers by means
F. Meunier, “Computer tools for the analysis of learner corpora”, in S. Granger, (Ed.), Learner English on Computer, pp. 19-37, London: Longman, 1998, p. 28. 86
D. Mindt, “English corpus linguistics and the foreign language teaching syllabus”, in J. Thomas & M. Short (Eds.), Using Corpora for Language Research, pp. 232-247, London: Longman, 1996, p. 247. 87
J. Higgins, “Fuel for learning: The neglected element of textbooks in CALL”, CAELL Journal, 2(2), 3-7, 1991, p. 6. 88
G. Aston, & L. Burnard, The BNC Handbook: Exploring the British National Corpus with SARA, Edinburgh: Edinburgh University Press, 1998, p.7. 89
J. Flowerdew, “Concordancing as a tool in course design”, System, 21, 321-344, 1993, p. 338.
L. Gavioli, “Exploring texts through the concordancer: Guiding the learner”, in A. Wichmann & S. Fligelstone & T. McEnery & G. Knowles (Eds.), Teaching and Language Corpora (pp. 83-99). London: Longman, 1997, p. 87.
Together with KWIC data, another useful employment of concordances is aimed at obtaining collocational information. Given that “collocational knowledge is one of the things which contribute to the difference between native speakers and second language learners”91, it follows that this can be a focal point for teaching and learning. As Lewis observes: “Learners need to notice words with the words with which they naturally occur. They need guidance on what can be generalized, and…the all important gaps…wage war but not *wage conflict, *wage battle…”.92 Moreover Lewis remarks that we can identify lexical collocations, the associations of open class words, that are recurrently found together: fire escape, radio station, examine thoroughly, and the like. We also may identify grammatical collocations, which associate one open class word (lexical) with a closed class word (grammatical), like alert to and keen on. Collocations are frequently idiomatic, like break the silence rather than interrupt or explode, their familiarity often covering the fact that they are idiomatic.93 Undoubtedly, if students could learn from collocations connected to a specific issue, their rapidity in language processing and production would move toward more native-like fluency.94 In addition to these and many other contributions of corpus research
C. Shei, Collocation, Learner Corpus, Language Teaching. Available: http://www.dai.ed.ac.uk/homes/shei/collocation/top.html, 2000, November 21. 92
M. Lewis, (Ed.), Teaching collocation: Further developments in the lexical Approach, Hove, England: Language Teaching Publications, 2000, p. 133. 93
M. Lewis, (Ed.), Teaching collocation: Further developments in the lexical Approach, Hove, England: Language Teaching Publications, 2000, pp. 133-5. 94
G. Aston, “Corpora in language pedagogy: Matching theory and practice”, in G. Cook & B. Seidlhofer (Eds.), Principle and practice in applied linguistics, pp. 257-270, Oxford: Oxford University Press, p. 261.
to the comprehension of naturally occurring language, corpus research may play an important part in LLT evaluation as well. In fact, even though in the early phases of development, corpora may be exploited for test conception, test creation, and in the scoring process.95 Shei96 maintains that teachers ought to become familiar with outcomes from both large native-speaker corpora (reference corpora) and target learner corpora (to display frequent mistakes). Not unexpected, the greater part of researches in computer learner corpora (CLC) are with written corpora. But with the recent accent on oral communication as the main evaluator of general language skill, and even intelligence, efforts in collecting spoken learner data are rising.97 In all probability the best recognized of learner corpora is the International Corpus of Learner English (ICLE). 98 The aim of this corpus is to allow the methodical comparison of English learners of 14 nationalities. It comprises academic compositions, mostly from students in their third year, that are roughly 500 words lengthy and contribute to analyses of coherence and cohesion.99 In encouraging a substantial use of corpora in the classroom, Franca observes that working with corpora of spoken learner data allows students 95
J. C. Alderson, “Do corpora have a role in language assessment?” in J. Thomas & M. Short (Eds.), Using Corpora for Language Research, pp. 248-59, London: Longman, 1996, p. 253. 96
C. Shei, Collocation, Learner Corpus, Language Teaching. Available: http://www.dai.ed.ac.uk/homes/shei/collocation/top.html, 2000, November 21. 97
D. Nunan, Second Language Teaching and Learning, Boston: Heinle & Heinle, 1999.
ICLE, available: http://www.uclouvain.be/en-cecl-icle.html.
S. Granger, “The computer learner corpus: a versatile new source of data for SLA research”, in S. Granger (Ed.), Learner English on Computer, pp. 3-18, London: Longman, 1998, p. 10.
to: become analytically conscious of their own production; become aware of items present in conversational communication during problem-solving exercises; achieve greater competency in more genuine oral interactions and at the same time keep control over the elements which help support conversational interaction; have realistic objectives in terms of their own inter-language production, contrasting their production with that of more competent nonnative L2 speakers.100 Laying emphasis on error samples may increase students’ understanding of the difference between their own incorrect speech patterns and correct patterns, shifting them closer to native-like talking. So, the use of corpora in the area of language learning and teaching brings some benefits: they provide a key for better understanding certain linguistic phenomena and their use helps students improve fluency.
1.10 Language and translation Nowadays corpora are exploited both in the daily practice of translation and in translation research. The availability of corpora has not gone overlooked by technically expert translators. The most useful corpus
Adapted from V. B. Franca, “Using student-produced corpora in the L2 classroom”, in P. Grundy (Ed.), IATEFL 1999 Edinburgh Conference Selections, pp. 116-117, Whitstable: IATEFL, 1999, p. 116.
type, often intended explicitly for translators, is certainly the parallel corpus, making each language segment available in two or more languages. By providing translations of segments rather than equivalents of words, a parallel corpus moves the translatorâ€™s interest from a lexical item to an item of meaning. It is impossible to analyze an expression without its context, and authentic examples of preceding translation results may offer a broader enrichment into pragmatic equivalence than conventional dictionaries. However some dictionaries do supply collocational context and therefore try to better fulfill the needs of translators. Aston101 observes that also monolingual and comparable corpora are important tools in translation practices, the first consisting in texts in a particular language which might be both the source and the target language of a certain translation, the second entailing the same genre of a text in different languages or different variety of the same language. Furthermore, â€œwhere monolingual corpora of similar design are available for two or more languages, they may be treated as components of a single comparable corpusâ€?.102 A critical aspect of corpora is the imperfection of language use. Any set of authentic texts may include typos (typographical errors), parts of imperfect style in the original or translations, and obviously translation solutions that run the gamut from perfect to inaccurate or just incorrect. For that reason, a corpus user has to find a way to rationally evaluate the
G. Aston, Corpus use and http://home2.sslmit.unibo.it/~guy/textus.htm, 1999.
G. Aston, Corpus use and http://home2.sslmit.unibo.it/~guy/textus.htm, 1999.
solutions offered by a corpus and consider them according to its type and contents. Unlike the dictionary, a corpus allows the user to derive how an expression is used from the data. This usually asks for more detailed processing than does consulting a dictionary, in that way increasing the possibility of learning. In other words, by paying attention to the diverse ways expressions are normally used and with what frequencies, corpora can make learners more aware of topics related to word choices, register, and frequency, which are scantily documented by other instruments.103 For a translator, a corpus is one important resource of linguistic information, both on the lexical level when looking for translation correspondences and on other levels when trying to make an efficient translation. As the prime source of lexical information, the majority of translators still lean on dictionaries. Bowker and Pearson104 mention five critical points concerning dictionaries, for which corpora may offer a solution. One of the major problems connected to dictionaries is their incompleteness. It is required a long time to compile and publish a dictionary, thus in numerous cases dictionaries do not mirror the actual state of knowledge or language. Another problem with dictionaries is related to their size, particularly as the largest part of dictionaries are still compiled for printed 103
L. Bowker, and J. Pearson, Working with specialized language: A practical guide to using corpora. London: Routledge, 2002, p. 15.
editions. Big, multi-volume dictionaries are able to cover a specific area in its totality, but not many people are able to afford such dictionaries. Moreover, lexicographers have to make a selection about which information to include or to omit, and their selections do not always satisfy the requests of language users or translators. Other common criticisms of dictionaries concern their lack of contextual or usage information together with the lack of frequency information. The choice of a lexical equivalent for a translator could be easier if they have information about the domain-particular usage patterns, comprising the frequency of lexical elements. Finally, even if dictionaries contain the appropriate information, users might have problems finding it, as for instance the entries of some terms can be listed as acronyms or expanded terms (e.g. Random Access Memory for RAM or Central Processing Unit for CPU).
1.11 Pre-electronic corpora Corpus-based research is often supposed to have started in the early 1960s with the advent of electronic, machine-readable corpora. However, before then there was a significant tradition of corpus-based linguistic researches taking place in the major fields of study (biblical studies, grammar, lexicography, dialect, etc.).105 Pre-electronic corpora were developed before the computer era and they were made up of texts that were used as the basis of a specific project, 105
G. Kennedy, An Introduction to Corpus Linguistics, London: Longman, 1998, p. 13.
and had to be examined by way of time-consuming and boring manual analysis.
1.11.1 Biblical concordances Kennedy observes that biblical concordances are “the first significant pieces of corpus-based research with linguistic associations …”.106 Of these concordances, Cruden’s represents the most remarkable and inclusive. It includes about 2,370,000 words and required a unexpectedly short period of time to write. In the 18th century, Alexander Cruden compiled a concordance of the King James Version of the Bible. Though it was a very difficult work done with pen and paper, Cruden completed his concordance in only two years. Cruden’s concordance is lengthier than the Bible because he incorporated entries not just for single terms but for several collocations too. Moreover, Cruden did not lemmatize the entries, on the contrary he inserted distinct entries for every form of a term. For all entries, he registered their position in the Bible together with some words preceding and following the entry. The purpose of Cruden’s work was not straight related to his interest in language phenomena but it was a way to allow people to easily access to the biblical text.
Ibidem, p. 13
1.11.2 Early grammars Several of the first grammars were based on corpus examinations: the Ungrammatical Words has been compiled in the 1st century by the Greek Aristonicus of Alexandria to examine irregular grammatical structures in the corpus of Homer. 107 Early English grammars were influenced by the classical tradition. This is especially noticeable in many 18th century English grammars, for instance in Robert Lowth’s 1762 A Short Introduction to English Grammar. Lowth had a precise aim in writing his grammar: “The principal design of a Grammar of any Language is to teach us to express ourselves with propriety in that Language, and to be able to judge of every phrase and form of construction, whether it be right or not. The plain way of doing this, is to lay down rules, and to illustrate them by examples. But besides shewing what is right, the matter may be further explained by pointing out what is wrong.”108 What is particularly significant in this quotation is that Lowth founded his researches on examples taken from well-known writers of English but, while Lowth made use of corpus data to confirm his very personal theories about English use and grammar, succeeding linguists and grammarians used such data as the source of their linguistic explanations. This tendency is particularly observable in the descriptively-oriented grammars of English compiled in the late 19th and early to mid 20th
A. Ludeling, M. Kyto, Corpus Linguistics, Walter de Gruyter, 2009
R. Lowth, A Short Introduction to English Grammar, 1762, p. XI, reprinted in: Alston, R. C. (ed.), English Linguistics 1500-1800 18, Menston: Scolar Press, 1967.
centuries by intellectuals like Otto Jespersen and Charles Fries. However, during this period, not all researchers centered their reflections on examples obtained from a corpus. For example, in his A New English Grammar, Henry Sweet109 used only made-up examples to describe the grammatical classes in question, while Otto Jespersen founded his work, “A Modern English Grammar on Historical Principles”, on instances acquired from a wide anthology of written English.110 Together with many other linguists of this era,111 Jespersen was convinced that linguistic account ought to make use of genuine sooner than invented examples. Jespersen remarks that: “With regard to my quotations, which I have collected during many years of both systematic and desultory reading, I think that they will be found in many ways more satisfactory than even the best made-up examples, for instance those in Sweet’s chapters on syntax. Whenever it was feasible, I selected sentences that gave a striking, and at the same time natural, expression to some characteristic thought; but it is evident that at times I was obliged to quote sentences that presented no special interest apart from their grammatical peculiarities”.112 Jespersen’s corpus is vast and embraces a variety of different written genres: narrative, poems, science, and politics.
H. Sweet, A New English Grammar, Oxford: Oxford University Press, 1891-1898.
O. Jespersen, A Modern English Grammar on Historical Principles, London: George Allen and Unwin LTD, 1909-1949. 111
Including the neogrammarian Eduard Sievers (1850-1932), cf. An Old English Grammar, translated and edited by Albert S. Cook, Boston: Ginn & Company, 1885. 112
O. Jespersen, A Modern English Grammar on Historical Principles, London: George Allen and Unwin LTD, 1909-1949, p. VI.
In contrast to Lowth’s, Jespersen’s analysis does not convey inflexible prejudices on how English must be spoken and written. In fact the data in his corpus are an instrument used to describe what the language is really like. This approach is completely realized in the grammatical accounts provided by Charles Carpenter Fries in The Structure of English.113 While his forerunners founded their investigation entirely on written texts, Fries is the first to make use of spoken texts as the sole resource of information for his grammar, and to employ frequency information obtained from this corpus to isolate usual and unusual patterns of language use. Fries collected a 250,000-word corpus derived from transcripts of conversations of American speakers living in the Northern and Central region of the United States. Criticisms from contemporary corpus linguists toward intuitionbased accounts of language are definitely based on Fries’ conviction that linguistic analysis must rely on naturally occurring samples.
1.11.3 Early dictionaries Corpora have a time-consuming tradition
principally because they offer the double utility of helping lexicographers establish the sense of a term from the context in which it appears and then exemplifying the meaning of the word in the dictionary entry. Samuel Johnson has been considered the first lexicographer to make extensive use
C. C. Fries, The Structure of English, New York: Harcourt Brace, 1952.
of descriptive quotations in his 1775 A Dictionary of the English Language.114 For what concerns the choice of texts from which he would get illustrative quotations, Johnson initially intended to consult texts published up until the restoration, as he believed that texts from this epoch were “the pure sources of genuine diction”, whilst those from later periods would include too many borrowings, a demonstration that the English language had started “gradually departing from its original Teutonick character, and deviating towards a Gallick structure and phraseology, ….”115 Johnson also didn’t plan to employ texts from living authors. However, as Reddick notices, he included many authors, for instance Pope and Swift, whose works were published subsequent to the restoration and he employed quotations from “some living authors, including (though usually attributed anonymously) passages from his own works.”116 Furthermore, since he had to take account of technical terms as well, he was obliged to comprise quotes from authors not considered, in Johnson’s words “as masters of elegance or models of style.”117 The final result is that Johnson’s citations embraced a variety of genres, from poetry to history to horticulture.118
S. Johnson, A Dictionary of the English Language, T. Ewing, 1775.
S. Johnson, ‘Preface’ to A Dictionary of the English Language, T. Ewing, 1775.
A. Reddick, The Making of Johnson’s Dictionary, 1746-1773, Cambridge: Cambridge University Press, 1990, p. 33 117
S. Johnson, ‘Preface’ to A Dictionary of the English Language, T. Ewing, 1775.
See A. Reddick, The Making of Johnson’s Dictionary, 1746-1773, Cambridge: Cambridge University Press, 1990, p. 33.
Johnson’s methodology had some bearing on many following dictionaries, in particular the Oxford English Dictionary, the biggest dictionary ever published.
The dictionary incorporated English words
found in works published from 1250 to 1858. The OED is grounded entirely on written texts, disregarding speech completely. Furthermore, as Landau remarks, “the core of citation files tend to be those of the educated and upper classes,”119 so, these quotations are not representative of the language as a totality. The first edition of the OED, published in 1928, consisted of terms taken from four million quotation slips provided by roughly 2,000 readers that collected them from sources they were asked to read.120 Specific directives were provided to readers advising them how they should collect words for insertion on citation slips: “Make a quotation for every word that strikes you as rare, obsolete, old-fashioned, new, peculiar, or used in a peculiar way. Take special note of passages which show or imply that a word is either new and tentative, or needing explanation as obsolete or archaic, and which thus help to fix the date of its introduction or disuse. Make as many quotations as you can for ordinary words, especially when they are used significantly, and tend by the context to explain or suggest their own meaning.”121
S. I. Landau, Dictionaries: The Art and Craft of Lexicography, 2nd ed. Cambridge: Cambridge University Press, 2001, p. 207. 120
W. N. Francis, “Language Corpora B.C.”, in: J. Svartvik, (ed.), Directions in Corpus Linguistics, Berlin: Mouton de Gruyter, 1992, p. 21.
Johnson’s dictionary and the OED undoubtedly represent two noteworthy dictionary works carried out during the 18th-20th centuries. However they were not the only dictionaries of their epoch based on preelectronic corpora. In the United States, the second edition of Webster’s New International Dictionary appeared in 1934 and included 1,665,000 quotations acquired from books and periodicals not by “volunteers … [but by] full-time professional lexicographers working together to make sure that all significant sources were searched.”122 Further comprehensive dictionaries founded on pre-electronic corpora have been published. A dialect dictionary of British English was published by Joseph Wright between 1898 and 1905.123 This dictionary was grounded on a corpus of 3,000 dialect terms and volumes from which examples were derived. A gathering of volunteers worked on its development collecting 1.5 million quotation slips for the first volume. Landau124 describes several pre-electronic corpora compiled in the early 20th century for “lexical study”.125
J. A. H. Murray, et al., “Historical Introduction”, The Oxford English Dictionary, p. XV, quoted from R. A. Wells, Dictionaries and the Authoritarian Tradition: A Study in English Usage and Lexicography, Berlin: Walter de Gruyter, 1973, p. 29. 122
W. N. Francis, “Language Corpora B.C.”, in: J. Svartvik, (ed.), Directions in Corpus Linguistics, Berlin: Mouton de Gruyter, 1992, p. 22. 123
J. Wright, The English Dialect Dictionary, 6 vols., Oxford: Clarendon Press, 1898-1905.
S. I. Landau, Dictionaries: The Art and Craft of Lexicography, 2nd ed., Cambridge: Cambridge University Press, 2001, pp. 273-275. 125
A five-million-word corpus provided the base of Ernest Horn’s A Basic Writing Vocabulary: 10,000 Words Most Commonly Used in Writing. A five-million-word corpus served as the source of Michael West’s
He cited a work that he considered as particularly worth mentioning, as it is based on corpora created expressly for lexical study, not a collection of books or periodicals examined by groups of readers: an 18-million-word corpus was the source of Edward L. Thorndike and Irving Lorge’s The Teacher’s Word Book of 30,000 Words.126 Thorndike and Lorge’s work was “enormously influential for the teaching of English in many parts of the world over the next 30 years.”127
1.11.4 The Survey of English Usage (SEU) corpus Another important pre-electronic corpus is the Survey of English Usage (SEU) Corpus, a corpus whose compilation started in 1959 at the Survey of English Usage (University College London) under the supervision of Randolph Quirk. Quirk created the SEU Corpus as he wanted to refute traditional grammatical accounts based on writing and not on speech. The SEU Corpus comprises various genres: this is an effect of the Firthian impact on British linguistics at this time. According to Firth, language use may change in relation to the circumstances.128 Although the SEU Corpus integrated a variety of written and spoken genres, the rank of writers and speakers included in the corpus were limited
A General Service List of English Words, which consists of 2,000 terms recorded according to their general occurrence and the incidence of the individual meanings that they conveyed. 126
E. L. Thorndike, I. Lorge, The Teacher’s Word Book of 30,000 Words, New York: Teachers College, Columbia University, 1944. 127
G. Kennedy, An Introduction to Corpus Linguistics, London: Longman, 1998, p. 16, cf. also pp. 93-97. 128
J. R. Firth, Papers in Linguistics 1934-1951, London: Oxford University Press, 1957.
to “educated professional men and women.”129 Moreover the corpus contains conversations entailing mixed genders in addition to only males or only females. The SEU Corpus includes roughly one million words. As maintained by Quirk, a corpus this dimension “will not present a complete picture of English usage or anything like it”,130 but the principles behind its development are very significant and tightly applied by those creating modern corpora. Although the monotonous manual examinations related to preelectronic corpora are now regarded as inconceivable and superfluous, these corpora has propelled the development of corpus linguistics as a field of study. The various software programs now easily obtainable for creating concordances would never have existed if persons like Alexander Cruden had not conceived the idea of ‘concordance’.
R. Quirk, The Linguist and the English Language, London: Edward Arnold, 1974, p. 167.
Ibid. p. 170.
1.12 Electronic corpora Electronic corpora are the stronghold of contemporary corpus linguistics and they are the result of the computer revolution, starting with the first computer corpora in the 1960s, the Brown Corpus,131 and enduring to the present time. Several corpora have been set up since the Brown corpus, the million word corpus of American English traditionally considered as the starting point of the corpus era. The late eighties and early nineties in particular saw a flourishing of initiatives, culminating with the first release of the British National Corpus in 1995,132 a 100 million word corpus of spoken and written British English, carefully designed and richly annotated to represent the whole of contemporary British English. Other projects attempted to build resources that would remain up-to-date, that is ‘monitor’ corpora like the huge Bank of English, and still others set up resources that could yield evidence about languages other than English, that is to say the CORIS corpus for Italian, the CREA corpus for Spanish and the Mannheim corpora for German,133 as well as about different varieties of English and of other languages, for example, corpora of learner language,134 corpora of
H. Kučera, W. N. Francis, Computational analysis of present-day English, Providence (RI): Brown University Press, 1967. 132
L. Burnard (ed. by), Users reference guide for the British National Corpus (SGML version), Oxford: Oxford University Computing Services, 1995. 133
CORIS: http://dslo.unibo.it/coris_ita.html; http://corpora.ids-mannheim.de/ccdb/. 134
translator production,135 corpora of English spoken and written in different geographical settings.136 Monolingual reference corpora of the general language were thus complemented by specialized corpora of just one domain and/or genre, comparable corpora including texts that are comparable all regards except the variable under analysis, and parallel corpora made of two ‘versions’ of the same texts, typically originals and their translations. Currently there are different types of corpora and each type is used according to the purpose of the corpus itself. Only the types of corpora conform to the topic of this work will be discussed here.
1.12.1 General Corpora The largest type of corpus is ‘generalized corpus’. Generalized corpora are usually very large, more than 10 million terms, and include a variety of language so that results from it might be fairly generalized. Though no corpus will ever represent all possible language, generalized corpora try to provide users as much of a complete representation of a language as possible. The British National Corpus (BNC), the American National Corpus (ANC) and the COCA are examples of generalized corpora. These big, generalized corpora incorporate written texts like newspaper and magazine articles, works of fiction and non-fiction, scripts from
transcriptions of informal dialogues, government procedures, and business
conventions. If generalizations on language as a whole have to be described, a large, generalized corpus ought to be taken into consideration.
1.12.2 Specialized Corpora A specialized corpus includes texts of a specific type and its purpose is to be representative of the language of this type. Specialized corpora may be large or small and are generally created to answer very definite questions. Examples of specialized corpora are the Michigan Corpus of Academic Spoken English (MICASE), which involves only spoken language from a university backdrop; the CHILDES Corpus,137 which comprises language used by children; the MICUSP, Michigan Corpus of Upper-level Student Papers, a set of documents from a variety of university subjects.
1.12.3 Learner Corpora A learner corpus is a type of specialized corpus that includes written texts and/or spoken transcriptions of language employed by students who are learning a certain language. Learner corpora are frequently used to examine common mistakes students made. A renowned learner corpus is the International Corpus of Learner English (ICLE),138 which consists of essays composed by English language learners in 14 different native languages.
B. Mac Whinney, The CHILDES database, Dublin, OH: Discovery Systems, 1992.
S. Granger, â€œThe International Corpus of Learner English: A new resource for foreign learning and teaching and second language acquisition researchâ€?, TESOL Quarterly, 37(3), 538-546.
1.12.4 Pedagogic Corpora A pedagogic corpus includes language used in classrooms. Pedagogic corpora may contain academic manuals, transcriptions of classroom conversations, or several other written texts or spoken transcriptions that learners find in an educational backdrop. Pedagogic corpora are often used to make sure students are learning functional language, to analyze teacher-student interactions, or as a self-reflective device for teacher training.
1.12.5 Monitor corpora The monitor corpus is a corpus that examines language transformation. It is, updated on a regular basis and open-ended so, it includes texts added regularly to keep track of changes in a language. Monitor corpora are particularly useful for corpus linguistics researches which are especially involved in lexical change, for instance in the alteration of frequency of words or other units of meaning (compounds, multi-word units, collocations, set phrases), in the variation in uses or meanings of old words, in the appearance of neologisms or in words falling out of use; in the variation of context profiles, and so on. The most important monitor corpora are the Bank of English (BoE) and the Global English Monitor Corpus.139 The Bank of English (BoE) was created in 1991 on the COBUILD (Collins Birmingham University International Language Database) project.
The corpus was intended to represent standard English, to meet the needs of learners, teachers and other users, but it is also used by researchers in present-day English language. It includes written texts (75%) coming from newspapers, magazines, fiction and non-fiction books, brochures, reports, and websites whereas spoken data (25%) comprises transcriptions of television and radio broadcasts, meetings, interviews, discussions, and conversations. The greater part of the data in the corpus concerns British English (70%) whilst American English and other varieties represent 20% and 10% respectively. Currently the BoE keeps rising with the regular adding of new information. Another monitor corpus is the Global English Monitor Corpus, which was designed in late 2001 as an electronic collection of the worldâ€™s major newspapers in English. The corpus has the purpose of monitoring language use and semantic variation in English by the analysis of newspapers so that researchers can discover whether or not the English language discourses in Britain, the United States, Australia, Pakistan and South Africa have changed in the same or in a different way. In this way the Global English Monitor Corpus may be a useful instrument not only for lexicographers, but also for those involved in social and political studies all over the world.
1.12.6 Parallel and comparable corpora There are two kinds of multilingual corpora: parallel and comparable corpora. Parallel corpora consist of a source text and its translation into one or more languages. Comparable corpora, in contrast, do not include 55
translations but contain texts from different languages which are alike or comparable with reference to a number of factors like text type, formality, subject-matter, time span, etc. Parallel corpora enable to contrast more than two languages through the use of many translations of the same original text. This kind of corpora are certainly more complicated to build as we have to get translations into more than one language. However they consent to perform multilingual, cross-linguistic studies and extended language analysis, restoring the old paradigm of translation vs. comparison.140 Some criticisms have been made with regard to the objectivity of the data, since the decisions made by the translators are often affected by the translation purpose and translations may reproduce the translatorâ€™s individual style.141 Ebeling summarizes the criticisms maintaining that: â€œ1. translations distort the target language because of influence from the source language; 2. translated language is different from original language; 3. translators are unreliable and make mistakes; 4. translations differ depending upon the individual translator; 5. translations are unpredictable, motivated by reference to the text 140
A. Viberg, Polysemy and Disambiguation Cues across Languages: The Case of Swedish faËš and English get, in Altenberg/Granger 2002, pp. 119-150, p.121. 141
K. Lauridsen, Text Corpora and Contrastive Linguistics: Which Type of Corpus for which Type of Analysis? In Aijmer/Altenberg/Johansson 1996, pp. 63-71, p. 67.
and its circumstances onlyâ€?.142 Comparable corpora are probably easier to build as one can make use of already existing corpora. Comparable corpora have been exploited particularly in contrastive study of languages for specific purposes e.g. in domains such as contract law and genetic engineering.143 However, since it is not easy to ensure the comparability of texts in different languages, comparable corpora provides a less appropriate representation of the correspondences of a lexical element than a parallel corpus does.
J. Ebeling, Presentative Constructions in English and Norwegian. A Corpus-based Contrastive Study, (Acta Humaniora 68) Oslo: Unipub forlag, 2000, pp. 25-26. 143
K. Lauridsen, Text Corpora and Contrastive Linguistics: Which Type of Corpus for which Type of Analysis? In Aijmer/Altenberg/Johansson 1996, pp. 63-71.
1.13 The web revolution The rich assortment of resources, together with tools for analysis, have made corpus-based research a very popular methodology in linguistics and related disciplines such as applied linguistics translation studies, stylistics and so forth. However, there is no denying that corpus set up and management are typically very costly and time-consuming tasks that many researchers would do without if alternative resources were available. At the end of the 90s, a very promising alternative had become widely available, namely the World Wide Web (Web). Several corpus linguists started to ponder the similarities and differences between these two resources.144 After all, the Web has some of the characteristics of a corpus: it makes available texts in electronic form which are authentic, produced for real purposes by a multiplicity of different author; it is searchable via engines that accept words and phrases as queries, like corpus query tools; and it has several added advantages: it is free and easily accessible; searching is intuitive and result retrieval is extremely fast; and it is larger than any corpus will ever be: 11.5 billion pages at the end of January 2005 according to the estimate of Gulli & Signorini.145 However there are also significant differences between the web and a corpus. First of all, a corpus is (or should be) documented, so that users
M. Bekke, “From the British National Corpus to the WWW cybercorpus: A quantum leap into chaos?”, in J. M. Kirk (ed. By), Corpora galore. Analyses and techniques in describing English, Amsterdam: Rodopi, 1999. 145
A. Gulli, A. Signorini, “The indexable web is more than 11.5 billion pages”, paper presented at the WWW2005 conference, Chiba, Japan, May 2005.
know exactly what texts went into it, how they were chosen, whether they were reduced or corrected in any way and so forth, yet nobody knows exactly what the Web contains. Second, a corpus provides for replication of results, since it typically does not change over time, and when it does (as is the case with monitor corpora) this is documented and the older materials are usually not destroyed but stored in separate archives. Third, a corpus is a tangible object (as far as ‘tangible’ goes when one deals with electronic texts), that can be manipulated in different ways (moved to a different support, annotated, indexed). None of these processes are possible with the Web as a whole. Finally, the Web cannot be searched like a corpus. Even though Web search engine queries superficially resemble corpus queries, there are several important differences. An important worrying aspect of search engines in linguistic research is the weakness of their frequency counts, which have been shown to provide unreliable results and to undergo substantial changes in very short times and for no obvious reasons.146 Different localized versions of the engine also seem to provide diverging results for the same query.147
See J. Véronis, “Web: Google’s missing pages: mystery solved?”, 2005, available: http://blog.veronis.fr/2005/02/web-googles-missing-pages-mystery.html. 147
Cf. results obtained on 31 May 2012: www.google.it: Risultati 1 - 10 su circa 18.500.000 pagine in Inglese per “abridged”. www.google.com: Results 1 - 10 of about 23,100,000 English pages for “abridged” [definition].
Finally, and most importantly, several types of searches highly relevant to linguists’ concerns are simply not supported by Web engines. For instance, Google will not: - show you occurrences of all different forms of a given lemma. No Web search for a verb in Italian will raise the problem of what form(s) to use (infinitive? first person present indicative?). There is no ‘lemma search’ option, that allows you to decide whether you want to see only a given word form or all the word forms belonging to the same ‘head’ form. Clearly, this is less of a problem for English than for languages with a richer morphology. - Return a random sample of occurrences for a given query: since results can only be ordered according to their ‘relevance’ as evaluated by the engine, and since results are often too numerous to be examined exhaustively, the doubt remains that any analysis of a word or expression using Web data is biased in unknown ways. - Search for two or more words co-occurring within the same sentence, or in a span of a given size (e.g. min. 2 and max. 4 words); this facility is central to corpus analysis, e.g. to study variation in phraseological expressions. - Compare the frequencies of occurrence of different ways of spelling the same word(s): for instance, if one wants to know how many occurrences of nitty gritty there are on the web, as compared to nitty-gritty, a Google search will not do.
The area at the interface between corpora and the Web is currently the focus of active and enthusiastic research, providing formidable challenges and great opportunities for performance-based linguistics and for the language industry.148 However enthusiasm is not enough: in the long run much more substantial investments and more structured research projects will be required to bridge the gap between corpora and the web.
See S. Castagnoli, “Using the Web as a source of LSP corpora in the terminology classroom”, in M. Baroni & S. Bernardini (ed. by), Wacky! Working papers on the Web as a corpus, Bologna: Gedit, 2006, pp. 159-172; C. Fantinuoli, “Specialized corpora from the web and term extraction for simultaneous interpreters”, in M. Baroni & S. Bernardini (ed. by), Wacky! Working papers on the Web as a corpus, Bologna: Gedit, 2006, pp. 173-190; S. Sharoff, “Creating general-purpose corpora using automated search engine queries” in M. Baroni & S. Bernardini (ed. by), Wacky! Working papers on the Web as a corpus, Bologna: Gedit, 2006, pp. 63-98.
Chapter 2 Language and Gender
2. Language and gender Do women and men talk in a different way? Gender distinctions of all types captivate people, and so it is not unexpected that there is a certain interest about the way women and men talk and about possible linguistic differences. But what can we say about the question itself? By asking “do women and men talk in a different way”, we make a series of suppositions that may be challenged. First of all, the question takes for granted that we can separate speakers neatly into two categories: “women” and “men���. Secondly, the question supposes that we are paying attention to divergences between men and women more than to similarities between them.
During the last twenty years, there has been a remarkable increase in research studies concerning the question of the relationship between language and gender.149
2.1 Sex and gender: an essential differentiation At the very beginning it is important to point out how gender is conceptualized and how it is distinct from sex. Although the words are regularly taken by the layperson to convey the same concept, for the social 149
In this section of the work I will focus on sociolinguistic studies carried out in Britain and other English-speaking countries. Afterwards I shall provide a brief description of the most important linguistic approaches to the issue of gender differences in language.
scientist they have different meanings. Someone considers sex as defined by our being born male or female, and it has important consequences for us as individuals. Biological sex influences the way we experience our lives and how we act in the world. In this view, sex is connected to gender but it is not the same thing. Gender refers to the social category of behavior. It is deeply related to the social classifications produced on the basis of sex, and language plays an important role in creating and supporting these divisions. Despite the fact that the word “gender” is a grammatical category in several languages (for instance the “masculine” or “feminine” used for structural senses in French or Italian), the social approaches use gender as a social class where masculine and feminine are considered to be behavioral classes usually linked to and associated with individuals born with the correlative sex. Therefore, those who are born male are required to perform behaviors that are recognized and assumed as masculine by our society, while those who are born female are related to behaviors that are recognized and assumed as feminine by those in their community and by the social order surrounding them.
Some researchers have examined these differences, particularly in the light of the flourishing literature on sexual identities.150
The differentiation between sex and gender was first illustrated carefully by Anne Oakley151, a British feminist, in the early 1970s. She
Among various scholars, the topic of the relationship of language and sexuality is extensively dealt with in D.Cameron and D.Kulick Language and Sexuality, Cambridge: Cambridge University Press, 2003 and in M. Bucholtz and K. Hall, “Theorizing Identity in Language and Sexuality Research”, Language in Society, 33:4, pp. 469-515, 2004. 3
See A. Oakley Gender and Society, London: Temple Smith, 1972.
described sex as biologically constructed, something related to physiology, genes, hormones and anatomy. With the exception of particular situations, sex is fundamentally binary: one is either male or female.
Gender, on the other hand, is socially constructed: it is something we assimilate. Social scientists believe that we develop social characteristics and behave in a certain way as a result of how we are accepted by those around us. Simone de Beauvoir152 maintained that we progressively evolve into masculine or feminine and we behave in gendered ways in a complex range of situations for a host of reasons.
Gender is not binary like sex; indeed we are a mixture of many features that could be conceived of as either/both feminine or/and masculine, and that are determined by the situation and our connections with those concerned. One could suggest that someone is ‘more feminine’ or ‘more masculine’, for instance, by saying that someone is “very manly” or “like a girl”, but hardly ever do we affirm one is ‘femaler’ or ‘maler’.
We are consequently gendered and we are concerned in the practice of our own gendering and the gendering of others during our existence. In the area of gender and language use, this representation of gender is defined as “doing gender”: gender is something we do, not something we are.153 Throughout our lives and especially in our early years, we are taught, encouraged and stimulated to act in suitable ways so that our gender, and
S. De Beauvoir, The Second Sex, New York: Vintage Books, 1952.
See V. L. Bergvall, “Toward a Comprehensive Theory of Language and Gender”, Language in Society 28, 273-93, 1999.
our community’s perception of it, conforms with our assigned sex. At the present time, the practice of “doing gender” is something that starts even earlier than birth: a lot of new parents are aware of the sex of their child before the baby is born, and therefore they start to relate some gender attributes to their baby: they buy pink or blue clothes according to the sex of their newborn; they assume specific experiences as probable or inconceivable in view of their baby’s sex.
Throughout our lives we are expected to act on the basis of the social expectations that are associated with our sex. Similarly, we act in response to others in relation to similar assumptions about their sex. In both cases, we are gendered. If gender was only an issue related to one’s biological sex, we would constantly see the same representation of gender roles over all cultures, over all time periods and all age groups, but we do not. Gender behaviors are not universal, though gender as a social construct is a universal phenomenon affecting the way people live and approach each other. It is the measure in which behavior is biologically established and/or learned by means of social experiences that constitutes the complicated part. Are there actions that can be ascribed exclusively to our biological sex? Some scholars, such as Judith Butler154, do not consider the sex/gender distinction so fundamental; as an alternative, both sex and gender are regarded as socially constructed. Mary Talbot’s research warns us against merely classifying all 154
J. Butler, Gender Trouble: Feminism and the Subversion of Identity, London: Routledge, 1990. See also D. Cameron, “Language, gender and sexuality: Current issues and new directions”. Applied Linguistics 26 (4), 482-502.
behavior presented by boys and men as masculine and all behavior presented by girls and women as feminine; Talbot points out that doing so derives from an assumption that “socially determined differences between women and men are natural and inevitable”155. Considering sex and gender as the same thing often authorizes and can even encourage certain behaviors related to the support of conventional, traditional family roles and to the validation of male privileges and authority that align with these roles. Such view is known as biological determinism. If the difference between sex and gender is ambiguous or completely deleted so that sex and gender are seen as the same, then certain sentiments grow to be belief and encourage a binary, dualistic conception of gender.
Moreover, biological determinism has to do with essentialism. An essentialist vision of gender aims at determining and confirming a genetic foundation for our behaviors and life choices. Racial viewpoints can fall victim to this sort of philosophy: for instance, the idea that Blacks are musical because it's “in their blood” is undoubtedly a stereotype, in fact some Blacks are musical but surely not all. Nevertheless, some fanatic feminists assume biological determinism to justify gendered diversity, such as Andrea Dworkin.156 To her, being male implies being violent. It is “in their blood” to carry out certain crimes such as rape, for example. Probably more men than women do commit rape, but this does not mean that all men would do so simply because of their being men, or that women are unable of similar violence. The problem with
M. Talbot, Language and Gender: An Introduction, Cambridge: Polity Press, 1998.
A. Dworkin, Pornography: Men Possessing Women, Toronto: Women’s Press, 1981.
any essentialist viewpoint is that the effort to simplify suppresses what is significantly complex and exclusive to each person. Some scholars, such as Deborah Cameron157, point out that nowadays studies of divergences between the sexes have a political approach. Why do we wish to find differences? Whose objectives are being achieved by such investigation? One’s perception of gender often supports larger political or philosophical positions: those who consider gender as operating on a composite continuum do not constraint people based on their sex, while those who perceive gender as rigid and biologically predetermined run the risk of confining the human experience to constrained gender roles and demands for both women and men. To sum up, “sex” refers to a biological distinction, while “gender” is the term used to denote socially constructed classes founded on sex. Most societies act in terms of two genders, feminine and masculine, and it is instinctive to consider the category of gender as a simple binary opposition. Until lately, a great deal of the research dealing with language and gender did so. But recently some theories have challenged this binary thinking. Gender is rather conceived of as a variety of femininities and masculinities accessible to speakers at any point in time. The preoccupation with differentiation is typical of an essentialist view on gender, to be precise, of the idea that the male and female categories are indisputable essences legitimated by nature or metaphysics. However, gender is not an issue based on two independent and standardized social categories, connected with 157
D. Cameron, “Language, gender and sexuality: Current issues and new directions”. Applied Linguistics 26 (4), 482-502.
being male or female: male and female speakers may diverge in many ways, in reference to class, sexual orientation, ethnicity and age. Furthermore neither femininity nor masculinity can be understood on its own: the concepts are fundamentally relational. Put differently, masculinity is only significant when it is conceived in relation to femininity and to the whole of gender relations.158
The 1990s saw seismic revolutions in academic understandings of gender. As Deborah Cameron says: “gender […] has turned out to be an extraordinarily intricate and multi-layered phenomenon – unstable, contested, intimately bound up with other social divisions”.159
R.W. Connell, Masculinities, Cambridge: Polity Press, 1995.
D. Cameron, “The language-gender interface: challenging co-optation”, in Bergvall, Victoria, Bing, Janet and Freed, Alice, (eds.) Rethinking Language and Gender Research: Theory and Practice, London: Longman, 1996, pp. 31-53.
2.2 Sociolinguistics and gender It is only relatively lately that sociolinguists have turned their attention to gender. This is probably due to three main reasons: the first two derive from sociolinguistics’ precedents in dialectology and linguistics; the third is related to the variation in the status of women in modern-day society.
Firstly, in conventional dialectology, the favorite informants were usually older, non-mobile, rural and male.160 This prejudice in informant choices was noticed by sociolinguists and refused, though, in the beginning, refusal consisted of selecting urban rather than rural and younger in addition to older informants. Despite the fact that many studies integrated informants of both sexes, the majority continued employing male speakers. 161 It was only in the late 1980s that studies began to consider female speakers.162 Secondly, at the same time as sociolinguistics began to set up itself as a discipline, criticism against conventional linguistics led to a change in emphasis from usual to unusual varieties. All kinds of marginal groups have come under examination, especially working-class groups, teenagers, ethnic minority groups. Nonetheless women were not regarded as a minority group. Linguistic difference associated with social class, background or age was what seemed significant to early sociolinguists.
So why wasn't gender assumed as relevant? The answer may be that, 160
J.K. Chambers and P. Trudgill, Dialectology, Cambridge: Cambridge University Press, 1980.
See W. Labov, Language in the Inner City, Oxford: Blackwell, 1972, which is a study of black adolescents in Harlem. 162
B. Bate and A. Taylor, Women Communicating: Studies of Women’s Talk, Norwood: Ablex, 1988 and J. Coates and D. Cameron, “Gossip revisited: language in all-female groups” in J. Coates and D. Cameron (eds.) Women in their Speech Communities, London: Longman, 1989, pp. 94-121.
until recent times, men were regarded as at the heart of society by default, while women were considered marginal or even ignored. This is hard to understand nowadays, when gender diversity is a big business163, but if we look back at the phase succeeding the Second World War, all central positions in society were detained by men. Thus, for instance, Britain was governed by a king, George VI (the father of Queen Elizabeth II), the Prime Minister was male, the most significant people in the Law and the Church were male, trade was managed by men.
The most important change that has happened since that period, due for the most part to the political activism of the Women’s Movement, is that women have acquired the legal right to be considered as the equals of men.164 In 1975 the publication of Robin Lakoff’s book165 was an emblematic moment. While Lakoff’s work has been attacked for its generalized assertions and lack of methodological accuracy, its importance cannot be under-evaluated, as it spurred linguists all over the world to do research into the unexplored field of women’s talk.
Men, paradoxically, remained uninvestigated for a long time, exactly because man and individual were habitually exchangeable concepts, but in the last decade the topic of men and masculinity has come back into focus. There has been a change in men’s vision of themselves - a change from considering themselves as untouched representatives of the human race to
See, for example, the success of books like J. Gray, Men are from Mars, Women are from Venus, New York and London: HarperCollins, 1993, or the more recent S. Baron-Cohen, The Essential Difference: Men, Women and the Extreme Male Brain, Harmondsworth: Penguin, 2003. 164
Both the Equal Pay Act and the Sex Discrimination Act carne into effect in Britain in 1975.
R. Lakoff, Language and Woman's Place, New York: Harper & Row, 1975.
concentrating on themselves as men. A proper demonstration of this modification can be noticed in the titles of sociolinguistics books. Labov’s analysis of black male teenagers in Harlem - mentioned earlier in this section - was a significant sociolinguistic work in the 1970s. Its title, Language in the Inner City, doesn’t take into account the fact that the language examined in the book is male language. The same could be said for Brandis and Henderson’s work titled Social Class, Language And Communication.166
2.2.1 Different approaches to studying language and gender From the time when Lakoff published her classic work, Language and Woman’s Place, in 1975, linguists have considered language and gender from a variety of viewpoints. These can be classified as the deficit approach, the dominance approach, the difference approach, and the dynamic or social constructionist approach.
They flourished in a historical order, but the advent of a new approach did not mean that previous approaches were abandoned. In fact, these distinct points of view could be explained as working in a state of tension with each other.
The deficit approach was distinctive of the initial works in the discipline. The most famous is Lakoff’s Language and Woman’s Place, which declares to describe something called “women’s language” (WL), which is represented by linguistic forms like ‘empty’ adjectives (e.g. 166
W. Brandis, D. Henderson, Social Class, Language And Communication, Taylor & Francis, 1970.
charming, divine, nice), and ‘talking in italics’ (overstated inflectional contours). WL is depicted as weak and unassertive, in other words, as deficient. Implicitly, WL is deficient compared with the standard of male language. This approach was contested because of the insinuation that there was something essentially incorrect with women’s language, and that women should be taught to speak like men in order to be taken seriously.
In her Language and Woman's Place Robin Lakoff set up a collection of basic suppositions about the language of women. Among her assertions are that females: use backchannel support when listening or use positive minimal responses: nodding, saying 'yeah' and 'mm hmm'; hedge: use phrases, such as 'sort of', 'kind of', 'it seems like'; use (super)polite forms: 'Would you mind .. .', 'l'd appreciate it if .. .', ' ... if you don't mind'; use tag questions: 'You're going to dinner, aren't you?'; use unhelpful adjectives: such as 'lovely', 'adorable', 'nice' and so on; use hypercorrect grammar and pronunciation; use direct quotation when quoting speech: 'She said, "You can't go."'; have a special lexicon: use more words for things such as colours (like mauve or fuchsia); use question intonation in dec1arative statements: they make dec1arative statements into questions by raising the pitch of their voice at the end of a statement, expressing uncertainty; 73
speak less frequently than men in public settings; apologize more often than men: 'I'm sorry, but I think that .. .'; use modal constructions: 'can', 'would', 'should', 'ought' - 'Should we turn up the heat?'; avoid coarse language or expletives; use indirect commands and requests: for example, 'My, isn't it cold in here?' - as a request to turn the heat on or close a window; use more intensifiers than men, especially 'so' and 'very': for example, 'I am so glad you came!'; lack a sense of humour: women do not tell jokes well and often don't understand the punch line of jokes; interrupt less often than men.167
Lakoff maintained that these linguistic forms were representative of women’s talking and, more significantly, that they denoted lack of confidence on the part of many women. She assumed that: “Women's speech seems in general to contain more instances of 'well', 'you know', 'kind', and so forth: words that convey the sense that the speaker is uncertain about what [she] is saying”.168
Lakoff supposed that women soften their affirmations in such ways both because of their insecurity and in order to submit themselves to others. Others scholars, however, diverged from this interpretation. Lakoff’s ideas generated many possible explanations of women’s language models or
Adapted from R. Lakoff, Language and Woman’s Place, New York: Harper and Row, pp.40-41, 1975. 168
Ibidem, p. 53.
tendencies. Also non-verbal signals (for example eye-contact or smiling or touching) were supposed to reproduce power relationships so that limiting the analyst’s concentration only on verbal communication was insufficient for the aims of providing a comprehensive explanation of gender differences in communication. Dale Spender169 rejected Lakoff's vision of “women as deficient”. She suggested that men subjugated women as part of a patriarchal system but she maintained that women were not deficient as much as they were subjugated. She also examined the role of women themselves in constructing their own domination. Perhaps if women spoke with more strength they would be less submitted. Pamela Fishman170 carried out research on interactional strategies used by women and men and found out that, yes, women may pose more questions, make extensive use of feedback and backchannel support, and do most of the conversational work or, using her exact definition, the conversational shitwork, and she perceived these tendencies as being exclusively attributable to women’s lower position in society rather than any inborn inabilities with language or social patterns. In truth, many women in Fishman’s study were able to deal with and through language, but they made particular linguistic choices that just reproduced a lack of authority and even their own wish for staying in a powerless position with the purpose of being in relationships with men. This, paradoxically, is also a form of
D. Spender, Man Made Language, London: Pandora, 1980.
P. Fishman, “Conversational insecurity”, in H. Giles, W.P. Robinson and P. Smith, (eds.) Language: Social Psychological Perspectives, Oxford: Pergamon Press, 1980, pp. 127-32.
power. Fishman also maintained that conversations between the sexes sometimes fall down, not because of the way women or men talk, but because of how men keep on to their role of authority. Men do not have to pay particular attention in their conversations with women because they have symbols of success beyond these relations that they consider more important, such as money and status.
The second approach - the dominance approach - considers women as a subodinate group and explains linguistic divergences in women’s and men’s talk in terms of men’s domination and women’s submission. Researchers using this paradigm want to demonstrate how male dominance is performed by means of linguistic practice. “Doing power is often a manner of doing gender too”.171 Furthermore, all participants in communication, women and men, conspire in supporting and maintaining male dominance and female repression.
The third methodology - the difference approach - puts emphasis on the idea that men and women are associated with distinct subcultures. The “discovery” of separate female and male subcultures in the 1980s seems to have been a direct consequence of women’s increasing opposition to being deemed as a secondary group. The invisibility of women in history developed from the identification of culture with male culture. But women began to state that they had “a different tone of voice, a different mental attitude, and a different understanding of love, work and family from
See C. West and D. Zimmerman, “Small insults: a study of interruptions in cross-sex conversations between unacquainted persons”, in Thorne, Barrie, Kramarae, Cheris, Henley and Nancy, (eds.) Language, Gender and Society, Newbury House, Rowley,1983.
men”.172 The advantage of the difference model is that it allows women’s talk to be analyzed outside a frame of repression or powerlessness. As an alternative, researchers have been able to demonstrate the strong points of linguistic strategies typical of women, and to celebrate women’s ways of talking. In her best-selling work about male-female ‘misunderstanding’, Deborah Tannen maintains that the reader should be aware that the difference approach is questionable when applied to mixed talk. 173 On the contrary, critics of Tannen’s book maintain that the investigation on mixed talk cannot disregard the question of power.174
A large amount of the study on men, women and language has focused on sex differences.
During the 1970s, the sex diversities observed in language use were classified as sex-exclusive language styles, identified as women’s or men’s specific style of speaking, afterward sex-preferential speaking styles, where women and men could use the same lexis, syntax and phonology, but they decided to use certain forms rather than others. This became a more preferred way of classifying gendered language divergences. Deborah Tannen175 considers the use of the expression speech style
M. Humm, The Dictionary of Feminist Theory, London: Harvester Wheatsheaf, 1989.
D. Tannen, You just Don't Understand: Women and Men in Conversation, London: Virago, 1991.
See, for example, S. Troemel-Ploetz, “Review essay: selling the apolitical”, Discourse and Society, 1991, 489-502; D.Cameron, “Review of Tannen You Just Don’t Understand. Feminism and Psychology”, 1992, 465-8; A. Freed, “We understand perfectly: a critique of Tannen’s view of miscommunication”, pp. 144-52 in Hall, Kira, Bucholtz, Mary and Moonwomon, Birch, (eds.) Locating Power: Proceedings of the Second Berkeley Women and Language Conference, Los Angeles, University of California: BWLG group,1992. 175
D. Tannen, You Just Don’t Understand: Women and men in conversation, New York: William Morrow, 1991; Talking from 9 to 5, London: Virago, 1995; The Argument Culture, London: Virago, 1998.
more effective than the expressions women’s language or men’s language. According to Tannen, children take part in gender-particular subcultures with distinctive gender styles: the socialization process develops very early. As a result, the terms “women” and “men” are not very helpful in recognizing gendered language patterns because even children are gendered in their language practice from an early age. Girls are required to “be nice” and boys to be competitive. It is hard to understand how similar attitudes get passed on from one generation to another. It is probable that these attitudes are associated with distinctive interaction styles and behaviors and linguistic choices made by persons in specific groups and in particular circumstances. Maltz and Borker176 notice that girls seem to be educated to be cooperative in their use of language by playing with dolls and in paired groupings, whilst boys are often trained to be more independent by means of sports. But both styles are functional and essential. Girls (and women) may typically use a conversational style based on solidarity, collaboration, equality, while boys (and men) may prefer and develop styles based on independence, status, competition. Both styles of speaking can be helpful in both groups, as well as damaging in certain situations. But the models and trends are considerably complex and fluid to the point that in the end simplifications are useless. Deborah Tannen (1991, 1995, 1998) has noticed that women’s speaking style is essentially based on the fact that many women develop and maintain relationships through language, while men are predisposed to 176
D.N. Maltz and R.A. Borker, “A cultural approach to male-female miscommunication”, in J Coates, Language and Gender: A Reader, Oxford: Blackwell, 1998, pp. 417-34.
monologue so that conversations are just an exchange of information: “rapport” opposed to “report”. Tannen classifies masculine and feminine language as a series of contrasts: competition vs. collaboration; self-reliance vs. membership; information vs. interest; commands vs. suggestions; divergence vs. agreement. According to Tannen, both men and women are more at ease with these definitely gendered roles and carry on using their speech to sustain these gender roles. Jennifer Coates177 has highlighted that sometimes there is a particular purpose in the use of certain language structures, specifically to support the other speaker and to keep the dialogue alive - something not essentially related to uncertainty but rather with cleverness and even wisdom.
Most researchers investigating on gender and language nowadays would be inclined to admit that things said in any conversation depend on many variables, involving the participant’s age, knowledge, cultural background, character or personality, as well as the situation itself. Janet Holmes’ research178 concerning women and men in the workplace firmly associates intent to language use: what is intended affects both what is declared and the way in which it is understood. Her analysis also concerns politeness associated with gender - women being more polite than men, more frequently. Janet Holmes and Sara Mills179 suggest that 177
J. Coates, Women, Men and Language, New York: Longman, 1993; Women Talk: Conversation between Women Friends, Oxford: Blackwell, 1996. 178
J. Holmes, Power and Politeness in the Workplace, London: Longman, 2003; Gendered talk at Work, Oxford: Blackwell, 2006. 179
S. Mills, Gender and Politness, Cambridge: Cambridge University Press, 2003.
power is the element at the center of our relations and, consequently, understanding power is essential in order to understand language models and trends. Other researchers, such as Victoria Bergvall,180 have wondered if these debate on masculine and feminine tendencies is useful, because the concentration itself on gender divergences may sustain the belief that such divergences are real. To be precise, the examination takes for granted that women and men actually speak in a different way.
The fourth and most modern approach is occasionally called the dynamic approach because it lays emphasis on dynamic traits of communication. Researchers who adopt this methodology take a social constructionist perspective. Gender identity is perceived as a social construct rather than a “given” social class. As West and Zimmerman explicitly declare, speakers should be regarded as “doing gender” rather than “being” a specific gender.181 This dispute led Crawford to assert that gender should be conceived as a verb, not a noun.182
The social constructionist view is related to the particular social functions and status we obtain and uphold by means of language. In this perspective, we all can be considered as a “constellation of subject positions conferred by different discourses”.183
V.L. Bergvall, “Toward a comprehensive theory of language and gender”, Language in Society 28, 273-93, 1999. 181
C. West and D. Zimmerman, “Doing Gender”, Gender And Society, 1987, 1: 125-51.
M. Crawford, Talking Difference: On Gender and Language, London: Sage, 1995.
M. Talbot, Language and Gender: An Introduction, Cambridge: Polity Press, 1998, p.156.
While the Difference theory maintains that people are principally characterized by their gender as males and females, Social constructionism acknowledges that there are a variety of cultural and power features at work which both strengthen and weaken our gender identities. Our individual positions evolve within the specific institutions where we play a part. Any single person is placed in a broad variety of positions in a series of situations; this placement is identified as subject positions or subjectivity. Subject positions vary during our lives. Even in the course of one day, oneâ€™s subjectivity varies a lot of times.
Our subjectivities are various and perhaps even opposing, demonstrating our extraordinary capability to alter and regulate our language use when necessary. Sometimes we play the role of specialist on a subject; sometimes we are beginners.
Social constructionism is associated with the way in which our social positions are fluid, open to discussion and constantly shifting. According to the social constructionist theory, these positions are questioned and neutralized by the alternative ways in which people may be identified and categorized by means of power associations in a society, in relation to their age, rank, background, culture and sexuality (amongst other social variables). People have the power to perform numerous identities. Nonetheless, social constructionism also acknowledges that some prevailing discourses in society (for instance gender differentiation) have the potential to generate particular deep-rooted identities, which are difficult to controvert or refuse because of a well-established cultural support for them.
The four approaches do not have definite limits: researchers may be biased by more than one hypothetical point of view. What has changed is linguistsâ€™ perception that gender is not a fixed, tag on attribute of speakers, but it is something that is realized in conversation every time we speak.
Nowadays the deficit approach is judged as out-of-date by researchers. The other three approaches have all produced important insights into the scenery of gender differences in language. While it is correct to declare that social constructionism is now the widespread paradigm, it is important to highlight the influence of the dominance and difference approaches during the 1980s and 1990s .
2.3 The ascent of feminism: a brief review The study of gender and language received a great boost from feminist ideas. The Oxford English Dictionary identifies feminism as “the policy, practice, or advocacy of the political, economic, and social equality for women”
. This designation is informative enough, but there are
important implications concerning the topic of feminism that a dictionary definition cannot satisfactorily investigate. Today’s feminism is a heterogeneous phenomenon with a long and troublesome history with a beginning that may be complicated to locate. The contemporary discipline takes account of three main domains, each with a different view about the connection between female repression and patriarchy:
1. liberal feminism tries principally to observe and describe the society’s vision of women as indicative of society’s patriarchal behaviors and beliefs, for the most part concerning laws and human rights;
2. socialist feminism investigates patriarchy together with social class topics of domination and power;
3. radical feminism concentrates on patriarchy and male domination over women and calls into question society to analyze, confront and subvert it.
Other alternatives of feminism consist of existentialist feminism, psychoanalytic feminism, post-modern feminism, post-structural feminism Islamic feminism, Jewish feminism, and Christian feminism - each of which
J. A. Simpson, Oxford English Dictionary, Oxford: Oxford University Press, 2009
gives attention to the female experience in society. The analysis of gendered language is frequently associated with liberal feminist thoughts and the mutual concern for women as victims of a patriarchal order. Despite the fact that there has been recent interest in masculinities and men, together with gay and lesbian groups, the domain of gender and language has been primarily concentrated on women and the way in which they “miss out” by current constructions of gender.
A large amount of feminists trusts that all cultures have been and still are patriarchal and that males and the male experience determine what it means to be human, as well as what it represents to be female. Women have been treated as ‘the other’ whose life is determined and construed by men. Feminists think it is essential to comprehend and study gender as a structure of cultural symbols, regularly transferred to two distinctive body forms: female and male. According to feminists, it is decisive that we discredit and sabotage the power-established relationships assigned to gender in order that both women and men may live more freely. Feminism is generally the essence in women’s studies or gender studies in the academic world, but it is also an area of expertise that plays an important role in various branches of learning including education, fine arts, the humanities, and in all the social sciences, including anthropology, linguistics, philosophy, communications and psychology. There are a large number of ways in which feminism interrelates with academic debates. Several feminist researchers examine gendered ways of talking, writing, thinking, studying, advising; whereas others investigate gender-detailed health and medical problems, familial and household realities or conflicts, 84
and civil rights or admission to representation. Moreover there are feminist researchers devoted to the “re-evaluation” of the past to include ignored elements of female experience and points of view. Some researchers analyze literary hypothesis and the ways in which women and women’s existence are represented in literature and art, in addition to the ways in which literature and verbal communication are shaped by genderedness. It is essential to be conscious that the term “feminist” is not as politically meaningful in academia as it is sometimes supposed to be in society more commonly. As a matter of fact, a definite, universal accepted academic designation of feminism is difficult to find. ‘Feminisms’ (plural), if truth be told, might be a better word alternative and it is frequently chosen by feminist researchers because it more satisfactorily mirrors the variety of their work. However all of the assortments of feminisms include condemnations of patriarchy or of misogyny. It might also be correct to state that all feminisms share some form of disillusionment with our society in which the male experience is considered the main point of reference, the rule, whereas the female experience is distinct, separated and deemed as something other.
There are innumerable instances of how patriarchy is unveiled in our language use. Consider, for example, the term “waiter” as the assumed norm or the unmarked word employed in our daily language, whilst “waitress” is the linguistically marked variant. The use of the word “waitress” emphasizes the server as being of the female sex, whereas the term “waiter” is the starting point. The accentuation of otherness is what is intended by “marked”. You can surely think of equivalent examples (actor/actress; air 85
host/hostess) in addition to ways in which such gendered words have begun to change toward the gender-neutral (flight attendant). Feminist academics are involved in the ways these sorts of postulations function in society and how they can be uncovered, analyzed and prevented for reasons of liability and justice. All feminists, in every area, are paying attention to such problems and the potential complex solutions.
2.3.1 Phases of feminism Feminism is generally recognized as happening in three phases: first, second and third. These stages are carefully typified by numerous feminist researchers, including Sara Mills185 and Judith Baxter.186 The first phase is feminism before women could vote and its purpose and focal point was the realistic emancipation of women. The second phase of feminism is represented by the mid-20th-century struggle for “equal pay for equal work”, whilst the third stage of feminism begins in the 1980s as “identity feminism” and had to do with issues related to individual choice.
Particularly in the second part of the 20th century, feminist issues started to be represented by important American organizations like the National Organisation of Women (NOW) and the Equal Rights Amendment. These American materializations of feminism maintained that women should be free to run after independent careers and economic selfreliance, in addition to have lack of any male-based restrictions in their personal relationships. Feminism in the 1960s was principally based on 185
S. Mills, Language and Gender: Interdisciplinary Perspectives, New York: Longman, 1995.
J Baxter, Positioning Gender in Discourse: a Feminist Methodology, Basingstoke: Palgrave Macmillan, 2003.
consciousness raising groups in which women together started to challenge and refuse the role of housewife that was deeply -rooted in America at the time. Moreover the American feminism of this period was one of the many movements that contested any form of authority at the time, such as the antiwar movement and the civil rights movement. The sexual revolution of the 1960s condemned traditional patterns of male-female relations and wanted to introduce new, more equal models of behavior.
Throughout the 1970S and 1980s, Western culture conquered many legal reforms, for instance more equal pay for equal work, more achievable divorce laws and legal abortion, more decisive action in the place of work and educational institutions. However, while the question of womenâ€™s alienation from power came into Western social mindfulness in an especially powerful way in the 1960s, it is essential to know that feminism as a conception is much older than the 1960s and much more multifaceted. In fact, the investigation on women and womenâ€™s experiences dates far back and involves all cultures all over the world. Sappho, an ancient poet from the 8th century BCE, along with many medieval mystics such as Hildegard of Bingen (1098-1179 CE), Julian of Norwich (1342-1416 CE) and the 17th-century writer Aphra Behn (1640-89 CE), inaugurated a phase of openminded perspectives of life and society together with a feminist vision of reality and experience.
Most important feminist intellectuals comprise the English Mary Wollstonecraft who composed A Vindication of the Rights of Woman in
1792187. Wollstonecraft condemned the lack of a careful education for girls in 18th-century England and their weaker status in society as an immediate consequence of the inadequate education. She assumed that women were not capable to own positions of authority because they were deficient in the preparation required to do so.
An American campaigner, Sarah Moore Grimke, wrote Letters on the Equality of the Sexes188 at the beginning of the 1800s. In 1843, Sojourner Truth gave her “Ain't I a Woman?” speech as representation of the anti-slavery movement in the US189. Elizabeth Cady Stanton composed The Woman's Bible in 1895,190 a text presenting a woman’s viewpoint about Biblical facts. The British Emmeline Pankhurst191 with her universal suffrage organization cast the vote to all British women, extending it through the British Empire.
In 1949 the French Simone de Beauvoir wrote The Second Sex (translated into English in 1952).
M.Wollstonecraft, A Vindication of the Rights of Woman, New York: Norton, 1972
S. Moore Grimke, Letters on the Equality of the Sexes, Boston: Isaac Knapp, 25, Cornhill, 1838.
Ain't I a Woman?" is the title attributed to a speech, presented spontaneously by Sojourner Truth, (1797–1883), a slave born in New York State, at the Women’s Conference in Akron, Ohio, on May 29, 1851. After obtaining her freedom in 1827, she became a famous anti-slavery spokeswoman. A transcription of the speech appeared in the Anti-Slavery Bugle on June 21, 1853. 190
E. C. Stanton The Woman's Bible, European Publishing Company, 1895.
Emmeline Pankhurst (born Emmeline Goulden; 15 July 1858 – 14 June 1928) was a British political campaigner and guide of the British suffragette movement which helped women gain the right to vote.
Every one of these women and a lot of many others were recognized as major world writers, intellectuals and politicians long before the popular American feminists like Betty Freidan, Gloria Steinem and Germaine Greer in the UK. Feminism has been, and remains, the struggle to make life just for everyone, to dismount gender roles that downgrade both genders, and to stop gender typecasting.
To summarize, it is important to understand that the study of gender and language use develops from beliefs supported by preceding generations. The question of sex and gender is more intricate and vast than any of us can understand because it is surrounded by billions of personal lives affecting the female experience as well as the male experience.
organization in Western culture could be perceived in the sociolinguistic writings at the time, especially in the prominent work of Otto Jespersen.192 This Danish linguist maintained that women spoke in ways different from men because they were incapable of speaking in powerful, logical sentences or using a wide vocabulary. He was convinced that the highest orators of history were always and only men because of inborn skills in men that were unusually (if ever) present in women. Many at that time agreed with him. Despite that, there were also others who had different points of view. For instance, Virginia Woolf193 maintained that womenâ€™s absence from positions of real world authority was related to the lack of chance and the lack of appropriate preparation of women and not because those born female 192
O. Jespersen, Language. Its Nature, Development and Origin, London:Allen & Unwin, 1922.
V. Woolf, A Room of oneâ€™s own, London: Penguin, 1928.
were intrinsically incapable to be inventive or advanced in thought and language. Later feminists started to condemn other sexist positions, starting with sexism in the language itself.
2.4 Sexist speech Sexism is a term that originated in the 1960s, together with the term racism, to express intolerance in society based on specific individual qualities, such as being female or male or being black or white. Sexism is linked to the historical patriarchal hierarchy where men are regarded as the norm and women are indicated as something distinct from the norm. In this vision, the other can be oppressed, subjugated, controlled or even isolated because of his or her diversity from what is judged to be the common experience. In gender analysis, this phenomenon of othering is typically related to women, but it may be also based on other specific individual traits, such as ethnicity, religious convictions, sexual tendencies or disabilities - anything that is perceived to be different from the hypothetical norm.
Suspect and condemnation of sexist language have come out because of a concern that language is an influential medium by which the world is both represented and created. One instance of gender partiality in language use is the case of pronouns, especially the general use of “he” or “him” to make reference to something connected to both men and women. Feminist linguists, such as Dale Spender,194 assume that language has been traditionally man-made with the male structures mirroring the male’s status in society and the female structures assumed as atypical. Some have maintained that the employment of generic words (like “mankind” to refer to both men and women) strengthens a dualism that looks at the male and
D. Spender, Women of Ideas: And What Men Have Done To Them, Toronto: Harper Collins Canada, 1990.
masculine as the standard and the female and feminine as the “nonstandard”. This kind of lexical markings are also known to have precluded women from communicating and increasing consciousness about their own experiences and from speaking with their own voice.195 Their invisibleness in language and related silence keeps alive gender attitudes in society, so that we start to consider what is male as the only point of reference; by contrast, we perceive what is female as a variant of the mainstream human experience. It is this kind of discrimination that has inspired much feminist scholarship.
Sexist language also represents stereotypes of both females and males, occasionally portraying the difficulties of males, but more often focusing on the weaknesses of females.
Sexism can be seen across the world in all languages. In English, Robin Lakoff196 mentions the instance of “master” in opposition to “mistress”: there are asymmetrical connotations contained in these two terms: “master” has forceful and authoritative connotations, whereas “mistress” does not. Feminists have also condemned the use of generic terminology because it cannot be truly generic. Moreover generics too often strengthen patriarchy as a consequence. David Graddol and Joan Swann197 have investigated sexist language and how sexist terminology impede our comprehension of the human
C. Gilligan, In a Different Voice: Psychological Theory and Women’s Development, Cambridge, MA: Harvard University Press, 1982. 196
R. Lakoff, Language and Woman’s Place, New York: Harper and Row, 1975.
D. Graddol and J.Swann, Gender Voices, London: Blackwell, 1989.
experience by always observing the lived experience through a gendered lens. Other cases of ‘woman doctor’ or ‘male nurse’ modulate the adjective, causing it to be different and supporting a tendency to consider certain tasks in society as associated with implicit gender roles.
Sexist language also entails the representation of women in the position of submissive object rather than active subject, and it underlines women’s aspect (“a blonde”) or domestic functions (“a mother of two”) while analogous descriptions in similar circumstances would not be made of men. These illustrations of women underestimate their lives and increase the level of personal judgment on them. Men can be also underestimated and critically judged by sexist language (for instance, “what a stud”) but feminists claim that the implications in such examples are not as cruel and restrictive in the same way as they are for women. Certainly, regarding a man as an object of desire might be perceived by both men and women as flirty, positive and even gratifying.
Ultimately, feminist linguists expect that consideration given to language will disprove the time-honored male advantages, in conjunction with the patriarchal organization that protects them, and guarantees gender roles for both males and females. But sexist language is not only situated in the denotation or connotation of particular terms or expressions. It might also be identified in discourse, in our dialogues and in the meanings produced by our talking styles or models. Language varies from one situation to another, from one culture to another and from one period of time to another. Language varies as a consequence of political, social and economic events. It changes as a result of lifestyle transformations and 93
encounters with migration, media and technology.
Language can be employed as an instrument of social change but it also reveals current opinions and attitudes and it is an important matter for sociolinguists. Language use discloses our consciousness or lack of consciousness of human complexity. Despite the fact that the most evident purpose of language is to convey information, language also grants two other essential functions. One is to define and generate social identity and the other is to create and preserve social interactions. These functions may be rarely identified because gender is suggested not so much through what we say as through the way in which we say it.
Gendered language was not regarded as a significant field of study until the 1960s, and it did not emerge as a subject matter until the publication of Robin Lakoff’s essay, Language and Woman’s Place in 1975. Lakoff’s work suggested generalized assumptions concerning the speech of white, heterosexual, middle-class American women, in the form that she named “woman’s language”. Lakoff maintained that women use specific language elements because they are deprived of means of strong expression within a male-dominated culture. Many research studies following Lakoff’s analysis also concentrated on white, heterosexual, middle-class women, and many of these approved and assumed Lakoff’s interpretation and conclusions about gendered language use. Other versions, however, also began to come out. For
instance, Deborah Tannen’s work in the 1990s198 claimed that gender diversities correspond to cross-cultural diversities. She maintained that men and women depend on different sub-cultural rules when decoding cultural messages encoded in language use. She argued, for example, that femalebased subcultures regularly use language to construct personal relations more than to construct hierarchical relations. For Tannen, the divergences in language use between men and women derive from the intention or motivation and not from the leading position of men in society.
During the 1990s, the analysis of the relationship between gender and language shifted toward considering language as performative of gender identity: people were considered able to create gender by means of their own speech and to do so in a multiplicity of manners. Penelope Eckert,199 for instance, claimed that the language used in a Detroit high school produced a partial variety of social identities for feminine students (such as “girl” or “white”) because certain designations (for example being a star athlete) were socially unobtainable by girls. Gender and language researchers began to perceive gender as a way of dealing with established gender roles in particular groups of people. Mary Talbot200 assumed that the use of language induces gender discrepancies, more than merely reproducing them. She made use of the phrase “language-as-mirror” to explain how language unveils our feelings 198
D. Tannen, You Just Don’t Understand: Women and Men in Conversation, New York: William Morrow, 1991; Talking from 9 to 5, London: Virago, 1995; The Argument Culture, London: Virago, 1998. 199
P. Eckert, “The whole woman: Sex and gender differences in variation”, Language Variation and Change I, 245-67. 200
M. Talbot, Language and Gender: An Introduction, Cambridge: Polity Press, 1998.
and opinions. Calling an adult woman â€œgirlâ€?, demonstrates the specific social approach toward women that perceives them as less unfriendly when childish. In this sense, what we say and what is said to us are deeply influenced by our gender.
2.4.1 Gendered language Two of the lasting questions in the study of gender and language are how language use mirrors our opinions relating to men and women and how language creates our positions toward gender roles.
Consider how women in many workplaces are regularly subsidiary in relation to men in those same work settings. Or think about the way in which women are often addressed in public talk as “girl” (for example, “there is a new girl in our office”) while men are unusually referred to as “boy” in similar circumstances.
At the beginning of the 20th century, the American linguists and anthropologists Edward Sapir and Benjamin Whorf declared that “the confines of one’s language are the confines of one’s world”.201 This is wellknown as the Sapir-Whorf hypothesis according to which there is a connection between a person’s language use and how she/he perceives the world. This hypothesis (also known as linguistic determinism in its radical variety) indicates that language use unveils our experiences and cannot be external to them. But recent scholars are more and more unfavorable toward this position. In fact even if we may not have a term for a certain concept or sensation, this does not mean we cannot experience it. For instance, a woman in the 19th century might surely have experienced “domestic abuse” before this term existed. Probably her experience might certainly have contributed to the origin of the term. In this sense, it seems logical to think that language use is a continuously changing phenomenon. The way we 201
E. Sapir, The status of linguistics as a science, 1929, in E. Sapir, Culture, Language and Personality (D.G. Madelbaum, Ed.), Berkley: University of California Press, 1958.
speak suggests what we are, but the motivations behind our language choices may be complex.
Viewing language only as mirroring implies that we can only convey what we experience and that we can only experience what we have words for. However, we can and do make use of language to modify our standpoints or our perception of reality often and all through our lives. Language use has a lot to do with this. Studying language and gender is strongly correlated to feminist interest in impeding the reproduction of the accepted disparity between men and women. Language plays a complex role in reproducing and constructing genderedness. Therefore the study of language can play an important function not only in revealing gender divisions, but also in moderating, them.
There have been numerous sociolinguistic studies that have exposed differences in the way language is used by men and by women. According to these studies women constantly use more linguistic elements related to standard language. These studies also make use of what is well-known as the variationist perspective. Probably the best-known work of this type was carried out by the American William Labov,202 who investigated the concept of linguistic change, which is related to the kind of study identified as variationist sociolinguistics. His studies on language change had the aim of studying different demographic groups, which finally led him to maintain that men speak with more versatility, in a more informal way than women
W. Labov, The Social Stratification of English in New York City, Washington DC: Center for Applied Linguistics, 1966.
do. This remark suggests that men are more at ease in their social settings and that women are more concerned and uneasy in social circumstances owing to their necessity to attain or preserve social status. Moreover, women use more standard language because in this way they may get access to authority; on the contrary, men are free to be inventive with language because they already have authority. Obviously others diverged with this analysis suggesting, as an alternative, that gender-based linguistic divergences come up from the fact that women more often prefer standard language forms because they are more comprehensive than others. Peter Trudgill203 conducted a study in Norwich, England, inspired by Labovâ€™s New York study, trying to set up the difference of formality along gender lines. Trudgill discovered that women declared to use more standard language than they really did, while men declared to use more colloquial speech than they truly did. In reply, Trudgill suggested that since we previously considered women to be more status-conscious than men, we deciphered their speech patterns this way. But why would we do this? One reason entails our vision of womenâ€™s social position: we trust that women have to obtain and uphold social status, and that they use language as an instrument of doing so. Another explanation comes from our consideration of men in terms of their profession or earning power: we do not consider men as using standard or formal language to acquire their social position, because it is already fairly secure. As a general rule, we tend to think of men as being more self-sufficient, and women as more supportive of others. Thus, this is what we are likely to perceive when we examine their speech. 203
P. Trudgill, Sociolinguistics: An Introduction, Harmondsworth: Penguin, 1974.
2.4.2 Women and men talking As maintained by Mark Peters204 there is a lasting and implacable legend that women employ more words a day than men. Peters suggests that women make use of approximately 7000 words per day in comparison with just 2000 for men. In spite of this, Peters uses the critiques expressed by Deborah Cameron that identify sexism as behind our inclination to suppose that women talk more than men. In truth, no academic study has calculated women talking more than men.
categorization of sex-preferential speech; moreover she claimed that sexbased trends were not static but rather variable. Years later, Deborah Tannen206 also applied sex-preferential speech theory in her conception of women and men as coming into contact with different gender cultures. In his work Men are from Mars, Women are from Venus ,207 John Gray lines up sex-preferential speech styles with an essentialist awareness: “women are like that” (suggesting that women are more cooperative in their communication habits) and “men are like this” (suggesting that men are more antagonistic in their communication style).
Both Tannen and Gray continue to be central figures with reference
M. Peters, “The Math on Miss Motor Mouth”, Psychology Today, March/April 2007.
A. Bodine, “Androcentrism in prescriptive grammar: Singular “they”, sex-indefinite “he”, and “he or she””, Language in Society, 4: 129-46, 1975. 206
D. Tannen, You Just Don’t Understand: Women and Men in Conversation, New York: William Morrow, 1991. 207
J. Gray, Men are from Mars, Women are from Venus: A Practical Guide for Improving Communication and Getting What You Want, New York: Harper Collins, 1994.
to the analysis and description of communication difficulties between the sexes. Their fame is probably owed to their competence in simplifying and linking gender with sex-based characteristics. However, any person can use radically different speech styles in different situations, with different persons and with different intents.
Language use is always connected to the situation in which it occurs, the reason, the moment, the participants. Also oneâ€™s language use will be different in accordance with the requirements of the social context, for example in terms of the degree of formality and the topic of discussion, and in reference to the relationship between speaker and listener(s). For instance, the language we may use in a job interview or at an academic convention is very dissimilar from the language we employ with our friends when talking about a film. In this view, sex-preferential speech models are not very useful; no one is a pure result of gender tendencies. We are all more than simply our gender.
We are various versions of ourselves, in different circumstances and with a variety of intentions. It would be sufficient to say that our use of language mirrors our individual complexity and our aptitude to become accustomed to various situations.
2.4.3 Critical Discourse Analysis: the male as model What seems conventional or perhaps even normal in our daily lives as women or men is often a product of culturally created roles. These roles or identities can be artificial and disempowering. The gender difference comes to be â€œnormativeâ€? so that we believe that what is feminine is 101
something appeasing, supportive and person/relationship-oriented, whereas what is masculine is something more antagonistic, hostile and task-oriented. Our genderedness is strengthened throughout our life and by means of reciprocal exchanges and relations.
In every group of people, there are a variety of frames that “demarcate” the conversations and delineate the reality of things. These practice of framing meaning and creating topics is what we identify as “discourse”. Critical Discourse Analysis (CDA) is one of the major methods exploited to examine the social creation of gender roles in language use. The purpose of CDA is to study the way in which a specific conversation may be the expression of social relations of power. CDA is very useful in gender and language studies owing to its focus on power. The French philosopher and social thinker Michel Foucault208 considered discourse as both opportunity and limitation and even prearranged in certain power relations. For example, people in the legal profession share their understandings, use the same key vocabulary and specific procedures as well as social identities in their discourse.
Legal discourse circumscribes what is lawful and unlawful; these attitudes are historically established and consolidated by the everyday use of particular words, terms and expressions related to discussions about legal issues. Foucault supposed that discourse reveals who has power and, consequently, knowledge. Foucault’s belief is that those who are leading in any organization, group or society uphold their power and status by the use 208
M. Foucault, The Archaeology of Knowledge and Discourse on Language, New York: Pantheon Books, 1972.
of particular discourse(s); they set up the limits of knowledge and they do this mainly by language use.
Though CDA tries to cope with the amount of subject positions and power relationships, it is inclined to assume a dual position in which the males are dominant and the females are submitted. Judith Baxter209 suggested an unconventional approach: Feminist Post-structuralist Discourse Analysis (FPDA). Her method contests the conventional feminist stance for which women are often disempowered in mixed-sex situations. By means of an FPDA way of studying speech, one can estimate that power relations are structured in considerably complex ways and that women are well-aware of the specific roles which they decide to play in these relations.
J. Baxter, Positioning Gender in Discourse: A Feminist Methodology, Basingstoke: Palgrave Macmillan, 2003
2.5 Gender identity and media It is easy to perceive that in industrial societies gender is associated with sexualization. Conformist types of feminine and masculine representations are produced by the mass media to create customers for particular products. Being feminine and masculine entails specific ways of conduct and consumption; advertising generates in us specific necessities for a gendered identity: when we go shopping, we make certain choices to genderize ourselves buying the products promoted to us. Many media topics and imagery reproduce the power relations in society and exploit our aspirations to draw our attention to particular products. Fashion and makeup corporations above all rely on gendered identities to promote their products. David Gauntlett210 isolates a quantity of issues relating gender identity and the media, comprising the variability of our gender identities over time, the weakening of the representation of conventional gender roles and the growth of an innovative “girl power”. Twenty or thirty years ago, investigation on popular media regularly told researchers that typical culture was based on old-fashioned opinions and methods, opposed to social change. Today, researchers are more willingly to regard mass media as a force for change. The usual image of women as housewives has been substituted by appealing “girl power” symbols, and the masculine models of hardiness and self-sufficiency have been subverted through a new accentuation on men’s feelings. These unconventional views and conceptions have shaped different identities but they also involve new demands and requirements. 210
D. Gauntlett, Media, Gender and Identity: an Introduction, New York: Routledge, 2002.
In the 1990s Mary Pipher211 noticed how adolescent girls absorb society’s messages about appearance and especially slimness. She criticizes advertisers who propose impossible standards of beautiful women in order to sell perfection. In this way young women come to believe that they are only admirable if their bodies resemble those ideals. According to David Gauntlett,212 media spread innumerable messages about identity and appropriate forms of self-expression, sexual characteristics, gender, and way of life. However, we have our own set of different positions about these topics, which can vary as we shift through different life phases. The media’s suggestions are attractive, but they can’t simply overwhelm contrasting positions in the audience. Moreover even if we agree that many media sources support conventional hierarchical ideas of femininity and take advantages from it, we can’t overlook the contribution and choice that women themselves make: they enjoy the products and choose to buy them.213 As a result, we can suppose that the particular messages will be internalized by many even if refused by some readers or viewers. Stacy Smith conducted a study214 at the University of Southern California, Los Angeles, in which she examined G-rated (G for General211
M. Pipher, Reviving Ophelia: Saving the souls of Adolescent Girls, New York: Ballentine Books, 1994. 212
D. Gauntlett, Media, Gender and Identity: an Introduction, New York: Routledge, 2002.
See C. Caldas-Coulthard, Man in the news: The misrepresentation of women in the news-asnarrative discourse, in S. Mills, Language and Gender: Interdisciplinary Perspectives, Harlow: Longman, 1995, pp. 226-39. 214
See S. L. Smith, Where the Girls Aren’t: Gender Disparity Saturates G-Rated Films, Program Brief, See Jane Program at Dads and Daughters, February 2006, www.seejane.org and G Movies Give Boys a D: Portraying Males as Dominant, Disconnected and Dangerous, Program Brief, See Jane Program at Dads and Daughters, May 2006, www.seejane.org.
family viewing) films and the representation of female and male characters in movies sold to children. Smith and her group examined 101 top-grossing family-rated movies distributed from 1990 to 2004, exploring a total of 4249 speaking characters in the films, incorporating both animated and live action movies. The study highlighted that, in general, three out of four characters (75%) are male, whilst less than one in three (28%) talking characters are female. Less than one in five (17%) characters in crowd scenes are female and more than four out of five (83%) film narrators are male.
For all of us, but particularly for children, imagery and tales have a great effect on the perception of what it means to be human, whether male or female. In a 2003 American survey on a national scale, the Kaiser Family Foundation215 found out that half of all children aged zero to six watch no less than one DVD movie per day. In view of this, G-rated movie DVDs may have an impact on children’s early social understanding about gender roles because children are also inclined to watch the same movie repeatedly.
Other studies examine the television viewing customs among children and advise that gender expectations can be distorted.216 So, since women and girls constitute half of the human race, the presence of a wider assortment of female characters in children’s initial experiences with the media is crucial for both girls’ and boys’ growth. If both boys and girls saw 215
The Kaiser Family Foundation was founded in 1948 by Henry J. Kaiser. The Foundation was initially set up in Oakland, afterward KFF was transferred to its present site in Menlo Park, California. Kaiser is a non-profit, private foundation interested in the most important health care concerns facing the U.S., in addition to the U.S. part in global health policy. To know more about the 2003 survey, visit http://www.kff.org/entmedia/3378.cfm. 216 J. Herret-Skjellum and M. Allen, “Television programming and sex-stereotyping: A metaanalysis”, Communication Yearbook 19, 157-85, 1996.
more female characters in these widely distributed media, we could experience a total consciousness of the possible ways to be human.
2.5.1 Advertising and gender The media portrayal of female beauty is unachievable for most women. Jean Kilbourne217 declares that women’s magazines have ten and a half times more advertisements and articles publicizing weight loss than do men’s magazines, and more than 75 percent of women’s magazines’ covers contain at least one message with reference to how to change a woman’s physical appearance. The bombardment of messages about slimness, dieting and attractiveness says to ordinary women they are constantly in need of improvement and women absorb these stereotypes and consider themselves cruelly. Jean Kilbourne’s study highlights the unreal promise of advertising that leaves us never fulfilled: we can always get something better. The bombardment of advertising (some 3000 advertisements produced per day) influences especially young people generating an “habit-forming mentality” that often persists throughout life. Standards of beauty are imposed on women, the largest part of whom are physically larger and older than any of the models, because by offering an archetype difficult to attain and preserve, the cosmetic and diet business is guaranteed of expansion and profits. Women who are unsure about their bodies are more expected to buy beauty products, new clothing and diet aids. Exposure to images of skinny,
J. Kilbourne, Can’t Buy My Love: How Advertising Changes the Way We Think and Feel, Washington, DC: Free Press, 2000.
youthful, air-brushed female bodies is connected to depression and loss of self-confidence and the need to do anything is required to conform to suitability.
In contemporary industrialized societies, gender identities are deeply determined and affected by the consumerist interest that drives the market and influences our daily relations, how we consider ourselves, and it is a most important influence on our behavior at work, and at home. “Consumer gender” is essentially a creation of the mass media wherein we coparticipate. We spend our time and money to create ourselves as up to standard adaptations of ourselves, which entails much struggle and, more significantly, much costs. The creation of gender identities in advertising – or consumer gender – is equally captivating and troubling. This practice is extremely manipulative. The varieties of femininities and masculinities are set up and strengthened through the presence of and emphasis on an hypothetical audience. While the media makers create an ideal, we, the consumers, place ourselves in connection to that ideal.218 On the whole, media is boosted by the market and our attitudes and beliefs; the media can’t generate what is not already there, but it can produce more extreme forms. Advertising intensifies our human main beliefs and wishes to engender an even bigger need for particular products.
N. Fairclough, Language and Power, London: Longman, 1989; Discourse and Social Change, Cambridge: Polity Press, 1992; Media Discourse, London: Arnold, 1995.
2.5.2 Media discourse Although there are magazines and television programmes intended for men, the variety of media directed at women is astonishing: food channels, fashion magazines, celeb magazines, women’s home-making magazines, daytime talk shows, daytime soap operas. Lia Litosseliti219 makes a list of various UK women’s magazines every one linking with their ideal audience. On every occasion, the magazine makes use of slogans to tag itself as related to its audience. Think about the use of language on the covers of these magazines: • Wench: Where women are, where they are going, and where they should be already; • Cosmopolitan: The world's n. 1 magazine for young women; • She: For women who know what they want; • B: Everything you want; • Woman's Own: For the way you live your life and the way you'd like to; • Company: For your freedom years; • Minx: For girls with a lust for life; • Executive woman: For women who really do mean business.
Litosseliti underlines the employment of personal pronouns (you, your, we) as a frequent characteristic which assumes and therefore creates a 219
L. Litosseliti, Gender and Language: Theory and Practice, London: Arnold, 2006.
relationship between the publisher and the customer. Consequently, there is an invention and development of a shared ideal. The use of personal expressions in magazine articles (for instance “most of us” or “we’ve all done it”) creates manipulated associations with a reader that engenders a sort of solidarity or, following Mary Talbot’s definition of this kind of complicity, a synthetic sisterhood.
This synthetic reproduction of a
feminine-gender community is situated in an ideal location where economic or social disparities are not made explicit. Consequently, every woman can think of herself as a member of the sisterhood promoted. Men’s products work in a comparable way: a community is supposed, then constituted and then a relation develops with particular key products that improve the gender identity of their consumers.
Talbot is disapproving of the construction of a consumer femininity especially on account of the depiction of women as powerless, susceptible consumers. In order to feel right, women must appear a certain way (new each year) and share collective ideals and preferences to be completely approved.
By comparison, imagery of men are usually advertised in situations of power, autonomy and self-determination, so that women are the ones who are beaten the most from media’s representation of them. The demands on men might be as severe as those concerning women, but their potential rewards are more empowering. They will probably be rich and powerful,
M. Talbot, A synthetic sisterhood: false friends in a teenage magazine, in K. Hall and M. Bucholtz, Gender Articulated: Language and the Socially Constructed Self, New York: Routledge, 1995, pp.143-68.
whereas all that women can hope for is their being physically attractive and, optimistically, slim. Even though recent research on maleness in the media has concentrated on violent behavior in television and movies, some studies have started to look at the representation of maleness in menâ€™s magazines as Maxim, GQ and Esquire. These magazines also participate in describing what it means to be a contemporary man by way of a similar usage of an artificial community.
The majority of our favorite magazines, TV programs and films presume a heterosexual audience and strengthen the assumption that women and men are naturally dissimilar and, as a direct effect, are fairly divergent. Men and women, boys and girls, are opposed in the media in terms of beliefs, styles and attitudes. This in turn induces our perception of gender divergences: we come to consider the conventional depiction of gender as normal. However a new version of masculinity has been created which has to do with relationships, fashion, health, fitness and appearance similar to what is typically associated with women. This â€œnew manâ€? is in some ways a constructive evolution, but it is also a confounding one. In fact, on the one hand, he is associated with conventional maleness founded on male success, wealth, power and heterosexual wish; on the other hand, he ought to be linked to his friends and his family. There is a divergence here in conventional and progressive ideas of maleness, such as the conflicts in the representations of women.
Some point out that the language of marketing more often positions women (the shoppers) whereas regarding men as self-sufficient and
Most advertisements are directed to an exact gendered customer and we are “spoken to” in especially gendered manners. The pictures in magazines and in advertisements invent an imaginary world in which identity is accomplished by means of clothing, makeup and other gendertargeted goods. The phraseology used to advertise products links up with our gendered worlds.
However nowadays new questions concerning how to be a man are emerging, the discussion apparently being that men are now used for woman’s objectives (for instance to help around the house or to provide them with money, travel and so on). Michelle Lazar denotes this tendency as a discourse of popular post-feminism or a universal neo-liberal discourse of post-feminism,222 which maintains that, now that some important gains for women have been made, feminism has reached its aim and should consequently be abandoned: the victim is now the man. However, these interpretations of supposed equality between the sexes, or even of a switch of gender roles and power, disguise the real gender disparity that endures in women’s and men’s real life experiences.
It is questionable that news producers and the press are indifferent to all gender representations. Sara Mills223 explores how the news media texts
See G. Myers, Words in Ads, London: Arnold, 1994 and Ad Worlds: Brands, Media, Audiences, London: Arnold, 1998. 222
M. Lazar, Feminist Critical Discourse Analysis: Studies in Gender, Power and Ideology, London: Palgrave Macmillan, 2005. 223
S. Mills, “Third wave feminist linguistics and the analysis of sexism”, Discourse Analysis Online, 2003, www. extra.shu.ac.uk.
are authored and the way in which several factors have some bearing on what is conveyed, how and by whom (a woman or a man). Both talk shows and news bulletins are interesting sites for gender researches. Mills maintains that male television presenters are considered as more convincing whereas women more frequently present “soft” human interest reports, more willingly than “hard” news. It is captivating but also problematical to reflect on the ways in which gender is employed and sponsored in the media. The media function as both a mirror and an instrument of gender stereotyping. It is therefore meaningful taking into consideration its power to persuade and influence. Gender imagery are used to promote some products. If a wish or necessity can be created, then a product can be advertised to fulfill the need. It is the advertising of an ideal (done by means of imagery as well as through verbal communication) that creates the need. It is the particular use of constructed gender ideals in both pictures and in speech that make the analysis of gender and language in the media exceptionally interesting. However, a post-structuralist view would claim that people are not imprisoned in such positions and are able to assume new positions and roles. By way of inquiring, testing and overturning, we can create ever-new ways to be in the world.
Chapter 3 Man and woman in the British Press: a corpus study
3.1 The British daily press: tabloids vs. broadsheet Before starting the analysis, it is necessary to sum up the most important differences between two kinds of British newspapers consulted in this work. British newspapers are generally divided into two categories: tabloid and broadsheet, or popular and quality papers. While the first term is simply related to their format, the second hints at more conceptual differences between the two groups. There are evidently differences in the contents, for instance, the space dedicated to the different news items, how much gossip is included and so on. But there are also differences in language use. Several studies have been carried out in order to establish possible differences concerning semantic styles typical of popular and quality newspapers and possible associations of these styles to different values, beliefs and attitudes characteristic of specific social groups. Jucker, for instance, refuses the conventional dichotomy between quality and popular, since “what counts as ‘quality’ in one type of paper may not be desirable as an aim for the other types of papers”, and because “a look at the circulation figures (…) reveals that ‘popularity’ does not provide a reliable criterion to distinguish between newspaper categories”.224 Therefore, he prefers to employ the term ‘up-market’ for the quality press and he delineates a further distinction between ‘down-market’ (e.g. The Sun, The Star and The Daily Mirror) and ‘mid-market’ (e.g. The Daily Express,
A. H. Jucker, Social Stylistics. Syntactic Variation in British Newspapers, New York: Mouton de Gruyter, 1992, p. 47.
The Daily Mail) popular newspapers, in line with a previous analysis by Henry.225 Jucker also remarks that “the up-market papers are exactly those that are published in broadsheet format”.226 In reality, this is no longer the case: The Times and The Independent are now available in tabloid format. Another study which examines divergences between quality and popular press was accomplished by Bednarek.227 She studied a corpus of 100 news stories obtained from the most important British national daily newspapers, five quality newspapers (The Financial Times, The Guardian, The Independent, The Times, The Daily Telegraph) and five popular newspapers (The Sun, The Star, The Daily Mail, The Daily Mirror, The Daily Express). She realized that popular and quality newspapers “exhibit a distinct evaluative style, characterised by mitigation and negation in the case of the broadsheets, and by EMOTIVITY, unexpectedness and references to emotion in the case of the tabloids”.228 Bednarek explains her findings in terms of the main beliefs of the audiences of the various newspapers: the broadsheet newsmakers make use of a less direct, evocative, moderate and critical style with the purpose of attracting the educated and cultured readers that compose their target audience, while the tabloid newspapers choose a
H. Henry, “Are the National Newspapers Polarising?”, Admap 19.10: 484-491, 1983.
A. H. Jucker, Social Stylistics. Syntactic Variation in British Newspapers, New York: Mouton de Gruyter, 1992, p. 47. 227
M. Bednarek, Evaluation in Media Discourse. Analysis of a Newspaper Corpus. London: Continuum, 2006. 228
Ibidem, pp. 203-204
more straightforward, intense, emotive and simpler style with the intention of attracting a broader, less erudite and less wealthy audience.
3.2 A brief survey of previous studies on the representation of gendered items in corpora Several studies have examined gendered items in corpora, especially with respect to disparity and sexism. Kjellmer229 looked at the incidence and distribution of masculine and feminine pronouns, along with the terms man/men and woman/women in the 1961 Brown and London–Oslo–Bergen (LOB) corpora. He discovered that, in general, there were more ‘masculine’ items than ‘feminine’ ones in both corpora. Sigley and Holmes230 also analyzed the proportional occurrences of man/men and woman/women in the Brown and LOB corpora, along with the Wellington Corpus of Written New Zealand English (1986–90), the Freiburg–Brown Corpus of American English (1991–2) and the Freiburg– LOB Corpus of British English (1990–1). They realized that the occurrences of women in writing duplicated between the early 1960s and the early 1990s, while citations about man/men considerably decreased. But the incidence of references to women as individuals remained underneath those to individual men. As a matter of fact when they reflected on the proportion of singular to plural, some remarkable disparities emerged: man is over one and a half times more recurrent than the plural men, but the plural women is 229
G. Kjellmer, “‘The lesser man”: observations on the role of women in modern English writings’, in J. Aarts and W. Meijs (eds.) Corpus Linguistics II, pp. 163-176, Amsterdam: Rodopi, 1986. 230
R. Sigley, and J. Holmes, “Girl-watching in corpora of English”, Journal of English Linguistics 30 (2), pp. 138–57, 2002.
almost two times as frequent as woman. It looks as if adult males are more frequently mentioned in the singular, while adult females are more typically referred to as a group. Furthermore, they observed a sort of â€˜masculine biasâ€™, with man occurring over one and a half times more regularly than woman. Although helpful in providing an indication of how frequently men and women are talked and written about, such approaches tend to be very general, without many details and consequently limited. But corpus researches in language and gender have not been restricted to the mere computation of lexical items. A primary interest of analysts has been collocation â€“ the phenomenon of particular words regularly occurring in close proximity.231 Romaine232 (2000) illustrated how sexism in language can be proved with collocational evidence. In English there exist numerous pairings of gendered terms which attest a sort of semantic and discursive asymmetry. These comprise master and mistress, god and goddess, governor and governess, wizard and witch, and bachelor and spinster. Romaine investigated the collocates of bachelor and spinster in the BNC, giving particular attention to a specific grammatical relation: the adjectives modifying spinster. By providing some examples such as gossipy, nervy, ineffective, jealous, eccentric, frustrated, repressed, lonely, prim, coldhearted and despised, Romaine noticed that these are generally negative or
P.Baker, Using Corpora in Discourse Analysis, London: Continuum, p.96, 2006.
S. Romaine, Language in Society: An Introduction to Sociolinguistics (second edition). Oxford: Oxford University Press, 2000.
pejorative. She maintains that such disparities involve even basic terms for male and female human beings. She examined the terms man/woman and boy/girl and realized that a terminology with negative connotation is used more often with woman/girl than with man/boy. Equivalent statements, founded on corpus evidence, have been made by others. For instance, Caldas-Coulthard and Moon233 looked at the adjectives referring to the terms man and woman in a corpus of British newspaper articles, and they discovered that only woman is considerably modified by adjectives denoting physical appearance (e.g., beautiful, pretty and lovely), while man is modified by adjectives suggesting influence, dominance, authority, and power (e.g., key, big and main). Collocational samples like these may illustrate the relations and connotations of words and, consequently, the assumptions they imply.234 Moreover Caldas-Coulthard235 examined the way in which women are cited in a corpus of 200 articles from three British broadsheet newspapers (the Independent, the Guardian, and the Times). She employed the Times subcorpus of the Bank of English to support her examination. She investigated how often women are referred to as ‘a sayer’ in comparison with men. In the Times subcorpus, women are cited 62 whilst men 497 times (she only considered occurrences of verbs of saying which occur more than 100 times). This asymmetry is replicated in her own data: women speak in 233
C.R. Caldas-Coulthard and R. Moon, ‘Curvy, Hunky, Kinky: Using Corpora as Tools in Critical Analysis’, paper presented at the Critical Discourse Analysis Symposium, University of Birmingham, April 1999. 234
M. Stubbs, Text and Corpus Analysis, London: Blackwell, p. 172, 1996.
C.R Caldas-Coulthard, “From discourse analysis to critical discourse analysis: the differential re-presentation of women and men speaking in written news” in Baker et al. (eds): 196 – 208, 1993.
76 cases whereas men in 451 cases. More remarkably, women are represented in a different way from men: male speakers are generally represented in terms of their professional titles or status in the administration or in different types of public organizations, while female speakers are usually portrayed in terms of age, marital positions or family relationships. Clark looked in the Sun at the way in which women are described in accounts of male violence, especially sexual felonies against women, by examining the way of naming women victims. She affirms that “Naming is a powerful ideological tool. It is also an accurate pointer to the ideology of the namer. Different names for an object represent different ways of perceiving it.”236 She noticed that women victims are classified with reference to their sexual availability: the victims who are sexually unavailable are labeled as wife, bride, housewife, mother, young woman, girl, school girl, girl guide, daughter. In these cases, the male villain is debased as a ‘beast’, ‘fiend’ or ‘monster’ and considered responsible for the offense. In the meantime, the victims who are sexually available are labeled as blonde, unmarried mum, Lolita, blonde divorcee/mum. The women in this class are not constrained by men, so the male offender is not considered a villain and the responsibility is transferred to the female victims who drew the act of violence toward themselves by being sexually available. One of the conclusions Clark derives from the study is that this naming practice of the
K.Clark, “The linguistics of blame: the representation of women in the Sun’s reporting of crimes of sexual violence”, in Toolan M., (ed.) Language, Text and Context: 208 – 224, London: Routledge, p. 209, 1992.
victims mirrors a patriarchal perception of women not as self-sufficient individuals but rather as possessions of men.
3.3 Methodology: description of the corpus investigated This dissertation investigates how the British press represents dominant social views and attitudes referring to men and women. Corpus investigation techniques are used with a corpus that I have personally compiled containing 1,000 articles from the on-line versions of two British newspapers – The Guardian and The Daily Mail – the first of which is a broadsheets while the other is a tabloid. The corpus contains 102,119 tokens (see Tab.2 below). It is a diachronic corpus which, in Hunston's words is “a corpus of texts from different periods of time. It is used to trace the development of aspects of a language over time.” 237 The
January/December 2001 and January/December 2011. For each period of time 500 articles including the words man and woman have been randomly chosen. For this reason I have decided to call it the “Man and Woman in the British Press” Corpus (henceforth MWBPC). The size of the corpus is very modest, but it is sufficient enough for the purpose and the scale of the present study.
S. Hunston, Corpora in Applied Linguistics. Cambridge: Cambridge University Press, p. 16, 2002.
Articles containing the words man
Daily Mail 2001
Daily Mail 2011
Tab. 1 illustrates the number of articles per section in the MWBPC containing the words man and woman.
Tokens in the corpus man
Daily Mail 2001
Daily Mail 2011
Tab. 2 illustrates the tokens per section of the MWBPC divided by newspaper, period and topic. 122
Wordsmith Tools has been used as the main analytical tool. The BNC corpus has been used in order to test and expand what has been observed from the MWBPC.
3.4 Methodology: description of the British National Corpus The British National Corpus (BNC) is a 100-million-words general corpus planned to represent contemporary British English by including as many text types as possible.238 The written part of the BNC (which represents 90 percent of the corpus) consists of “extracts from regional and national newspapers, specialist periodicals and journal… academic books and popular fiction, published and unpublished letters and memoranda, school and university essays”.239 The spoken part (which makes up 10 percent of the corpus) comprises “unscripted informal conversation, recorded by volunteers selected from different age, region and social classes. . . together with spoken language collected in . . . different contexts, ranging from formal business or government meetings to radio shows and phone-ins”.240 Its ‘heterogeneric’ character,241 concerning the varieties of texts and speech events it includes, make the BNC “a repository of cultural information about [British] society as a whole”.242 However, a note of caution need to be announced: it is essential to remember that in the wide 238
McEnery, et al., Corpus-Based Language Studies, London: Routledge, p. 15, 2006.
See http://www.natcorp.ox.ac.uk/corpus/index.xml?ID=intro BNC website.
A. Partington, The Linguistics of Political Argument, London: Routledge, p. 4, 2003. S. Hunston, Corpora in Applied Linguistics, Cambridge: Cambridge University Press, p. 117, 2002. 242
world of corpus linguistics, the BNC is nowadays somewhat outdated. In fact the creation of the corpus began in 1991, and it was completed in 1994. Moreover no new samples have been added after the accomplishment of the project, but the BNC was partially revised before the release of the second edition BNC World (2001) and the third edition BNC XML Edition (2007).243 This means that the most up-to-date data is from fourteen years ago. Strange though it may seem, the BNC is turning out to be an historical corpus, which means that a part of my results about man and woman is unavoidably ‘dated’. An examination of a more ‘modern’ corpus may well generate different results. To sum up, the BNC can be considered a monolingual corpus as it incorporates samples of language use in British English only and a synchronic corpus since only language use referring to the late 20th century is represented.244
3.5 Methodology: description of WordSmith As stated above, Wordsmith tools is the software used to carry out the analysis and the BNC corpus is used as a reference corpus to test what has been observed from the British newspaper corpus WordSmith Tools is a software package principally for linguists and especially for work in the domain of corpus linguistics. It is a suite of modules for seeking out patterns in a language. The software is also available in several languages. The program package was conceived by the 243
British linguist Mike Scott at the University of Liverpool and firstly released as version 1.0 in 1996. Wordsmith tools is a comprehensive concordancing software with three main tools: key words that recognizes words that occur unusually frequently in a corpus in comparison with a reference corpus and it is functional for characterizing a text or a genre; word list which produces lists showing all the words or word forms that are incorporated in the selected corpus and which allows to study the vocabulary employed in a certain text. Once a word list has been produced, it is displayed in both alphabetical and frequency order. Also it is possible to obtain statistical information, for instance on how often each word occurs in the text files, on the type/token proportion, on the average sentence length and on how many words in the text(s) have x number of letters. Moreover the program may carry out lexical comparison of two texts; concordancing or concord which looks for a search word or phrase throughout a collection of chosen files, giving back a list of ‘hits’ for the search word in the KWIC (key word in context) format, with one hit for each line. The results from a search also provide information about the number of entries, where the entry occurs and in what file it occurs, collocates of the node word; cluster analyses showing frequent clusters of words (phrases).
3.6 Analysis: a mixed method approach In this work, I use the mixed method approach. According to Dörnyei,245 a mixed method approach entails the examination of both 245
Z. Dörnyei, Research Methods in Applied Linguistics: Quantitative, Qualitative, and Mixed Methodologies, Oxford: Oxford University Press, p. 163, 2007.
quantitative and qualitative information in a single study with some efforts to combine the two approaches at one or more phases of the research process. The data collection also implies bringing together numeric information in addition to text information so that the ultimate work represents both quantitative and qualitative information.246 Therefore, from a quantitative point of view, the collocations and their incidence are automatically worked out by a computer program that provides numeric data that ought to be interpreted further, and this involves some qualitative reflections when determining the most significant collocates. In this way the quantitative and the qualitative paradigms are integrated.
3.7 Objectives of the analysis This
associations: man/woman as subject (e.g., The woman dies), man/woman as object (e.g., The detective arrested the man), and attributive adjectives related to man/woman (e.g., She is a young woman), with the aim of showing how men and women are characterized as agents, as victims and beneficiaries, and also how they are portrayed and categorized. Where noteworthy, other grammatical associations will also be considered. The results show that there are similarities in the collocates of man and woman both within the same period of time and across time. In particular the MWBP corpus evidence illustrates that adjectives denoting age have always been associated with men and women over time, as confirmed in the BNC: 246
J. W. Creswell, Research Design: Qualitative, Quantitative, and Mixed Method Approaches, Thousand Oaks: Sage Publications, p. 20, 2003.
Figure 1: Adjectival Collocates of man from the BNC.
Figure 2: Adjectival Collocates of woman from the BNC. Figure 1 and Figure 2 present the top seven adjectival collocates of woman and man among which young and old occupy the first two positions in both cases. However, in the MWBP corpus, a prominence is given to the age of women as well as to their youthfulness and, additionally, it seems that this prominence is increasing across time for women and decreasing for men, as illustrated in Table 3 below:
Adjectives denoting age Man
Occurrences Occurrences Occurrences Occurrences 2001
2 27 68
Tab. 3 illustrates the adjectives denoting age referred to man and woman in both 2001 and 2011
However what is revealed from the data has to do with samples of collocation which mirror some gender disparities in the portrayal of men and women in the areas of power and misconduct, social classification, personality and mental ability, physical appearance and sexual orientation or tendency.
3.8 Discussion of the findings In this section the results of the analyses of the three main grammatical associations delineated above will be explained and discussed. The data are organized as semantic areas dealing with issues connected with gender disparity.
3.8.1 Power and misconduct Power is not spread out uniformly in society: some people own and exercise more power than others. The sharing of power between genders is lop-sided: men are, as a general rule, physically stronger than women (or are at least expected to be so), and the large size of financial assets and economic and political power is detained by men .247 Such disparities have been largely retrieved in the MWBPC, as reported in the data below: Adjectives connected with physical size and force are connected predominantly with man: big (5)248, strong (4), strongest (2), tall (1), macho (1), burly (1), large (1), fat (1), portly (1). While when woman is qualified by the adjective strong (2), a figurative or more general sense is being used, as shown in the following example:
J.P. Jutting, C. Morrisson, J. Dayton-Johnson and D. Dreschsler. â€œMeasuring gender (in)equality: introducing the gender, institutions and development data base (GID)â€?, OECD Development Centre Working Paper No. 247, 2006, available online at: http://www.oecd.org/dataoecd/17/49/36228820.pdf. 248
The number of the occurrences concerning the words under examination will be indicated next every specific word henceforth.
Ex. 1 She later managed to struggle free. Police said that though "obviously distraught", she was "a strong woman who has coped with it as well as anyone could". But one senior detective said she felt particularly betrayed by Marsh's involvement: "The impression she gives is that she feels more betrayed. Modern and historical inequalities of wealth, authority and property are also evident with regard to attributive adjectives. Man is associated with great (9), richest (6), best (4), leading (4), powerful (2), rich(1), top (1), influential (1), affluent (1), dominant (1), outstanding (1), well-known (1). In this area the negative marker homeless (1), and unemployed (1). There are divergences in the contexts in which these adjectives appear. For instance, when a woman is richest (4), the designation is often complemented by the hint that this is an exceptional occurrence or an odd expression as demonstrated in the following example: Ex. 2 A multi-million pound art collection including works by Goya and Bruegel has been ransacked in a midnight raid on the home of Spain's richest woman. Also, when a woman is successful (2), self-made(1), professional (1), high-earning professional (1), this could be a drawback for her private life, since men might feel uncomfortable and intimidate by a self-sufficient woman, as shown in the following example: Ex. 3 For she knows only too well that there are still many men who feel threatened by a high-earning professional woman. Her last boyfriend was one of them, and ultimately left her because of it. 131
Moreover a woman can be unemployed (1) and poor (1). One way in which individuals exercise power over others is to participate in illegal and immoral activities. Men are more strongly related with felony, violent behavior and the criminal justice system than women. This is expected, perhaps, since crime is effectively a male activity. In the UK, 85–95 percent of robberies, drugs offences, and violent behavior against individuals are committed by males.249 This fact is often implied in the MWBPC data. Man is often modified by adjectives related to misconduct, for instance, evil (2), drunk, (2), condemned (2), knife-wielding (2), aggressive (1), guilty (1), bad (1). However a man may also be inoffensive (1), harmless (1),
and defenseless (1). Man as subject is
associated in particular with verbs such as rape (5), kill (5), attack (4), murder (3), shoot (2), strangle (1), batter (1), stalk (1), rob (1), steal (1), ambush (1), push (1), punch (1), bit (1) kidnap (1), stab (1), fight (1). Men also wield a range of objects, for the most part weapons: a knife (2), a home-made shot gun (1), a hand gun, two guns (1), a gun (1), a silver pistol (1), a rifle (1), a bomb (1), two knives (1), a machete (1), an AK-47 (1). In only one case man is unarmed, which is illustrated in the following example: Ex. 4 The chief constable of a police force which mounted a bungled raid in which a naked and unarmed man was shot dead resigned today.
M. Ayres, and L. Murray, “Arrests for recorded crime (notifiable offences) and the operation of certain police powers under PACE England and Wales, 2004/05’”, 2005, Home Office Statistical Bureau Issue 21/05, accessed 9 June 2012, at: www.statewatch.org/news/2005/dec/uk-hosb-s-and-s.pdf
Sussex Chief Constable Paul Whitehouse's move came just a day after Home Secretary David Blunkett called on his employers to consider sacking him. Moreover there is a disparity between men and women in the sort of authority which is exercised over them by others. Numerous verbs that frequently combine with man as object place men as undertaking the powerful measures of the legal system, for example, arrest (52), accuse (21), charge (12), convict (6), detain (5), suspect (2), question (1), hang (1), jail (1), release (1). And man is qualified by adjectives such as condemned (2), arrested (2), wanted (1), armed (1). Man is also represented as a victim of violent actions. As a matter of fact man as object is regularly associated with verbs such as shoot (14), kill (7), stab (7), attack (3), assault (2), murder (1), injure (1), hit (1), force (1), beat (1), gun down (1), strike (1), punch (1), rob (1), trap (1), blow up (1), gore (1). This is a mirror image of the modern society: since men are deemed responsible for more crimes than women, men – and especially young men – are also more expected than women to be the victims of criminal acts, and in particular violent crime (although an exception to this is rape, as is indicated below). A further social aspect promoting this pattern is that more men than women are usually involved in armed conflicts, and are, consequently, more likely to be killed or wounded. Such patterns satisfy gender role expectations about male conduct, which is ‘expected’ to be dynamic, forceful, assertive and authoritarian. Also women are recipient of violent actions. In particular woman is represented as the object of rape (18). This is probably due to the fact that 133
rape is a crime of which females (rather than males) are the main victims. Other verbs frequently linked to woman as object represent females as the victims of negative actions (every so often as the victims of male sexual aggression), for example, kill (13), attack (7), murder (6), shot (6), punch (4), assault (4), abuse (3), kidnap (3), wound (2), decapitate (2), batter (1), strike (1), stab (1), knife (1), gag (1), bind (1) bludgeon (1), burn (1), behead (1), mug (1), rob (1), torture (1), executed, (1), grope (1). Many of these verbs are commonly used in portrayals of sexual aggression. With regard to attributive adjectives, women are qualified by injured (3), murdered (2), dead (2), deceased (2), abducted (2), terrified (1), tortured (1). Moreover women collocate with characteristics related to deficit of status and authority, vulnerable (3), defenseless (2), frail (2), disadvantaged (1). Women also collocates with practices implying the exercise of power by others, represented for instance by verbs such as sentence (2), jail (2), hold (1), restrain (1), imprison (1), detain (1), fine, (1), and verbs positioning them as going through medical interventions and procedures such as sedate (1), and sterilize (1). Women are beneficiaries of positive actions as well, which nonetheless imply some sort of weakness on their part, expressed by verbs such as help (2), and save (1). The results delineated in this section suggest significant disparities in the way men and women are denoted with regard to power and misconduct. In the representation of men, prominence is given to physical strength and force. Men are physically stronger than women, commit more criminal acts and are more aggressive than women. 134
Women, on the contrary, are represented as the victims of negative actions, in particular in relation to sexual aggression, and as being subjected to the exercise of power by others.
3.8.2 Social categories Especially important among the adjectives denoting women are those which characterize them as belonging to social categories. For instance, if compared with men, women are more frequently qualified by adjectives indicating marital/reproductive status and sexual tendency. Hence, woman is often associated with married (8), single (6), divorced (2), infertile (1), childless (1), transgender (1), gay (1), straight (1) and only women are, pregnant (19). In this semantic area, man is predominantly modified only with gay (6), and heterosexual (1). On the topic of sexual tendency the corpus evidence highlights that British people have become more opened to sexual orientation across time, as confirmed by the occurrence in the MWBP corpus of the adjectives gay and heterosexual for man and transgender and gay for woman during the 2011 period. This information is illustrated in Tables 4 and 5 below:
Male sexual orientation 2001
Table 4 illustrates the adjectival modifiers related to male sexual orientation during 2001 and 2011.
Female sexual orientation 2001
Table 5 illustrates the adjectival modifiers related to female sexual orientation during 2001 and 2011.
Some of these adjectives make reference to physiological events: only women give birth (8), carry a child (2), conceive (2), bear a child (1), breastfeed (1), have an abortion (1), miscarry (1). It seems that these traits of a woman’s identity are of superior significance and concern than a man’s, and are conferred more value in the corpus. The same could be said of attributes related to nationality, religion, ethnicity and class identity. For instance, woman is significantly or almost exclusively qualified by adjectives of nationality, such as British (17), Palestinian (7), American (6), Canadian (4), French (4), Italian (3), Japanese (3), Australian (3), Libyan (3), Indian (2), Saudi (2), Russian (2), Afghan (1), Algerian (1), Bahraini (1), Brazilian (1) Chechen (1), Chinese (1), Dutch (1), Egyptian (1), European (1), Filipino (1), Iranian (1), Kashmiri (1), Moldovan (1), Paki (1), Pakistani (1), Polish (1), Saudi Arabian (1), Sirian (1), Somali (1), Swedish (1), Tanzanian (1), Turkish (1), Ugandan (1), Uruguayan (1) ; religion: Jewish (4), Muslim (4), Christian (3), Islamic (1); ethnicity: Asian (2), Latino (2), South African (1), African– American (1), Dutch-Iranian (1); and class: middle-class (2), working-class (1). Fewer markers from these areas appear frequently or exclusively in connection with man: British (6), Asian (2), Chinese (3), Turkish (2), Spanish (2), American (1), Greek (1), Japanese (1), Lebanese (1), Palestinian (1), Polish (1), Swedish (1), Syrian (1), with reference to nationality, and Muslim (1), and Protestant (1), with regard to religion.
Two aspects seem to be at work here. Firstly, it looks as if adjectives referring to marital/reproductive condition and sexual tendency, nationality, ethnicity and class are relevant in discussions regarding women in sociological context. Secondly, the importance of these adjectives may also be related to the fact that women are marked in this domain of meaning. Some nationality/religion/ethnicity adjectives might be used as nouns to denote people, but when the adjective is not marked by gender on the facade (as in American, British, French, Muslim and so on) the gender of the person referred to in this manner is almost always ‘understood’ to be male. The same could be said for what concerns women’s profession. In fact their rising awareness in career seems to be perceived as something exceptional within the society, so that numerous substantives referring to jobs are often marked if associated to women, sometimes through the use of the cluster the first woman as indicated in Table 6 below:
Woman’s career 2001
Woman’s career 2011
The first woman to hold that post
The first woman to win a solo roundthe-world yacht race
The first woman to reach the most senior of ranks of the police
The first woman secretary of state for Scotland
The first woman appointed finance minister in a G8 country
The first woman to win the coveted Royal Marines' green beret
The first woman to be re-elected as president
The first woman in the country to rise to the rank of assistant chief constable.
The first woman commissioner of Scotland Yard.
Woman police officer
The first woman to anchor a network TV news show in America
Woman train driver
The first woman to metropolitan police force
The first woman vice-presidential candidate on a major US party ticket
Woman police chief
The first American woman to been given such a senior combat role
The first woman chief executive
The first woman captain
Woman police officer
The first woman to head a Scottish university library
The first woman to fly solo around the world
The first woman ever to hold the position of top police officer in the country
The first woman director of public prosecutions
Woman soldier (3 instances)
The first woman to be admitted into the previously all-male club of American "dramatic literature"
The first woman anywhere to get the Uefa pro licence
The first woman jockey to win a prestigious Group One flat race
Woman teacher Woman player
Table 6 Markers of womenâ€™s career during 2001 and 2011 obtained from MWBP corpus.
The corpus evidence in Table 6 shows that women are generally marked in the professional area, and especially in relation to typically male jobs (e.g., woman police officer, woman soldier, and so on). In particular, the table shows that the number of this markers slightly decreases across the time, as if the improvement of womenâ€™s professional conditions has become more obvious and accepted in 2011 compared to 2001. Once more, asymmetries in the description of men and women are discovered in patterns of collocation.
3.8.3 Character and mental ability The collocational evidence demonstrates that there are some differences in how men and women are classified and some aspects of their social identity which result more prominently in their representations, but it also highlights divergences in the psychological and behavioral traits of men and women. For instance, it is important to notice that women are sometimes associated with verbs referring to verbal communication or other forms of vocal expression, such as say (5), claim (3), scream (2), tell (1), speak (1), yell (1), Talkativeness may be considered as a marker of assertiveness, but also of friendliness, and cooperation. On the other hand, man tend to be modified by adjectives signaling reservation, for instance, lonely (1), quiet (1), quietest (1), humble(1), shy (1), modest (1), and awkward (1). There are only three marker of introversion associated with woman: lonely (1), absconding (1), kept (1). â€˜Agreeablenessâ€™ is another important personality trait that emerges from the analysis of the data. It is related to qualities such as openmindedness, gentleness and friendship. Man is more strongly modified by adjectives such as lovely (6), wonderful (5), amazing (3), special (3), generous (2), mild-mannered (2), perfect (2), classy (1), generous spirited (1), extraordinary (1), remarkable (1), loyal (1), loving (1), caring (1), gentle (1), warm (1), kind (1), kindly (1), kindest (1), compassionate (1), respectful (1), sweetest (1), outgoing (1), pleasant (1), admired (1), ideal 140
Man also patterns more strongly with adjectives associated with
humour and happiness, such as happy (3), funny (2), jovial (2), funniest (1), witty (1). At the other end of the scale, disagreeableness is connected with cruelty, distrust, lack of cooperation and egoism. Man is often associated with several ‘disagreeable’ attributes, comprising evil (2), arrogant (2), selfish (1), aggressive (1), hard (1), cold (1). Moreover men are frequently the subjects of verbs of violence and aggression, as discussed in the paragraph 3.8.1, which are indicative of general disagreeableness (e.g., attack, murder and shoot). In this area women are qualified by lovely (2), wonderful (1), happiest (1), friendly (1), perfect (1), gentle (1), kind (1), at the positive ending of the scale, and malevolent at the negative ending. ‘Conscientiousness’ is another relevant aspect connected to men's and women’s individuality. It is related to traits such as competency, wisdom and judiciousness. Man is frequently qualified by adjectives such as decent (4), good (3), intelligent (3), honest (2), clever (2), wise (2), creative (2), thoughtful (1), talented (1), sharpest (1), educated (1), bright (1), inventive (1), independent-minded (1), imaginative (1), prolific (1) reliable (1), stoic (1). With reference to this dimension of personality woman associates with bright (2), intelligent (2), honest (2), self-respecting (1), wise (1), educated (1), mature (1), decent (1), talented (1), profound (1), accomplished (1). 141
The analysis also reveals a dimension entailing traits such as concern, lack of confidence, apprehension and depression, which are ascribable to negative emotional and mental states. In this case woman collocates in particular with mentally-ill (3), obsessed (2), distraught (2), disgruntled (2), schizophrenic (1), startled (1), upset (1), confused (1), troubled (1), agitated (1), and grieving (1). Man patterns with angry (3), suicidal (2), self-torturing (1), irritating (1), nervous (1), agitated (1), scowling (1), discontented (1), incoherent (1), negative (1), stressed (1), irritable (1). However, women are significantly qualified by some adjectives of mental fortitude and flexibility: confident (2), brave (2), headstrong (1), ambitious (1), fearless (1), comfortable (1), gutsy (1), courageous (1), determined (1). But is interesting to notice that they are also associated with adjectives of mental weakness and incapacity such as daft (1), incapacitated (1), silly (1), stupid (1), and dumb.
3.8.4 Appearance In addition to gender diversities in the representation of personality in the corpus, there are also differences in the description of physical appearance of men and women. As said in the paragraph 3.8.1, men tend to be associated with attributive adjectives related to physical power such as strongest (2), therefore, it is not unexpected that man also patterns more strongly or exclusively with adjectives referring to physical size, and bulkiness such as burly (1), large (1), fat (1), portly (1), and so on. 142
The adjectives qualifying women refer to a more restricted variety of physical types, shapes and aspects. Some make reference to weight and size as pear-shaped (2), fit (2), overweight (1), healthy (1), and skinny (1). Adjectives referring to hair colour and fashion are more frequently associated with woman, comprising blonde (6), dark-haired (4), brunette (2), grey-haired (1), well- coiffed (1), blonde-haired (1), In this area, man is modified by shaven-headed (1), blonde (1). With regard to evaluative terms, woman frequently collocates with beautiful (18), attractive (10), stylish (2), graceful (1) nice (1), handsome (1), glamorous (1), and exclusively with pretty (5), whereas male appearance is rendered by adjectives such as handsomest (1) at the positive end of the scale, and scruffy (1) at the negative end. Specific cultural concerns are highlighted here: a prominence on strength and body type for males, and weight and physical appeal for females.
3.8.5 Sexuality The final domain I will take into account is gender disparities in the representation of sexual/intimate behavior. As stated in the paragraph 3.8.1, man as subject is recurrently associated with rape. As a result, woman as object patterns more strongly with verbs such as attack (7), assault (4), abuse (3), gag (1), grope (1), and so on. The main deduction which may be derived from the collocational evidence is that, women are more likely than men to be represented as the victims of male sexual violence and, unlike men, they are also described as 143
‘recipients’ of sexual actions. Sexuality and sexual appeal are also rendered by attributive adjectives. Both Man and woman collocate with adjectives of sexual orientation: man collocates with gay (6) and heterosexual (1), while women are associated with transgender (1), gay (1), straight (1) and with adjectives referring to general good looks, such as beautiful (19), attractive (11), handsome (1), sexy (2) sexiest (2), Adjectives with negative connotations occur more regularly with woman, as well as some related to sexual promiscuity, such as promiscuous (1) and scarlet (1), or ‘deviance’, for instance prostitute (1). With regard to man, the only adjective with evidently negative connotations is macho (1). Adjectives related to reproduction are also entirely associated with woman, comprising infertile (1) and childless (1).
Conclusion The aim of this thesis was investigating how the British press represents men and women with the central assumption that the Press reflects, or at least gives a prominent voice to, dominant social views and attitudes. The results have shown that there are similarities and differences in the collocates of man and woman both within the same period of time and across time. For example, evidence from MWBPC has revealed that adjectives denoting age have always been associated with men and women over time. However, in the semantic areas taken into consideration significant disparities have emerged in how men and women are denoted. For instance, with regard to power and misconduct, in the representation of men, prominence is given to physical strength and force. Men are physically stronger than women, commit more criminal acts and are more aggressive than women. Women, on the contrary, are represented as the victims of negative actions, in particular in relation to sexual aggression, and as being subjected to the exercise of power by others. With reference to the characterization of women and men as belonging to social categories, if compared with men, women are more frequently qualified by adjectives indicating marital/reproductive status and sexual tendency. Also, women are significantly or almost exclusively qualified by adjectives of nationality, religion, ethnicity and class identity. The importance of these adjectives may be related to the fact that women are marked in this domain of meaning. In fact, when these adjectives are not marked by gender at â€˜face valueâ€™, as in American, British, French, Muslim, 145
the gender of the person referred to is almost always ‘understood’ to be male. With regard to the psychological and behavioral traits of men and women, the collocational evidence has demonstrated that there are some differences in how men and women are classified in some aspects of their social identity: men tend to be friendly but also reserved and aggressive, while women tend to be cooperative and strong-minded and frail at the same time. With reference to differences in the description of physical appearance of men and women, men tend to be associated with attributive adjectives related to physical power, physical size, and bulkiness. Whilst the adjectives qualifying women refer to a more restricted variety of physical types, shapes and aspects. Some make reference to weight and size, as pearshaped, fit, overweight, skinny, and so on. Specific cultural concerns are highlighted here: a prominence on strength and body type for males, and weight for females. For what concerns the domain of sexuality, women are more likely than men to be represented as the victims of male sexual violence and, unlike men, they are also described as ‘recipients’ of sexual actions. Sexuality and sexual appeal are also rendered by attributive adjectives, such as beautiful, attractive, sexy, and so on. Furthermore the corpus evidence has showed that men and women for the period of 2011 compared to men and women for the period of 2001
are perceived differently by the British society especially in terms of social conditions and sexual orientation. On the subject of social conditions the corpus evidence has revealed that women are generally marked in the professional area, and especially in relation to typically male jobs (e.g., woman police officer, woman soldier, and so on). However the number of this markers slightly decreases across the time, as if the improvement of womenâ€™s professional conditions has become more obvious and accepted in 2011 compared to 2001. On the topic of sexual tendency the corpus evidence has highlighted that British people have become more opened for what concerns sexual orientation across time, as confirmed by the occurrence in the MWBP corpus of the adjectives gay and heterosexual for man, and transgender and gay for woman during the 2011 period. The analysis has revealed that the perception of a particular social group can change over time according to different factors which may be of social, cultural or economic nature, and these changes and divergences might affect the lexical choices made by the people in speaking and writing. Corpus linguistics has been a functional instrument in the exploration of what is encoded or implied in texts and cultures and it has also been useful in highlighting that language, culture and ideology are often intertwined. Furthermore corpus linguistics has allowed us to observe the linguistic behavior of gender-related words such as man and woman in order to derive potential cultural beliefs or dominant views and attitudes encoded in a particular social context. 147
References Aarts J. 2002. “Does corpus linguistics exist? Some old and new issues” in L.E. Breivik and A. Hasselgren (eds.), From the COLT’s mouth… and others’:Language corpora studies in honour of Anna-Brita Stenström, 1-19, Amsterdam: Rodopi. Aijmer K., Altenberg B. (eds) 1991. English Corpus Linguistics: Studies in Honour of Jan Svartvik, London: Longman. Alderson J. C. 1996. “Do corpora have a role in language assessment?” in J. Thomas & M. Short (Eds.), Using Corpora for Language Research, pp. 248-59, London: Longman. Aston G. 1995. “Corpora in language pedagogy: Matching theory and practice”, in G. Cook & B. Seidlhofer (Eds.), Principle and practice in applied linguistics, pp. 257-270, Oxford: Oxford University Press. Aston G. 1999. Corpus use and learning to translate, Textus 12:189-314, URL: http://home2.sslmit.unibo.it/~guy/textus.htm, visited on February 2, 2012. Aston G., & Burnard L. 1998. The BNC Handbook: Exploring the British National Corpus with SARA, Edinburgh: Edinburgh University Press. Ayres M. and Murray L. 2005. “Arrests for recorded crime (notifiable offences) and the operation of certain police powers under PACE England and Wales, 2004/05”,
www.statewatch.org/news/2005/dec/uk-hosb-s-and-s.pdf, visited on June 9, 2012. 148
Baker P. 2006. Using Corpora in Discourse Analysis, London: Continuum. Baron-Cohen S. 2003. The Essential Difference: Men, Women and the Extreme Male Brain, Harmondsworth: Penguin. Bate B. and Taylor A. 1988. Women Communicating: Studies of Women’s Talk, Norwood: Ablex. Baxter J. 2003. Positioning Gender in Discourse: a Feminist Methodology, Basingstoke: Palgrave Macmillan. Baxter J. 2003. Positioning Gender in Discourse: A Feminist Methodology, Basingstoke: Palgrave Macmillan. Bednarek M. 2006. Evaluation in Media Discourse. Analysis of a Newspaper Corpus, London: Continuum. Bekke M. 1999. “From the British National Corpus to the WWW cybercorpus: A quantum leap into chaos?”, in J. M. Kirk (ed. By), Corpora galore. Analyses and techniques in describing English, Amsterdam: Rodopi. Bergvall V. L. 1999. “Toward a Comprehensive Theory of Language and Gender”, Language in Society 28, 273-93. Bergvall V.L. 1999. “Toward a comprehensive theory of language and gender”, Language in Society 28, 273-93. Biber D. 2006. University Language: A Corpus-based Study of Spoken and Written Registers, Amsterdam: John Benjemins.
Biber D., Johansson S., Leech G., Conrad S., and Finegan E., 1999. Longman Grammar of Spoken and Written English, London: Longman. Bodine A. 1975. “Androcentrism in prescriptive grammar: Singular “they”, sexindefinite “he”, and “he or she””, Language in Society, 4: 129-46. Bowker L. and Pearson J. 2002. Working with specialized language: A practical guide to using corpora, London: Routledge. Brandis W., Henderson D. 1970. Social Class, Language And Communication, London: Taylor & Francis. Bucholtz M. and Hall K. 2004. “Theorizing Identity in Language and Sexuality Research”, Language in Society, 33:4, pp. 469-515. Burges J. 1996. “Hierarchical Influences on Language Use in Memos”, unpublished doctoral dissertation, Northern Arizona University. Burnard L. (ed. by) 1995. Users reference guide for the British National Corpus (SGML version), Oxford: Oxford University Computing Services. Bussmann H. 1996. Dictionary of Language and Linguistics, London: Routledge. Butler J. 1990. Gender Trouble: Feminism and the Subversion of Identity, London: Routledge. Caldas-Coulthard C. 1995. “Man in the news: The misrepresentation of women in the news-as-narrative discourse”, in S. Mills, Language and Gender: Interdisciplinary Perspectives, pp. 226-39, Harlow: Longman.
Caldas-Coulthard C. R. 1993. “From discourse analysis to critical discourse analysis: the differential re-presentation of women and men speaking in written news” in Baker et al. (eds): 196-208. Caldas-Coulthard C. R. and Moon R. 1999. “Curvy, Hunky, Kinky: Using Corpora as Tools in Critical Analysis”, paper presented at the Critical Discourse Analysis Symposium, University of Birmingham. Cameron D. “Language, gender and sexuality: Current issues and new directions”. Applied Linguistics 26 (4), 482-502. Cameron D. 1992. “Review of Tannen You Just Don’t Understand. Feminism and Psychology”, 465-8 Cameron D. 1996. “The language-gender interface: challenging co-optation”, in Bergvall, Victoria, Bing, Janet and Freed, Alice, (eds.) Rethinking Language and Gender Research: Theory and Practice, pp. 31-53, London: Longman. Cameron D. 2005. “Language, gender and sexuality: Current issues and new directions”, Applied Linguistics 26 (4), 482-502. Cameron D. and Kulick D. 2003. Language and Sexuality, Cambridge: Cambridge University Press. Carter R. 2004. Introduction to J. Sinclair and R. Carter (ed.), Trust the text: Language, corpus and discourse, 1-6, London: Routledge. Castagnoli S. 2006. “Using the Web as a source of LSP corpora in the terminology classroom”, in M. Baroni & S. Bernardini (ed. by), Wacky! Working papers on the Web as a corpus, pp. 159-172, Bologna: Gedit. 151
Chambers J. K. and Trudgill P. 1980. Dialectology, Cambridge: Cambridge University Press. Chomsky N. 1965. Aspects of the Theory of Sintax, Cambridge: The MIT Press. Chomsky N. 2004. (Interviewed by Andor, Jozsef), “The master and his performance:
Pragmatics, 1(1): 93–111. Clark K. 1992. “The linguistics of blame: the representation of women in the Sun’s reporting of crimes of sexual violence”, in M. Toolan (ed.) Language, Text and Context: 208-224, London: Routledge. Coates J. 1993. Women, Men and Language, New York: Longman. Coates J. 1996. Women Talk: Conversation between Women Friends, Oxford: Blackwell. Coates J. and Cameron D. 1989. “Gossip revisited: language in all-female groups” in J. Coates and D. Cameron (eds.) Women in their Speech Communities, pp. 94-121, London: Longman. Connell R.W. 1995. Masculinities, Cambridge: Polity Press. Conrad S. 2010. “What can a corpus tell us about grammar?”, in M. McCarthy & A. O’Keeffe (Eds.), Routledge Handbook of Corpus Linguistics, pp. 227240, Abingdon: Routledge. Cook A. S. 1885. An Old English Grammar, Boston: Ginn & Company. Crawford M. 1995. Talking Difference: On Gender and Language, London: Sage. 152
Creswell J. W. 2003. Research Design: Qualitative, Quantitative, and Mixed Method Approaches, Thousand Oaks: Sage Publications. De Beauvoir S. 1952.The Second Sex, New York: Vintage Books. Dörnyei Z. 2007. Research Methods in Applied Linguistics: Quantitative, Qualitative, and Mixed Methodologies, Oxford: Oxford University Press. Durkin K. 1983. “Review of Kress and Hodge, 1979”, Journal of Pragmatics 7: 101-105. Dworkin A. 1981. Pornography: Men Possessing Women, Toronto: Women’s Press. Ebeling J. 2000. Presentative Constructions in English and Norwegian. A Corpusbased Contrastive Study (Acta Humaniora 68), pp. 25-26, Oslo: Unipub forlag. Eckert P. 1989. “The whole woman: Sex and gender differences in variation”, Language Variation and Change I, 245-67. Fairclough N. 1989. Language and Power, London: Longman. Fairclough N. 1992. Discourse and Social Change, Cambridge: Polity Press. Fairclough N. 1995. Media Discourse, London: Arnold. Fantinuoli C. 2006. “Specialized corpora from the web and term extraction for simultaneous interpreters”, in M. Baroni & S. Bernardini (ed. by), Wacky! Working papers on the Web as a corpus, pp. 173-190, Bologna: Gedit.
Fillmore C. 1992. “Corpus linguistics or computer-aided armchair linguistics”, in J. Svartvik (ed.) Directions in Corpus Linguistics, pp. 35-60, Berlin: Mouton de Gruyter. Firth J. R. 1935. "The Technique of Semantics", Transactions of the Philological Society, 36-72. Firth J. R. 1957. Papers in Linguistics 1934-1951, London: Oxford University Press. Fishman P. 1980. “Conversational insecurity”, in H. Giles, W.P. Robinson and P. Smith (eds.), Language: Social Psychological Perspectives, pp. 127-32, Oxford: Pergamon Press. Flowerdew J. 1993. “Concordancing as a tool in course design”, System, 21, 321344. Foucault M. 1972. The Archaeology of Knowledge and Discourse on Language, New York: Pantheon Books. Fowler R. 1991. Language in the News: Discourse and ideology in the press, London: Routledge. Fowler R. 1996. “On critical linguistics”, in Caldas-Coulthard and Coulthard (eds.): 3-14. Franca V. B. 1999. “Using student-produced corpora in the L2 classroom”, in P. Grundy (Ed.), pp. 116-117, IATEFL 1999 Edinburgh Conference Selections, Whitstable: IATEFL.
Francis W. N. 1992. “Language Corpora B.C.”, in: J. Svartvik, (ed.), Directions in Corpus Linguistics, Berlin: Mouton de Gruyter. Frawley W. 1987. “Review of Van Dijk T. A. (ed), 1985”, Language 63: 361-395. Frazier S. 2003. “A Corpus Analysis of Would-clauses Without Adjacent Ifclauses”, TESOL Quaterly 37(3): 443-66. Freed A. 1992. “We understand perfectly: a critique of Tannen’s view of miscommunication”, pp. 144-52 in Hall, Kira, Bucholtz, Mary and Moonwomon, Birch, (eds.) Locating Power: Proceedings of the Second Berkeley Women and Language Conference, Los Angeles, University of California: BWLG group. Fries C. C. 1952. The Structure of English, New York: Harcourt Brace. Gauntlett D. 2002. Media, Gender and Identity: an Introduction, New York: Routledge. Gavioli L. 1997. “Exploring texts through the concordancer: Guiding the learner”, in A. Wichmann & S. Fligelstone & T. McEnery & G. Knowles (Eds.), Teaching and Language Corpora, pp. 83-99, London: Longman. Gilligan C. 1982. In a Different Voice: Psychological Theory and Women’s Development, Cambridge, MA: Harvard University Press. Graddol D. and Swann J. 1989. Gender Voices, London: Blackwell. Granger S. 1998. “The computer learner corpus: a versatile new source of data for SLA research”, in S. Granger (Ed.), Learner English on Computer, pp. 3-18, London: Longman. 155
Granger S. 2003. “The International Corpus of Learner English: A new resource for foreign learning and teaching and second language acquisition research”, TESOL Quarterly, 37(3), 538-546. Gray J. 1993. Men are from Mars, Women are from Venus, New York and London: HarperCollins. Gray J. 1994. Men are from Mars, Women are from Venus: A Practical Guide for Improving Communication and Getting What You Want, New York: Harper Collins. Gries S. T. 2006. “Some proposals towards more rigorous corpus linguistics”, Zeitschrift für Anglistik und Amerikanistik, 54(2): 191–202. Grimshaw A. D. 1981. “Review of Kress and Hodge, 1979 and of Silverman and Torode, 1980”, Language 57: 759-765. Gulli A., Signorini, A. 2005. “The indexable web is more than 11.5 billion pages”, paper presented at the WWW2005 conference, Chiba, Japan. Henry H. 1983. “Are the National Newspapers Polarising?”, Admap 19.10: 484491. Herret-Skjellum J. and Allen M. 1996. “Television programming and sexstereotyping: A meta-analysis”, Communication Yearbook 19, 157-85. Higgins J. 1991. “Fuel for learning: The neglected element of textbooks in CALL”, CAELL Journal, 2(2), 3-7. Hoey M. 1991. Patterns of lexis in text, Oxford: Oxford University Press.
Hoey M. 1993. Introduction to M. Hoey (ed.). Data, Description, Discourse: Papers on the English language in honour of John McH Sinclair, V-IX, London: Harpercollins. Holmes J. 2003. Power and Politeness in the Workplace, London: Longman. Holmes J. 2006. Gendered talk at Work, Oxford: Blackwell. Humm M. 1989. The Dictionary of Feminist Theory, London: Harvester Wheatsheaf. Hunston S. 2002. Corpora in Applied Linguistics, Cambridge: Cambridge University Press. Jespersen O. 1909-1949. A Modern English Grammar on Historical Principles, London: George Allen and Unwin. Jespersen O. 1922. Language. Its Nature, Development and Origin, London: George Allen & Unwin. Johnson S. 1775. ‘Preface’ to A Dictionary of the English Language, T. Ewing. Jucker A. H. 1992. Social Stylistics. Syntactic Variation in British Newspapers, New York: Mouton de Gruyter. Jutting J. P., Morrisson C., Dayton-Johnson J. and Dreschsler D. “Measuring gender (in)equality: introducing the gender, institutions and development data base (GID)”, OECD Development Centre Working Paper No. 247, 2006, URL: http://www.oecd.org/dataoecd/17/49/36228820.pdf, visited on May 10, 2012.
Kennedy G. 1998. An Introduction to Corpus Linguistics, London: Longman. Kilbourne J. 2000. Can’t Buy My Love: How Advertising Changes the Way We Think and Feel, Washington, DC: Free Press. Kjellmer G. 1986. “‘The lesser man”: observations on the role of women in modern English writings’, in J. Aarts and W. Meijs (eds.) Corpus Linguistics II, pp. 163-176, Amsterdam: Rodopi. Kučera H., Francis W. N. 1967. Computational analysis of present-day English, Providence (RI): Brown University Press. Kytö M., Rissanen M., and Wright S. (eds) 1994. Corpora across the Centuries, Amsterdam: Rodopi. Labov W. 1966.The Social Stratification of English in New York City, Washington DC: Center for Applied Linguistics Labov W. 1972. Language in the Inner City, Oxford: Blackwell. Lakoff R. 1975. Language and Woman’s Place, New York: Harper and Row. Lakoff R. 1975. Language and Woman’s Place, New York: Harper and Row. Landau S. I. 2001. Dictionaries: The Art and Craft of Lexicography, 2nd ed. Cambridge: Cambridge University Press. Lauridsen K. 1996. Text Corpora and Contrastive Linguistics: Which Type of Corpus for which Type of Analysis? In Aijmer/Altenberg/Johansson pp. 6371.
Lazar M. 2005. Feminist Critical Discourse Analysis: Studies in Gender, Power and Ideology, London: Palgrave Macmillan. Leech G. 1974. Semantics, London: Penguin. Leech G. 1992. “Corpora and theories of linguistic performance” in J.Svartvik (ed.) Directions in Corpus Linguistics, Berlin: Mouton de Gruyter. Lewis M. (Ed.) 2000. Teaching collocation: Further developments in the lexical Approach, Hove, England: Language Teaching Publications. Litosseliti L. 2006. Gender and Language: Theory and Practice, London: Arnold. Louw B. 1993. Irony in the Text or Insincerity in the Writer? The Diagnostic Potential of Semantic Prosodies, in M. Baker, G. Francis, & E. TogniniBonelli, (eds) "Text and Technology", Philadelphia/Amsterdam: John Benjamins. Louwerse M., Crossley S., Jeuniauxa P. 2008. “What if? Conditionals in Educational Registers”, Linguistics and Education 19(1): 56-69. Lowth R. 1762. A Short Introduction to English Grammar, reprinted in: Alston, R. C. (ed.), 1967. English Linguistics 1500-1800 18, Menston: Scolar Press. Ludeling A., Kyto M. 2009. Corpus Linguistics, Walter de Gruyter. Mac Whinney B. 1992. The CHILDES database, Dublin, OH: Discovery Systems. Mahlberg M. 2005. English general nouns: A corpus theoretical approach, Amsterdam: John Benjamins.
Mahlberg M. 2006. “Lexical cohesion: Corpus linguistic theory and its application in English language teaching”, International Journal of Corpus Linguistics, 11(3): 363–383. Maltz D. N. and Borker R. A. 1998. “A cultural approach to male-female miscommunication”, in J. Coates, Language and Gender: A Reader, pp. 417-434, Oxford: Blackwell. McCarthy M. 2001. Issues in Applied Linguistics, Cambridge: Cambridge University Press. McEnery A. et al. 2006. Corpus-Based Language Studies, London: Routledge. McEnery T. and Wilson A. 1996. Corpus linguistics, Edinburgh: Edinburgh University Press. McEnery T., Gabrielatos C. 2006. “English corpus linguistics,” in B. Aarts and A. McMahon (eds.), The handbook of English linguistics, 33–71, Oxford: Blackwell. McEnery T., Xiao R. Z., and Tono Y. 2005. Corpus-based language studies: An advanced resource book, London: Routledge. Meunier F. 1998. “Computer tools for the analysis of learner corpora”, in S. Granger, (Ed.), Learner English on Computer, pp. 19-37, London: Longman. Meyer C. F. 2002. English Corpus Linguistics: An Introduction, Cambridge: Cambridge University Press.
Mills S. 1995. Language and Gender: Interdisciplinary Perspectives, New York: Longman. Mills S. 2003. “Third wave feminist linguistics and the analysis of sexism”, Discourse Analysis Online, URL: www. extra.shu.ac.uk, visited on July 15, 2012. Mills S. 2003. Gender and Politness, Cambridge: Cambridge University Press. Mindt D. 1996. “English corpus linguistics and the foreign language teaching syllabus”, in J. Thomas & M. Short (Eds.), Using Corpora for Language Research, pp. 232-247, London: Longman. Moore Grimke S. 1838. Letters on the Equality of the Sexes, Boston: Isaac Knapp, 25, Cornhill. Morley J. and Bayley P. (eds.) 2009. Wordings of war. Corpus assisted discourse studies on the war in Iraq, London: Routledge. Murray J. A. H. et al. 1973. “Historical Introduction”, The Oxford English Dictionary, p. XV, quoted from R. A. Wells, Dictionaries and the Authoritarian Tradition: A Study in English Usage and Lexicography, Berlin: Walter de Gruyter. Myers G. 1998. Ad Worlds: Brands, Media, Audiences, London: Arnold. Myers G. 1994. Words in Ads, London: Arnold Nunan D. 1999. Second Language Teaching and Learning, Boston: Heinle & Heinle.
Oakley A. 1972. Gender and Society, London: Temple Smith. O'Keeffe A.,
McCarthy M., Carter R., 2007. From Corpus to Classroom:
Language Use and Language Teaching, Cambridge: Cambridge University Press. Oostdijk N., de Haan P. (eds), 1994. Corpus Based Research into Language, Amsterdam: Rodopi. Partington A. 1998. Patterns and Meanings, Amsterdam: John Benjamins. Partington A. 2003. The Linguistics of Political Argument, London: Routledge. Partington A. 2004. “Corpora and discourse: A most congruous beast”, in A. Partington, J. Morley and L. Haarman (eds.), Corpora and discourse, 1-20, Bern: Peter Lang. Peters M. 2007. “The Math on Miss Motor Mouth”, Psychology Today, March/April, URL: http://www.psychologytoday.com/articles/200703/themath-miss-motor-mouth, visited on June 12, 2012. Pipher M. 1994. Reviving Ophelia: Saving the souls of Adolescent Girls, New York: Ballentine Books. Quirk R. 1974. The Linguist and the English Language, London: Edward Arnold. Reddick A. 1990. The Making of Johnson’s Dictionary, 1746-1773, Cambridge: Cambridge University Press. Reddick A. 1990. The Making of Johnson’s Dictionary, 1746-1773, Cambridge: Cambridge University Press. 162
Richardson K. 1987. “Critical linguistics and textual diagnosis” Text 7: 145-163. Rissanen M., Kytö M. and Palander-Collin M. (eds), 1993. Early English in the Computer Age, Berlin: Mouton de Gruyter. Romaine S. 2000. “Language in Society: An Introduction to Sociolinguistics”, (second edition), Oxford: Oxford University Press. Römer U. 2005. Progressives, patterns, pedagogy. A corpus-driven approach to English progressive forms, functions, contexts and didactics, Amsterdam: John Benjamins. Sapir E. 1958. The status of linguistics as a science, 1929, in E. Sapir, Culture, Language and Personality (D.G. Madelbaum, Ed.), Berkley: University of California Press. Scott M. 1999. WordSmith Tools version 3, Oxford: Oxford University Press. Sharoff S. 2006. “Creating general-purpose corpora using automated search engine queries” in M. Baroni & S. Bernardini (ed. by), Wacky! Working papers on the Web as a corpus, pp. 63-98, Bologna: Gedit. Sharrock W. W. and Anderson D. C. 1981. “Language, thought and reality, again”, Sociology 15: 287-293. Shei
February 20, 2012. Sigley R. and Holmes J. 2002. “Girl-watching in corpora of English”, Journal of English Linguistics 30 (2), pp. 138-57. 163
Simpson J. A. 2009. Oxford English Dictionary, Oxford: Oxford University Press. Sinclair J. 1987. Looking Up: An Account of the COBUILD Project in Lexical Computing, London: Collins. Sinclair J. 1991, Corpus, concordance, collocation, Oxford: Oxford University Press. Sinclair J. 1991. Corpus, concordance, collocation, Oxford: Oxford University Press. Sinclair J. 1991. Corpus, concordance, collocation, Oxford: Oxford University Press. Sinclair J. 1996. “Preliminary Recommendations on Corpus Typology”, EAGLES Guidelines, URL: http://www.ilc.cnr.it/EAGLES96/corpustyp/node5.html, visited on February 10, 2012. Sinclair J. 2000. Lexical Grammar, Naujoji Metodologija, 24: 191-203. Sinclair J. 2004. Trust the Text: Language, Corpus and Discourse, London: Routledge. Sinclair J. 2005. “Corpus and Text – Basic Principles”, pp.1-16, in M. Wynne, Arts and Humanities Data Service, Developing Linguistic Corpora: a guide to good practice, Oxford: Oxbow Books. Smith S. L. 2006. G Movies Give Boys a D: Portraying Males as Dominant, Disconnected and Dangerous, Program Brief, See Jane Program at Dads and Daughters, May, URL: www.seejane.org, visited on April 13, 2012.
Smith S. L. 2006. Where the Girls Aren’t: Gender Disparity Saturates G-Rated Films, Program Brief, See Jane Program at Dads and Daughters, February, URL: www.seejane.org, visited on April 13, 2012. Spender D. 1980. Man Made Language, London: Pandora. Spender D. 1990. Women of Ideas: And What Men Have Done To Them, Toronto: Harper Collins Canada. Stanton E. C. 1895. The Woman's Bible, European Publishing Company. Stubbs M. 1993. “British traditions in text analysis: From Firth to Sinclair” in M. Baker, F. Francis and E. Tognini-Bonelli (eds.), Text and Technology: In honour of John Sinclair, pp. 1-36, Amsterdam: John Benjamins. Stubbs M. 1995. “Collocations and semantic profiles: on the cause of the trouble with quantitative methods”, Function of Language, 2/1: 1-33. Stubbs M. 1996. Text and Corpus Analysis, London: Blackwell. Stubbs M. 1996. Text and Corpus Analysis, Oxford: Blackwell. Stubbs M. 1997. “Whorf’s children: critical comments on critical discourse analysis” in A. Ryan and A. Wray (eds.) Evolving Models of Language, Clevedon: Multilingual Matters/BAAL, pp. 100-116. Stubbs M. 2001. “Text, corpora and problems of interpretation: a response to Widdowson”, Applied Linguistics 22: 149-172. Stubbs M. 2001. Words and Phrases: Corpus studies of lexical semantics, Oxford: Blackwell. 165
Sweet H. 1891-1898. A New English Grammar, Oxford: Oxford University Press. Talbot M. 1995. “A synthetic sisterhood: false friends in a teenage magazine”, in K. Hall and M. Bucholtz, Gender Articulated: Language and the Socially Constructed Self, pp.143-68, New York: Routledge. Talbot M. 1998. Language and Gender: An Introduction, Cambridge: Polity Press. Tannen D. 1991. You Just Don’t Understand: Women and men in conversation, New York: William Morrow. Tannen D. 1995. Talking from 9 to 5, London: Virago. Tannen D. 1998. The Argument Culture, London: Virago. Teubert W. 2005. “My version of corpus linguistics” in International Journal of Corpus Linguistics, 10 (1), 1–13. Thompson G. and Hunston S. (eds.), 2006. System and corpus: Exploring connections, London: Equinox. Thorndike E. L., Lorge I. 1944. The Teacher’s Word Book of 30,000 Words, New York: Teachers College, Columbia University. Tognini-Bonelli E. 2001. Corpus Linguistics at Work, Amsterdam: John Benjamins. Troemel-Ploetz S. 1991. “Review essay: selling the apolitical”, Discourse and Society, 489-502. Trudgill P. 1974. Sociolinguistics: An Introduction, Harmondsworth: Penguin.
Véronis J. 2005. “Web: Google’s missing pages: mystery solved?”, URL: http://blog.veronis.fr/2005/02/web-googles-missing-pages-mystery.html, visited on May 14, 2012. Viberg A. 2002. Polysemy and Disambiguation Cues across Languages: The Case of Swedish fa˚ and English get, in Lexis in Contrast, B. Altenberg and S. Granger (eds), pp. 119-150, Amsterdam: John Benjemins. West C. and Zimmerman D. 1983. “Small insults: a study of interruptions in crosssex conversations between unacquainted persons”, in Thorne, Barrie, Kramarae, Cheris,
Henley and Nancy, (eds.) Language, Gender and
Society, Rowley MA: Newbury House. West C. and Zimmerman D. 1987. “Doing Gender”, Gender And Society, 1: 12551. Widdowson H.G. 1996. “Reply to Fairclough: discourse and interpretation: conjectures and refutations”, Language and Literature 5: 57-67. Wollstonecraft M. 1972. A Vindication of the Rights of Woman, New York: Norton. Woolf V. 1928. A Room of one’s own, London: Penguin. Wright J. 1898-1905. The English Dialect Dictionary, 6 vols., Oxford: Clarendon Press. Zhang W. 2000. Corpus Studies: Their Implications for ELT, IATEFL Issues (152), pp. 9–10.
Other reference sources A Routledge Companion Website, Corpus Based Language Studies, 2006, available at: http://cw.routledge.com/textbooks/0415286239/resources/corpa.htm#_Toc9 2298876, visited on March 13, 2012.
The Kaiser Family Foundation website, October 10, 2003, available at: http://www.kff.org/entmedia/3378.cfm, visited on March 10, 2012.
UniversitĂ di Bologna website, CORIS corpus, available at: http://dslo.unibo.it/coris_ita.html, visited on March 15, 2012.
Real Academia Espagnola website, CREA corpus, available at: http://corpus.rae.es/creanet.html, visited on March 15, 2012.
MANHEIM corpus website, available at: http://corpora.ids-mannheim.de/ccdb/, visited on March 16, 2012.
ICE corpus website, available at: http://ice-corpora.net/ice/, visited on March 18, 2012.
ICLE corpus website, available at: http://www.uclouvain.be/en-ceclicle.html, visited on March 18, 2012.
http://en.wikipedia.org/wiki/British_National_Corpus, visited on April 14, 2012.
BNC website, available at: http://www.natcorp.ox.ac.uk/corpus/index.xml?ID=intro, visited on April 16, 2012.
TEC corpus website, available at: http://www.monabaker.com/tsresources/TranslationalEnglishCorpus.htm, visited on April 20, 2012.