The Undegraduate Journal

Page 1

r

Undergraduate Journal

FOUNDING PARTNERS

Undergraduate Journal

r Volume 2

Volume 2

www.undergraduateawards.com

A collection of winning essays from 2010


Undergraduate Journal

r Volume 2

A collection of winning essays from 2010

gradbook final.indd 1

22/10/2010 12:41:57


Published in 2010 by the Undergraduate Awards of Ireland & Northern Ireland. Undergraduate Awards of Ireland & Northern Ireland c/o Google Ireland Ltd, Gasworks House, Barrow Street, Dublin 4 www.undergraduateawards.com info@undergraduateawards.com Copyright Š 2010 The Undergraduate Awards of Ireland & Northern Ireland. All rights reserved. This book may not be reproduced, in whole or in part, including illustrations, in any form without the written permission of the publishers. Design and content management by Studio82, Dublin, Ireland. Printed by Colour World Print Ltd., Co. Kilkenny.

gradbook final.indd 2

22/10/2010 12:41:58


gradbook final.indd 3

22/10/2010 12:41:58


Balance We seek both sides of every story. After that it is up to you.

THE IRISH TIMES irishtimes.com

gradbook final.indd1 4 balance A5.indd

22/10/2010 02/06/201012:42:05 13:43


0 13:43

r BOARD OF DIRECTORS Jim Barry – Chair Áine Maria Mizzoni – Vice-Chair Oisin Hanrahan, Director Paddy Cosgrave, Director

ACADEMIC ADVISORY BOARD Jim Barry – Chair Áine Maria Mizzoni Oisin Hanrahan Paddy Cosgrave Louise Hodgson Tom Boland, HEA Patricia McVeigh, DEL Prof. David Fegan, RIA Frances Ruane, ESRI Martin Curley, Intel Bobbie Bergin, Ulster Bank Helene Almeida, Google Prof. Terri Scott, IT Sligo Prof. Brigid Laffan, UCD Prof. Patrick Prendergast, TCD Prof. Brian MacCraith, DCU Prof. Peter Kennedy, UCC Prof. Jim Walsh, NUI Maynooth Prof. Jim Browne, NUI Galway Prof. Ellen Douglas-Cowie, QUB PROGRAMME MANAGER Louise Hodgson FOUNDING PARTNERS

111gradbook final.indd 5

22/10/2010 13:54:18


Matheson Ormsby Prentice is pleased to support The Undergraduate Awards of Ireland and Northern Ireland.

Driven by excellence.

Contact: Liam Quirke, Managing Partner E: liam.quirke@mop.ie 70 Sir John Rogerson’s Quay, Dublin 2, Ireland. T: +353 1 232 2000 F: +353 1 232 3333 E: mop@mop.ie W: www.mop.ie/careers

gradbook final.indd 6

22/10/2010 12:42:24


Outstanding minds and extraordinary potential

T

r

his journal comprises the 26 winning entries of the Undergraduate Awards of Ireland & Northern Ireland in 2010. Each essay was chosen because it presents independent thinking, originality, and academic excellence, identifying the authors as knowledge creators with extraordinary potential. But why does independent thinking matter? Why reward such a thing – especially at the undergraduate level? The answer lies in the quality of graduate that a successful society needs. One cannot expect a smart economy, for example, if smart thinking is not encouraged. Our graduates will go on to reach great success throughout their lives, be it in further education, employment, or enterprise. Their post-graduate research will save lives, extend scientific knowledge and contribute to the engineered environment; their entrepreneurship will employ others; their creativity will enhance our culture; and their civic contribution across many disciplines will ensure our society both functions efficiently and evolves sustainably. For those who are seeking encouragement for tomorrow, you would be wise to look through the following essays. However, if the thought of such things as instantons, synovial angiogenesis, and neural networks induces only bafflement (as indeed it would for most) then we implore you simply to read the judges’ comments. For here is the most encouraging thought of all: we have in our education systems of both Northern Ireland and the Republic of Ireland outstanding minds, who are producing work beyond their level of education. If this is what they can do as undergraduates, think about their potential as graduates. In her notes on the Business category’s winning essay, ‘Understanding overpopulation: how commercial marketing tactics can help us tackle “the greatest shortcoming of the human race”’, judging chair Dr. Sarah Ingle of Dublin City University commented: “[the winning paper] is illustrative of the fact that it is not just having ideas that is important; ideas without the means to convey them well such that they convince others all too often fail in their communication. This essay is an excellent example of how business-related ideas and opinions can be made clear and convincing, and ultimately brought to life.” At a time when ‘financial’ is considered by some to be

7 gradbook final.indd 7

22/10/2010 12:42:26


a four-letter word, this undergraduate’s vision of business as a tool that can serve the interests of society is nothing less than inspiring. Prof. Elmer Kennedy-Andrews of the University of Ulster was equally impressed with his winning paper in the English Language & Literature category, ‘To what extent do the annotations to the Geneva Bible (1560) present the Book of Revelation as “a conventional apocalypse, [with] the conventional apocalyptic purpose of providing comfort to the suffering faithful” (Gabel et al.)?’. He commented: “Most impressive is the way this essay is capable at once of demonstrating a thorough knowledge of the historical context, an understanding of the textual problems of annotation and redaction within biblical studies, and an ability to read texts both critically and creatively – and to do all of this with style, wit and grace.” Of the Environment & Geosciences winning essay, entitled: ‘Solar UV disinfection of drinking water for developing countries’, judging chair Prof. Paul Ryan said, “the report is of such a high standard that it is worthy of publication in the scientific literature and may well go some small way to fulfilling the requirement for access to sterile drinking water, which Kofi Annan, the United Nations Secretary General described as ‘a fundamental human need and, therefore, a basic human right’.” That an undergraduate would research such a vital and topical issue is understandable; that he could well be part of a solution to the global shortage of drinking water is phenomenal. A most interesting title won the Life Sciences category: ‘How should we treat time in our investigation of coordinated movement?’ In her appraisal of the report, Prof. Cliona O’Farrelly of Trinity College, Dublin noted that the essay was “beautifully written… with great style and originality… Indeed, it reads like a piece of excellent journalism, which would not be out of place in, say, the New Yorker.” Many of the judges’ comments touch on the value of communication. It is one thing to have an idea, but quite another to express that idea in such a way that is credible. If a scientist spends all their time in the laboratory, no matter how significant their findings, it will be pointless unless they can communicate their importance effectively. This requires various skills, such as conducting thorough research, presenting evidence, making a clear argument, and having the courage to challenge conventional thinking. The best communicators can explain complex subject matters to anyone on the street. The undergraduate essay, for all its purposes, communicates a student’s knowledge of what they are studying, be it in the humanities or sciences. We have in this journal 26 outstanding examples of undergraduate essays. Most, if not all of them, outline a vision for the future, be it in tackling an age-old problem with new research, or actually presenting a possible solution to current humanitarian issues of the greatest enormity. And so to our 26 winners of the 2010 awards programme, and to those who were highly commended, we offer our warm congratulations. We hope that in some way this award will galvanise you to realise your full potential as a graduate, and more than that, will yield widereaching and exciting consequences. Jim Barry (Chair) & Áine Maria Mizzoni (Vice-Chair)

8 gradbook final.indd 8

22/10/2010 12:42:28


T H E C O S G R AV E PA RT Y: A H I S TO RY O F C U M A N N N A

NGAEDHEAL

by Ciara Meehan • November 2010 • ISBN 978-1-904890-65-2 • HB/€30 Telling the unique story of a political party born into government amidst the bloodshed of civil war, this book reveals the true complexity of Cumann na nGaedheal and investigates the internal politics of the party as it struggled with ideological tensions and personality clashes.

DOCUMENTS ON IRISH FOREIGN POLICY VO LU M E V I I : 1941  1945

edited by Catriona Crowe, Ronan Fanning, Michael Kennedy, Dermot Keogh and Eunan O’Halpin • November 2010 • ISBN 978-1-904890-63-8 HB/€45 DIFP VII explores Ireland’s increasingly isolated position during 1942 and 1943, and provides a unique first-hand contemporary picture of wartime Europe.

P O L I C I N G T H E N A R ROW G RO U N D : L E S S O N S F RO M T H E T R A N S F O R M AT I O N O F P O L I C I N G I N N O RT H E R N I R E L A N D

edited by John Doyle • September 2010 • ISBN 978-1-904890-66-9 HB/€30

Ten years after the publication of the Patten Report, this book reflects on its role in the subsequent and ongoing transformation of policing in Northern Ireland and the lessons for security-sector reform internationally.

J O H N RO C QU E ’ S D U B L I N : A G U I D E TO T H E G E O RG I A N C I T Y

by Colm Lennon and John Montague • In association with Dublin City Council • November 2010 • ISBN 978-1-904890-69-0 • PB John Rocque’s Dublin reproduces forty extracts from the Exact survey of the city and suburbs of Dublin (1756). Each one is accompanied by a commentary which highlights some of the features of the city’s political, social, economic and cultural history.

T H E L AW O F T H E S E A

by Mahon Hayes • November 2010 • ISBN 978-1-904890-72-0 • HB/€35 This is the first book to present a narrative account, from the point of view of the Irish delegation, of the Third United Nations Conference on the Law of the Sea, which ran from 1973 to 1982.

www.ria.ie

Trade orders to sales@gillmacmillan.ie

R OYAL IRISH ACADEMY

gradbook final.indd 9

I RA

22/10/2010 12:42:44


gradbook final.indd 10

22/10/2010 12:56:55


Contents

r

Outstanding minds and extraordinary potential Introduction from Jim Barry and Áine Maria Mizzoni

7

Acknowledgements

19

Highly Commended

21

Agriculture & Veterinary Sciences The influence of farming practice on water quality in Ireland Tara Griffin

25

Ancient & Classical Studies Wyrd in linguistic and cultural context Mark Anthony Cooke

35

Archaeological Studies Potential amidst stagnancy: new directions for the study of archaeological ceramics Robert Power

41

Astronomy & Space Sciences X-ray imaging and spectroscopy of the impulsive phase of a solar flare Aidan O’Flannagain

49

Business Understanding overpopulation: how commercial marketing tactics can help us tackle “the greatest shortcoming of the human race” Daniel Philbin Bowman

71

Celtic Studies & Irish The poetry of Seán Ó Riordáin Thaddeus Ó Buachalla

81

11 gradbook final.indd 11

22/10/2010 13:26:53


GRADUATE AND INTERN CAREERS Dublin and Belfast

RIGHT PLACE. RIGHT TIME.

Ulster Bank, part of the RBS group, has the largest share of personal and business customers in Northern Ireland and is the third largest bank in the Republic of Ireland. We already make things happen for more than 1.9 million customers. But our success depends on people. The people that bank with us and the people that work for us. We aim to be the most helpful bank on the island of Ireland by building long-term sustainable relationships and adding value to our customers. Our success is such that we are consistently ranked in the Best Companies to Work For in Ireland. Not only that, we’re recognised by GradIreland as being one of the country’s top graduate recruiters. As a business, we have everything ambitious graduates need to take them places. We have graduate opportunities in Dublin and Belfast across Business Services and Corporate Markets. The place is here. The time is now. Find out more at

www.makeitrbs.com

gradbook final.indd 12

22/10/2010 12:43:47


.

r

Chemistry Synthesis of novel indolocarbazole derivatives Hannah Winfield

87

Computer Sciences & Information Studies Mobile robot localisation using neural networks Haoming Xu

117

Economics A critical analysis of the sterilisation of the Yuan Barry O’Donovan

139

Engineering & Mechanical Sciences Outsole design for the enhancement of support and performance in sports footwear William Holland

153

English Language & Literature To what extent do the annotations to the Geneva Bible (1560) present the Book of Revelation as “a conventional apocalypse, [with] the conventional apocalyptic purpose of providing comfort to the suffering faithful” (Gabel et al.)? Fionnuala Barrett

181

Environment & Geosciences Solar UV disinfection of drinking water for developing countries John Murtagh

191

Historical Studies Womanhood under Stalin: selfhood under threat? A critical exploration of the Soviet sexual counter-revolution of the 1930s Joanne Davies

211

13 gradbook final.indd 13

22/10/2010 12:43:48


do you need

help paying for

university? if so, We Can help! If you live in Northern Ireland, there are various Government

schemes available to help you pay for your University education. If you wish to study in the UK or Republic of Ireland as a full-time or part-time student, there are a number of ways we can help: - Student loans - Maintenance grants - Special support grants - Supplementary grants - Support funds - Tuition fees - Travel expenses - Eligibility criteria and application procedures Students can get more info on student support by contacting Student Finance Northern Ireland on 0845 600 0662 or by logging onto www.studentfinanceni.co.uk

gradbook final.indd 14

22/10/2010 12:43:48


r

International Relations & Politics 221 “The war against terrorism has been based on misconceptions about the nature of the enemy.” Discuss Cormac Hayes Languages & Linguistics La langue bretonne: passage de la langue vernaculaire à la langue condamnée Anne Molloy

229

Law A comparative analysis of freedom of association, trade unions and labour rights: authoritarian and libertarian perspectives Donncha Conway

241

Life Sciences How should we treat time in our investigation of coordinated movement? Rachel Carey

255

Mathematical Studies Instantons and the Taub-NUT space Chris Blair

265

Medical Sciences The impact of evolutionary theory on the history of developmental psychology Louise Bhandal

339

Modern Cultural Studies The female presence in the development of electronic and experimental music Claire Leonard

347

15 gradbook final.indd 15

22/10/2010 13:27:19


r

Nursing & Midwifery Critically evaluate the legal, ethical and professional issues associated with defining mental illness Carmel Penrose

365

Pharmacy Notch plays a critical role in synovial angiogenesis in response to hypoxia Catherine Sweeney

373

Philosophical Studies Describe and evaluate the supervaluational theory of vagueness Siobhan Moriarty

389

Physics Vortex dipoles: ordered structures from chaotic flows Eamonn Kennedy

399

Social Studies Famine: a crime against humanity? Or genocide? Death by starvation – Holodomor in Ukraine 1932-1933 Renate Stark

419

Teacher Education From one heart to another: using visual arts as a medium of self-expression will activate an individual’s self-discovery Nuala Finnegan

431

irish - winner 2009 Cillíní Páistí: briseadh croí faoi rún Philomena Ní Flatharhta

439

16 gradbook final.indd 16

22/10/2010 12:43:50


gradbook final.indd 17

22/10/2010 12:43:50


gradbook final.indd 18

22/10/2010 12:43:52

Higher Education Authority, Brooklawn House, Crampton Avenue, Dublin 4.

www.hea.ie

The Higher Education Authority is the independent statutory agency charged with allocating funding on behalf of the Government to our universities, institutes of technology and other higher education institutions. The HEA also provides advice to Government on higher education and research issues.

The Undergraduate Awards of Ireland.

Make the difference.


Acknowledgements

As the Undergraduate Awards grew and moved into its second year, the list of individuals and organisations who have helped its development increased greatly in length. First and foremost, many thanks goes to our eight founding partners: NTR plc, Google, the Higher Education Authority, the Department for Employment & Learning, Ulster Bank, the Royal Irish Academy, the Irish Times, and the Communications Clinic. Without your continued support, these awards simply could not operate. It is the judging process that ultimately chooses a winner in each category, and every year we rely on the goodwill of academics and other interested parties from around Ireland and Northern Ireland. The judging chair in each instance recruits and organises their own panel, making them a crucial element to the awards programme. We are extremely grateful for the considerable amount of time and effort each of our 27 judging chairs and co-chairs, along with every member of their panels, puts into choosing the Undergraduate Award winners. As a small organisation, we rely on the expertise and advice of many individuals who populate our boards. Those who comprise our Academic Advisory Board ensure that academic rigour is up-held in all that we do, which is fundamental to what we do. The Undergraduate Awards’ Management Board and Board of Directors both aid at the operational level, overseeing the dayto-day running of the programme and its development. The members of these boards have been vital in their knowledge and encouragement. However, many other individuals around the island of Ireland have contributed their feedback, helped us improve processes, and, crucially, helped to promote the awards in their own institutions or industries. Both lecturers and students’ unions have greatly increased the awareness of the awards over the last year alone by directly encouraging students to submit their essays online. And we are incredibly lucky to count among our supporters many recognisable figures in industry who offer their assistance in countless ways. In the end, however, if it wasn’t for the students who submit their projects and essays, the awards would not exist. The Undergraduate Awards has a simple mission: to inspire, support, and celebrate the ideas of future generations of graduates. We believe that original thinking can and does exist at undergraduate level in Ireland and Northern Ireland, which should be acknowledged and rewarded. But we do not generate the ideas. It is with sincere thanks to every student who participated in the awards in both our inaugural and second year that we move into the 2011 programme on the back of great success. Congratulations to everyone who was highly commended in 2010, and indeed, to our 26 overall winners. May this journal be a reminder of your considerable ability to create ideas and innovate. Oisin Hanrahan & Paddy Cosgrave

19 gradbook final.indd 19

22/10/2010 12:43:53


gradbook final.indd 20

22/10/2010 12:43:57


Highly Commended Agriculture & Veterinary Sciences Stephen Moore

Chemistry Mark Moore

Ancient & Classical Studies

Computer Sciences & Information Studies

John O’Rourke

Archaeological Studies

Brendan Kelly

Astronomy & Space Sciences Robert Dixon

Michelle Picardo Business

Mari Haughey Clare McCollum Suzanne Meegan Catriona Forsyth Jennifer Cowman Kieran Little Elizabeth O’Connor James O’Connor Brian O’Flynn Victoria Ramsey Patricia Sheehan Celtic Studies & Irish

Claire Dunne Ursula McCarthy

Sean Ryan

Lianlong Wu Ciaran McKenna Howard Shortt David Haddock Deirdre Dalton Economics

Saite Lu Jennifer Pratt Roisin Lavelle Jason Somerville Jennifer Cowman Engineering & Mechanical Sciences

Patrick Murphy David Kelleher John Murtagh

English Language & Literature

Deirdre Ni Annrachain Kirstin Southard Hannah Partis-Jennings Environment & Geosciences Ciara Rooney

Martina Thege Sara Ferguson Historical Studies

Felix Paterson Siobhan Murphy Aideen O’Connor Hugh Taylor Jeffrey Chambers Robyn Atcheson Selina Little Fintan Neylan Esther Burke Susan Storey Caitriona Dowd

International Relations & Politics

Fay Niker Tom Sheppard Eoin Williams Olwyn Fagan

Languages & Linguistics

Clara Lyons Huw Duffy Steven Corcoran Claire Dunne Cadhla McDonnell Clare Houlahan Anthony Brophy Rachel Hynes Ruth Talbot

21 gradbook final.indd 21

22/10/2010 12:43:57


gradbook final.indd 22

22/10/2010 12:44:05


Padraic Lamb Susan Potter Law

Stephen Brittain Alan Banbury Ulic Egan Kelly Hardman Robert Kearns Emma Lawrence Orla Mackle Laura McMurran Paul Montague Keara Powell Nan You Chris Gallagher Emma Roche-Cagney Life Sciences

Siobhan Leonard David Leonard Caoileann Murphy Keith O’Brien Catherine Keogh Orlath Collison Steven Fagan Mathematical Studies

Gavin Armstrong Andrew Steele

Medical Sciences

Eanna Kenny Petra Lexova Feena Cunnane

Rebekah Mahoney Clíona Murphy Valerie Power Olivia McDevitt Heidi Kerrigan Eline Appelmans Atish Chopra Meave Mullooly Eamon Nugent Joanna Tilley Paul Ryan Sarah Cadogan Eóin Rickard Modern Cultural Studies

Ruth Clinton Ciara Gilvarry Jack Logan Sean Treacy Helen Haswell Anna Olsson Madhu Kambamettu

Nursing & Midwifery

Esther Funmilayo Afolalu Oliver Allen Michelle Beaumont Elizabeth-Anne Buchanan Jennifer Carey Karen Calvey Richard Doyle Sally Hanford Fiona Henerty Maria Jarosinski

Laura Mac Kenna Damaris Noble Meave Murphy Clare O’Neill Suzanne Thornhill Jennifer Ramsey Baggs Hannah Twomey Pharmacy

Patrick O’Leary Philosophical Studies

Alex Court Eoin Gubbins Yvonne Gubbins Fintan Neylan Fay Niker Physics

James McGrane Scott McKenchie Social Studies

Anthony Brophy Mary Noonan Janin Eberhardt Tamryn Reinecke

Teacher Education

Katherine Kennedy Serena Gordon Ryan Dolan Laura Curtis Helen Graham

23 gradbook final.indd 23

22/10/2010 12:44:06


Agriculture & Veterinary Sciences panel

r

Prof. Dolores O’Riordan, UCD (chair) Prof. Pat Lonergan, UCD Prof. Shea Fanning, UCD Dr. Gordon Purvis, UCD Dr. Barbara Murphy, UCD Dr. Aine Ni Dhubhain, UCD Dr. Kenneth McKenzie, UCD

Judges’ comments

This paper reviews the changes required in farming practices in Ireland in order to ensure the protection of high-quality water and improve the status of all water bodies. This essay presents an important topic and suggests useful approaches that may be successful in controlling water quality. It is an exceptionally well-written essay. An extensive literature is presented and it is evident that the candidate has assimilated the important elements. A very thorough consideration of the literature is presented and the candidate shows remarkable maturity of thought. The material presented is factually correct and the discussion presented is of a very high quality. The review is well organised and clearly presented, and the candidate demonstrated a very high level of synthesis of the material and critical thought. The judges considered the essay to be of an exceptionally high standard for an undergraduate and were very pleased to select it as the winning submission.

24 gradbook final.indd 24

22/10/2010 12:44:07


r

Agriculture & Ve t e r i n a r y S c i e n c e s

The influence of farming practice on water quality in Ireland Tara Griffin

E

ABSTRACT

utrophication from agricultural pollution in Ireland account for 75.3 and 33.4% of the total nitrogen and total phosphorous loading respectfully. Some of the biggest threats from eutrophication of water bodies include i) the increased growth in phytoplankton populations which reduce the amount of oxygen available for aquatic life, resulting in poor quality waters and fish kills and ii) eutrophication of water bodies affect the quality of drinking water, large phytoplankton populations are difficult to treat during water treatment and may make the water distasteful, discoloured and odourus, and if the Nitrogen concentration of the water exceeds 50mg/l the water is undrinkable as this concentration may cause methemoglobinemia in adults with poor immunity (e.g. cancer patients), children or the elderly. The Water Frameworks Directive requires Ireland, by 2015, to achieve the protection of water of high quality and improve all water bodies to good status. This review aims to investigate what changes in farming practices in Ireland are required to reach this goal. Five farming practices were reviewed; riparian buffer zones and constructed wetlands, manure application processes, conservation tillage, catch cropping cover cropping and intercropping and finally beetle banks. Of these farming practices each has been shown to reduce diffuse agricultural pollution to water bodies with varying successes. We can achieve the best success in Ireland by locating them where they have been proven to work best e.g. riparian buffer zones work best where runoff does not converge before the buffer, and combining the best farming practices for each farm in order to successfully reduce the amount of diffuse agricultural water pollution as best as possible.

25 gradbook final.indd 25

22/10/2010 12:44:07


Introduction

Agriculture in Ireland accounts for 75.3 and 33.4% of the total nitrogen load and the total phosphorous load respectively that enters our water bodies (EPA, 2005). With currently 28.6% of our waterways being polluted (Clabby et al. 2008) there is a large movement to clean up our rivers and lakes, which includes using best practices on farms to reduce the nutrient loading into rivers. The European Union Water Frameworks Directive (WFD; European Parliament and Council 2000) aims to protect water of high quality status, prevent the degradation of water bodies, to improve the quality of all waters to good status by December 2015 (Humphreys, 2007), and to involve the public and to streamline legislation (WFD Ireland, 2009). Agricultural pollution enters Irish water bodies in two main ways, through point sources and diffuse sources. Point source water pollution originates from a single identifiable, locatable source, such as a pipe or leak in a tank, while diffuse pollution cannot be located to a single point as it enters the river by actions such as overland flow or runoff, thus, making diffuse source water pollution harder to prevent and control. Diffuse source pollution affects 45.2% of rivers while point source pollution affects 13.5% (EPA, 2005). Diffuse sources are larger issues than point sources in terms of water pollution resulting from farm land and it is this pollution source that requires the most control with regards to farming practices in order to meet the water quality standards required by the WFD by 2015. There are two main types of agriculture in Ireland, intensive and organic agriculture. Intensive agriculture is the most widely practiced, with high inputs and high outputs and emphasis of making the largest yield possible from the area of land that you have. Organic agriculture has low inputs and moderate outputs with an emphasis on the use of environmentally friendly farming practices without using artificial fertilisers and pesticides. Organically grown food has been marketed to the public as the healthier option and so the farmer can charge a higher price and is compensated in this way for the loss in yield that can occur from using these methods. By applying different farming practices to both types of agriculture there can be a reduction of pollution to water bodies. The literature demonstrates that different practices work in some circumstances better than others; reasons include outside influences affecting the results such as climate or soil type or the farming practice not being suited to the differing types of farming. Point source pollution in Irish farming today is being well managed as opposed to 10 – 20 years ago. In 1987 pollution originating from agricultural practices resulted in an unprecedented number of fish kills. Because of this, a £1 billion investment was made to improve farmyards over a ten year period (Beegle et al. 2000). Today farm yards are inspected by two authorities, the EPA and the department of Agriculture. Also, if the farmer has signed up for R.E.P.S (Rural Environmental Protection Scheme), their officials also visit. Point sources are easy to identify as they usually come in the forms of leaks and pipe discharges etc. Point sources are also relatively easy to control as they originate from a single location. Due to these factors point source pollution is no longer the problem it once was and diffuse pollution, being harder to control and identify is the most important and largest contributor to agricultural pollution in Ireland. This review aims to present and evaluate farming practices that have been shown to reduce the nutrient and chemical loading of rivers by diffuse sources, and also to indicate which of these

26 gradbook final.indd 26

22/10/2010 12:44:07


practices could be best applied in the Irish farming environment.

Diffuse Sources

Diffuse sources of agricultural pollution are the hardest to prevent and control and are also the primary source of agricultural pollution to water bodies (Magette et al. 2000; Herzog et al. 2008). Diffuse pollution, its effects and severity depend on many factors such as soil type, climate, organic matter content etc. (Holden et al. 2004; Spargo et al. 2006; Leys et al. 2007). Because of these different influences on agricultural pollution, not all farming practices will work to reduce diffuse pollution on all farms (Peigné et al. 2007), and a combination of practices may need to be applied in order to observe a satisfactory reduction in diffuse pollution (Holland 2004; Beaudoin et al. 2005; Verstraten et al. 2006).

Riparian Buffer Zones and Constructed Wetlands

Riparian buffer zones are areas of uncultivated, permanently vegetated land of varying width located between fields and waterways. They are constructed to slow Riparian buffer zones are areas of uncultivated, permanently vegetated land of varying width located between fields and waterways. They are constructed to slow down agricultural runoff by filtering sediments, nutrients and pesticides. As the water flows over and through the riparian buffer zone vegetation adsorbs nutrients and organic material and traps sediments (Uusi-Kämppä et al. 1998; Ducros and Joyce 2003). One very important effect of riparian buffer zones is their ability to trap sediment (UusiKämppä et al. 1998; Owens et al. 2007). This is because inorganic phosphorus chemically binds to agricultural sediment, approximately 860mg P /kg soil (Chambers and Garwood 2000). Much phosphorous can be prevented from reaching water bodies by riparian buffer zones by reducing the amount of sediment runoff. Usui-Kämppä et al (1998) has shown that careful management of riparian buffer zones can decrease total P losses up to 97%. Maintenance of vegetation was shown here to be important, as riparian buffer zones that were not cut in autumn and the grass removed added to the amount on P losses to water due to the decay of the vegetation. Cutting and harvesting of grass will also remove P from the buffer zone area. They have also shown that the type of vegetation grown on the riparian buffer zones do not have a significant influence on P retention. This was explained due to i) rapidly growing vegetation taking up P efficiently, and ii) dense vegetation slowing down the water resulting in more soil, and therefore P, deposition. Grass freezing in winter also results in increasing plant P leaching as found by Ulén (1984) (cited by Usui-Kämppä et al. 1998), by freezing and thawing grass in the lab. Owens et al (2007) also has shown that riparian buffer zones trap sediment and the P that binds to it and has found that land use, soil type, and slope influence the amount of sediment-associated P that reaches these buffers. This study also has shown that where erosion and overland flow are channelised, buffers may need to be re-enforced to prevent “break-points”. In these instances buffers may be required only in specific locations, i.e. where the channelisation accurs. In contrast to these results Verstraten et al (2006) has shown that the sediment trapping capability of riparian buffer zones in a catchment is reduced where overland flow converges and

27 gradbook final.indd 27

22/10/2010 12:44:08


bypasses the buffer through ditches, sewers and road surfaces. This study has shown that on a catchment scale, riparian buffer zones reduce sediment loading by only 17% and that they should be accompanied by other nutrient management techniques in order to reduce the pollution loading of the water body sufficiently. Strategically placed constructed wetlands also, like riparian buffer zones, slow down agricultural runoff and trap sediment (Usui-Kämppä et al. 1998; Owens et al. 2007). The biological action of biota in the wetland e.g. phytoplankton and macrophytes can also break down organic pollution. A two year, short term study of nitrogen removal by a constructed wetland (~1% of the catchment area) draining grazed dairy pasture by Tanner et al (2005) obtained positive results for nitrogen removal. In the first year 79% of total N was removed and 21% in the second year, the decrease in the second year can be somewhat explained by a decrease in drainage flow from 305mm in the first year to 197mm in the second. Nitrogen dissolves readily in water and so larger amounts of rainfall result in larger amounts of nitrogen leaching from the soil. Unlike results in Usui-Kämppä et al (1998) where the average phosphorous retention of constructed wetlands was 41%, Tanner et al (2005) showed that total P rose by 101% in the first year but decreased to 12% in the second year. This can be attributed to the phosphorous bound earth being disturbed from the initial construction of the wetland, and subsequent settling of the disturbed earth afterwards. This shows the benefits of longer-term studies to prevent the results being skewed by initial “settling in” periods of the experiments. If N, P and sediment loads are reduced to water bodies by these two methods it is likely that other sediment-associated pollutants can also be removed i.e. pesticides, metals, pathogens etc. (Owens et al. 2007). This may be another advantage attributable to riparian buffer zones and constructed wetlands and their ability to reduce the diffuse agricultural pollution load to water bodies.

Manure Application Practices

The evolution of agriculture in Ireland has resulted in farms (mostly intensive systems) that produce more organic fertiliser that can be spread on the available land. Large proportion of the nutrients consumed by livestock are excreted by the animals (usually >75%) instead of being ‘exported’ off the farm in the form of meat, eggs, milk etc. (Beegle et al. 1998; Shepherd and Chambers 2005). This applies especially to pig and poultry production were there is insufficient land to cope with the amounts of manure produced (Lord et al. 1999). The worst case found during this review is where 1000kg N/ha per year was spread in England on land which was later designated a nitrate sensitive area (Lord et al. 1999). Nitrate sensitive have been defined by the Department for the Environment, Food and Rural Affairs (U.K.) as areas where drinking water is vulnerable to nitrate pollution (DEFRA, 2005). Spreading of organic fertiliser (animal manure) should essentially meet three targets. It should meet crop requirements, avoid losses to the environment and maintain productive soil (Lewis et al. 2003; Shepherd and Chambers 2005). Reducing losses to the environment can be achieved by spreading on a suitable day and spreading on fields that will only produce moderate leaching effects (Lewis et al. 2003). Spreading on a suitable day is one of the most important factors

28 gradbook final.indd 28

22/10/2010 12:44:08


(Lewis et al. 2003). Spreading in autumn/winter has shown increased N leaching compared with spring applications (Weslien et al. 1998; Lewis et al. 2003). This can be attributed to low N requirement and uptake by the crop and increased rainfall. On the day of spreading rainfall, on that day and the days prior, should not result in soil that is too wet to absorb the organic fertiliser and, little or no rainfall should fall on the spreading day. Rainfall forecasted for the days post spreading should not consist of prolonged moderate or single heavy rainfall events as these will, as described by Holden et al (2004), have a “significant polluting impact”. Lewis et al (2003) found that in a model based study of N leaching in fields in Ireland (Wexford) and Scotland that sandy loam soils (found in the experimental sites in Scotland) generated most leaching while the poorly drained Irish sites produced the least amount of leaching. Surface runoff (generating a high pollution risk) was higher for the poorly drained Irish soils in comparison to the well drained Scottish soils. This shows that soil type and soil runoff along with other soil characteristics can determine if a field will suffer from moderate leaching. With new technology, injection of organic fertiliser is thought to be the new ‘greener’ technology. This is true when it comes to N2O, NH3 emission and odour reduction but this method of application does not reduce, or increase nitrogen losses into water bodies (Weslien et al. 1998; Pahl et al. 2001; Lewis et al. 2003). Therefore it is not a farming practice applicable when attempting to reduce agricultural water pollution in Ireland. Increased fertiliser application rate increases the risk of water pollution slightly until the crop becomes over-fertilised and there is a disproportionate increase in risk of pollution (Lord et al. 1999; Lewis et al. 2003; Shepherd and Chambers 2005).

Conservation Tillage

Farming in Europe relies heavily on soil conservation (Holland 2004) while other parts of the world have used conservation tillage for many years. They use this technique to reduce soil erosion and compaction, conserve moisture and reduce production costs (Holland 2004, Peigné et al. 2007). With 16% of farmland in Europe susceptible to soil erosion, the area cultivated using minimal tillage is increasing (Holland 2004). Conservation tillage has shown to reduce soil erosion in Europe by 88% (Leys et al. 2007), and improve the quality of stream ecosystems in the United States, where conservational tillage is most widely practiced (Yates et al. 2006). A reduction of soil erosion will reduce the amount of soil-bound phosphorous and also soil associated agrochemicals from reaching water bodies (Holland 2004). Leys et al (2007) also found reduced runoff with conservational tillage, some of this reduction can be attributed to an increase in topsoil organic matter content. Peigné et al (2007) found an increase in aggregate stability which improves the infiltration rate of the soil and thus reduces runoff. Conservation tillage may not apply itself as well to organic farming as it does to conventional farming. This is due to the heavy reliance on ploughing by organic farmers for mainly weed and disease control (Peigné et al. 2007). Peigné et al (2007) found that there are major limitations to conservational tillage in organic systems. They found increased weed pressure, which was also found by Johnson et al (2002), as mechanical weed control is not yet suited to soil with crop

29 gradbook final.indd 29

22/10/2010 12:44:08


residues. Topsoil compaction during the transition year from conventional to conservation tillage was one other issue, which can itself reduce crop establishment and drainage. Also found was a slowing of the nitrogen cycle. This reduced nitrogen availability for the crop and can impede growth. Although, Holland (2004) suggests that conservation tillage results in richer soil biota which can improve nutrient recycling. Here we can see one farming practice which has much success in reducing water pollution in some sectors but is not viable for others. Holland (2004) has stated that conservation tillage is more suited to relatively dry soil conditions; this is not the case for most soils in Ireland due to heavy rainfall events throughout the year. Wet weather can also prevent drilling (Holland 2004). This suggests that conservation tillage would not be suit Irish soils. In comparison to this, Leys et al (2007) has shown the benefits of conservation tillage in Belgium on loamy soils that suffers from frequent flooding. Also conservation tillage is practiced in some tropical areas of the world (Holland 2004); if conservation tillage did not work in wet weather then conservation tillage in tropical areas would not occur. Because of this, the literature suggests that conservation tillage is a farming practice that would be suited to Ireland in order to prevent diffuse agricultural pollution to water bodies

Catch crop, Cover crop and Intercropping

The introduction of these farming techniques involve changing the field from a monoculture system to a system of multicropping (Whitmore and SchrÜder 2007). Catch crops are crops harvested for low value uses e.g. animal fodder, cover crops are grown under or after the main crop and are ploughed back into the soil as green manure and intercropping involves growing two or more crops simultaneously with both having high value end uses e.g. sale. Catch crops are grown to reduce leaching and when incorporated back into the soil they remineralise nitrogen for the following crop. Catch crops can be under- or autumn-sown and incorporated back into the soil in late autumn or the following spring (Torstensson and Aronsson 2000). They can also be grown during the summer, if the crop is grown in winter; here they work best if the summer climate is usually wet. Catch crops e.g. ryegrass, when the main crop is not over-fertilised, have been found to reduce leaching by 51 – 60% (Torstensson and Aronsson 2000; Beaudoin et al. 2005) while Stenberg et al (1999) observed no difference in Nitrogen leaching but, acknowledges that peers carrying out catch cropping experiments in the same country have shown the benefits of this farming practice and its ability to reduce nitrogen leaching. One hypothesised benefit of catch cropping is the remineralisation of N for the following crop. This needs to be further studied as Torstensson and Aronsson (2000) found a soil nitrogen deficit after catch cropping, therefore less nitrogen was available to the following crop in the early growing season and also lower yields were observed when the catch crop was ploughed into the soil in spring instead of late autumn. They explained this as i) nitrogen uptake by the catch crop and ii) low nitrogen re-mineralisation during the winter. Recent studies in England on the ability of cover crops to reduce nitrogen leaching have shown increases in reduction (Johnson et al. 2002) by 29 – 91% (MacDonald et al. 2005). Catt et al. (1998) in Oxfordshire showed a reduction in nitrogen leaching due to cover crops in comparison to winter fallows, but also found that continuous winter cereal crops reduce nitrogen leaching

30 gradbook final.indd 30

22/10/2010 12:44:08


more again. The use of cover crops have shown to be more successful during wet winters by MacDonald et al (2005), which suggests that this farming practice would suit Ireland, but it was also commented that “they are less likely to be effective on poorer drained, medium-heavy textured soils”, which are common in Ireland (Humphreys 2008). Therefore the use of this farming practice in Ireland will be dependent largely on the soil type and drainage. Cover cropping has also shown the ability to suppress weed growth (Kruidhof et al. 2008) under western European conditions. Cover cropping may inhibit weed growth in late summer and autumn by suppressing growth and seed production through competition; fodder radish, winter oilseed rape and winter rye were most successful at this. Cover crop residues can reduce or retard weed emergence in spring. Lucerne had the strongest effect on weed suppression in spring (Kruidhof et al. 2008). Winter oilseed rape had strong effects during both periods and therefore is the suggested cover crop for this purpose. Weed management using this farming practice reduces the need for herbicides which can reduce agricultural chemical pollution to water bodies. Johnson et al (2002) and Munõz-Carpena et al (2008) have shown that a reduction in fertiliser application is required with a cover crop to ensure there is not an increase in nitrogen leaching due to remineralised nitrogen in the soil. Intercropping has also been shown to reduce nitrogen leaching both in the field (Li et al. 2005) and in model studies (Whitmore and Shröder 2007). Whitmore and Shröder (2007) found that a maize-grass intercrop will reduce N leaching by 15mg/l when compared to a conventional catch crop and 20mg/l when compared to fallow soil. Traditional intercrops in China, such as wheatmaize and maize-faba bean, reduce the accumulation of nitrate in the soil profile, thus reducing the available nitrogen for leaching (Li et al. 2005). Reasons for increased efficiency in the use of mineral nitrogen include complimentary root distribution and longer growing periods (Li et al. 2005).

Beetle Banks

Beetle banks were designed to control cereal aphids (pest of winter wheat) attack by providing an overwintering habitat for beneficials such as polyphagous arthropods and linyphiids, who predate on aphids (Collins et al. 2002; Collins et al. 2003; Mac Leod et al. 2004). The provision of an overwintering habitat reduced the distance that beneficials have to travel in order to populate the whole field during aphid attack on cereal crops and aims to maintain a population of beneficials in the middle of the field as well as near the boundaries (Collins et al. 2003; Mac Leod et al. 2004). In two long term English studies, five and seven years, the population of polyphagous arthropods were greater or equal to the population at the field boundaries (Collins et al. 2003; Mac Leod et al. 2004). Collins et al (2002) has described that the presence of a beetle bank in the middle of a cereal field has a “significant impact on reducing aphid populations in the crop”. They have also shown that lyniphiids populations were also high on beetle banks, especially in spring and early summer, the same time when the aphid populations were at their highest. They have shown that polyphagous predators were still influencing aphid populations 83m from the beetle bank, but the impact decreased with distance from it. Nix 1999, found savings of between 3 and £12/ha for the

31 gradbook final.indd 31

22/10/2010 12:44:08


farmer for each application of insecticide that could be avoided. With less insecticide spraying there is a reduced risk of water pollution from agrochemicals. Beetle banks do not work on conventional farms as by using pesticides the farmer kills off the pest species but, in doing so the farmer also kills the beneficial species. Therefore this practice is limited to organic farms or conventional farms that do not use pesticides. Irish farmers will need to reduce or eliminate pesticide use in order for this practice to work. This is a trend not observed as farm spending on plant protection produces (as defined by the EU) has increased from 60.6â‚Ź/ ha in 2000 up to 67.1â‚Ź/ha (Knaapi 2008). The benefits of beetle banks on non-cereal pests need to be investigated in order to see the full benefits, or limitations of beetle banks.

Conclusions

Individually, each of these farming practices can work to reduce diffuse agricultural pollution on Irish farms. This will be determined by their correct location e.g. valleys vs. flat land with the correct soil type and drainage, and also the right situation e.g. intensive vs. organic farmland. Diffuse agricultural pollution is difficult to control and different farming practices may reduce one type of diffuse pollution while increasing another type e.g. if application of slurry to a field is calculated to reach nitrogen requirements, an excess of phosphorous may result (Beegle et al. 2000). These practices may also reduce yields (Tortensson and Arronson 2000) or cause other problems such as increased weed pressure (Johnson et al. 2002). A combination of farming practices is required in order to reduce diffuse agricultural pollution significantly (Holland 2004; Beaudoin et al. 2005; Verstraten et al. 2006). This can be seen in many countries were farming schemes are introduced. These schemes combine many farming practices in order to reduce diffuse agricultural pollution as well as other types of agricultural pollution. Some examples are R.E.P.S., Ireland (Humphreys 2008), Nitrate Sensitive Areas (NSAs), U.K, (Lord et al. 1999), Proof of Ecological Performance (PEP), Switzerland (Herzog et al. 2008) and Good Agricultural Practices (GAP) in the E.U., (Beaudoin et al. 2005). Farmers may be compensated for costs incurred from the changes made to their farming practices e.g. R.E.P.S. or may not, NSAs (Basic Scheme) (Lord et al. 1999). Lord et al (1999) studied the effect of turning agricultural land into NSAs under two schemes. The Basic Scheme required restricted fertiliser use and the use of cover crops in winter and the Premium Scheme required these and also the conversion of some arable land into low-input grassland which was compensated for. The premium option had the largest influence with an 80% reduction in N leaching when arable land was converted to low-input grassland. The basic scheme’s most successful practice was the use of cover crops. A total of a 30% reduction in N leaching shows how a combination of farming practices makes the most difference, but that the most difficult choice for the farmer (conversion of arable land to low-input grassland) is often the most successful. While Lord et al (1999) found cover crops, conversion of arable land to low input grassland and reduced fertiliser application to have the best results in reducing N leaching, Beaudoin et al (2005) has found that a combination of N-fertiliser optimisation, use of cover crops and straw incorporation yields the best results.

32 gradbook final.indd 32

22/10/2010 12:44:08


The R.E.P.S. scheme in Ireland aims to “i) establish farming practices and production meathods that take into account conservation, landscape protection and wider environmental problems, ii) protect wildlife habitats and endangered species of flora and fauna; and iii) produce quality food in an extensive and environmentally friendly manner” (Humphreys 2008). In 2006, 45% of all farms were participating in the R.E.P.S. scheme (Humphreys 2008). Nutrient management is of the most important aspects of R.E.P.S. and restricting and managing organic fertiliser (animal manure) has proven in many studies to reduce diffuse agricultural pollution to water bodies (Weslien et al. 1998; Tortensson and Aronsson 2000; Lewis et al. 2003). This may explain the reduction in the amount of money spent by farmers on inorganic (mineral) fertilisers and other soil improvers from 76.2 €/ha in 2000 down to 61.8€/ha in 2006 (Knaapi 2008). R.E.P.S. also encourages the creation or management of existing wetlands which also have shown to reduce pollution (Usui-Kämppa et al. 1998; Owens et al. 2007). R.E.P.S. aims to combine farming practices in order to achieve the best results in reducing diffuse agricultural pollution to Irish water bodies. From the literature that has been reviewed it is clear more catchment studies need to be made in order to obtain the full picture about how successful a farming technique is. This was seen in the case of riparian buffer zones where Usui-Kämppa et al (1998) in a review of Nordic experiments found a 27-97% reduction in P loading to rivers. In these experiments the riparian buffer zones were maximum 16m long and tested with both artificial and natural runoff. These experiments yielded positive results for riparian buffer zones, but, in a catchment scale experiment Verstraten et al (2006) demonstrates where drainage flow converges before the riparian buffer zones, a reduction of only 17% in P leaching is observed. This also re-enforces the issue of the careful locating required for these farming practices. With careful location of these farming practices where they are required and also the use of the right combinations, Ireland will be closer to having a better chance of reaching the standards set by the WFD by 2015.

33 gradbook final.indd 33

22/10/2010 12:44:09


Ancient & Classical Studies panel

r

Prof. Brian McGing, TCD (chair) Dr. Aude Doody, UCD Dr. Kieran McGroarty, NUIM Dr. Amanda Kelly, NUIG Prof. Anna Chahoud, TCD

Judges’ comments

This essay is an excellent example of how traditional philological analysis, when allied with a broad range of knowledge and the imaginative mind of a cultural historian, can yield impressive profits from a modest investment. Starting from the single Old English word wyrd, which means something like, but not exactly, fate, the winner traces its story in other Indo-European contexts, teasing out the possible influences on its meaning of societal differences among Anglo-Saxon and Germanic peoples. He moves with impressive ease from Old English to the Greek of Homer, the Latin of Vergil and the Hebrew of the Old Testament, ending with a fine image to convey his conclusion: “since wyrd expresses possibility in destiny, it might be seen in terms of the weft stretched on a loom, fixed and yet with the pattern of the cloth still undetermined.” This is a succinct piece of work, sharply intelligent, nuanced and written with the precision of a true scholar. As one member of the panel observed: “When reading the paper you had to remind yourself that it was written by an undergraduate.”

34 gradbook final.indd 34

22/10/2010 12:44:10


r

Ancient & Classical Studies

Wyrd in linguistic and cultural context Mark Anthony Cooke

I

open this essay with a slight corruption of the question posed by Juliet in Shakespeare’s Romeo and Juliet: What’s in a Word?1 In the case of the Old English wyrd, the answer would be: a whole cosmos, cosmos used here both in its ancient sense of an ordered system and in its more modern meaning of the people inhabiting that system. Loosely translated wyrd means fate, but it encompasses a concept so broad that it cannot comfortably be rendered by any single term in modern English. First, this essay will trace the journey of wyrd from its distant Indo-European origins down to its use in Old English and will then examine a variety of its uses within the extant corpus of Old English texts. The focus here will be on where and how wyrd deviates from a modern understanding of fate in an attempt to bridge the inevitable gap between the mindset of modern English readers of the Old English texts and an etymologically reconstructed version of what speakers of the then nascent English language would have understood by it. The old English noun wyrd can be traced back to the Indo-European verb *uert, meaning to turn. A trace of this pre-historical meaning (the twists and turns of fate) can still be found in the phrasal verbs to turn out and to turn into. It appears in Proto-Germanic as *wurdís which can be loosely translated as things that come about. This, in turn, enters Old Saxon as wurd, Old High German as wurt and Old Icelandic as urðr. All three terms mean fate. Indeed, in this context, it is

The correct quote is “What’s in a name?” Romeo and Juliet (Act II, scene ii, 1-2).

1

35 gradbook final.indd 35

22/10/2010 12:44:10


noteworthy that urðr was the name of one of the three most important Nornir,2 the fates in Norse mythology. Moving forward chronologically, the earliest written attestation of wyrd and Old English manifestation is to be found in Beowulf. This singular (in every sense of the word) Old English noun is also related to the Old English verb weorðan.3 To translate wyrd into modern English simply as fate with no further comment would be tantamount to ignoring the very fabric of Anglo-Saxon society. The perception of the world around us and how we codify reality is inextricably tied up with the language we speak, whether we are modern Europeans or ancient Anglo-Saxons. However, although we are inextricably linguistic beings, it is always important to note diachronic shifts in cultural practices and expressions. The Germanic peoples, for example, practiced an elective system of kingship. In short, in contrast to the hierarchical structure of most Mediterranean cultures, they practiced a system of leadership that was more horizontal in its political structure. This unusual elective component would seem to suggest that within the ancient world, the individual in Germanic society had a somewhat stronger sense of control over his destiny than his Mediterranean counterparts. This is reflected at a linguistic level inasmuch as a core component of wyrd is that power of agency vested in the individual. An important question in this context would be whether this putative Germanic sense of self-agency survived in Anglo-Saxon England. As M.J. Swanton puts it, “Germanic society [as] described by Tacitus at the end of the first century must have appeared decidedly antique from the point of view of the fifth” (Swanton 15). He postulates, however, that “there is no good reason to doubt that it survived in its essentials into the age of migrations” (Swanton 15). Similarly, it is quite possible that an Anglo-Saxon notion of agency in the face of destiny (a notion quite at variance with the ancient Mediterranean idea of the role played by fate) survived well into the age of Christianity. Fate as expressed by the Old English wyrd carries a different meaning to the Greek concept of fate as used by Homer in The Iliad. Hector, pleading with his wife not to grieve says, “μοῖραν δ᾽ οὔ τινά φημι πεφυγμένον ἔμμεναι ἀνδρῶν, οὐ κακὸν οὐδὲ μὲν ἐσθλόν, ἐπὴν τὰ πρῶτα γένηται”4 (Homer). Here, μοῖρα5 conveys the idea of a fate that is predestined for everyone who enters this world and to which man must surrender. Wyrd, alternatively, suggests a degree of free will. This is demonstrated very clearly in Beowulf. When it is ordered that gold be paid for the warrior whom Grendel has murdered, we are presented with the outcome that would have come about had Beowulf not taken action: “he hyra ma wolde [acwellan], nefne him witig God wyrd forstode ond ðæs mannes mod” (l. 1055-1057, Chickering 108 ).6 Beowulf’s courage, or indeed the courage of any man, is a viable instrument with which to counteract the force of wyrd. Hence wyrd is not inescapable and unchangeable, setting it apart from the Greek μοῖρα. Norn, plural Nornir, the mythological women who rule the fates in the Poetic Edda. To become. 4 I say, from fate, can no man escape, be he cowardly or courageous, once he has entered this world. 5 Moira: fate, feminine nominative singular. 6 “[H]e would have killed more had not wise God and Beowulf’s courage changed that fate.”(Chickering) 2 3

36 gradbook final.indd 36

22/10/2010 12:44:11


Neither does wyrd correspond to the finality conjured up by the Latin fatum, as used, for example, by Vergil in The Aeneid where Tymoetes is referred to as a “tool of fate of Troy’s predestined fall” (Virgil II, line 34).7 However, with the spread of Christianity Latin came to have a significant influence on Old English. The Church not only established, or rather re-established Latin as the dominant language of writing, but it had also established a system of hierarchy that was, in some salient points, at variance with the Germanic heritage of the Anglo-Saxons. To what extent, we need to ask, did and could older connotations of wyrd survive a cultural and therefore concomitant linguistic upheaval as vast as the Christianisation of England? There is, however, a line in the A section of Daniel, the Old English poem based on the Book of Daniel, which suggests precisely such a survival. In line 149 we find “wyrda gerynu” (The Complete Corpus) which can be translated as the mysteries of fate. To accept such a translation at face value with no correction of the parallax that historical distance brings, however, would be to ignore much of the spectrum of meaning hidden in wyrd. The full scope of wyrd in Daniel can only be understood if we look at the Old Testament versions most probably available to its anonymous author. Given the educational structures of the time, it is unlikely that he was anything other than a member of the clergy or, at least, somebody with access to a monastic education. The most common biblical source of the time was the Latin Vulgate Bible where we find no mention of fateful mysteries, but instead the more open-ended notion of “quæ ventura sunt in novissimis temporibus,”8 (Daniel, 2:28, Vulgate Bible) which allows room for possible human intervention in the interim before these events take place. It is also possible that our author had access to and knowledge of the Koine Greek Bible. Jane Stevenson, in her article on St. Ephraim, speaks of “a number of Syriac and Greek works” in the library of Canterbury in the 7th century, “the contents of which formed part of a syllabus taught to an unknown number of Anglo-Saxon students” (Stevenson). It is not unreasonable to imagine the Koine Bible to have been among these works. There, however, we find “ἃ δεῖ γενέσθαι,”9 a construction that does not suggest inescapable fate. Even in the Hebrew text, the sentence“‫המ‬ ‫”אימוי תירחאב אוהל יד‬10 still allows for possible changes in the intervening period before events unfold. In none of the three versions, in short, is there any mention of fatum, μοιρα11 or ‫דַי לָרֹוּגַה‬12 and all that these imply in terms of the fixity of future outcomes. It seems unlikely that our author would break with such a tradition of literal translation and introduce the notion of inevitable mysteries of fate into his version. This analysis sees, however, our anonymous author’s “literal translation” not as a transliteration but rather as his remaining true to the sense of the original Hebrew, Greek and Latin. It should be remembered, moreover, that the Old English Daniel is first and foremost a poem rather than primarily a translation. The poet, in primusque Thymoetes duci intra muros hortatur et arce locari, siue dolo seu iam Troiae sic fata ferebant. Daniel, 2:28, Vulgate Bible: events that will come [to be] in latter days. 9 Daniel, 2:28, Septuagint Koine Greek Bible, Mod. Gr. τι μέλλει γενέςθαι: that which will happen. 10 Daniel, 2:28 Tanakh (Old Testament), mo di lehevey b’akharis yomayo: that which will be in latter days. 11 Moira: fate. 12 Yad hagôrāl: the hand of fate. 7 8

37 gradbook final.indd 37

22/10/2010 12:44:11


all probability cognisant of one or more of the biblical source texts, chose his words carefully (in the sense of with due care to meter and pregnancy with meaning) and yet did not break in any way with the previous tradition of literal renderings. His version, for all its poetry, is just as literal as the Greek or the Latin and, moreover, eloquently expresses (or carries over, in the older sense of metapherein) the notion of “that which shall be” as it stands not only in the Latin and Greek texts but also in the Hebrew. The absence of a damning, inevitable, inescapable fate is conspicuous in all three of these texts. Based on this fact, it is hard to imagine that in the Old English version of Daniel’s story it should be otherwise. A negation in The Wanderer also hints at the notion of a fate that can be deterred. In line 15 we read, “Ne mæg werig mod wyrde wiðstondan”13 (Treharne 56). Since the weary mind, it is claimed here, cannot withstand fate, then it logically follows that the alert mind would have the option to do so. Indeed, this very sentiment is expressed in the affirmative by Beowulf when he recounts his battle with the sea monster. He manages to survive the attack, declaring in line 572 “wyrd oft nereð unfægne eorl, þonne his ellen deah”14 (Chickering 82). The claim that wyrd never signifies inescapable fate would be a reduction just as limiting and blinkering as the claim that the concepts of fatum or μοῖρα are the only interpretations that the word allows. The modern English word fate lacks the spectrum of meaning of its Old English counterpart, and yet this is how it is translated. As this analysis has demonstrated, there are facets of Old English wyrd that have become obscure with the passage of time and indeed are probably not fully accessible to speakers of modern English. However, it is important to keep this gap in mind when reading the relevant texts in order to ensure that such gaps are not breached by solecistic assumptions. If one were to attempt to bridge this gap, one might do so using the analogy of weaving, appropriately enough. Since wyrd expresses possibility in destiny, it might be seen in terms of the weft stretched on a loom, fixed and yet with the pattern of the cloth still undetermined.

A weary mind cannot withstand fate. “So fate often saves an undoomed man when his courage holds.” (Chickering)

13 14

38 gradbook final.indd 38

22/10/2010 12:44:11


39 gradbook final.indd 39

22/10/2010 12:44:11


Archaeological Studies panel

r

Dr. Colin Rynne, UCC (chair) Dr. Thomás Ó Carragáin, UCC

Judges’ comments

The judging panel felt that this essay brought a very fresh, incisive and insightful approach to the directions pursued in more recent research on archaeological ceramics. This was very clear in its treatment of the traditional debate of whether or not ceramic assemblages can still realistically be used as firm evidence for ethnicity. A number of different theoretical positions on this problem are expertly contrasted here and, in particular, on the implications of the new questions posed by post-processualists on diversity and social agency in artefact production for ceramic studies, in general. It also calls into question the future relevance of ethnoarchaeological studies to our understanding of the use and stylistic meanings of pottery in antiquity. Finally, this study is not afraid to advert to the apparent unwillingness of recent specialist reports on ceramic assemblages to embrace basic interpretation let alone modern theoretical enquiry, despite the continual importance of pottery in archaeological investigations of all periods.

40 gradbook final.indd 40

22/10/2010 12:44:12


r Archaeological Studies

Potential amidst stagnancy: new directions for the study of archaeological ceramics Robert Power

M

aligned or cherished but never ignored, the study of pottery is synonymous with archaeology. Despite fluctuations in popularity of artefact studies in theoretical quarters, pottery is still one of the archaeologist’s most favoured means to gather information on past societies. It has long been assumed pottery is the ideal artefact to serve as a tracer of vanished cultures. The remarkable permanence of potsherds allows pottery to survive in virtually all soils and conditions. In most cultures pottery is an everyday item and is not restricted to certain social ranks (Rice 1987). Pottery utensils often served a very ordinary domestic purpose, allowing their archaeological extent to give a valuable insight across the strata of society. However, in recent decades, a plethora of more recent archaeological specialities has emerged and grown from strength to strength. Coupled with the increased scepticism of inferring cultural certainties following the development of post-processualism (Hodder 2003) the hegemony of pottery studies has been challenged. How representative of past cultural processes is old and usually broken pottery? Indeed, many popularly accustomed principals of pottery studies have been challenged. For instance Bell et al. (1996) demonstrated that the acclaimed universal durability of pottery did not occur in highly acidic peaty areas. It has also long being assumed that degree of pottery production can indicate a level of sedentism reached by a society, as craft specialisation is a key characteristic of civilisation in itself (Arnold 1985). A high level of pottery production in particular is associated with an elevated level of craft specialisation and thus sedentism (Pratt 1999). However, a number such as Zvelebil (2008) have drawn attention to the presence of ceramics in several pre-agricultural cultures such as the Forest Neolithic in Europe, disproving the

41 gradbook final.indd 41

22/10/2010 12:44:12


accepted notion that presence of pottery is essentially evidence of agriculture. Is ceramic theory still valid in the face of increasingly sophisticated archaeological science? These questions are necessary if the discipline is to keep up with the greatly increased pottery assemblies that now recovered such as in Ireland. In this essay I seek to examine these questions by reference to the past and current trends in the discipline with a focus on Irish ceramic studies. The study of antiquated pottery predates the development of modern archaeology. From the 15th century there was art motivated interest in historical pottery from past cultures in Europe (Orton et al. 1993). As archaeology slowly emerged from antiquarianism in the 19th century, excavation methods developed to become more meticulous. The increasing quantities of pottery that were recovered must have provided an impetus for pottery studies to develop. In this environment the methodology of seriation developed; it was a highly favoured if not the almost universal methodology practiced. This involved classification of pottery into chronological groups, often based on the premise of cultural evolution. Pitt River’s late 19th century excavations in Wessex defined the earliest British pottery tradition by the coarseness of the pottery fabric material (Gibson & Woods 1997, 11-13). In time it was realised that if accurate catalogues of the changes in pottery styles could be compiled, pottery could be an excellent means to date archaeological sites and other artefacts by means of a relative chronology. Changes in style, form and construction could be contrasted to other artefacts and the resulting information can be compared with data from sites. By collating information on several pottery types it was possible to create relative chronologies (Gibson & Woods 1997). For instance, the beaker pottery is a classic indicator of the appearance of metallurgy across much of Europe. This long-realised association was possible to deduce due to the retrieval of beaker pottery with the earliest copper and bronze implements (Gibson 2002, 17). Abercromby’s landmark 1912 publication “A study of the Bronze Age pottery of Great Britain and Ireland and its associated grave goods” epitomised the typological phase of this time. Pottery chronologies allowed archaeological sites to be more accurately interpreted, thus they were an indirect means to study human behaviour. In the following decades this approach developed within its own parameters. Refinements in recording and interpreting stratified archaeological deposits during excavation allowed improved chronologies to be created. Despite refinements, this approach has been succeeded by a more holistic perspective. Sheppard’s (1956) groundbreaking work “Ceramics for the Archaeologist” drew together chronology, trade/distribution and technological development laying the way for the contextual approach (Orton et al. 1993, 13). Around this time extremely rigorous typologies helped to create a backlash to revise the typological paradigm. Previously there was little attempt to understand technological or the cultural role of pottery (Gibson & Woods 1997, 18). These topics became popular goals of optimistic studies typical of the processual era. In the New Archaeology pots were interpreted within a premise of a long-term developing adaptive system (Hodder 2003, 8). Irwin (1978), for instance, used pottery to propose historic trade routes and their change. This approach is continued to this day with refinements (for example McClutcheon 1996; Kelly 2008). As mentioned, from an early date style was a research foci but only as a means to date vessels. Later, style began to be studied to trace the extents of past cultural groups. Stylistic boundaries in

42 gradbook final.indd 42

22/10/2010 12:44:12


material culture have often being interpreted as ethnic divisions (Stark 1999, 25). Barry Cuncliffe (1991) used this theorem to outline the extent of British Iron Age tribes. He compared regional pottery variability with tribal boundaries described by Roman writers leading to the creation of cultural inventories of the different tribal entities. However, it has been well demonstrated that stylistic variability alone can be an unreliable marker of ethnic membership, as stylistic information can vary considerably within a group according to level of group interactions, economics and context of item (Hodder 1979; Stark 1995 cited by Stark 1999). There has been a growing realisation that production is not a monolithic entity enforcing compliance but fundamentally an engagement of the individual subject to highly variable factors. Hodder’s (2009) landmark ethnoarchaeological study of a central African Lozi village demonstrated that pottery similarities did not reflect collective learning networks and frequency of interaction; rather agency used pottery style to create particular social differences and allegiances. Archaeologists advocating dual inheritance theory consider cultural transmission to be driven by individual’s decisions to imitate the behaviours of other individuals based on the incomplete awareness of a trait’s usefulness (Stark et al. 2008, 6). This clouds certainties in interpretation been shrouded in ambiguity as contextual information of production and dispersal of potterys is rarely available. The post-processualist emphasis on diversity leaves the discipline in an abstruse position. Stark (1999) proposed that rather than abandoning this approach, technological variation could be used rather than stylistic change for inferring social information. Technological variability represents the mundane and repetitive practises perpetuated over generations in craft production and use of artefact. This may be problematic, particularly on inferring ethnic boundaries as style may be more closely aligned to utilitarian use than is immediately discernable (Ibid). Rigorous ethnoarchaeology on living peoples may well be key to deciphering further cultural difference. This perspective is being reinforced by a plethora of ethnoarchaeological ceramic studies since the early ‘90s (Stark 2003); fortunately ethnoarchaeology seems to be progressively focusing on overcoming the processualist and post-processualist dichotomy in ceramic studies (Ibid). Despite progress this approach is not entirely adequate, as potential for ethnographic studies in living societies is rapidly diminishing with fading use of locally produced pottery worldwide et al, this approach also fails to accommodate the very real probability that the past was unique (Hodder 2003, 35). Fortunately, one of the most marked trends in archaeological ceramics in the last forty years has been the increasing application of scientific methods; allowing technological variation to see increasing study. This includes analysis of properties surviving vessel fabric and residual traces of original contents (Heron & Evershed 1993). A number of methods have been developed to reveal various information on pottery manufacture, provenance and use. Visual, petrographical and compositional analyses are each methods to examine fabric material, each requiring greater levels of technology (Orton et al. 1993, 132). Straightforward visual examination of a vessel can yield useful insights into manufacturing, firing process and use. At its most rudimentary, vessel form can be informative. For instance, a crudely rudimentarily built vessel may betray a food preparation function. Typically, convention reserves the term coarse wares for these domestic unglazed vessels. Cleary (2008) argued that this

43 gradbook final.indd 43

22/10/2010 12:44:13


term belies the high technical standard sometimes observed in vessels of this category. The neglect of the mundane and the ordinary pottery forms in preference to the decorated and the abstract (Hodder 2003; Hodder 2009) for formation of archaeological theory is probably in itself a legacy of the decrying language: coarse wares. Through ethnographic study of contemporary living potters’ manufacturing processes and their products are well documented. Colour of a vessels section can reveal firing conditions; open firing or in an oxygen reducing environment such as a kiln (Gibson 2002). Analysts should be wary of the possibility of colour variation resulting from use of vessel or due to processes of sherds post-depositional environment. Accidental burning of sherds, acid soils or even contact with rootlets can alter colour (Rice 1987). Petrography is one such discipline, whose application into archaeology has been a major development since the 1950s (Orton et al. 1993, 140). This involves examining a thin slice of fabric material mounted on a slide under a petrographic microscope. This allows the material added to the fabric material during clay preparation such as temper to be examined. Temper particles of stone, bone or other organics added to vessel can also be detected and examined. Tempers reduce plasticity and thus ease the fashioning of the pot. Most critically to manufacture is their tendency to act as an “opening agent”; they allow moisture to leave the fabric material reducing the possibility of disintegration during drying and firing (Gibson 2002, 37-8). Despite functionality of temper, inclusions added did not always perform these functions such as larger particles or gravel of rock (Rice 1987). Selected use of inclusions such as added crushed quartz may well have had a symbolic function in prehistory (Gibson 2002) or alternatively was a result of imperfect knowledge of the inclusions properties. The clay, sand and other natural materials from which the pottery were fashioned may have an origin which geologically is unique and thus, may be diagnostic of the geographic source from which the clay was extracted. In its immediate form this information informs of manufacturing processes chosen by past cultures. Providence of clay can indicate trade of vessels and indeed otherwise forgotten trading networks (Gibson 2002). Arnold (2002) has discussed how shared geological and geographic properties sanction identification of a society’s raw materials extraction area: “community signature units”. As outlined by Arnold (2002), these encompass a limited range, probably no greater than 3 to 4 km in diameter. Even though agency and cultural structure may result in selection of raw materials for non-obvious reasons such as symbolism and perceived properties, pottery communities can be still be extrapolatory. Many actions carried out in pottery vessels, notably the repeated heating of animal and plant products, will create chemical or microfossil evidence of this activity. In addition to the preparation, storage, and cooking of food, many other activities will also have involved the use of pottery vessels. These activities include brewing, tanning, dairying, dyeing, fulling, textile washing, and salt extraction (Heron & Evershed 1993, 256), while other uses, such as the dry storage of seeds, may yield no detectable residue in pots. Studies of trace food residues using techniques borrowed from applied chemistry and environmental archaeology, have recovered pollen, silica plant fragments, starch and even fats from ancient meals. This is particularly associated with unglazed pottery which readily absorbs food traces (Gibson 2002). Fats, often termed lipids from both animal and plant foods, have been found to be particularly robust in the

44 gradbook final.indd 44

22/10/2010 12:44:13


contents residue of pottery. By using large assemblages for investigation of vessel, informative data on pottery use can be achieved. Potential information is not limited to pot use and can, on occasion, have much broader implications. As cooking wares act as receptacles for foods, they hold relevance as a proxy to dietary and agricultural change. In a study of pottery recovered from a barrow in the Welsh Borderlands, Dudd et al. (1999) used residue analysis on recovered pottery to document a clear shift in vessel use, possible indicating a change in local animal husbandry strategy or diet. Both otherwise not observable as archaeozoological material had not survived. As this remarkable evidence dates back to c.5000BP it is clear that in the correct circumstances there are few preservation constraints on residue analysis. Although this use of pottery as a tracer of the subsistence strategies that sustained its makers is not a recent innovation. In Ireland, pottery studies allowed the initial investigation of prehistoric agriculture. Prior to firing, the impressionable pottery surface will be altered by the miscellaneous items the pot comes into contact with. In the 1940s these impressions were studied by pioneering archaeobotanists Jessen and Helbaek, who produced the first survey of prehistoric plant husbandry in Ireland (McClatchie 2007). The recent rarity of this approach has been accentuated by the present dichotomy existing among archaeological specialists. These research questions are seldom posited in the Old World where artefact analysts have maintained a distance from the environmental archaeological specialists who usually tackle these problems, with only one limited exception ‘The Ballyhoura Hills project’ (Cleary 2008). Here a holistic approach revealed a vividly complete archaeology and, not insignificantly, the earliest evidence of dairy farming in Ireland. Many methodologies favoured for residual analyses are technology demanding and expensive thus limiting their application. One comparatively low tech and resultantly low cost novel method to infer the use of vessels is phytolith analysis. Phytoliths are microscopic silica microfossils of plants that can be traced to plant species or genus. They readily survive in burnt food traces and show considerable promise for identifying plant-based food residues (Piperno 2006, 164). Residual phytoliths in pottery are often studied in association with complementary microfossils starch granules and pollen. Frequently, when phytoliths of a specific genus do not permit identification, microscopic starch granules or pollen do (Piperno 2006). Phytoliths and starch granules in ceramic vessels could offer a novel approach to identifying initial cultivation of several key cereal staples in Ireland such as rye and common oat. There is also rich potential to examine dietary breath through wild plant consumption. Paleodiet has been shown to provide an excellent proxy to examine social change, health, demographics and social structures (Dudd et al. 1999; Jones 2007; Cleary 2008). In contrast, more conventional studies of potterys’ symbolism have been less successful; seldom have the abstract concepts imbued in pottery been confidently reconstructed without cultural specific information. For instance, Trigger (1989) noted that few successful symbolic studies exist outside of historical archaeology. Archaeological pottery has come a long way from being simply viewed as chronological signposts. Pottery studies has benefited enormously as a discipline since broadening from the typology fixation of its birth to vessel function, trade, and manufacture. This was only allowed by developments in scientific chronologies namely radiocarbon dating. However, far from being redundant post-radiometric dating for a site’s excavator, pottery is still a useful find as

45 gradbook final.indd 45

22/10/2010 12:44:13


an immediately decipherable chronological phase marker without the expense or waiting of radiometric dating. Many (Stark 1993; Hodder 2004; Shanks 2005) now argue that culture, not least pottery traditions, can only be studied within an open ended system seeking diversity, without any goal of scientific archaeological data. However, an optimistic approach is necessary to encourage promising research into extractable social information. Despite the pessimistic postprocessualist position pottery vessels holds key to many of fundamental archaeological questions. Their potential role to elucidate subsistence strategies and group economics through studying raw materials is not easily overstated. The wane and growth of each theoretical perspective has allowed a potential stronger position as achievements of each can now be gathered and utilised in ceramic studies. However, in Irish research the reality is stagnancy in comparison to new pottery finds particularly in prehistoric pottery; outside of restricted university research groups namely University College Cork’s and occasional reports such as Kelly (2008), archaeological pottery publications are dominated by descriptive reports with only meagre attempts at interpretation. Where rigorous research does occur stylistic chronological typologies are still being created but only with limited emphasise on rich possible extractable social information; for instance Looney (2010) and Cleary (2008). A new emphasis on production information, related networks and pot use will allow specialists to maximise the ‘archaeology’ in pottery studies and help to capitalise artefacts’ value. Unfortunately, it is foreseeable that only seldom will all means of scientific analysis be possible and be conducted on ideal sample sizes; reoccurring constraints include budgets and technical expertise, problems regrettably persistent in many areas of archaeology.

46 gradbook final.indd 46

22/10/2010 12:44:13


47 gradbook final.indd 47

22/10/2010 12:44:13


Astronomy & Space Sciences panel

r

Prof. Paul Callanan, UCC (chair)

Judges’ comments

The standard of entries in the Astronomy & Space Sciences category of the 2010 Undergraduate Awards was extremely high, to the great credit of the students and supervisors involved. To identify the overall winner, the major judging criteria were: • How well did the student present his/her work in the context of current research in the area? • How competently did they deal with the data reduction/modelling/interpretation aspects of the project? • How much did they understand of the physics background to their project? All of the projects excelled in some of these areas, but the one that succeeded above all is ‘X-ray imaging and spectroscopy of the impulsive phase of a solar flare’. This excellent work discussed in detail the observation of a solar flare in the X-rays. It involved a comprehensive presentation of the acquisition, modelling and interpretation of the data in terms of physical models. It is a worthy winner of the Astronomy & Space Sciences category for 2010, and its author is warmly congratulated.

48 gradbook final.indd 48

22/10/2010 12:44:14


r Astronomy & Space Sciences

X-ray imaging and spectroscopy of the impulsive phase of a solar flare Aidan O’Flannagain

T

1 Introduction 1.1 The Sun

he Sun is a star and can be described as a ball of rotating plasma, existing in equilibrium between outward pressure from nuclear reactions in the core, and the gravitational pull of the star’s mass. The zones of the sun of particular importance to this project lie on the outer skin of the Sun: the chromosphere and the corona. The chromosphere is the lowest layer of the solar atmosphere, lying directly above the photosphere. A property of importance to X-ray analysis is its high density gradient: it is about 10,000 km thick, with density values varying from 1010 to 1016 cm-3. Above the chromosphere is the corona, which is much less dense and decreases in density much slower with distance from the Sun.

1.1.1 The Solar Magnetic Field

Within the Sun, rotation of each layer occurs at different rates. This produces a shear stress within the conducting plasma, resulting in a current, which generates the solar magnetic field following Ampere’s law. If this magnetic field is stressed, gaining energy at a rate faster than that of the emission of energy, then we have an explanation for the short timescale, high energy release of radiation and matter observed during a solar flare. We start by considering a flux tube under the surface of the Sun. A flux tube is simply a cylindrical region of space where the magnetic field lines are parallel to the side surfaces of the cylinder. We begin with Newton’s second law of motion.

49 gradbook final.indd 49

22/10/2010 12:44:14


where Ď is the plasma density, a is acceleration, and F represents some force acting on the plasma. Now if we rephrase the acceleration and assume the dominant forces acting on the plasma are the Lorentz force, and the force caused by a pressure gradient we get:

where v and t are velocity and time, P is the plasma pressure, J is current density and B is magnetic field. Under hydrostatic equilibrium, the velocity is constant and so the left hand side of this equation is zero. From Maxwell’s equations we can say that and so: (Insert Equation) Finally, using the vector identity

we have:

If the flux tube is straight and uniform, the second term vanishes. If in pressure balance with the surroundings, the gas pressure inside Pi is less than that of the outside Po by an amount B2/8Ď€, which represents the magnetic pressure. This suggests a lower density inside the tube than outside, causing it to rise. This results in a loop structure passing through the chromosphere. These loops play a vital role in the theory explaining the origin of solar flares.

1.2 Solar Flares

A solar flare is the impulsive release of energy (on the order of 1025 Joules) from the Sun in the form of radiation and ejection of plasma. Light from a solar flare covers the full electromagnetic spectrum, each energy range exhibiting different behaviours which help to explain the processes that caused the event. In this project, the focus will be on the high-energy portion of the spectrum: the region containing X-rays. By convention, the energy of the photons emitted will be measured in kilo-electron volts (keV).

1.2.1 Magnetic Reconnection

It is observed that emission of radiation, especially in the hard (12-120 keV) X-rays, can rise and reach a peak very quickly. This strongly suggests that the cause of the emission is unlikely to be fully explained by heating, as the timescales associated with heating plasma to temperatures high enough to emit X-rays are generally too long. So, a magnetically generated electron beam is considered. The magnetic diffusion timescale is given opposite.

50 gradbook final.indd 50

22/10/2010 12:44:17


Fig. 1. A simple representation of the magnetic fields in a reconnection zone.

where l is the characteristic length scale, and Ρ is the plasma resistivity. As such, in order to reduce the timescale of diffusion to that of energy release in solar flares, the aim is to make l as low as possible. This can be done by having two oppositely aligned magnetic fields in close proximity, as shown in figure 1. With magnetic reconnection presented as a plausible mechanism for fast conversion of stored magnetic energy to motion of charged particles, it now needs to be defined what form this takes in solar activity.

Fig. 2. Artist’s representation of the standard flare model (Holman 2005).

51 gradbook final.indd 51

22/10/2010 12:44:18


After rising through the chomosphere, the coronal loop develops a ‘waist’, where the oppositely polarised magnetic field lines are allowed to interact, resulting in magnetic reconnection. This results in acceleration of electrons to high velocities towards the chromosphere.

1.3 X-Ray Emission

As the electrons travel downwards, they undergo collisions with the coronal and chromospheric plasma, resulting in two phenomena. First, the electrons lose energy to deceleration by attraction to a proton, emitting radiation (Bremmstrahlung emission, hereafter called nonthermal emission), resulting in the observed sudden peak in hard X-rays. Second, the Coulomb collisions between the electrons and the plasma cause heating, reaching temperatures high enough to emit thermal soft X-rays (~3-12 keV).

Fig. 3. An example of a spectral fit of X-ray emission. The crosses are data, the green line is a thermal fit, the red line a nonthermal fit, the black line a Gaussian line (used to fit peaks caused by line emission) and the blue line is the background emission. Spectroscopy is used to differentiate between these two types of X-ray emission. The thermal green line in figure 3 is produced by the relation , where I is the radiation intensity, E is the photon energy, k is Boltzmann’s constant and T is the plasma temperature. The nonthermal portion is fit by a power law funtion , where α is a constant and γ is the spectral index. A second method that will be used in this work is the height-energy relationship of X-ray emitting sources. As electrons propagate through a plasma, the higher energy electrons will travel farther before losing their energy to Bremsstrahlung interactions. Therefore, it is expected that the highest nonthermal emission will be lower in the flare loop. Conversely, higher energy thermal

52 gradbook final.indd 52

22/10/2010 12:44:19


emission should be above lower energy thermal emission, as the hot plasma is replenished by magnetic reconnection from above.

Fig. 3. A plot of density versus altitude based on observations made by RHESSI (Brown, Aschwanden, Kontar 2002).

1.4 Chromospheric Density Model

From figure 3, the density increases rapidly with decreasing altitude. In the same paper, the relation between density, N and energy of photon emission, E in the nonthermal case also presented, quantifying what has been established above:

53 gradbook final.indd 53

22/10/2010 12:44:19


Fig. 4. A schematic of the processes involved in chromospheric evaporation (Adapted from Dennis & Schwartz 1989).

1.5 Chromospheric Evaporation

Chromospheric evaporation is a process where an electron beam hits the dense plasma of the chromosphere, with the subsequent heating causing the surrounding plasma to expand up into the loop. This plasma is frequently hot enough to emit thermal X-rays, in most cases dominating over the residual nonthermal emission at the footpoints. The upflow velocity of this plasma can be predicted as a result of the following derivation (Fisher et al. 1984). We begin again with the equation of motion:

Given that we can rewrite:

and mass density is number density by particle mass

,

Now, for constant velocity the first term in parenthesis vanishes, and given P=2nkBT:

54 gradbook final.indd 54

22/10/2010 12:44:21


Next, we integrate between the front of our expanding coronal plasma and the chromosphere:

Finally, we assume that the velocity of the plasma in the chromosphere is negligible due to the vastly greater densities encountered, and rearranging, we have:

where m is the mean mass per Hydrogen nucleus, T is plasma temperature, and the n parameters correspond to the density of pre-flare chromosphere (nch) and corona (nco). As shown, the evaporation velocity scales with the square root of the thermal energy of the plasma. This relation can be used to check if chromospheric evaporation is being observed.

2 Instrumentation

Fig. 5. A schematic of RHESSI’s grid system (Hurford et al. 2002).

2.1 The Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI)

The Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI; Lin et al. 2002) was the main instrument used in this project. It is a satellite orbiting at an altitude of 600 kilometres, constantly pointed towards the Sun. Its purpose is to image X-ray and gamma ray emission from the Sun in an effort to gain a greater understanding of the physics of particle acceleration and energy release in solar flares.

55 gradbook final.indd 55

22/10/2010 12:44:21


2.1.1 Imaging X-Rays

Due to the high energies of X-ray photons, RHESSI does not use mirrors or lenses to focus light, instead using a rotating dual-grid system. RHESSI is made up of nine germanium detectors, with two grids in front of each detector. These grids use the rotation of the instrument to make images. As the instrument rotates, the grids pass and block X-rays, resulting in a wave-like intensity variation. The Fourier components of this wave are used to reconstruct an X-ray image of the Sun. Each of the nine grids has a different slit width and thus a different spatial resolution.

2.1.2 RHESSI Spectroscopy

A spectrum is essentially a plot of flux versus energy, where in RHESSI the energy range is about 3 keV to 17MeV. RHESSI’s energy resolution is about 1 keV when detecting below 100 keV. RHESSI spectra can be fit with a mix of thermal and nonthermal models. The goal in spectroscopy is to reach a believeable fit with both a low and featureless difference between fit and data.

2.2 Geostationary Operational Environmental Satellites (GOES)

The GOES spacecraft are fitted with soft X-ray detectors that record emission from the Sun in two channels, 0.5 – 4 and 1.0 – 8.0 Angstroms. GOES uses two ion chambers to detect X- rays. A ratio of these two emission profiles provides a temperature and emission measure for the flaring region or quiet Sun. Temperature and emission measure as derived from RHESSI spectroscopy can then be verified.

3 Method 3.1 Imaging

The overall goal of this section was to produce a plot of height versus time for several different energy ranges for a source in the legs of the flare loop. The southern leg was focused on, as it was longer and thus relative height values more precise. If an image was made over 1keV and for four seconds, the resulting image would simply show noise – there would not be enough data to make a reliable image. Given that the flare timescale was roughly 04:35:30 to 04:37:00 UT, the difficulty was making many images without any of these images showing a very low number of counts. It was decided that overlapping time intervals was necessary – an acceptable set of thirty-six images was reached for five energy bands: 3-6 keV, 6-8 keV, 8-10 keV, 10-12 keV and 12-20 keV. Also, detectors 1, 2 and 7 weren’t used due to unreliability at low energies. RHESSI’s data analysis software was used to output a text file containing locations of peaks in each image. A program was then written that found the southernmost peak above a certain threshold of flux, and tracked its transverse distance along the flare loop.

56 gradbook final.indd 56

22/10/2010 12:44:21


3.2 Spectroscopy 3.2.1 Spatially Integrated Spectroscopy

Having the same limitations, the same time intervals were chosen for the spectroscopy. However in this case, only detector four was used as it is known to have the highest energy resolution of RHESSI’s nine detectors. To decide on a background radiation, a long series of spectra were made covering the flare as well as the rest of RHESSI’s ‘day’. Emission prior to flare showed a long period of no activity and a very constant spectrum, so this time interval was considered acceptable. With the background now removed, fitting the spectrum could begin. The fitting process began by choosing the time interval of peak emission, and running the OSPEX spectral fitting program from that point, with each time interval starting with the previous interval’s final parameters. Here the parameters were temperature and emission measure for the thermal model, and spectral index and low-energy cutoff for the nonthermal model. In addition to low errors and believable spectral time evolution, the time evolution of temperature and emission measure compared with the corresponding data from GOES. A satisfactory set of spectra was reached, the resulting parameters now available to explain the physics behind the source motions.

3.2.1 Imaging Spectroscopy

It is possible to specify the area of the sun one would like to analyse spectroscopically. This process does provide more precise information, focusing separately on emission from different parts of the loop, but the amount of counts is greatly reduced due to the selective imaging. So, the duration of the flare was split into only three sections: before the hard X-ray peak (‘prepeak’), after the hard X-ray peak (‘post-peak’), and the time where emission was looptop-only (‘endflare’). Each interval was further cut into twenty energy bands between 3 and 50 keV.

57 gradbook final.indd 57

22/10/2010 12:44:21


4 results

Fig. 6. A lightcurve of the flare of 28 November 2002, showing flux versus time. The 3-6 keV and 12-25 keV curves have scaling factors of 1 and 5, respectively. Four time intervals are marked, corresponding to the four 6-8 keV images below the lightcurve. This project includes the study of only one flare. It occurred at approximately 04:36:00 UT on the 28th November 2002. The degredation of the Germanium detectors onboard was minimal. Projection effects were negligible, so analysis could be done without taking into account motion towards the observer. As shown in the figure 6, the evolution of the source structure with time is remarkably clear. A source appears above the limb, splits into two and descends down both legs of the loop. Upon reaching the footpoints, the emission remains briefly before rising back up. The visibility of this process is rarely if ever seen elsewhere, and for this reason has been previously covered in a paper by Sui, Holman and Dennis (2006).

58 gradbook final.indd 58

22/10/2010 12:44:22


Fig. 7. A time profile of the 28 November 2002 flare (top) and a height-time plot of the source motion in the leg (bottom). In the time profile, the solid lines represent RHESSI data, energy bands from top to bottom are 3-6, 6-12, 12-25, 25-50, 50-100 keV, with scaling factors 5, 1, 4, 3 and 0.5 respectively. Overlaid is the corresponding GOES data. The bottom plot shows distance along the loop from the 25-50 keV footpoint. (Sui, Holman, Dennis 2006). This work will build on the work done previously and will reach a complete explanation of each aspect of the flare. Specifically, it will explain why these sources of emission descend, why they subsequently ascend, and the nature of the third source apparent in the second and third parts of figure 6.

59 gradbook final.indd 59

22/10/2010 12:44:23


4.1 Height-time plot

Fig. 8. A recreation of figure 6, with some changes. The lightcurve is not scaled, and the source tracked in the bottom portion of the figure is the southern source only.

Fig. 9. Figure 7 with higher energy bands removed for clarity.

60 gradbook final.indd 60

22/10/2010 12:44:23


The plot opposite shows clearly the initial descent of peak emission up to the hard X-ray peak around 04:36:12. Note that during this descent, the 8-10keV line is lower than the 6-8keV, which is below the 3-6keV line. This suggests that the emission is nonthermal. The most significant feature of this plot however is the reversal of this behaviour after the hard X-ray peak. Upon ascending back up the leg of the loop, the distribution of energy is now such that the high energy emission lies higher than the low energy emission. According to our predictions, this means the emission has become thermal. It is important to note that, in the nonthermal case, the velocities apparent on the descent are not literal plasma motions, they are the motion of location of peak emission for that energy. However, in the case of thermal emission, the velocities are real, as they represent motion of hot plasma emitting X-rays. Keeping this point in mind, the slope of the descending portion of the graph still contains information. It shows that the lower energy nonthermal emission descends faster than the high energy peak emission. An explanation for this can be presented with the help of spectroscopic data.

61 gradbook final.indd 61

22/10/2010 12:44:24


4.2 Â Spatially Integrated Spectroscopy Â

Fig. 10. The height-time plot from figure 8 (top), with three marked time intervals. Below this plot are spectra and contour images for each marked time interval. In the spectral fits, the green line represents the thermal fit, the red the nonthermal fit, and the blue background emission.

62 gradbook final.indd 62

22/10/2010 12:44:25


Spectroscopy helps to decide if our conclusions based on the height-time plot are accurate, as well as provide some physically significant parameters, namely temperature, spectral index and lowenergy cut-off. The first step in analysing these results is to check if the height-time plot and spectroscopic results are in agreement. The three images in figure 10 show the spectra of the full sun at the onset, hard X-ray peak and soft X-ray peak of the flare. At the onset, the spectral fits show that the nonthermal Bremsstrahlung emission is enough to explain all emission above ~5keV, and no Gaussian line emission is necessary. This agrees with the previous conclusion of the emission being nonthermal at the onset of the flare. The second image shows that, at the point in the height-time plot where the sources are reversing their distribution in height, the thermal component has risen significantly and in fact dominates for all energies below ~8keV. The final image shows that during the part of the flare where emission is at the looptop only, all emission below 10keV is thermal. This agreement in time with the reversal in energies strongly supports our interpretation of the height-time plot.

Fig. 11. Plots of emission measure and temperature versus time as recorded by RHESSI (solid line) and GOES (dotted line). The spectroscopy could be further supported by comparison with the same paramaters as determined by GOES. When the difference in calibrations between the two instruments is taken into account, this data shows strong agreement. The deviation in the earlier data points is explained strong noise in the early RHESSI images, when emission in the X-ray spectrum was only starting to become visible.

63 gradbook final.indd 63

22/10/2010 12:44:26


An immediate observation can be made once this series of spectra is accepted. In order to achieve the best fits, the well-documented iron line emission at 6.7 keV needed to be separated from the thermal component, allowing it to contribute significantly to the fit even when the thermal component was at a minimum. Thus, while it has been previously assumed that this line can only be produced thermally, this spectroscopy provides evidence for a nonthermallygenerated iron line.

Fig. 12. A plot of spectral index versus time based on the nonthermal fit of the RHESSI spectra. Finally, the nonthermal parameters aid in the physical interpretation of the electron acceleration mechanism. Figure 12 shows that the spectral index (or slope on a log-log plot) starts at roughly 6 s-1 cm-2, descends to 4 s-1 cm-2 at a time coinciding with the hard X-ray peak, and then rises to about 8 s-1 cm-2. This is a behaviour very commonly seen in flares, called ‘soft-hard-soft’ or SHS behaviour. The physical interpretation of this is the acceleration mechanism powering up to a maximum, accelerating electrons to higher energies, followed by a process of running out of energy, and electrons being precipitated from the flare loop.

4.3 Imaging Spectroscopy

Fig. 13a. A series of images all from the timeframe before the hard X-ray peak, 04:35:00 – 04:36:08, with energy ranges as labelled.

64 gradbook final.indd 64

22/10/2010 12:44:27


Fig. 13b. As 13a, except the timeframe here is from the hard X-ray peak to the point where the emission is looptop-only, 04:36:00 – 04:36:24. The corresponding image set for the ‘endflare’ time region is omitted here for neatness, as it simply shows one looptop source with little change in position with energy. Before spectroscopy is even begun on these imagecubes, it can be seen that there is further agreement with our interpretation of the flare. In the prepeak images, a source that rises with energy is seen in the first two frames, before peak emission steadily moves to a source lower in the flare loop. Once this source is dominating, it slowly decreases in altitude with energy before the images become noisy. This is exactly what our spectroscopy predicts – the emission from the early flare is thermal only in the very low energy bands, while the remainder is nonthermal.

Fig. 14. A table summarising the results of imaging spectroscopy on the looptop and loopleg sources of the flare before the hard X-ray peak, immediately afterwards, and at a later time where emission is looptop only. In the postpeak images, precisely the same property is seen, the only difference being that the point in energy where the emission ‘jumps’ to a lower altitude is higher. In the prepeak images it occurs in the third frame (4-4.6 keV), while in the postpeak it occurs in the sixth (6.1 – 7 keV). This represents the heating of the flare loop with time. Before the hard X-ray peak, the plasma temperature is only enough to emit X-rays of energy up to say 4.5keV, but once the accelerator has reached a maximum power, the heating is enough to emit up to 6.5keV (and eventually ~10keV) X-rays. As shown in figure 14, due to count issues, little information can be derived from the spectral parameters in this section. The values marked with an asterisk (*) are considered invalid as they correspond to times/locations where thermal emission dominates. However, it can be seen that

65 gradbook final.indd 65

22/10/2010 12:44:28


the prepeak values strongly resemble those covered in another paper (Liu, Xu, Wang, 2009), that suggests the looptop emission can be explained by a thick-thin scenario introduced therein.

5 Discussion

All data analysis done in this project supports the idea that the emission in the flare begins as mostly nonthermal up to the hard X-ray peak, during which thermal starts to dominate eventually reaching energies of roughly 10 keV for the remainder of the flare. This section discusses the possible physical interpretations of the descent of the emission sources, as well as the following ascent. The focus will however be on the former, as these readings are unique to this flare.

5.1 Descending Sources

The splitting of the initial nonthermal source, and subsequent descent to the footpoints of the flare loop may be explainable by two physical processes. First, we know from the spectral data that the spectrum in this phase of the flare is hardening – meaning electrons are being accelerated to higher energies as time passes. This alone can account for some level of descent in the southern source. This can be explained by a reasonably simple model (Figure 15). First, using the models presented by Brown, Aschwanden and Kontar, a column representing a loop leg can be drawn, colour-coded so that the regions where 6-8, 8-10 and 10-12 keV emission dominate are marked. Next, one needs to consider what occurs when the spectral index decreases (or hardens). Taking for now that there are no other factors increasing flux, then it can be said that the higher the energy, the more the increase in flux as spectral flattening occurs. This means that within each energy band, the higher energies will contribute more, and thus emission from the lower altitudes will rise, causing the centroid of emission of that energy band to descend. In the picture below, one would see a source at the higher end of the lightest gray area lower to the centre as the spectrum reaches a slope of 0.

Fig. 15. A simple representation of a model flare loop leg. Density and height relation taken from Brown, Aschwanden and Kontar 2002.

66 gradbook final.indd 66

22/10/2010 12:44:28


However, in this model, there’s no explanation for a source emitting in the 3-6keV region descending and displacing the source of 6-8keV emission. However, this property can be seen in the data. In order for emission of one energy to take the place of that of another energy, the density of that location needs to change. This suggests that some form of ablation is occurring during the onset of the flare. Chromospheric evaporation readily describes this process, however until now it has not been seen this early in a flare. If plasma is heated and ejected up into the loop or out of the loop entirely, the remaining column will have an overall reduced density, and so the beam will be able to travel further before finding an area of the same density. This would also explain why the high-energy emission regions will descend substantially slower than the lower energy regions.

Fig. 16. A plot of downward velocity of the southern source versus the square of the average energy of emission for that source. Associated velocities are shown in the table below the plot.

5.2 Ascending Sources

In order to determine if the velocity to square of energy relation is observed, they are plotted in figure 16, and fit with a linear trend line. As shown, the observations agree with the predicted velocities of chromospheric evaporation.

67 gradbook final.indd 67

22/10/2010 12:44:29


6 Conclusion

In this work, a unique flare has been studied. Due to minimal plasma preheating, the flare of 28 November 2002 shows clear source motions before the hard X-ray peak. The flare was successfully observed by RHESSI and GOES, and so these instruments could be used in an attempt to explain a part of solar flares previously hidden behind thermal radiation. Results using data from RHESSI included first a height-time plot, the most significant feature of which was a strong ‘reversal’ in height-energy relationship at the hard X-ray peak, whuch suggested a switch from nonthermal to thermal emission. This finding was pursued using spatially integrated spectroscopy. The resulting spectra agreed with the suggestion made from the heighttime plot. The remaining results were found in an attempt to explain the source motions observed during the flare. First, the downwards ‘velocities’ were determined from the height-time plot. These velocities were noted to be higher for lower energies. Second, the upward velocities were similarly determined, and were seen to show a strong dependence on energy, with the plasma velocity being proportional to the square root of the emitted photon energy. Finally, using imaging spectroscopy, the spectral index of the looptop and loopleg emitting sources were determined separately. This analysis has verified some established physical theory. The upflow of plasma after the hard X-ray peak shows strong agreement with chromospheric evaporation models (Fisher, Canfield & McClymont, 1984). The looptop source has also been studied before (Masuda et. al, 1994), and behaves as expected by the thick-thin model (Liu, Xu, Wang, 2009). However, due to the unique nature of the flare, new theory is required. The apparent downwards motion of emitting plasma regions has also shown a strong dependence on energy, in that the higher energy emitters are seen to descend slower. This process can be explained by the spectral hardening coming up to the X-ray peak, but not completely. A second factor in this behaviour, coronal or chromospheric ablation, is suggested to complete the explanation. Finally, there is a strong suggestion that the iron line in this flare has been nonthermally generated. This assumption is rarely made as there has always been enough thermal emission to explain the presence of an iron line. The work in this report can be built upon in future, either by advancement of spectroscopic techniques, discovery of another similar flare in RHESSI’s data archives, or the occurrence and recording of a new flare with similarly minimal preheating.

68 gradbook final.indd 68

22/10/2010 12:44:30


69 gradbook final.indd 69

22/10/2010 12:44:30


Business panel

r

Dr. Sarah Ingle, DCU (chair) Dr. Paul Donnelly, DIT Dr. Aoife McDermott, DCU

Judges’ comments

The winning submission in the Business category of the 2010 Awards brings into sharp focus the issue of global population, linking in with the complex area of man-made climate change. The author’s rationale for writing the essay was in “the hope of stimulating the reader to act in the cause of advancing both social marketing practices, and more presciently, peoples’ understanding of the dangers of overpopulation”. This rationale is also founded on what the author notes as “two major inconsistencies”: (1) the scale of the threat posed to humanity by global population growth is very much out of sync with the very limited attention it receives in mainstream politics and the media; (2) the reluctance of public-good activists to engage with commercial marketing tactics of proven success. The paper represents a call to action, with marketing presented as a means to making that call, to cast aside self-interest in favour of collective interest. In so doing, it presents the possibility of business to serve the interests of society, a worthy aim in the current climate where finance/ business has been shown to be calamitously self-serving to the detriment of society. The paper is well researched, well written and well structured – vital and valuable qualities that are all too often overlooked by students of business. It is comprehensively referenced, which is illustrative of broad reading, and it succinctly captures the essence of the literature reviewed to present a focused and cogent argument. Finally, the paper is illustrative of the fact that it is not just having ideas that is important; ideas without the means to convey them well such that they convince others all too often fail in their communication. This essay is an excellent example of how business-related ideas and opinions can be made clear and convincing, and ultimately brought to life.

70 gradbook final.indd 70

22/10/2010 12:44:30


r Business

Understanding overpopulation: how commercial marketing tactics can help us tackle “the greatest shortcoming of the human race”1 Daniel Philbin Bowman

Marketing the threat of overpopulation: How design and communication can help save the world.

S

The Challenges of the 21st Century

ince the beginning of time, humankind has been forced to come together and cooperate to create a better future collectively. This need has only increased in recent decades as exponential advances in technology have created an ever more globalised world with global challenges: nuclear proliferation, terrorism, climate change, poverty, genocide, disease2 and, perhaps most dangerously, overpopulation.3

Dr. Albert Bartlett on the human inability to understand the exponential function and its consequences of overpopulation in the 21st century, Arithmetic, Population and Energy. Accessed at: www.albartlett.org. 2 What could be termed the “Big Six” global challenges of the 21st century referred to by Barack Obama on numerous occasions during his campaign for the presidency including during his New Hampshire Primary (“Yes We Can”) speech, January 8th 2008. Of course, the ultimate difficulty with such contentious issues is that not everyone agrees that they are problems: one man’s terrorist is another’s freedom fighter. Such debates are, however, beyond the remit of this essay, which adopts the general global consensus that the aforementioned “Big Six” and, as will come to be argued, the challenge of overpopulation are real problems and must be dealt with. 3 “You cannot sustain population growth and/or growth in the rates of consumption of resources.” Dr. Albert Bartlett, Arithmetic, Population and Energy. Accessed at: www.albartlett.org. 1

71 gradbook final.indd 71

22/10/2010 12:44:31


These will require wider, deeper and smarter cooperation amongst the world’s citizens. As important as the challenges are and as necessary as such cooperation is, this will not be achieved easily. Each one of us is a rational, self-interested being with immediate needs and wants; it is rarely convenient or immediately desirable to lower these priorities in favour of a common goal. How best then can humans come together to solve our problems? Evidently political and civic leadership is required alongside continued investment in technological innovation. However, even when the world agrees on a problem and has the capacity to solve it the tragedy of the commons4 shows us how difficult it can be to successfully resolve it as direct self-interest outweighs indirect collective interest. We need only look to the outcome of the recent COP-15 UN Climate Summit in Copenhagen to see the results of such conflict in action.5 This essay argues that commercial marketing tactics, particularly those that utilise the power of design and communication, have a significant influence on how we determine the world’s problems and can thus improve the way we save the world.

Commercial Marketing

Marketing can be broadly defined as “the processes for creating, communicating, delivering, and exchanging offerings that have value for customers, clients, partners, and society at large.”6 In other words, commercial marketing is an attempt to influence human behaviour towards the purchase of the marketer’s product or service. Consider the management of the consumer’s perceptual process that is at the heart of marketing.7 The marketer aims to position their product or service within the consumer’s mind, often using external stimuli with particular associations and preconceived ‘categories’. In this way the consumer’s beliefs and attitudes and even emotions are manipulated by the marketer so that they learn to equate the product with positive feelings or attributes.

Design

Careful design of the user’s experience with a product or service is the cornerstone of effective marketing. Thaler and Sunstein explore the power of design in Nudge, their behavioural analysis of how people make decisions. Through what they term “choice architecture”, Nudge remarks how businesses take advantage of basic human psychological traits such as the use of heuristics and a bias towards loss aversion and the status quo.8 For example, the status quo bias which vastly increases the importance of the default option has been exploited by a wide array of firms Richard Thaler and Cass Sunstein, Nudge: improving Decisions about Health, Wealth and Happiness. (Yale University Press, London: 2008) pg. 185 for explanation. 5 Sarah Bartlett and John Hickman, “Copenhagen as a monumental Tragedy of the Commons”, Australia’s E-Journal of Social and Political Debate, 17 December 2009. Accessed at http://www.onlineopinion.com.au/ view.asp?article=9844. 6 An extract from the American Marketing Association’s definition of marketing. 7 Dr. Norah Campbell, Lecture on Consumer Perception, Trinity College Dublin, October 2009. 8 Thaler and Sunstein, Nudge, Part 1: Humans and Econs, Chapter 1: Biases and Blunders. Pp. 17-39. 4

72 gradbook final.indd 72

22/10/2010 12:44:32


from credit card providers to mail order specialists to fitness clubs – all of whom seek to lure in the consumer at low introductory rates on a contract that will automatically upscale to a higher rate after a certain period of time if no action is taken.9 Even in our local pub’s design is used to maximise how much we spend. In laying out the traditional carvery buffet process, food retail entrepreneur Bobby Kerr advises placing soft drinks and alcoholic beverages at the start of the queue but leaving jugs of free tap water beyond the cash register.10

Communication

Human beings buy from human beings and marketing communication has long taken advantage of this fact through their application of social learning theory.11 This is a broad psychological theory which states that people become engaged in and learn from observing the actions of others and their associated outcomes.12 Advertisements are worth nothing unless consumers can relate to them, hence why the people who appear on adverts closely resemble the demographics of the target market. However, it is not as straightforward as that. People require a commonality with the message but they also dream. Social learning theory is therefore combined with aspirational advertising tactics in many mainstream firms.13 Celebrity endorsement is a classic application of aspirational advertising and social learning theory. A study by Till et al. found that the systematic pairing of a celebrity with a product led to favourable perceptions of that product, even if it was previously unheard of14. The effectiveness of the strategy increased if there was a perceived commonality between the particular celebrity and the product endorsed15 – the use of William Shatner by Priceline.com has been cited as one such example of a successful match.16 Marketing sells hope.17 Perhaps the best example of this in the last year has been the phenomenal success story of Susan Boyle. Through the emotional presentation of Susan’s individual narrative from social outcast to sublime talent and worldwide superstar, the world was captivated with millions of people Thaler and Sunstein, Nudge, pg. 35. Bobby Kerr, National Vintners of Ireland Conference, Kilkenny, 20th May 2009. 11 Gerard Hastings, Social Marketing: Why Does the Devil Have All The Best Tunes?, (Elsevier, Burlington: 2008) pg. 27. 12 Charles Bird, Review of “Miller, N. E., & Dollard, J.. Social Learning and Imitation. (Yale University Press, New Haven: 1941)”, American Journal of Psychology, Vol. 55, No. 3, (July 1942) pp. 459-460. 13 Victoria Young, Campaign reveals just a bit of Coke’s secret formula, Financial Times, August 6th 2008. 14 Brian Till, Sarah Stanley, Randi Priluck, Classical Conditioning and Celebrity Endorsers: An examination of belongingness and resistance to extinction. Psychology and Marketing, Vol. 25 (2) February 2008. pp. 179-196. 15 Brian Till et al. Pg. 186. 16 Steve McKee, “The Trouble with Celebrity Endorsements”, BusinessWeek, November 14th 2008. 17 Most explicitly seen in Barack Obama’s Presidential Campaign winning him Advertiser of the Year in 2008. Explored in: Binoy Kampmark, The Marketing of Hope: Buying the President, CounterPunch, January 21st 2009. Accessed at http://www.counterpunch.org/kampmark01212009.html. 9

10

73 gradbook final.indd 73

22/10/2010 12:44:32


developing emotional ties to the success of Susan’s career – giving them huge incentives to watch, vote and buy as her story played out.

Corporate Social Responsibility and Social Marketing

As a capitalist society we generally accept such behaviour18: it is normal for profit-seeking businesses in competitive markets to openly campaign to maximise their share of the consumer’s wallet. These same tactics can of course be used to alter human behaviour in other ways – such as donating to particular charities or changing our lifestyle in certain ways. The growing trend of Corporate Social Responsibility (CSR) has seen many profit seeking businesses engage in such campaigns alongside their mainstream commercial activities. While often CSR offers significant PR and long-term rewards for such firms19, others are genuinely attempting to make a difference in the world; to leave a legacy. Richard Reed, co-founder of Innocent Drinks, is one such business person and he wholeheartedly agrees with Michael Porter that business organisations are the most effective entity through which to make an impact towards social good, due to their strength of resources.20 However, in laying out Innocent’s five point plan to improve the world as they conduct their operations, Reed acknowledges that it is easier for an owner-managed, niche organisation like his to achieve real change as he can, and has, chosen to make commercially harmful decisions – such as sourcing Rainforest Alliance Pineapples at a 30% price premium to the market – in order to maintain his ethos. And while he and others point to cause related marketing successes where private interest and social interest can overlap (Ben and Jerry’s being one example)21, CSR campaigns will never – and perhaps should never22 ­– be allowed to outweigh the commercial interests of the vast majority of organisations in a capitalist system. As such, the primary responsibility to effect social good rests with governments and NGOs. In this endeavour informal marketing and solicitation are as old as human nature. However, it was Within certain bounds of reason and legality. See Jim Phillip’s interpretation of the difference between marketing and manipulation and the legal line in between in his discussion of a recent court case in Mississippi http://ezinearticles.com/?Marketing-Vs-Manipulation&id=2620282. See also the legality of subliminal advertising. For instance, in the United States Act Sec 5 of The Federal Trade Commission, who hold responsibility for the regulation of advertising states that it “prohibits unfair or deceptive acts or practices in interstate commerce.” 19 John Surdyk, CSR: More than PR, Pursuing Competitive Advantage in the Long Run, QFinance. Accessed at http://www.qfinance.com/contentFiles/QF02/g26fs3i7/11/0/csr-more-than-pr-pursuing-competitive-advantagein-the-long-run.pdf. 20 Richard Reed, Can Brands Save The World?, The UK Marketing Society. Accessed at: http://www. marketing-society.org.uk/non-member/knowledge-zone/Pages/can-brands-save-the-world.aspx. 21 Ron Irwin, Can Branding Save The World?, Brandchannel, 21st April 2001. Accessed at: http://www. brandchannel.com/features_effect.asp?pf_id=87. 22 Milton Friedman, The Social Responsibility of Business is to Make Profits, New York Times, 13th September 1971. 18

74 gradbook final.indd 74

22/10/2010 12:44:32


only in the 1970s that social marketing was born as a discipline when Philip Kotler and Gerald Zaltman noted the growing implementation of marketing techniques to “sell” ideas, attitudes and behaviours.23 Kotler and Andreasen define social marketing as “differing from other areas of marketing only with respect to the objectives of the marketer and his or her organisation. Social marketing seeks to influence social behaviours not to benefit the marketer, but to benefit the target audience and the general society.”24 An additional clause needs to be added to this definition: many behaviourists and psychologists would argue that claiming the objectives are “not to benefit the marketer” fails to acknowledge that all human behaviour is conducted out of self-interest25, “the desire to be great”26 or, as Dale Carnegie defines it, in the words of philosopher John Dewey “the desire to be important”27. Thus, the above definition may be better rewritten with the addition of one word so that it reads: “Social marketing seeks to influence social behaviours not to materially benefit the marketer...”. While this refinement may seem petty or unnecessary it is anything but. If we accept the words of Freud, Kristof, Fellenz, Carnegie and Dewey, we have established a link of self-interest between the motives of profit-seeking organisations and nonprofit causes and charities. However, despite this link of motives, many NGOs and public-good causes fail to properly utilise the tools available to them for so-called ‘ethical’ reasons. Nicholas Kristof, social marketing guru, philanthropist and author of Half The Sky, summed up the consequences of this failure: “good people engaging in good causes sometimes feel too pure and sanctified to sink to something as manipulative as marketing but the result has been that women have been raped when it could have been avoided and children have died of pneumonia unnecessarily.”28

Marketing Global Challenges

What we have termed Obama’s “Big Six” challenges of the 21st century all have one thing in common: they are widely acknowledged and accepted as problems. And while that status is merited it has not necessarily been achieved easily. To a greater or lesser extent it has required marketing in one form or another. For instance, the scientific evidence supporting climate change has not increased dramatically in the last two decades – but popular acknowledgement of it as a problem has. How so? Growing incidence of climate change-related natural disasters have undoubtedly played their part, but so too has a concerted drive by scientists and environmental activists to raise its profile up the agenda of world priorities – in other words, a climate Philip Kotler and Gerald Zaltman Social Marketing: An Approach to Planned Social Change, The Journal of Marketing, Vol. 35, July 1971. Pp. 3-12. 24 Kotler, P., and Andreasen, A. (1991). Strategic Marketing for Non-Profit Organisations, 4th edition. (Englewood Cliffs, NJ, Prentice-Hall: 1991). 25 Martin Fellenz, Lecture Series on Organisational Behaviour, October 2008 26 Sigmund Freud as referenced in Dale Carnegie, How to Win Friends and Influence People, (Vermilion, London: 2008) Orig: 1936. Pg. 18. 27 Dale Carnegie, How to Win Friends and Influence People, (Vermilion, London: 2008) Orig: 1936, pg. 18. 28 Nicolas Kristof, Advice for Saving The World, Outside Magazine, December 2009. 23

75 gradbook final.indd 75

22/10/2010 12:44:33


change marketing campaign. Perhaps the most successful marketer in this effort has been Al Gore who borrowed heavily from traditional marketing theory in mainstreaming the climate change movement through his documentary film, An Inconvenient Truth and the worldwide LiveEarth concerts.

The Threat of Overpopulation

However, as important as each of the “Big Six” may be, Professor Albert Bartlett argues that there is one challenge, the challenge of overpopulation, which significantly worsens nearly all of the others and as such poses by far the greatest of the world’s problems in the 21st century. And yet it receives markedly less attention. Using hard maths to back up his claims, Bartlett points to the oxymoron of “sustainable growth” and delivers a stark prognosis for the damage it will cause if not stopped. If minimising climate change through the alteration of the quality and scope of our lifestyles is deemed by Al Gore to be “an inconvenient truth” then the list of proposed solutions to overpopulation which Al Bartlett suggests, including “abortion” “war”, “pollution” “disease” “accidents” and “stopping immigration”, might be deemed ‘an unconscionable truth’. This bleakness perhaps partly explains why overpopulation, unlike the other problems, is not acknowledged by mainstream society as a problem. Overpopulation is just not “a politically correct issue” to address at the current time29. One look at the statistics behind Bartlett’s claims show that such an approach is self-delusion at best; self-destruction at worst. Nobel Laureate Dr. Henry W. Kendall remarks that “if we don’t halt population growth with justice and compassion, it will be done for us by nature, brutally and without pity – and will leave a ravaged world.”30 Anyone whose eyes have been opened to these facts has a responsibility to let the world know. After all, as vast as the problem is, it is not insurmountable: it can be halted with justice and compassion. Also on Bartlett’s list of potential solutions to overpopulation are achievable and non-violent methods such as “contraception” and “family planning.”31

Understanding Overpopulation: Where We Go From Here

Nicholas Kristof argues that these two solutions are most saliently achieved through a comprehensive worldwide campaign to educate women32; this essay, along with the UN Population Fund and Overpopulation.org, agree with him. However, to act on Kristof’s call and achieve Kendall’s aim the world must first acknowledge the difficult reality that the world’s population is growing at unsustainable levels. This essays draws on the principles of design and communication which we have noted, to propose several methods which would call mainstream attention to overpopulation as an urgent global problem. Dr. Albert Bartlett, Arithmetic, Population and Energy, accessed at: www.albartlett.org. World Population Awareness, www.overpopulation.org. 31 Arithmetic, Population and Energy, a talk by Dr. Albert Bartlett. Accessed at www.albartlett.org. 32 Nicholas Kristof and Sheryl WuDunn, Half The Sky: Turning Oppression into Opportunity for Women Worldwide, (Knopf, New York, 2009). 29 30

76 gradbook final.indd 76

22/10/2010 12:44:34


Design

Thaler and Sunstein’s core philosophy in Nudge is that successfully implemented choice architecture can improve health, wealth and happiness. The underlying thesis – that more often than not human beings act irrationally – is an extremely valuable lesson for those seeking to alter the world’s behaviour for the better. For instance, what the authors term ‘the dynamic inconsistency of temptation’, (put simply: our inability to resist behaviours that we know to be against our own self-interest) would suggest that merely showing every person in the world Al Bartlett’s lecture would not solve the problem. In the context of global challenges, Nudge allows us to consider overpopulation as the outcome of a global choice architecture system in which decisions are made by all kinds of actors, from consumers to NGOs to governments. The authors argue that the two most important factors affecting this market of decisions are incentives and feedback.33 Nudge contends that when incentives are badly aligned and leading to unfavourable outcomes, it is appropriate for government to try and fix them by realigning them. In discussing the steps forward from Al Bartlett’s stark findings, Gail Tverberg calls for one such realignment of incentives with potentially momentous consequences. She states that all nations should limit foreign aid contributions to only those countries that make “continued demonstrated progress in reducing population growth rates and sizes.”34 The application of a cap and trade system discussed by Thaler and Sunstein in the area of climate change35 would offer a variant of this, allowing countries more freedom to decide their policies while directly disincentivising population growth. Either such proposal effectively equates to an indirect tax on children which, if population growth persists, may well be necessary. Such arguments certainly highlight the problems of social welfare policies and no-strings attached charity approaches which can actually incentivise families to have more children. These issues are not easy. All authors acknowledge that the proposals under discussion can seem politically unacceptable or even inhumane on first assessment. However, Thaler and Sunstein maintain that part of the reason for this is due to a lack of feedback on the decisions we make. While the gain from making a particular decision may be personal, clear and immediate, the costs may be diffuse, uncertain and delayed. This inconsistency of the saliency between the benefits and costs of particular behaviour is at the root of many of the world’s problems, such as pollution36. It is also present in many decisions which contribute to overpopulation. In order to alter this dynamic, systems of feedback must be developed to increase the saliency of the costs of unfavourable behaviour. Thaler and Sunstein point to the marked success enjoyed by the Toxic Release Inventory initiative in causing major polluters to reduce their environmental impact in order to avoid being amongst a “named and shamed” list of the worst offenders which would incur Thaler and Sunstein, Nudge, pg. 185. Gail Tverberg, “Dr. Albert Bartlett’s Laws of Sustainability” The Oil Drum, 5th November 2009. Accessed at: http://www.energybulletin.net/node/50632. 35 Thaler and Sunstein, Nudge, pg. 186. 36 Thaler and Sunstein, Nudge, pg. 188. 33 34

77 gradbook final.indd 77

22/10/2010 12:44:34


significant reputational and PR costs.37 This essay proposes the application of a similar initiative targeting the large charities and NGOs as well as national governments in areas where population growth is at its most dangerous. By ranking these organisations in order of their efforts to educate women and publicising the results, such a campaign may refocus the work of charities and governments in need of public support.

Communication

However, this proposal will only work if the general public who donate to charities perceive overpopulation to be a problem of significant importance. Research by Professor Paul Slovic, a pioneer in this field, would suggest that the publication of a list of facts concerning the challenges of overpopulation would not be sufficient to highlight the issue in the public consciousness.38 His data shows that what he calls ‘psychic numbing’ occurs in most humans when they are confronted with information about large figures of people suffering: “If I look at the mass I will never act”. Given the large numbers of people in need of aid, this is an extremely concerning finding for governments and NGOs reliant on public support to conduct their activities.39 Slovic set out to investigate the definitive figure at which psychic numbing kicks in – the number of people that constitute the “mass” referred to. The results were deeply depressing. Through comparative studies Slovic found that our empathy begins to fade when the number of victims reaches just two – starting out from the individual case, Slovic remarked that “the more who die, the less we care.”40 However, as depressing as such a finding may initially be, it also offers a path forward – summed up by the system 1 conclusion of Slovic’s study: “if I look at the one I will”41. Nicolas Kristof believes Slovic’s findings have extraordinarily important repercussions. Having determined the importance of female education for overpopulation in Half The Sky Kristof is now on a mission to raise maximum attention and money for this cause. As a Pulitzer Prize winning journalist, Kristof visits impoverished areas around the world to report back to the 1st world on the harsh realities of overpopulation. In order to maximise the impact of these reports, Kristof explains that he has developed two important rules. Firstly, acknowledging that “people do good things in part because it feels good”, the presence of hopefulness in every communication is vital. He highlights the case of Mukhtar Mai, a Pakistani rape victim who used compensation money to start a school, believing education to be the route to overcoming the kind of attitudes that led to her rape. “The first time I wrote about her, I was inundated with letters and more than $100,000 in cheques for her,” he says. Kristof’s second rule is drawn directly from Slovic’s ‘psychic numbing’ research: his storytelling always focuses on an individual, rather than a group. In support of this tactic, Kristof highlights the anti-apartheid movement in South Africa. Although the government had imprisoned many brave activists, global campaigns focusing on freeing these Ibid. Pg. 190. Paul Slovic, “If I Look at the Mass I will Never Act, Psychic Numbing and Genocide”, Decision Research and University of Oregon, Judgement and Decision Making, Volume Two, Essay Two, April 2007, pp. 79-95. 39 Nicholas Kristof, Advice for Saving The World, Outside Magazine, December 2009. 40 Slovic, pg. 80. 41Slovic, pg. 91. 37 38

78 gradbook final.indd 78

22/10/2010 12:44:34


political prisoners failed to gain traction at the start. Until that was, the organisers had the idea of refocusing it on an individual and came up with the slogan “Free Nelson”. Once the movement had a face, it resonated far more widely—and, ultimately, helped topple apartheid. Now in his own stories Kristof is ruthless in adopting the same careful messaging tactics used by commercial firms the world over. For instance, he remarks that regarding public-good campaigns: “I learned that readers cared above all about girls, so when I came across a young man with a compelling story, I would apologise and ask him if he knew any girls with similar problems.”42

Conclusion

This essay started by providing a brief overview of both the problems of collective action in the modern day and the commercial application of design and communication strategy. In examining the traditional hesitancy to apply these tactics to public-good problems we evaluated the growing social marketing practice and its potential benefits. Referring to Dr. Albert Bartlett’s lecture on overpopulation and Nicolas Kristof’s related campaign to educate women worldwide, we proposed definite steps that could be taken by concerned individuals, as well as governments and NGOs, to highlight the threat of overpopulation based on the lessons derived from design and communication. In the order that they have been presented above they are: 1. The linking of monetary and financial aid with measurable efforts to educate women. 2. The publication of a ranking of government and NGO’s efforts to educate women. 3. The linking of numerous hopeful, individual narratives to the problem of overpopulation. Although the steps are listed in the order in which they would theoretically have the most immediate impact, in reality this essay proposes that they be implemented in reverse. Only when the general public have become convinced of the threat of overpopulation through a wide-scale campaign linking Bartlett’s macro-trends with individual stories of hope will feedback initiatives be salient and a political will exist to support tough realignment of incentives.

Author’s Note

This essay was written in the hope of stimulating the reader to act in the cause of advancing both social marketing practices and more presciently, people’s understanding of the dangers of overpopulation. The study was prompted by what the author perceives as two major inconsistencies. Firstly, the scale of the threat which global population growth poses to humanity is clearly out of sync with the miniscule attention it is receiving in mainstream politics and the media. Secondly, the not-uncommon reluctance by many public-good activists to engage in commercial marketing tactics which have been proven successful is not only naive but also irresponsible. It has not been within this essay’s remit to detail the statistical evidence which supports both social marketing and overpopulation and thus, if the reader remains unconvinced as to the legitimacy of the assumptions made in this essay it is hoped they will consider the more elegant works of Bartlett and Kristof as a next step. Let us start out on the long, collective journey to save the future of the human population. 42

Nicholas Kristof, Advice for Saving The World, Outside Magazine, December 2009.

79 gradbook final.indd 79

22/10/2010 12:44:34


Celtic Studies & Irish panel

r

Dr. Regina Uí Chollatáin, UCD (chair) Dr. Pádraigín Riggs, UCC Prof. Liam Mac Mathúna, UCD

Judges’ comments

The author of this essay demonstrates an excellent and unusual mastery of literary criticism on the spirit and scope of some of the most important aspects of Seán Ó Ríordáin’s poetry. He presents a discerning, perceptive study which highlights a clear understanding of the development of the poet’s thoughts in the period between the publication of both of his main poetry collections, Eireaball Spideoige and Brosna. Although there are some grammatical errors from a language standpoint, which are to be expected at undergraduate level, the author succeeds in implementing an academic writing style which is very apt for this literary discourse. It is the opinion of the adjucating panel that this essay is not only deserving of the highest praise and acknowledgement, but that it is clearly the best essay in this category. The author also demonstrates significant promise as a future scholarly critic, adding further credence to his selection as the winner of the Celtic Studies & Irish category for the 2010 Undergraduate Awards. Léiríonn údar na haiste seo máistreacht neamhchoitianta ar éirim dánta tábhachtacha de chuid Sheáin Uí Ríordáin. Déanann sé iniúchadh grinn géarchúiseach orthu agus aimsíonn tuiscintí glinne ar an bhforás a tháinig ar mhachnamh an fhile sa tréimhse idir foilsiú a dhá phríomhchnuasach filíochta, Eireaball Spideoige agus Brosna. Cé go bhfuil easnaimh áirithe le brath ar ghnéithe de chruinneas deilbhíochta na teanga san aiste, a mbeifí ag súil leo ag leibhéal fochéime, tá láimhsiú breá cumasach déanta ag an údar ar an stíl acadúil scríbhneoireachta a oireann don dioscúrsa a bhaineann leis an réimse ábhair. Is dóigh leis an bpainéal moltóireachta nach é amháin go bhfuil ardghradam agus ardaitheantas tuillte ag an aiste seo, ach gurb í is fearr ar fad sa chatagóir, an Léann Ceilteach agus Gaeilge, agus go bhfuil an-ghealladh faoin údar mar scoláire critice amach anseo.

80 gradbook final.indd 80

22/10/2010 12:44:35


r Celtic Studies & Irish

The poetry of Seán Ó Riordáin Thaddeus Ó Buachalla

“Paradise is lost for good, the individual stands alone and faces the world – a stranger thrown into a limitless and threatening world. The new freedom is bound to create a deep feeling of insecurity, powerlessness, doubt, aloneness and anxiety.”

S

Escape from Freedom – Erich Fromm

a sliocht seo, labhraíonn Erich Fromm ar bhuaireamh an duine a fhaigheann saoirse ó struchtúr éigin ina shaol. Is fóirsteanach é mar áit chun tosnú a dhéanamh ar fhilíocht Sheáin Uí Ríordáin mar léiríonn sé an faitíos a bhí air nuair a chaill sé a chreideamh. San aiste seo, féachfaidh mé ar shaothar an Ríordánaigh mar a léiríonn sé an buaireamh seo agus an choimhlint inmheánach a ghabh leis. Tá cur síos an-mhaith ag an Ríordánach ar a fhealsúnacht filíochta sa réamhrá a chéad leabhair, Eireaball Spideoige. Dar leis, is ceangail spioradálta, meitifisiciúil í an fhilíocht le heisint ruda. Cheap sé go raibh eisint, nó paidir, ag gach rud agus gur tharla filíocht nuair a thiocfaí i dteagmháil, nó nuair a gheitfí, leis an eisint seo. Cheap sé go dtuigfí an eisint seo sa tslí ina dtéann aigne linbh i ngleic leis an domhan. Tá an fhealsúnacht seo le feiceáil in Malairt. Labhraíonn sé ar Turnbull ag machnamh “chomh cruaidh air gur tomadh é fá dheoidh/ in aigne an chapaill”. Bhí an fhealsúnacht seo fite fuaite lena chreideamh. Cheap sé go mbeadh an eisint seo le fáil ar neamh: “beidh foirm iomlán na fírinne os ár gcomhair” (ES 17) agus san ifreann nach mbeadh “aon tsampla de shnoíodóireacht na fírinne le fáil” (ES 19). Bhí an chailliúint creidimh cosúil le hionsaí ar an bhfealsúnacht seo agus léiríonn a chuid filíochta, um an dtaca seo, é mar dhuine a bhí ag iarraidh greim a bhreith ar eisint an tsaol mar chosaint ón tsaoirse gan bhrí a bhí os a chomhair. Tháinig cuid dá fhilíocht is láidre as an gcoimhlint seo, dar liom. Bhí an fhealsúnacht chéanna ag Gerard Manley Hopkins agus chuir an Ríordánach a dhán As Kingfishers Catch Fire, Dragonflies draw flame go léir ina réamhrá. Thug Hopkins an téarma

81 gradbook final.indd 81

22/10/2010 12:44:35


‘inscape’ ar an eisint seo, paidir an Ríordánaigh. Cheap siad araon go raibh an eisint seo i ngach rud. Dar leo, bheadh ar an bhfile a phaidir nó a eisint féin a chuimil le rud lasmuigh de agus is as seo a bhfaighfí an toradh, an dán. Bhí an-bhaint leis seo agus an reiligúin, dar le hEibhlín Ní Gearailt, os rud é go gceaptar go bhfuil Dia i ngach rud agus “nuair a theagmhódh file le paidir ní éigin deoranta, theagmhódh sé chomh maith le Dia” (Nic Ghearailt 41). Tá cosúlachtaí ann chomh maith idir a gcuid filíochta. D’úsáid Hopkins comhfhocail ina dhánta: “Towery city and branchy between towers;/ Cuckoo-echoing, bell-swarméd, lark-charméd, rook-racked, riverrounded”. Rinne an Ríordánach a rud céanna le “screadstracadh ar an nóinbhrat”. Tá tábhacht faoi leith ag na cosúlachtaí seo mar léiríonn siad an rud a bhí á lorg ag an Ríordánach um an dtaca ar scríobh sé na dánta in Eireaball Spideoige. File cráifeach, caitliceach ab ea Hopkins agus taispeánann na cosúlachtaí, idir fhilíocht agus fhealsúnacht, go dteastaigh ón Ríordánach an cráifeachas sin a neartú ina shaol agus ina fhilíocht féin. B’in é an cuspóir a bhí aige ach ní bheadh sé éasca dó á aimsiú. In Adhlacadh Mo Mháthar, tá an cainteoir ag iarraidh greim a fháil ar an saol meitifisiciúil. Ba mhaith leis dul thar an adhlacadh nádúrtha agus é a “bhlaiseadh go hiomlán”. Seo í paidir an adhlactha agus tá stíleanna faoi leith aige sa dán chun an tarchéimnitheacht seo a aimsiú. Úsáidtear athrá ar an bhfocal ‘lámh’ sa tríú rann; ‘gile’ sa chúigiú agus ‘ba mhaith liom’ sa rann deireanach agus tugann an stíl seo éifeacht hiopnóiseach don dán. Tá tagairtí don Bhíobla sa dán chomh maith agus féadtar iad a fheiceáil mar mhodh tarchéimnitheachta eile. Tá siombalachas ann a cheanglaíonn an cainteoir le Críost i línte mar: “pian bhinibeach ag dealgadh mo chléibhse”. Sa tslí chéanna, tá siombalachas ann a cheanglaíonn a mháthair le máthair Dé. Tá cur síos uirthi, dar le Pádraigín Riggs, “mar mháthair agus mar mhaighdean” (Riggs 2000: 329). Sa chúigiú rann, déantar tagairt díreach di i gcomhthéacs siombalachais a thagraíonn do mhaighdeanas: “Gile gearrachaile lá a céad chomaoine” agus do mháthaireachas leis an bainne as na cíche. Sa líne idir an dá línte siúd, faightear siombail na habhlainne agus féadtar, im thuairimse, an abhlann seo a fheiceáil mar shiombail a bhaineann le tarchéimnitheacht an saol nádúrtha (sa tslí go ndéantar an corp Chríost meitifisiciúil den phíosa aráin fhisiciúil) agus mar íomhá fhoirfe den ‘phaidir’ atá á lorg aige. Luann Pádraigin Riggs chomh maith an tagairt sa dán seo don dán traidisiúnta Caoineadh na dtrí Muire. Tá macalla sa líne “cuimhne na mná a d’iompair mé trí ráithe ina broinn” le líne as an dán seo ach athraíonn an Ríordánach pearsa an bhriathair. Is ceangal níos dírí idir máthair an chainteora agus máthair Dé é seo. Léiríonn an siombalachas cráifeach seo, Íosa, Muire agus an abhlann, dúil a bheith i gcomaoin leis an saol meitifisiciúil. Teipeann ar an gcomaoin, áfach, mar gheall ar an íomhá fhoirfe den tarchéimnitheacht – an spideog. Is siombal í an spideog den ‘phaidir’ ar a dtráchtann sé sa réamhrá. Is féidir léi bás a tarchéimniú lena ‘caidreamh neamhghnách’ agus tá sí inchurtha le haer na bhflaitheas. Ina réamhrá, shamhlaíonn sé an ‘gheit’, nó an inspioráid, mar mhóimint ina bhfuil an duine lasmuigh dá aigne féin “nó os a chionn mar a bheadh spideog” (ES 22). Ní féidir leis an gcainteoir, áfach, bheith páirteach sa mhóimint mheitifisiciúil seo. Níl aon eipeafáine ann agus tá sé deighilte amach ina ‘thuata’. Críochnaíonn an dán le bathos a fhaightear lasmuigh den dúnadh. Is Bathos lag é, dar le Seán Ó Tuama, agus luann sé The Hollow Men le

82 gradbook final.indd 82

22/10/2010 12:44:35


T.S. Eliot mar shampla de bhathos láidir. Ní aontaím leis sa chás seo, áfach, mar ní oirfeadh bathos néata an dán. Im thuairimse, is as an tsraimlíocht filíochta sna línte seo a thagann a gcumhacht agus a léiríonn éagumas an chainteora, duine nach bhfuil “gan mhearbhall gan scáth”. I gcomhthéacs an tsleachta le hErich Fromm, faightear an t-éagumas seo mar théama lárnach sa dán. Is machnamh é an dán ar easpa comaoine leis an saol meitifisiciúil agus ar bhuaireamh an duine nach bhfuil in ann teacht air. Léiríonn an siombalachas reiligiúnach an fonn seo atá ag an gcainteoir ach teipeann air mar níl ann os a chomhar ach folús. Is san fholús seo atá an t-amhras, an t-uaigneas agus an imní ar a dtráchtann Erich Fromm. Má theipeann ar an reiligiún faoiseamh a thabhairt don chainteoir in Adhlacadh Mo Mháthar, is Cnoc Mellerí ina bhfaightear an choimhlint idir an cainteoir atá ag lorg faoisimh reiligiúnach agus an té atá á thréigint. Tá ceithre mhír sa dán, dar le Seán Ó Tuama, agus léirionn siad gnéithe difriúla den choimhlint. Sa chéad mhír, glactar an saol rialta, reiligiúnach na manach mar rud draíochtúil. Faightear manach, mar shampla, déanta de ghrian, ag léitheoireacht. Sa tarna mhír, áfach, tagann an t-amhras agus an duairceas isteach le “tiubhscamall de chlúimh liath”. Féachann an cainteoir ar shaol an manach mar rud crua a dhéanfadh eagóir ar gharsún. Sa tríú mhír, tá binb tar éis teacht ina ghuth agus féachann sé laistiar de ar ‘fhásach’ a shaol. Ceistíonn sé an mhínádúrthacht a bhaineann leis an saol reiligiúnach seo. Ta iarracht sa ceathrú mhír ar an dá thrá a fhreastal. Ar maidin, faigheann sé “mórfhuascailt na faoistine” ach níos déanaí is buarach ar a aigne í Eaglais Dé. Faoi dheireadh, is í an phríomhéifeacht ar an meanma, dar le Seán Ó Tuama, ná “amhras níos doimhne ná riamh” (Ó Tuama 1978: 45). Is coimhlint é an dán seo idir an dá mheanma ach, ar an taobh a shéanann an saol reiligiúnach, níl an ‘domhan bagarthach gan chuimse’ ar a dtráchtann Erich Fromm le feiceáil. Is é an t-aon chomhartha air ná “greim fhir bháite” sa líne deireanach. Tá guth réasúnta sa dán, nó róréasúnta dar le Seán Ó Tuama, agus tá dealramh ann go bhfuil an cainteoir ag iarraidh an cheist a réiteach go loighciúil. Féadtar a rá, áfach, nach bhfuil eagla na saoirse brúite isteach air go hiomlán fós. Is meafar é an paidrín a bhaineann le saol an duine mar a shamhlaítear don Ríordánach é, im thuairimse. Rud líneach atá i gceist anseo, gan chasadh gan bhrainsí. Faightear an paidrín brúite ina dhóid mar shiombail an amhras atá air faoina chinniúint ach, sa rann deireanach, tagann an meafar ar ais i bhfoirm eile agus is iad na laethanta atá roimhe atá “fá cheilt i ndorn Dé”. Sa tslí seo, féadtar a thuiscint ar a chinniúint féin a fheiceáil mar rud líneach agus féadtar a rá nach bhfuil an cineál saoirse atá gan treorú os a chomhair mar choincheap diongbháilte fós. Baineann an dán Saoirse leis an téama céanna ach tá peirspictíocht an-difriúil ar an gcás. Is anseo ina bhfaightear an ‘domhan bagarthach gan chuimse’ ar a dtráchtann Erich Fromm. Sa réamhrá, thug an Ríordánach cur síos dúinn ar ifreann mar “áit ná fuil aon tsampla de shnoíodóireacht na fírinne le fáil” (ES 19). Luaigh sé Saoirse mar iarracht ar an aigne dhamanta sin a léiriú ach d’admhaigh sé gur bhraith sé “an tsaoirse dhamanta theibí” ag bagairt air agus é ag scríobh. Faoin am ar scríobh sé an dán seo, dar le Seán Ó Tuama, bhí sé lánchinnte “nach mbíonn aon duine saor ach an té a bhíonn ceangailte” (50) agus faightear an chinnteacht seo sa dán. Tá sé ann, áfach, le híoróin binbeach an daoir. Tréigeann an cainteoir saoirse ar an daoirse a bhfaighidh sé leis na daoine “nár samhlaíodh riamh leo/ ach macsmaointe”. Is pointe suntasach é gurb ionann na daoine siúd agus “lucht glanta glún” atá le fáil in Adhlacadh Mo Mháthar. Cé go

83 gradbook final.indd 83

22/10/2010 12:44:35


bhfuil na daoine seo cosúil le hathair an linbh sa réamhrá, gan tuiscint acu ar ‘phaidir’ aon rud, tá siad fós “gan mhearbhall gan scáth” – fadhbanna na saoirse. Is comhghéilleadh uafásach é ach is í teachtaireacht an dáin ná go bhfuil sé níos fearr dó ná a mhalairt. Is léir ó fhoirm an dáin, áfach, nach bhfuil sé in ann glacadh go hiomlán leis an gcomhghéilleadh seo. Tá rithim faoi leith ann leis na ranna de trí líne. I ndiaidh an deichiú rann, tá athrá ar an bhfocal ‘don’ a thugann dealramh don dán go bhfuil sé ag dul as smacht ach tosnaíonn seo leis an líne: “Is do thugas gean mo chroí go fíochmhar/ Don rud tá srianta”. Is í an chodarsnacht idir ‘fíochmhar’ agus ‘srianta’ a léiríonn an choimhlint inmheánach ar a mbunaítear an dán seo. Is léir chomh maith go mbíonn faitíos air faoin tsaoirse ón treine binbe atá le feiscint i línte mar “Is bheirim fuath anois is choíche/Do imeachtaí na saoirse”. Críochnaíonn an dán le rann fada i rithim shaor. Níl an guairneán céanna le fáil anseo ach is diúltach an anaphora atá ann. Is feiscint cheart seo ar an bparthas caillte ar a dtráchtann Erich Fromm. Bhí sé ag dúil fós leis an bparthas caillte seo a aimsiú dhá bhliana dhéag ina dhiaidh sin nuair a foilsíodh Brosna. Bhí peirspictíocht difriúil ar an tóir, áfach, toisc go raibh an reiligiún tréigthe aige faoin am seo. Bhí an fhealsúnacht ar a scríobh sé sa réamhrá fós aige ach bhí múnla difriúil uirthi. Mar a dúirt Pádraigín Riggs: “Bhí parthas eile ann go bhféadfaí filleadh air-ach ceithre céad bliain den stair a chur i leataobh” (Riggs 2008: 145) Is é an parthas caillte sa chás seo ná “cló ceart” duine mar a gheobhfaí é trí mheán na Gaeilge i gCorca Dhuibhne más féidir “srathar shiabhialtacht an Bhéarla” a bhaint dá mheabhair. Is í an t-aon idéal meitifisiciúil a lorgaíonn sé in Adhlacadh Mo Mháthar. Déanann sé trácht ar phaidir na Gaeilge sa réamhrá ach tá sé i bhfad níos déine anseo toisc go bhfuil an reiligiún tréigthe aige. Tá an Ghaeilge agus an cló ceart gaelach in ionad an reiligiúin anois agus labhraíonn sé orthu le teanga reiligiúnach mar “dein d’fhaoistin” agus “buail is osclófar”. Is pictiúr rómánsiúil é ina bhfuil Dún Chaoin mar neamh – cosúil leis an áit, mar a dúirt sé sa réamhrá, ina mbeadh “foirm iomlán na fírinne os ár gcomhair” (ES 17). Faightear anois “ag bun na spéire ag ráthaíocht ann/An Uimhir Dhé, is an Modh Foshuiteach ”. Is áit í ina dtagann an lá atá inniu ann agus aimsir ár sinsear le chéile ach, mar a cheistíonn Seán Ó Tuama, conas a réitíonn cló ceart an duine san áit idéalach seo leis na personae difriúla, atá in earraid le chéile, a chruthaítear idir an dá shiabhialtacht, an Ghaeltacht agus an Ghalltacht. “Níl aon Dún Chaoin ann, dá glaine ghaelaí é, a leigheasfadh a chás” –­dar leis (54). Is ceist mhór í an ‘mise’ deimhneach a bhí á lorg ag an Ríordánach agus tráchtann Seán Ó Tuama arís uirthi i gcomhthéacs Daoirse. Dar leis, b’í tuairim an fhile “go bhfuil Ríordánach buan fírinneach amháin laistigh i gcónaí” (Ó Tuama 1975: 26) agus go mbeadh saoirse aige dá bhféadfadh sé teacht air. In Daoirse, moltar don léitheoir géilleadh don daoirse agus bhronnfaí saoirse dó, “Ná tabhair don daoirse diúltamh / Is tabharfar saoirse duit”. Dealraíonn sé, áfach, go bhfuil an cineál saoirse le fáil an-difriúil leis an tsaoirse a chur scéin san fhile in Saoirse. Tá an géilleadh seo an-chosúil le teagasc Críostaí agus tá an saoirse a thagann as an-chosúil le staid chompordach an chreidmhigh, staid ina bhfuil ‘mise’ fírinneach amháin ann agus an ‘domhan bagarthach gan chuimse’ i gcéin. Tá an cló ceart nó an ‘mise’ fírinneach inchurtha le híomhá an phaidrín in Cnoc Mellerí. Tagraíonn siad araon do chineál canúna líneach éigin. Sna Slíte seo, féadtar filíocht an Ríordánaigh a fheiceáil mar iarracht an fhile ar a fhéiniúlacht

84 gradbook final.indd 84

22/10/2010 12:44:35


a thuiscint agus áit deimhneach a chruthú dó féin sa domhan. Labhraíonn a fhilíocht luath faoin eagla a bhí air roimh domhan a bhí gan struchtúr an chreidimh agus oibríonn sí mar iarracht ar an bhfolús seo a líonadh. Faoin am ar tháinig Brosna amach, bhí an teanga agus an litríocht in ionad an chreidimh ach bhí an fhealsúnacht chéanna ann. Bhí sé fós ag iarraidh struchtúr féiniúlachta a chruthú mar dhíon ón domhan a bhí gan struchtúr agus, i slí éigin, parthas caillte a aimsiú.

85 gradbook final.indd 85

22/10/2010 12:44:35


Chemistry panel

r

Prof. Kieran Hodnett, UL (chair) Dr. Kevin M Ryan, UL Dr. John Colleran, NUIM Dr. Leigh Jones, NUIG John Cassidy, DIT

Judges’ comments

This entry reports the synthesis of novel indolocarbazole derivatives, which have significant potential as anti-cancer drugs. Specifically, indolocarbazole compounds have been shown to act as protein kinase inhibitors that prevent the uncontrolled cell division in cancer cells. The report details complex organic chemistry pathways towards the successful synthesis of these compounds with extremely high purity. The desired product was achieved by a different route to that planned at the outset, showing capability for original thinking and dedication to the project. The review panel found the report to be excellently structured with a detailed introduction, thorough experimental section and complete bibliography. The accuracy in the structures of compounds presented and numerical labelling of same made for ease of reading, with the complete absence of typographical errors greatly commended. In summary, this was an exemplary report that succeeded from a very high quality field of entries and is a worthy winner of the Undergraduate Award in the Chemistry category.

86 gradbook final.indd 86

22/10/2010 12:44:36


r Chemistry

Synthesis of novel indolocarbazole derivatives Hannah Winfield

I

Abstract

ndolocarbazoles are a class of compounds that are currently under study due to their potential as anti-cancer drugs. A modification of these has lead to the bisindolmaleimide series of analogues. The overall objective of this project was to produce novel kinase inhibitors by modification of the bisindolmaleimide structure, replacing one of the indoles with a pyridine and utilising novel heterocycles in place of the lactam/maleimide ring. Two key intermediates were successfully synthesised, which will form the basis for future elaboration to more novel kinase inhibitors. Two heterocycles were consequently formed on these intermediates, which are now ready for derivatisation. The novel potential kinase inhibitors were fully characterised.

Introduction Biology Cancer

Cancer is a class of diseases in which a group of cells displays uncontrolled growth, invasion and sometimes metastasis. Cancer causes about 13% of all human deaths and at least one fifth of all deaths in Europe and North America.1 There are three main approaches for the treatment of cancer: surgery, radiotherapy and chemotherapy. Immunotherapy, monoclonal antibody therapy or other methods can also be used. Treatment can often initially be effective if the cancerous tissue is entirely removed by surgical excision, but this is not possible if the cancer has metastasised to other sites in the body. In such instances, radiotherapy and chemotherapy usually must be used. Chemotherapy is increasingly being utilised and predominantly works by interfering with cell division of rapidly dividing cells. However, as chemotherapy also affects normal cells, it is not without side effects.2 Therefore, new and more selective remedies for cancer sufferers must be found.

87 gradbook final.indd 87

22/10/2010 12:44:37


Protein Kinases and Kinase Inhibitors

Kinases are a type of enzyme that transfer phosphate groups from a high-energy donor molecule, such as adenosine triphosphate (ATP), to a specific substrate. Protein kinases are a type of kinase enzyme which modify other proteins by adding phosphate groups to them. This phosphorylation results in a change in shape of a protein, which usually gives rise to a functional change of the protein such as a change in enzyme activity, association with other proteins or cellular location. In this way they regulate other proteins and, more indirectly, the activities of cells. The kinases play an important role in many intracellular signalling pathways, including those that control cell growth and cell division. Up to 518 different kinases have so far been identified in humans.3 Their enormous diversity, as well as their role in signalling, makes them an object of study for drug design.4 In normal cells, kinases, and another group of enzymes called phosphatases, work together to control cell growth and division. However, in cancer cells, these normal controls no longer function. One key feature of cancer cells is their ability to reproduce in the absence of external signals such as growth factors. This can be caused by a mutation in a kinase gene, which causes the pathway that the kinase controls to be, in effect, stuck in the ‘on’ position. A protein kinase inhibitor is an enzyme inhibitor that specifically blocks the action of one or more protein kinases. Staurosporine, which inhibits several different protein kinases such as Protein Kinase C (PKC), stops progression of normal nontransformed cells in the G1 phase of the cell cycle.

Fig. 1. The cell cycle. Staurosporine and other indolocarbazoles have been shown to inhibit kinases by competing with ATP.5 Due to the similarity between ATP binding sites in different kinases, this approach can be limited in terms of selectivity. However, it has been possible to obtain some selectivity by exploiting differences in the mode of ligand interaction in the ATP-binding pocket.6 One of the key features of future kinase inhibitors will be the selective H-bonding in the ATP-pocket.

88 gradbook final.indd 88

22/10/2010 12:44:38


Fig. 2. Interactions between staurosporine and amino acid residues of protein kinase binding site (note role of lactam moiety).

DNA Topoisomerases

In 1992, mammalian topoisomerases were shown to be targets for indolocarbazoles by Yamashita et al.7 This opened up new possibilities in that indolocarbazole compounds could selectively interact with ATP-binding sites of not only protein kinases but also other proteins that had slight differences in ATP-binding sites. DNA topoisomerases are found in all living organisms. They are enzymes which are involved in many cellular processes involving the separation of DNA strands such as replication, transcription, recombination and repair.8 There are two types of DNA topoisomerases: type I which makes a nick in a single strand of DNA and pass a single strand through the nick and topoisomerase II which makes a nick in both strands of DNA and passes another double-stranded DNA segment through the nick. Afterwards the topoisomerase rejoins the strand without making any changes to the DNA sequence. DNA topoisomerases are targets for different classes of drugs with antitumor or antibacterial applications.9

Indolocarbazoles

Indolocarbazoles are a class of compounds that are currently under study due to their potential as anti-cancer drugs. These compounds of natural origin have been isolated from diverse groups of organisms from prokaryotes (actinomycetes, cyanobacteria, b-proteobacteria) to eukaryotes (myxomycetes, basidiomycetes,marine invertebrates). There are five different isomeric ring systems available for indolocarbazoles: indolo[2,3-a]carbazole (1), indolo[2,3-b]carbazole (2), indolo[2,3-c]carbazole (3), indolo[3,2-a]carbazole (4), and indolo[3,2-b]carbazole (5). Almost all indolocarbazoles isolated from nature are indolo[2,3-a]carbazoles and interest has mostly focused on this isomer.

89 gradbook final.indd 89

22/10/2010 12:44:39


Fig. 3. Indolo[2,3-a]carbazole (1), indolo[2,3-b]carbazole (2), indolo[2,3-c]carbazole (3), indolo[3,2-a]carbazole (4), indolo[3,2-b]carbazole (5). DNA topoisomerase I has been shown to be an important therapeutic target in cancer chemotherapy for indolocarbazole antibiotics.10 There are also at least two other modes of biological action which exist for indolocarbazoles: inhibition of protein kinases and intercalative binding to DNA. However, it has been shown that at least some indolocarbazoles act through more than one mechanism. For example, the antitumor activities of different Rebeccamycin (REB) (7) derivatives were often not correlated to their anti-Top1 or DNA-binding activities.11 In addition, some K252a derivatives were found to be active against Top1, as well as being nonselective protein kinase inhibitors.12 The replacement of one or both of the terminal aromatic rings by pyridine rings has been found to reinforce the DNA binding properties of indolocarbazoles.13 The first synthetic work carried out towards azaindolocarbazoles, i.e. indolocarbazoles containing either one or two azaindoles, was reported in 2002 by Routier et al.14 Indolocarbazoles also have displayed many other biological activities including hypotensive properties, inhibition of platelet aggregation, inhibition of smooth muscle contraction, activation of macrophages,15 blocking of the proliferative response of T lymphoblasts to mitogens,16 in vitro immunosuppression,17 inhibition of the osteoclast proton pump,18 insecticidal activity,19 reversal of multidrug resistance,20 and neuroprotection (promotion of neuronal survival).21 The indolocarbazoles isolated to date are a structurally diverse family of natural products. The various types of aglycons can be divided into four major groups: 1) the parent indolo[2,3-a] carbazole nucleus, such as that found in tjipanazole (6) 2) an imide, as in Rebeccamycin (7) 3) hydroxy lactams, as in the UCN compounds (8) and, 4) simple lactams, such as those found in RK-1409B (9). In all of these aglycon types, substitution (i.e. halides, ethers, phenols) at various positions on the aromatic heterocycle has been observed.

90 gradbook final.indd 90

22/10/2010 12:44:40


Fig. 4. Tjipanazole (6), rebeccamycin (7), UCN-01 and UCN-02 (8), RK-1409B (9).

Synthesis of Indolocarbazoles

Most of the available routes for the synthesis of indolocarbazoles rely either on direct formation of the indolocarbazole skeleton by indole ring synthesis, elaboration of bisindole precursors by construction of the central carbocyclic ring, or a combination of these strategies. Modification of the existing indolo[2,3-a]carbazole scaffolds using well-established functional-group transformations enables synthesis of structurally more complex derivatives. Prior to their isolation from nature and the discovery of their biological action, little attention was paid to the synthesis of indolo[2,3-a]carbazole derivatives. The first synthesis of an indolocarbazole was achieved in 1956 by Tomlinson et al. 22 Tomlinson reported that the condensation of 8-amino-1,2,3,4-tetrahydro-9-methyl-9H-carbazole (10) with 2-hydroxycyclohexane, followed by dehydrogenation produced the N-methyl derivative, 9-methylindolo[2,3-a]carbazole (11), as shown in Scheme 1. However, attempts to synthesise the parent indolo[2,3-a]carbazole via an analogous approach failed.

Scheme 1 The following year Bhide et al.23 developed a double Fischer indolisation of 2-chlorocyclohexanone or 2-hydroxycyclohexanone which, upon oxidation, provided indolo[2,3-a]carbazole (1) as shown in Scheme 2.

91 gradbook final.indd 91

22/10/2010 12:44:43


Scheme 2

Staurosporine

Staurosporine (12) was the first indolocarbazole to be isolated from a natural source, which was isolated from Streptomyces staurosporeus in 1977.24 It was during the same era that protein kinase C was discovered and oncogene v-src was shown to have protein kinase activity. It was the first indolocarbazole to be isolated with this type of bis-indolyl structure. Staurosporine has anticancer properties and inhibits several different protein kinases. It also enhances the cytotoxicity of other anti-cancer drugs.25 Many staurosporine analogues have been synthesised in order to obtain compounds that have a higher selectivity towards different kinases.26

Fig. 5. Structure of Staurosporine. The structure was determined shortly after but this original structure was found to be incorrect. The correct relative stereochemistry was finally established in 199427 and was later confirmed by total synthesis. The first total synthesis of staurosporine was reported by Link et al. in 1996.28 This was achieved by initially synthesising an aglycon from benzyloxymethyl dibromomaleimide (13) as shown in Scheme 3.

Scheme 3

92 gradbook final.indd 92

22/10/2010 12:44:44


The sugar moiety was then synthesised by treating the oxazolidinone glycal epoxide (16) with the sodium salt of the aglycon (15) to obtain the indole glycoside (17) as shown in Scheme 4.

Scheme 4 A Barton deoxygenation was then used to remove the C2’hydroxyl group and the C6’PMB and indolic SEM groups were deprotected. The indolocarbazole moiety was then completed using photolytic oxidative cyclisation as shown in Scheme 5. Following another twelve steps and removal of the protecting groups this compound was converted to staurosporine (12).

93 gradbook final.indd 93

22/10/2010 12:44:45


Scheme 5

94 gradbook final.indd 94

22/10/2010 12:44:46


Rebeccamycin

Rebeccamycin (7) was isolated from cultures of Lechevalieria aerocolonigenes from a soil sample from Panama in 1985. It was found to act against leukemia and melanoma in mice, and also against human adenocarcinoma cells.29 Rebeccamycin works by inhibiting topoisomerase I by forming a DNA-topoisomerase I-drug complex that prevents the religation of the DNA strand after it has been cleaved by the enzyme. Analogues of rebeccamycin are currently in several clinical trials for cancer therapy.

Fig. 6. Structure of Rebeccamycin. The structure and absolute configuration of rebeccamycin were determined by X-ray crystallography and total synthesis. The first total synthesis of rebeccamycin was reported in 1985 by Kaneko et al.30 In this synthesis is the first example in literature of a coupling reaction between an indolocarbazole and a complex carbohydrate. In their total synthesis the Grignard of 7-chloroindole (21) was coupled with benzyloxymethyl dibromomaleimide to give the bisindolylmaleimide (22). The indolocarbazole (23) was then formed by photocyclisation. Coupling of the aglycon (23) with bromo pyranose in the presence of Ag2O yielded the N-glycoside (24). Rebeccamycin was then formed by deprotection of the imide and carbohydrate.

95 gradbook final.indd 95

22/10/2010 12:44:46


Scheme 6 In 1993, Danishefsky improved the synthesis of Rebeccamycin by applying glycal epoxide chemistry to the synthesis of indolocarbazoles via development of the method to include the preparation of indole-N-glycosides.31 It was found that indoles were stronger glycosyl acceptors than indolocarbazoles. In this synthesis the maleimide was coupled with the epoxide furnished glycoside before cyclisation, which is in contrast to the earlier method developed by Kaneko et al.

Azaindolocarbazoles

In 2002, the first work carried out towards azaindolocarbazoles was reported by Routier et al.14 They described a preparation of new symmetrical and nonsymmetrical azaindolocarbazoles. This was achieved by first reacting 7-azaindole with 2,3-dibromo-N-methylmaleimide (13), as shown in Scheme 7. The indolic nitrogen was then protected using a benzene-sulfonyl group and anionic condensation of the compound with 7-azaindole using LiHMDS in toluene gave the bisindoylmaleimide (27).

96 gradbook final.indd 96

22/10/2010 12:44:47


Scheme 7 The second indolic nitrogen was then protected using a tert-butyloxycarbonyl group and the N-benzenesulphonyl group was cleaved. Following photocyclisation and final deprotection the symmetrical azaindolocarbazole (29) was formed, as shown in Scheme 8.

Scheme 8

Bisindolmaleimide Derivatives

Staurosporine has become the lead compound among protein kinase inhibitors. One modification of this compound has lead to the bisindolmaleimides which are considered to be a part of the family of indolocarbazole compounds, due to their close biosynthetic relationships. Many bisindolmaleimides such as LY333531 (ruboxistaurin) (31), shown in figure 7, have also been

97 gradbook final.indd 97

22/10/2010 12:44:48


shown to be highly selective inhibitors of various protein kinases.32 By using a benzofuran moiety instead of an indole unit (32) selectivity against other kinases has been reported, in particular glycogen synthase kinase 3β (GSK-3β) inhibition.33 Some of these compounds show picomolar inhibitory activity toward GSK-3β and an enhanced selectivity against cyclin-dependent kinase 2 (CDK-2). Vascular endothelial growth factor receptors (VEGF-Rs), a type of protein kinase, play important roles in mediating the effects of hypoxia induced VEGFs on pathophysiological angiogenesis such as in pancreatic carcinomas, gastric carcinomas, colorectal carcinomas, breast cancer, lung cancer, prostate cancer, and melanoma.34 However, since VEGF-Rs and other protein kinases use ATP as a cofactor for the phosphorylation of proteins, they share a highly preserved ATP binding pocket that is the site where most inhibitors bind. This means that many compounds developed to selectively inhibit one VEGF-R will also show affinity for other VEGF-Rs and also other tyrosine kinase receptors. Therefore it is very important to find drugs that are both potent and selective. It has been shown that 3-(1H-indole-3-yl)-4-(3,4,5- trimethoxyphenyl)-1H-pyrrole2,5-dione (33), shown in figure 7, is a highly active and selective inhibitor of VEGF-R2.35 This compound is a slight variation of the bisindolmaleimide framework where one of the indole units is replaced by a trimethoxy-phenyl unit and the maleimide ring is replaced by a lactam ring.

Fig. 7. Structure of LY333531 (31), a benzofuran-3-yl(indole-3-yl)maleimide (32) and 3-(1Hindole-3-yl)-4-(3,4,5-trimethoxyphenyl)-1H-pyrrole-2,5-dione (33). The synthesis of LY333531 was first carried out in 1996 by Jirousek et al.36 This was achieved by reacting the corresponding alcohol (34) with methanesulfonic anhydride to produce a mesylate that underwent displacement with dimethylamine.

98 gradbook final.indd 98

22/10/2010 12:44:48


The synthesis of compound 33, Fig. 7, was first carried out by Pfeifer et al. in 2008.35 This was achieved by first protecting the indole nitrogen of a suitable precursor (36) with an SEM protecting group and then performing an aldol condensation, as shown in Scheme 9.

Scheme 9 Acid conditions were then used to generate the compound 3-(1H-indole-3-yl)4-(3,4,5-trimethoxyphenyl)-1H-pyrrole-2,5-dione (33). The benzofuran-3-yl(indole-3-yl)maleimides, (32) shown in Fig. 7, were first synthesised by Gaisina et al. via a condensation between the appropriately substituted 3-indolylglyoxylic acid esters (38) and benzofuranyl-3-acetamides (39).33 The indolyl-based glyoxalates were prepared by acylating the N-alkylated indole with ethyl oxalyl chloride. The substituted 3-benzofuranone was reacted with (carbethoxymethylene)triphenylphosphorane to give the ethyl (1-benzofuran-3yl acetates which were subsequently converted into the corresponding acetamides. Condensation of the acetamides with the glyoxilic esters resulted in formation of the maleimide core (40), as shown in Scheme 10.

99 gradbook final.indd 99

22/10/2010 12:44:49


Scheme 10 These bisindolmaleimide compounds can often be converted to the corresponding indolocarbazoles by a cyclisation between the two C-2 indolic carbon atoms. Several routes have been described to carry out this intramolecular ring closure such as a photolytic activation31 or the use of palladium-catalysed cross-coupling reactions.39 Compounds in the indolo[2,3-a]carbazole series are generally more active than similar compounds in the bisindolylmaleimide series. However, bisindolylmaleimide derivatives with potent antitumor and antiangiogenic properties have been synthesised.40

Results and Discussion Synthesis The overall objective of this project was to produce novel kinase inhibitors by modification of the bisindolemaleimide structure, replacing one of the indoles with a pyridine and utilising novel heterocycles on ring E, to synthesise useful reactive intermediates (42), to convert these reactive intermediates to novel heterocyclic products (43) and to assign and characterise the novel potential kinase inhibitors fully by NMR, LCMS, etc.

100 gradbook final.indd 100

22/10/2010 12:44:49


Fig. 11.

The approach used to synthesise the target molecules was to couple an activated carboxylic acid (45) with a nucleophile (44). Nucleophiles included a Grignard reagent and the anion of an acetonitrile and ester to form a key intermediate (46). The connecting heterocycle was then formed by reaction of the intermediate with a bis-nucleophile. A number of different heterocycles could be formed on each intermediate.

Scheme 11

101 gradbook final.indd 101

22/10/2010 12:47:43


Synthesis of Ketone Intermediate

Initially the synthesis of a ketone was undertaken to probe its utility as an intermediate towards kinase inhibitor synthesis. The first step was to form the Weinreb amide (48), which was then converted to the corresponding ketone (49) using a Grignard reagent. MeMgCl was used as a trial to be replaced with more elaborate Grignards once proof of concept was established.

Scheme 12 The next step required the protection of the indolic nitrogen. However, attempts to achieve this using a methyl protecting group and a p-toluenesulfonyl group both failed.

Scheme 13

Synthesis of β-Keto Ester Intermediate

A different route was then attempted in order to achieve the target molecules. The first step was to use indole-3-acetic acid (47) as our precursor and dimethylate it in order to protect both the indole

102 gradbook final.indd 102

22/10/2010 12:47:43


nitrogen and the carboxylic acid. This was achieved by reacting the indole-3-acetic acid with potassium carbonate and dimethyl carbonate in DMF. The next step was to couple the protected indole (52) with nicotinic acid to form the β-keto ester (53) using a method similar to that outlined by Brana et al.32

Scheme 14 The nicotinic acid (55) was activated using 1,1’-carbonyl diimidazole.

Scheme 15 The novel β-keto ester intermediate was then reacted with hydrazine and camphoric acid to form a pyrazolone (56).

Scheme 16

103 gradbook final.indd 103

22/10/2010 12:47:44


The intermediate was also reacted with guanidine carbonate to form a different connecting heterocycle. However, attempts to try and form the isocytosine product (57) failed. The mechanism by which the β-keto ester is converted to the pyrazolone is highlighted in Scheme 17.

Scheme 17 Initially, protonation of the carbonyl group oxygen by the camphoric acid causes the carbon to become more electrophilic. The lone pair of electrons on the hydrazine nitrogen can now attack the carbonyl carbon resulting in the delocalisation of electron density onto the oxygen. A proton is then transferred from the positively charged nitrogen to the oxygen and a delocalisation of electron density from the same nitrogen then results in the expulsion of a water molecule. Attack of the camphoric acid anion then results in the deprotonation of the positively charged nitrogen. Movement of the lone pair of electrons from the second hydrazine nitrogen onto the ester carbonyl carbon results in the ring-forming step. A delocalisation of electron density from the oxygen results in the expulsion of a methoxide ion which then takes a proton from the positively charged nitrogen to form a molecule of methanol. Tautomerisation of the molecule then results in the formation of the pyrazolone product (56).

104 gradbook final.indd 104

22/10/2010 12:47:45


Synthesis of β-Keto Nitrile Intermediate

The first step was to protect the indole nitrogen on indole-3-acetonitrile (58). This was achieved using a methyl protecting group introduced via standard NaH/MeI conditions. The next step was to couple the protected indole (59) with activated nicotinic acid (54) to form the novel β-keto nitrile intermediate (60).

Scheme 18 The final step was to form the aminopyrazole ring (61), employing hydrazine hydrate in a similar fashion to that seen previously in the synthesis of the pyrazolone moiety.

Scheme 19 The mechanism by which the β-keto nitrile is converted to the aminopyrazole is highlighted in Scheme 20.

105 gradbook final.indd 105

22/10/2010 12:47:45


Scheme 20 Initially, protonation of the carbonyl group oxygen by the camphoric acid causes the carbon to become more electrophilic. The lone pair of electrons on the hydrazine nitrogen can now attack the carbonyl carbon resulting in the delocalisation of electron density onto the oxygen. A proton is then transferred from the positively charged nitrogen to the oxygen and a delocalisation of electron density from the same nitrogen then results in the expulsion of a water molecule. Attack of the camphoric acid anion then results in the deprotonation of the positively charged nitrogen. Movement of the lone pair of electrons from the second hydrazine nitrogen onto the acetonitrile carbon results in the ring-forming step. Another proton transfer results in the movement of a hydrogen atom on the nitrogen from the hydrazine to the acetonitrile nitrogen. Tautomerisation of the molecule then results in the formation of the aminopyrazole product (61).

106 gradbook final.indd 106

22/10/2010 12:47:46


Conclusion

The overall objective of this project was to produce novel kinase inhibitors by modification of the bisindolmaleimide structure, replacing one of the indoles with a pyridine and utilising novel heterocycles in place of the lactam/maleimide ring. This project constitutes this first synthesis of a new compound class with vast scope for future derivatisation. The initial Grignard coupling chemistry was identified to be unviable in this instance and another route through base mediated coupling was pursued. From this, two key intermediates were successfully synthesised, the β-keto nitrile and β-keto ester containing indole and pyridine subunits, and were fully characterised. These versatile intermediates yielded a pyrazolone and an aminopyrazole by condensation with hydrazine and are now ripe for further derivatisation and substitution. The novel potential kinase inhibitors were fully characterised as can be seen in the experimental section.

Experimental Synthetic Procedures 2-(1H-indol-3-yl)-N-methoxy-N-methylacetamide (48)

107 gradbook final.indd 107

22/10/2010 12:47:46


1-(1H-indol-3-yl)propan-2-one (49)

1-(1-Methyl-1H-indol-3-yl)propan-2-one (50)

108 gradbook final.indd 108

22/10/2010 12:47:47


1-(1-Tosyl-1H-indol-3-yl)propan-2-one (51)

N-Methyl methyl indol-3-yl acetate (52)

(1H-Imidazol-1-yl)(pyridin-3-yl)methanone (54)

109 gradbook final.indd 109

22/10/2010 12:47:47


Methyl 2-(1-methyl-1H-indol-3-yl)-3-oxo-3-(pyridin-3-yl)propanoate (53)

4-(1-Methyl-1H-indol-3-yl)-5-(pyridin-3-yl)-1H-pyrazol-3(2H)-one (56)

110 gradbook final.indd 110

22/10/2010 12:47:50


2-Amino-5-(1-methyl-1H-indol-3-yl)-6-(pyridin-3-yl)pyrimidin-4(3H)-one (57)

2-(1-Methyl-1H-indol-3-yl)acetonitrile (59)

111 gradbook final.indd 111

22/10/2010 12:47:51


2-(1-Methyl-1H-indol-3-yl)-3-oxo-3-(pyridin-3-yl)propanenitrile (60)

4-(1H-Indol-3-yl)-5-(pyridin-3-yl)-1H-pyrazol-3-amine (61)

112 gradbook final.indd 112

22/10/2010 12:47:52


Rang, H. P.; Dale, M. M.; Ritter, J. M. Pharmacology (1999) 4th Edition, Churchill Livingstone. Love, R. R.; Leventhal, H.; Easterling, D. V.; Nerenz, D. R. Cancer. (2006) 63, 604-612. 3 Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. Science. (2002) 298, 1912. 4 Liao, J. J.-L. Med. Chem. (2007) 50, 409-424. 1 2

McGregor, M. J. J. Chem. Info. Model. (2007) 47. 2374-2382. Noble, M. E. et al. Science. (2004) 303, 1800-1805. 7 Yoshinori Yamashita, Noboru Fujii, Chikara Murakata, Tadashi Ashizawa, Masami Okabe, and Hirofumi Nakanot; Biochem. (1992) 31, 12069-12075. 8 Hampoux, J.J. Annu. Rev. Biochem. (2001) 70, 369-413. 9 Mitscher, L.A. Chem. Rev. (2005) 105, 559-592. 10 Yoshinari, T.; Yamada, A.; Uemura, D.; Nomura, K.; Arakawa, H.; Kojiri, K.; Yoshida, E.; Suda, H.; Okura, A. Cancer Res. (1993) 53, 490-494. 11 P. Moreau; N. Gaillard; C. Marminon; F. Anizon; N. Dias; B. Baldeyrou; C. Bailly; A. Pierre; J. Hickman; B. Pfeiffer; P. Renard; M. Prudhomme. Bioorg. Med. Chem. (2003) 11, 4871–4879. 12 P. Schupp; C. Eder; P. Proksch; V. Wray; B. Schneider; M. Herderich; V. Paul; J. Nat. Prod. (1999) 62, 959–962. 13 Arimondo, P. B.; Baldeyrou, B.; Laine, W.; Bal, C.; Alphonse, F. A.; Routier, S.; Coudert, G.; Merour, J. Y.; Colson, P.; Houssier, C.; Bailly, C. Chem. Biol. Int. (2001) 138, 59-75. 14 Routier, S.; Coudert, G.; Merour, J. Y.; Caignard, D. H. Tet. Lett. (2002) 43, 2561-2564. 15 S. Tanida, M. Takizawa, T. Takahashi, S. Tsubotani and S. Harada, J. Antibiot. (1989) 42, 1619–1630. 16 M. Kubbies, B. Goller, E. Russmann, H. Stockinger and W. Scheuer, Eur. J. Immunol. (1989) 19, 1393–1398. 17 J. B. McAlpine; J. P. Karwowski; M. Jackson; M. M. Mullally; J. E. Hochlowski; U. Premachandran; N. S. Burres, J. Antibiot. (1994) 47, 281–288. 18 P. Hoehn-Thierry; O. Ghisalba; H. Peter; T. Moerker. WIPO Pat., WO9500520. (1995). 19 K. M. McCoy and C. J. Hatton, US Pat., US4735939. (1988). 20 G. Conseil; J. M. Perez-Victoria; J.M. Jault; F. Gamarro; A. Goffeau; J. Hofmann; A. Di Pietro. Biochemistry. (2001) 40, 2564–2571. 21 P. Lazarovici; D. Rasouly; L. Friedman; R. Tabekman; H. Ovadia; Y. Matsuda. Adv. Exp. Med. Biol. (1996) 391, 367–377. 22 Brunton, R. J.; Drayson, F. K.; Plant, S. G. P.; Tomlinson, M. L. J. Chem. Soc. (1956) 4783. 23 Bhide, G. V.; Tikotkar, N. L.; Tilak, B. D. Chem. Ind. (1957) 363. 24 Omura, S.; Iwai, Y.;Hirano, A.; Nbakagawa, A.; Awaya, J.; Tsuchiya, H.; Takahashi, Y.; Masuma, R. J. Antibiot. (1977) 30, 275-282. 25 Akinaga, S.; Sugiyama, K.; and Akiyama, T. Anti-Cancer Drug Des. (2000) 1543-52. 26 Knolker, H. J.; Reddy, K. R. Chem. Rev. (2002) 102, 4003-4427 27 N. Funato; H. Takayanagi; Y. Konda; Y. Toda; Y. Harigaya; Y. Iwai; S. Omura. Tetrahedron Lett. (1994) 35, 1251–1254. 5 6

J. T. Link; S. Raghavan; M. Gallant; S. J. Danishefsky; T. C. Chou; L. M. Ballas, J. Am. Chem. Soc. (1996) 118, 2825–2842. 29 Bush, J. A.; Long, B.H; Catino, J.J.; Bradner, W. T.; Tomita, K. J. Antibiot. (1987) 40, 668-678. 30 T. Kaneko; H. Wong; K. T. Okamoto; J. Clardy. Tetrahedron Lett. (1985) 26, 4015–4018. 28

113 gradbook final.indd 113

22/10/2010 12:47:52


Gallant, M.; Link, J. T.; Danishefsky, S. J. J. Org. Chem. (1993) 58, 343. 32 Brana M. F.; Gradillas A.; Ovalles, A. G.; López B.; Acero N.; Llinares, F.; Mingarro, D. M. Bioorg. Med. Chem. (2006) 14, 9-16. 33 Gaisina, I.N.; Gallier, F.; Ougolkov, A.V.; Kim, K.H.; Kurome, T.; Guo, S.; Holzle, D.; Luchini, D.N.; Blond, S.Y.; Billadeau, D.D.; Kozikowski, A.P. J. Med. Chem. (2009) 52, 1853-1863. Kiselyov, A.; Balakin, K. V.; Tkachenko, Expert Opin. Invest. Drugs. (2007) 16, 83-107. Peifer, C.; Selig, R.; Kinkel, K.; Ott, D.; Totzke, F.; Schachtele, C.; Heidenreich, R.; Rocken, M.; Schollmeyer, D.; Laufer, S. J. Med. Chem. (2008) 51, 3814-3824. 36 Jirousek, M.R.; Gillig, J.R.; Gonzalez, C.M.; Heath, W.F.; McDonald, J.H.; Neel, D.A.; Rito, C.J.; Singh, U.; Stramm, L.E.; Melikian-Badalian, A. J. Med. Chem. (1996) 39, 2664–2671. 37 E. Caballero, M. Adeva, S. Calderon, H. Sahagun, F. Tome, M. Medarde, J. L. Fernandez, M. Lopez-Lazaro and M. J. Ayuso, Bioorg. Med. Chem. (2003) 11, 3413–3421. 38 Ohkubo, M.; Nishimura, T.; Homma, T.; Morishima, H. Tetrahedron (1996) 52, 8099-8112 39 M. Prudhomme, Curr. Med. Chem. (2000) 7, 1189–1212. 34 35

114 gradbook final.indd 114

22/10/2010 12:47:52


115 gradbook final.indd 115

22/10/2010 12:47:52


Computer Sciences & Information Studies panel

r

Prof. Jonathan Blackledge, DIT (chair) Prof. Eugene Coyle, DIT Dr. Marek Rebow, DIT Prof. Timo Hamalainen, University of Jyvaskyla, Finland Dr. Sekharjit Datta, Loughborough University

Judges’ comments

The submission clearly illustrates progression from an original concept to the engineering and implementation of a neural network for a mobile robot. The material is clear, coherent, very well presented and articulated, and is worthy of publication in the open literature. One of the principal reasons for selecting this paper is the mature nature of the work that has been undertaken together with its recognised importance in the field of robotics, namely, robot positioning systems that support calibration of odometry errors in both indoor and outdoor environments. The material submitted was considered by the selection panel to be more in-keeping with the product of a research student undertaking a masters or doctoral research programme than a final year undergraduate student and the author is to be congratulated on a premier example of a final year undergraduate project.

116 gradbook final.indd 116

22/10/2010 12:47:52


r

Computer Sciences & Information Studies

Mobile robot localisation using neural networks Haoming Xu

T

Abstract

he algorithm of robot position is very important to the consistency of map building, and odometry is one of the mostly used methods for robot position. Currently, few robot position systems support calibration of odometry errors in both indoor and outdoor environments. To achieve this, the mobile robot has to able to find task solutions to unknown environments, learning from experience and recognising the similar new environment. This paper proposes a method where a Feed-Forward neural network is used for calibrating the odometry of both synchronous and differential drive mobile robots; the standard BackPropagation technique is also used. This method integrates non-linear problem solving, environment recognition and learning capabilities. Although neural network is trained in general environment for once, and then perform nonlinear control based on its nonlinear input-output mapping ability. It is unnecessary to learn again in a new environment. The paper also examines calibration method based on optimisation and compared with neural network. The different length and shape of paths denote that the neural network approach reduce the accumulative errors of odometry efficiently, and improves the accuracy for robot during map building. Experiment results demonstrate that the neural network approaches trained by Bayesian Regularisation provide improved performance and are suitable for this purpose.

INTRODUCTION

Autonomous mobile robot localisation, map building and Simultaneous Localisation and Mapping (SLAM) are regarded as important competence in the field of robotics [1, 2]. Map building is estimating the position of the robot relative to the map and generating a map using the accuracy coordinate sensory input and the estimates about the robot’s pose [3]. Localisation emphasise estimate the position of a robot in its map frame, given a map of the environment and the sensor readings [4], and SLAM is focused on robot in an unknown location and environment, and have the robot simultaneously estimating positions of newly perceived landmarks, and determining position of mobile robot itself while mapping [5]. This paper is focused on localisation only. Odometry is one of most widely used methods for positioning of mobile robots, and accumulated

117 gradbook final.indd 117

22/10/2010 12:47:52


error in odometry is usually caused by unequal floor contact, unequal wheel diameter, measurement resolution [6]. Whereas most of today’s mapping and localisation systems are able to deal with noise in the odometry – estimating a robot’s pose relative to its environment, and the odometry error problem is a key problem in mobile robotics [3, 4]. Odometry error provides accuracy for robot map building in a short period of time. Estimation of odometry is difficult which is hard to capture; odometry usually leads to the accumulation of errors. In addition, the accumulation of position error and orientation error will increase proportionally with distance travelled by robot. The substantial amounts of position error accumulated during robot exploration, as a result, sensor information was incorporated into the map at the wrong locations and the magnitude of this error increased over time [7].

Prior Work

Robot odometry error calibration is not a novel idea; a lot of investigation has been carried out on the odometry error. From the theoretical perspectives, Kleeman presented a simple statistical systematic error model for estimating robot’s position and orientation based on computing odometry covariance matrix, the method relies on incrementally updating the covariance matrix in small time steps. The approach assumes the odometry error are exclusively random zero mean white noise [8]. Borenstein and Feng introduce a calibrate method called UMBmark which is based on solving geometric relation. Due to the consideration of this method, estimate and calibrate for systematic errors of a mobile robot with a differential drive which observable on 4mx4m pre-programmed closed squared trajectory [9]. Chong and Kleeman presented an error modelling of low cost odometry system capable of achieving height accuracy deadlocking. The solution is obtained for non-systematic error on constant curvature paths by solving a recurrence formula. The experiment designed is same as in Borenstein and Feng’s paper [10]. Kelly builds on Chong and Kleeman’s earlier work and develops a general solution for nonlinear systematic error propagation dynamics of both systematic and random error in vehicle odometry for any trajectory and any error model [11]. Roy. N and Thrun. S proposed a statistical method for calibrating the odometry of mobile robots which is an efficient, incremental maximum likelihood algorithm. The method is used to increase the position estimation accuracy online during robot navigation in an unknown environment [12]. System error concerned with an augmented Kalman Filter (AKF) for both synchronous-drive and differential-drive mobile robot systems during navigation proposed by Martinelli and Siegwart [13]. The major missing element from all the above work is absence of a technique that can both calibrate odometry error indoors and outdoors, and those systems have some common drawbacks such as large computation problem, hard implementation and the complexity of the system. We do not sufficiently justify the need for a new method given by Thrun’s success with a maximum likelihood online approach.

Motivation

The major drawback of such research is that orientation will not work in large open spaces or feature rich environments, and odometry error relationship has to be chosen before analysis. This paper wants to present a new approach for mobile robot which can adapt for both systematic error

118 gradbook final.indd 118

22/10/2010 12:47:53


and non-systematic error during map building. This allows the robot to perform more reliable position estimation in areas of the unknown environment which means robot builds more accuracy map for the environment. Furthermore, our approach does not require any specific hardware, such as in the technique approach of Borenstein and Feng [9]. We address the odometry error calibration problem through neural network in order to be applied easily and more flexible. Our approach phrases the odometry error of mobile robot as nonlinear dynamical system [14]. Neural network approach provides solutions of nonlinear analysis problems, and it could learn odometry error relationship directly from the data being modeled. On the other hand, neural network solutions could be of some drawback because sometimes neural network cannot find the best solution, network was trapped to the local minimum and hard to escape. Concerning the systematic component, we introduce four different Feed Forward neural network schemes to estimate error for mobile robot. Similarly, all of the four schemes employ Back Propagation algorithm, however, they employ different transfer functions and training functions. The purpose of this design aims at the intercomparsion among the four schemes, and the optimal neural network scheme to estimate robot odometry error. In this paper we also compare the performance of the neural network developed to solve linearised calibration problem with the performance of optimisation calibration approach.

Paper Structure

The paper is organised as follows: Section II describes the odometry error model for mobile robots with both synchronous and differential drive. Section III introduces robot position estimate system. The data processing algorithm is described in section IV. Section V explains the neural network schemes designed for robot odometry error. Section VI displays four different neural network schemes with three different test cases, the test environment is developed for straight lines Environment I and turning with 90o II and Environment III can be composed of Environment I and II without great loss of accuracy. Section VII analyses the estimated error results between different schemes.

II. ODOMETRY ERROR MODELS

The odometer error of mobile robot consists of systematic odometer error and non-systematic odometer error. Systematic error is usually caused by faultiness in the design and mechanical implementation of a mobile robot. The key factor of odometry error for differential-drive mobile robot is caused by the following reasons, first is unequal wheel diameters; second is uncertainty about the effective wheelbase and non-systematic odometry error is caused by faultiness features of the circumstance [9]. For instance ceramic tile floor surface will cause wheel skid. In order to unequal wheel diameters, we will define robot error position and orientation as

(1)

where, P0 is the actual robot position in the beginning, and P1 is the actual position at the end. Next vl and vr translating velocities of the left and right wheel, respectively. b means distance

119 gradbook final.indd 119

22/10/2010 12:47:53


between left wheel and right wheel. Rewrite P1 into Eq. (2), (2) And we can denote Oˆrot and Oˆtrans as robot actual translation and rotation, respectively. (3) Recall that is the displacement measured by the robot. If robot odometry is 100% accurate, that means there is no odometry error problem. In this condition Otrans = Oˆtrans and Orot = Oˆrot. Substituting Eq. (3) into Eq. (2), and we obtain (4) The accumulated odometry error influences by two factors: a systematic error and a random error, where the random error has an excepted value of zero mean. More specifically, the true rotation and translation can be rewritten as [12] (5) Here εrot and εtrans are the receptive random errors variables with zero mean in orientation changed and distance travelled, d is distance travelled by mobile robot. δtans and δrot are the parameters of system error in distance travelled and system error in orientation changed, respectively. So the expanded kinematics model for robot actual final position can be rewritten as following: (6) Section VII presents a solution for Eq. (6) that is used to benchmark neural network approach.

III. POSITION ESTIMATION SYSTEM

The robots actual trajectory is processed from the data captured from the laser scanner based on a real time trajectory measurement system, which consist two hardware components: SICK Laser Measurement Sensor (LMS) 200 and data acquisition computer. The laser scanner measures a pole placed on top of, and centre at the axes of mobile robot. The pole is a hand-made, three centimetres (cm) diameter and 30cm height paper pillar, which is painted in white colour with low light reflect varnish. The measurement data will be sent to computer via serial port communication (RS-232). SICK LMS-200 is a plane scan laser, which provides distance measurements over an 180o area up to 80 metres (m) away, and scan from right to left. In this system, the LMS is set as follows: angular resolution: 0.5o angular angle: 0o to 180o and measurements in the millimeter (mm) mode. Depends on this selected format, for a complete scanning cycle, one dataset with 361 value will be achieved. To track the robot’s actual position, we need to find the pole’s position in

120 gradbook final.indd 120

22/10/2010 12:47:53


each scanning. The data received from the laser bears lots of noise, which will bring error in the calculation. So, the algorithm of sliding window filter is adopted to reduce the noise. where:

(7)

Np = current data after filter M = the size of the filter di = the ith data

Fig. 1. Data after siding window filter. Figure 1(a) illustrates the data when the pole captured by laser. The dashed circle shows the data, which represents the pole while the data are full of noise signal. After the processing by sliding window filter in Eq. (3), the data is recorded like figure 1(b), where the pole position zooms out in the rectangle box. After the data filter process, the gradient was calculated and represents the pole position shows in figure 2.

Fig. 2. The pole position localisation method calculated the biggest turning point in the gradient curve. By calculating each protrudes in Eq. (9), to obtain the trajectory of mobile robot. (8)

121 gradbook final.indd 121

22/10/2010 12:47:54


(9) where GP = gradient of the laser r = gradient parameter

Fig. 3. The laser data represent, marker pole’s position. Figure 3 illustrates path followed by robot from perspective of a stationary laser range finder, which differs from the actual robot path traversed. The laser available detect distance is set 5m in this experiment, which shows as semi-circle arc in the figure 3, arc sticks out (protrudes) means at time t the marker pole’s position captured by laser. The closer the positions the lower the speed, which appears at the beginning and end part.

Fig. 4. Robot actual path. Figure 4 manifests the robot actual path by calculating standard geometry of each peak in figure 3 correspond robot being detected. We can see the starting point in figure 4 is 25cm far from y-axis

122 gradbook final.indd 122

22/10/2010 12:47:55


approximately. The trail of Robot is located at the 0 on the scale of y-axis, and marker is placed on the center of robot pathway, by reason of the distance from robot trail to centric position is 25cm.

Fig. 5. Robot path capture system. The Pioneer 3-DX robot and SICK LMS-200 laser used in our research. Our approach was tested by using the Pioneer 3-DX robot showing in figure 5. The robot is equipped with twp wheels and one caster. The robot comes with Advanced Robotics Interface for Applications (Aria) software of MobileRobots Inc. (ActivMedia Robotics), our approach uses Aria control software to estimate robot position. The size of experiment is 8x8m, and it is tile with tesserae, as of a floor. Each tile is 30x30cm square, and there is a gap of 0.5cm between tiles. The sub-figure in figure 5 depicts zoom in on the area where part of tile is positioned. Robot ran straightforward for about 6.5m, and laser will capture 16 available data set – from which the robot actual position could be calculated. Each laser data will gain the relevant time frame’s Aria data by matching the time. We capture those data for supervisors learning. The data captured scene is displayed in Fig. 5. The available data will be divided into two samples. According to the time frame, the 16 relevant Aria data will be divided into two sample data which time interval for each sample data is six minutes approximately. The reason for constructing datasets is for neural network short time training and appropriate time for calibrating odometry error effect. Each sample data consists of 8 Aria estimation x positions and y positions, starting time and ending time. Aria data is robot with distance travelled measurement error. The reason for adopting Aria position is determining a robot’s position based on previous robot position and error accumulate over time as robot moves due to robot uncertainty in measurements [6]. Although it is hard to calculate exact robot average velocity from Aria data, timing is also a relevant factor of robot average velocity.

IV. NEURAL NETWORK MODELLING

The odometry error modelling for mobile robot is nonlinear, and the neural network has generated considerable interest as an alternative linearised modelling tool [14]. In this paper, two layer feedforward neural network techniques have been applied to robot odometry error modelling. The

123 gradbook final.indd 123

22/10/2010 12:47:55


Back Propagation algorithm was used to train this two-layer perception. Back-propagation works by applying the gradient descent rule to a feed-forward network and is employed to optimise the parameters of the network. The structure of odometry error calibrating system designed with neural network is shown in figure 6.

Fig. 6. Black diagram of odometry error calibrating system with neural network.

τa is Aria estimation also it is network training set, and can be simply described as below.

(10)

where x1...8 are x position and y1...8 are y position of the Aria. ts and te is robot starting time and ending time respectively. τn is the robot estimate position generated by neural network. The structure of neural network is shown in figure 7.

Fig. 7. Structure of neural network. The mathematical of neural network is describe as following,

(11)

where ω(1) and ω(2) are weight matrices from input nodes to the hidden layer and from hidden layer to the output layers F(1), F(2) is transfer function of hidden layer and output layer respectively. H is hidden node.

124 gradbook final.indd 124

22/10/2010 12:47:55


The minimise network error function at network output is defined by

(12)

where τn indicates output node, γa = [lx,ly] is target set of the neural network, lx,ly is x position and y position of the laser, the authentic position of robot are x and y. The number of the hidden layer neurons used is important for neural network, and the larger number of hidden layer neurons is more difficult to implement, as each neural network weight requires an integrator [15]. The optimised number of hidden layer nodes is given by following equation, (13) Where ξ indicates number of hidden layer, α indicates number of input nodes, and β indicates number of output nodes. Empirically, hidden nodes ξ = 13 is the best performance of this neural network.

V. EXPERIMENTAL DESIGN

In this section, four neural network schemes have been implemented for mobile robot odometry error modelling. 80% of sample data will be randomly selected into training data and rest 20% of sample data will be selected into testing data by the system. And all sample data are normalised from -1 to 1 before training. If the Sum Squared Error (SSE) and Mean Squared Error (MSE) was computed precisely as the error function. The MSE defined in Eq. (14), (14)

Each network scheme will be equipped with several sets of network simulates outputs. There are four different types of experiment for testing. The environment designed for experiment is outdoor on the ceramic tile flooring, and all testing are pre-program for the Pioneer 3-DX robot. Scheme I: Tan-sigmoid (tansig) and linear (purelin) is transfer function for hidden layer and output layer receptivity, and applied Bayesian regularisation (trainbr) training function for network. Scheme II: Applied Tan-sigmoid (tansig) for both hidden layer and output layer, and applied Bayesian regularisation (trainbr) network for training function. Scheme III: Tan-sigmoid (tansig) and linear (purelin) is transfer function for hidden layer and output layer receptivity, and applied Levenberg-Marquardt (trainlm) training function for network.

125 gradbook final.indd 125

22/10/2010 12:47:56


Scheme IV: Tan-sigmoid (tansig) for both hidden layer and output layer, and applied Levenberg-Marquardt (trainlm) training function for network. Most commonly, the transfer function is either tan-sigmoid, pure-linear or log-sigmoid [16]. The pure-linear and tan-sigmoid transfer functions was picked because log-sigmoid generates outputs in range (0, 1), and in this paper, all the sample data is normalised from -1 to 1. Environment I: Robot ran straightforwards up till to 6.5m at 0.75m/s on the tile floor indoor. Environment II: Robot ran forwards up till to 5.5m at 0.75m/s on the tile floor indoor, but it started to slow down gradually since its being run up till to 6m, in the coming up, it turned to right with 90o to run forward to 1.8m at 0.75m/s. Environment III: The robot was programmed for 6 Ă— 2.5m rectangle path on tile floor. Robot was travelling rectangle trajectory at a velocity of 0.75m/s during the straight legs of the rectangle trajectory. Close to at end of each leg the robot slowed down and it then turn 90o. The linear velocity of robot during turning was 0.5m approximately. The environment I and II are developed for straight line and turning 90o. Error in odometry being deviating randomly indicates in the meantime the robot is running along straight line [8]. The purpose of straight line trajectory is to test if neural network would be able to utilise to reduce random error and the purpose is testing robot slippage during turning 90o. Paths in environment III is more complex and composed of environment I and II.

VI. EXPERIMENTAL RESULTS

The basic result of our evaluation is that the approach presented here finds out the best performance of neural network schema for calibrating robot’s odometry error. Mean Absolute Percentage Error (MAPE) is the preferred error measure of network estimate measurement in Eq. (15), while a large negative value would indicate the reverse. Where, EetEst is network estimate output of each observation, n is the total number of observations [17]. A small MAPE means the better scheme. (15) The mean, standard deviation and standard error in Eq. (16) of the validity have been produced by the different network schemes across the training sets. (16)

126 gradbook final.indd 126

22/10/2010 12:47:56


The Table I represents the ensemble statistics for different measures of variation for four schemes in Environment I, with all other parameters is held constantly.

According to MAPE, the average MAPE from scheme I to III are 1.28, 2.04 and 20.77 respectively. The MAPE results demonstrate that scheme I can be rated as the best performance for previous environment 1, and the same holds true for environment II and III.

127 gradbook final.indd 127

22/10/2010 12:47:56


Fig. 8. Mean square error results for four difference schemes. The x-axis adopted to represent the epoch, and y-axis adopted to represent the MSE. Figure 8 shows the results of MSE related to the four schemes in Environment I in one run. Convergence of training is verdicted according to the viration of MSE. Having been trained to a constant by means of some epochs, the constant numerical value of MSE usually approves the convergence of network training. Fig. 8a, 8b, 8c and 1d represents scheme I, scheme II, scheme III and scheme IV respectively. A. Optimisation-Based Error Calibration To value the result of the method proposed from this paper, we compare it with the optimisation method on the same object. Optimisation approach with an optimisation criterion that minimises final position error can be done resulting with calibration parameters values [18]. Optimisation approach based on expanded kinematics model with two calibration parameters. The robot actual final position expression given by Eq. (6), the expression needs rewrites for optimisation formatting. Robot distance travelled between time step K and K + 1 can be denoted as follows: (17) (18) Here Otrans and Orot are robot actual translation and rotation receptivity. T sampling time (s); and vt (K) of the robot translation velocity are related to the angular velocity of the wheels by [19], (19)

128 gradbook final.indd 128

22/10/2010 12:47:57


We denote the velocity of the mobile robot on the part of these parameters as well as of the angular velocities of its left and right wheels, ψL and ψR respectively. So the expanded kinematics model with twp calibration parameters for optimisation can be expressed by rewriting Eq. (17) into, (20) (21) where δtrans and δrot are calibration parameters that compensates systematic error in the traversed distance measurement and orientation change measurement, and d is distance travelled by robot. The optimisation criterions in this method are minimum position error and orientation error between estimate and actual. Optimisation criterions are denotes as following. (22) (23) where Iposition and Iorientation are position criterion and orientation criterion, respectively. Est is estimate poe and Actual is robot actual pose. The fminsearch function in MATLAB Optimisation Toolbox version 4.2 used to solve unconstrained nonlinear multi-variable optimisation problems. The function based on Nelder-Mead sequential simplex search algorithm [20]. It’s a direct method and does not require any gradient or Hessian evaluation. Applying fminsearch to computing calibration parameters for each collected data sets in same experiments, and final calibration parameters values are averages of parameter values obtains from the all collected calibration data sets.

Fig. 9. Simulation results for the differential trajectories in Environment I. The coordinate system used is Polar system, and sub-figure used is Cartesian system.

129 gradbook final.indd 129

22/10/2010 12:47:57


Fig. 10. Simulation results for the differential trajectories in Environment II. The coordinate system used is Polar system, and sub-figure used is Cartesian system.

Fig. 11. Simulation results for the differential trajectories in IV. The coordinate system used is Polar system, and sub-figure used is Cartesian system. Figure 9 illustrates the results of estimation in Environment I for straight line path. As the diagram indicates, the estimation of robot final position in neural network approach is almost the same as robot actual. The optimisation-based estimation is much better Aria estimation. The offset between between Aria estimation and robot actual path is 750mm approximately. Resulting position trajectories for robot turning 90 degree are presented in figure 10. The figure reveals that odometry error increased significantly during the turning, while after turning the odometry error was changed slightly. At the end, neural network estimation is near match the actual position and optimisation-based estimation is 10 cm far from actual position. A more extensive experiment is shown in figure 11. As we can see from the diagram, the robot

130 gradbook final.indd 130

22/10/2010 12:47:57


wheel was slipped during first turn. At the end of trajectory, neural network estimation is better than optimisation-based estimation. This means that optimisation-based methods is not good for estimating odometry error in complex environment and at high speeds. In case in which Environment III was more complex than pervious two environments, the neural network approach still presented good accuracy demonstrating its robustness.

Fig. 12. Error comparison among three different calibration approaches in Environment I. Figsures 12, 13 and 14 show the time evolution of the position error with the Aria, optimisationbased and neural network methods. It can be seen from those figures that neural network method can be rate as best way to estimate error in odometry. According to figure 12, error in odometry go up substantially before four seconds, the uncalibrated error in robot odometry increased up to 600mm in five seconds approximately, at the same time, error in neural network estimation and optimisation-based estimation was 25 and 80mm respectively. From then on, the odometry error growth gained a stronger momentum, hitting the peak across the board at 700mm in the 11 second approximately.

Fig. 13. Error comparison among three different calibration approaches in Environment II.

131 gradbook final.indd 131

22/10/2010 12:47:57


The data in figure 13 indicates that after robot running more than 10 seconds, error in odometry is exceed 400mm. When the odometry error amounted to a peak to 1200 mm. Figure 14 shows the error comparison among three methods in Environment III. As is easy to be seen, the optimisation-based estimation errors have been accumulated along with the increase of robot’s runtime and distance. The neural network estimation error decreases after 0.5ms, at the end error nearly reduce down to 0. At this time, the error in odometry is 40cm. Figures 12-14 demonstrated the best accuracy is neural network approach, and optimisation-based method has better accuracy than Aria estimation in Environment I and II, but very low accuracy in Environment III. There are possible reasons for low accuracy in optimisation-based method.

Fig. 14. Error comparison among three different calibration approaches in Environment III. The first one is that as the Eq. (20) and (21) suggests, the odometry error model assumes that robot pose error increase linearly with distance travelled [12], and only used for robot which performs very little pure rotation. In practice, Experiment I to III are feature rich environment, and figures 12-14 show the error in odometry grows nonlinearly with distance travelled. The second reason is that, the robot translational errors are linear in straight trajectory at constant velocity [11]. However, figure 15 shows that the robot actual velocity is nonconstant, which means the error modelling is not appropriate in those environments. The third reason is the limitation of fminsearch algorithm, which may obtain a local solution. Both of these three reasons were cause mainly by the inaccuracy of optimisation-based estimation. In addition, optimisation-based estimation goes against with robot at high speeds. On the other hand, the reasons for neural network approach rated as the best performance is that, this method is an effective solution to perform nonlinear control based on its nonlinear input-output mapping ability. It is able to learn rapidly and efficiently to estimate error in odometry without knowledge of nature or geometrical modelling.

132 gradbook final.indd 132

22/10/2010 12:47:57


Fig. 15. Time and velocity relation for Environment I, II and III. Table IV summarises the results of three experiments using two different calibration approaches. The percentage error comparison defined Eq. (24) [21] is used to evaluating the calibration accuracy. Presented percentage values show both calibration approaches. As the results in table 5 indicate, the odometry error in three extensive runs was 10.92%, 4.27% or 2.2%, which was reduced by neural network algorithm to 1.2%, 0.45% or 0.17% respectively, by optimisation algorithm to 2.80%, 2.64% or 4.04% respectively. Thus, neural network approach reduces the odometry error by 88.97%, 89.39% or 92.0%. Clearly, the neural network solution is an excellent approximation for errors in error. Throughout the test, the difference between neural network estimation and exact not exceeds 5cm. Neural network generalised in all cases, presenting better precision than optimisation-based methods. However, in real experiments with time and magnitude of error relation figures, the difference in the accuracy for both methods slightly at the start of the experiment, but the difference in the accuracy for both methods is significant at the end of experiment. Thus, using neural networks is simpler since the implementation does not require a complex mathematical model for odometry error.

133 gradbook final.indd 133

22/10/2010 12:47:58


B. Discussion According to the UMBmark method proposed by Borensterin and Feng, in which altogether eight relevant experiments had been operated, the robot was running on 4x4 square m path experiment. In order to avoid slippage, the robot was running with a slow speed. The translating velocity was set to 0.2m/s during the straight line and then robot stopped completely before turning 90 degree (on the spot rotation), in the meantime the linear speed of robot during its turning was approximately 0.2m/s and -0.2m/s. In experiment 4, the speed is set to 0.75m/s which is nearly four times faster than in Borensterin’s experiment, and then it is decelerated, in the following, the linear speed of robot during turning was 0.7m/s and -0.7m/s approximately. Table VI summaries the two experiments. According to UMBmark method, the mean of improvement resulted in the eight experiments were 15.1-fold for 4x4 square path. Likewise, the mean has also been obviously improved to 14.8-fold with operation of our experiment. When mobile robot runs at higher speed, it has higher error in odometry than mobile moving slowly [22], which means our experiment approach is more suitable accesses the localisation with higher speed. The advantage of the UMBmark method is that, it does not need extra sensors. Its drawback is robot start and end position must coincide. Another drawbacks are the experiment which is not suitable for robot localisation and odometry error relationship has to chosen before calibration. As demonstrated by our ability to apply the neural network approach on any mobile robot, the ability to work in real environment is a testament to the method’s robustness.

VII. CONCLUSION

In this paper, three odometry error calibration techniques for map building are implemented and experimentally compared using mobile robot. First, the map building accuracy of mobile robots employing odometry was considered. To address map building accuracy problem, neural network, optimisation and UMBmark based mobile robot odometry calibration methods was used. Compared with optimisation-based and UMBmark calibration methods, proposed neural network approach has lots of notable advantages. The key advantages of proposed method is that it can fit any odometry model and works in high velocity. The advantage of this approach is that it does not need to choose odometry error relationship before analysis, which means that method can be implemented easily. Experimental results obtained in feature rich environment illustrate that the method implemented in this paper can reduce mobile robot’s odometry error significantly, and then the accuracy has been increased up to 1.2% of the distance travelled. In addition, the neural network has been trained before robot map building; the error model in

134 gradbook final.indd 134

22/10/2010 12:47:58


the environment can be learnt by neural network rapidly without human interference. This method is useful to the map building in general indoor environment and works well in large open spaces or feature rich environments.

REFERENCES [1] A. Bicchi, F. Lorussi, P. Murrieri and V. G. Scordio, On The Problem of Simultaneous Localisation, Map Building, and Servoing of Autonomous Vehicles, advances in Control of Articulated and Mobile Robots Springer Verlag, 2004, pp. 223-239. [2] M. Montemerlo, S. Thrun, D. Koller and B. Wegbreit, FastSLAM 2.0: An Improved Particle Filtering Algorithm for Simultaneous Localisation and Mapping that Provably Converges, In Proc. of the Int. Conf. on Artificial Intelligence (IJCAI) Acapulco, Mexico 2003, pp. 1151-1156. [3] D. H채hnel, R. Triebel, W. Burgard and S.Thrun, Map Building with Mobile Robots in Dynamic Environments, In Proc. of the IEEE International Conference on Robotics and Automation (ICRA) Taipei, Taiwan 2003. [4] D. Fox, W. Burgard, and S. Thrun, Markov localisation for mobile robots in dynamic environments, Journal of Artificial Intelligence Research (JAIR) 11:391C427, 1999. [5] C.C. Wang, and C. Thorpe, Simultaneous Localisation and Mapping with Detection and Tracking of Moving Objects, In IEEE International Conference on Robotics and Automation (ICRA) Washington, DC, USA, May, 2002. [6] R. Siegwart and I. Nourbakhsh, Introduction to Autonomous Mobile Robots, The MIT Press, Cambridge, Massachusetts, 2004. [7] B. Yamauchi, A. Schultz and W. Adams, Mobile Robot Exploration and Map-Building with Continuous Localisation, Proceedings of the 1998 IEEE International Conference on Robotics and Automation, Leuven, Belgium, May 1998, pp. 3715-37. [8] L. Kleeman, Odometry Error Covariance Estimation for Two Wheel Robot Vehicles, Technical Report MECSE-95-1. Department of Electrical and Computer Systems Engineering, Monash University, 1995. [9] J. Borenstein and L. Feng, Measurement and correction of systematic odometry errors in mobile robot, IEEE Transactions on Robotics and Automation, 1996, 12(6): 869{880. [10] K.S. Chong and L. Kleeman, Accurate Odometry and Error Modelling for a Mobile Robot, Proceedings of ICRA 1997, Albuquerque, New Mexico, April 1997. [11] A. Kelly, General Solution for Linearised Systematic Error Propagation in Vehicle Odometry, International Conference on Intelligent Robot and Systems (IROS01) Maui, Hawaii, USA, Oct. 29-Nov. 3, page 1938-1945, 2001. [12] N. Roy and S. Thrun, Online self-Calibration For Mobile Robot, In Proceedings of the IEEE international Conference on Robotics and Automation (ICRA). 1999. [13] A. Martinelli and R. Siegwart, Estimating the Odometry Error of a Mobile Robot during Navigation, In Proceedings of European Conference on Mobile Robots, ECMR03, 2003. [14] I. Rivals, D. Canas, L. Personnaz and G. Dreyfus, Modelling and control of mobile robots and intelligent vehicles by neural network, IEEE Conference on Intelligent Vehicles, Paris, France, 1994. [15] F.L. Lewis, Neural Network Control of Robot Manipulators, IEEE Intelligent Systems and Their

135 gradbook final.indd 135

22/10/2010 12:47:58


Applications Vol. 11, IEEE Educational Activities Department, 1996. [16] R. Japikse, D. Japikse and R. Pelton, Evaluation of neural networks for meanline model development, 10th International Symposium on Trans- port Phenomena and Dynamics of Rotating Machinery (ISROMAC-10), Honolulu, Hawaii, USA, 7-11 Mar 2004. [17] G.R.FINNIE and G.E.WITTIG, AI Tools for Software Development Effort Estimation, Proceedings of the Conference on Software Engineering: Education and Practice, University of Otago, 113-120, 1996. [18] E. Ivanjko, I. Komšić and I. Petrović, Simple off-line odometry calibration of differential drive mobile robots, 16th International Workshop on Robotics in Alpe-Adria-Danube Region RAAD 2007, Ljubljana, Slovenija, pp. 164-169, 2007. [19] J.C. Lagarias, J. A. Reeds, M. H. Wright and P. E. Wright Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions, SIAM Journal of Optimisation Vol. 9 Number 1, pg. 112-147, 1998. [20] A. Gianluca and C. Stefano, A Deterministic Filter for Simultaneous Localisation and Odometry Calibration of Differential-Drive Mobile Robots, 3rd European Conference on Mobile Robots 2007, Freiburg, Germany, 2007. [21] P. Goel, I.R. Stergios and S.S. Gaurav, Robust Localisation Using Relative and Absolute Position Estimates, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyongju, South Korea, 1999. [22] C. SeungKeun, Y. TaeKyung, C. MunGyu and L. JangMyung Localisation of a high-speed mobile robot using global features, 4th International Conference on Autonomous Robots and Agents (ICARA 2009) Wellington, New Zealand, 2009.

136 gradbook final.indd 136

22/10/2010 12:47:58


137 gradbook final.indd 137

22/10/2010 12:47:58


Economics panel

r

Prof. Vani Borooah, UU (chair) Prof. Donal Dineen, UL Dr. Kevin Lalor, DIT

Judges’ comments

The winning entry addresses arguably the single most pressing problem facing the world economy, and one that provided the fuel for the credit bonanza which characterised Western economies until 2008. The essay provides an invaluable introduction to movements over time in the value of the renminbi. The essay then provides a succinct description of the various methods used by the People’s Bank of China to sterilise the currency in the face of rising demand for Chinese goods and services. Using these methods has generated its own problems – some of which manifest in a less content labour force – and the author examines many of these. In the main, the author argues in this well written and comprehensive submission, that sterilisation operations, by providing surplus funds to Western banks, freed them from the constraint of retail deposits in making loans and, thereby, sowed the seeds for the financial disaster which occurred after 2008.

138 gradbook final.indd 138

22/10/2010 12:47:58


r Economics

A critical analysis of the sterilisation of the Yuan Barry O’Donovan

“S

Introduction

terilisation implies the deliberate postponement of the adjustment of an economy’s domestic price level or external exchange rate to changes in the international economy.” Greenwood (2008) Sterilisation is a form of central bank intervention used to manage and influence the exchange rate essentially through the use of open market operations, the reserve ratio and credit quotas. My essay will focus on the sterilisation techniques imposed by the People’s Bank of China (PBC)1 and how they affected the real exchange rate in the period after 2002. I will also discuss the broad implications of the sterilisation operations both on China and the global economy. From 1994-2005 the Yuan was pegged2 to the dollar before being allowed to appreciate moderately up until July 2008 when it was subsequently repegged to the dollar following the onset of the financial crisis. Since 2002, China has amassed significant current account surpluses (see figure 1) amid extraordinary productivity growth with a real GDP growth averaging 10% per annum between 2002 and 2008 (CIA World Factbook).

Fig. 1. Source: IMF International Financial Statistics. The People’s Bank of China is the Chinese Central Bank. The Yuan was set at a rate of 8.28Y/$.

1 2

139 gradbook final.indd 139

22/10/2010 12:47:59


The massive trade surpluses generated by China in recent years has resulted in huge sums of foreign currency flooding into the Chinese economy. These large sums of foreign currency (mainly US dollars3) are deposited at commercial banks in exchange for Yuan.4 The PBC begins its intervention process by purchasing US dollars from the commercial banks. It prints excess Yuan in order to do this. This intervention prevents large amounts of dollars being released into the financial markets by the commercial banks, which would cause a surge in the supply of dollars that would lead to a depreciation of the dollar relative to the Yuan. This would negatively affect China’s exports and also result in a capital loss on China’s dollar holdings. However, printing excess Yuan and selling it to the commercial banks (in exchange for US dollar reserves) causes a surge in the supply of Yuan. This ordinarily would increase lending by the commercial banks as an increase in the money supply causes a reduction in interest rates thus boosting demand for money. This would have inflationary effects on the Chinese economy as money growth increases substantially. The inflation—if permitted—would at some point render China uncompetitive at the prevailing exchange rate as prices become excessive, leading to pressure for an appreciation of the exchange rate. (Greenwood, 2008) By sterilising the excess money available, the Chinese monetary authorities directly undertake a policy to systematically offset the natural inclination for the money supply to grow in a high growth economy, which would cause inflation and exchange rate adjustment.

Exchange rate undervaluation

Figure 2 illustrates an IMF study of the real effective exchange rate of the Yuan between 1994 and 2007 (a similar study by Citigroup achieved similar results). The study estimated that a 2% upward tilt be imposed on the value of the Yuan per annum in order to bring the value of the Yuan in to line with its long term equilibrium value. This tilt takes account of productivity growth and the growth in trade surpluses5 and their effect on increasing the fundamental value of the Yuan. This upward tilt is consistent with the Balassa Samuelson effect, which is simply the fact that as productivity increases in rapidly developing economies like China, wages rise causing the price level to rise leading to an eventual nominal appreciation of the exchange rate due to excessive inflation (Xiao, 2008). The graph captures the restraints imposed on the structural appreciation of the exchange rate between 2002-2007. With 100 given as the base rate in 1994, the exchange rate fluctuated above and below the long run equilibrium level between 1994 and 2002 before dipping below the equilibrium level for a sustained period between 2002 and 2007 to almost 25% below its long run equilibrium level. In the next section, I will investigate how sterilisation operations caused the Yuan to remain significantly below its equilibrium level for such an extended period and blocked the long-term adjustment of the exchange rate from occurring. I will focus on dollars for the purpose of this paper due to the fact that the Yuan is pegged to the dollar and economists estimate over 70% of China’s of foreign reserves are held in liquid dollar assets. 4 Chinese citizens must exchange a significant amount of their foreign currency at the commercial banks ensuring that it is not widely traded (Humpage & Shenk, 2008). 5 See Appendix for a more detailed look at China’s recent growth. 3

140 gradbook final.indd 140

22/10/2010 12:47:59


Fig. 2.

Sterilisation Mechanisms and Effects

Sterilisation blocks the monetary, price and interest rate mechanisms that come into play to equilibrate imbalances (Mussa, 2007). David Hume’s (1752) widely accepted price-specie flow mechanism theory predicts (when adjusted to current times) that countries with large current account surpluses experience an increase in the money supply. By the quantity theory of money,6 a rise in domestic inflation results. This creates pressure for an appreciation of the exchange rate and an adjustment of the balance of payments as exports decrease and imports increase (Scanlon, 2009). The sterilisation operations and strict management of lending imposed by the People’s Bank of China (PBC) have largely prevented this mechanism from adjusting the imbalances that exist resulting in massive current account surpluses being generated by China year on year since 2002. By looking at the monetary base and the significant changes that have occurred in its make up post 2002, we can in large part identify the sterilisation operations undertaken by the PBC. The noticeably rapid rise in foreign exchange reserves can be seen from the graph over leaf. The percentage of foreign reserves to the monetary base changed from 51% in 2002 to 110% in 20067 (Mussa, 2007). Greenwood (2008) estimates the PBC’s foreign exchange purchases to have been $1.8 billion per day in 2007. Foreign exchange reserves have now surpassed the 2 trillion dollar See graph on page 142.

7

141 gradbook final.indd 141

22/10/2010 12:47:59


mark climbing $141 billion in the third quarter of 2009 to $2.273 trillion.8 These reserves are largely invested in US bonds. In order to sterilise the resulting rise in the monetary base, net domestic assets9 (NDAs) decreased from 49% of the monetary base in 2002 to -10% in 2006 (Mussa, 2007). This offsets the inflationary effects the Chinese monetary authorities would otherwise create by releasing the excess Yuan onto the market that it created in order to buy up foreign reserves. The PBC has reduced NDAs largely by limiting the amount of credit made available to the private sector via two sterilisation mechanisms.

Fig. 3. Source: Mussa (2007). The first mechanism is through the issuance of bonds (known as ‘sterilisation’ bonds) to commercial banks, reducing the supply of money available to rule lend and anchoring it to the PBC. The PBC, governed by the State10, effectively forces the commercial banks, many of which are state owned, to hold these sterilisation bonds worth billions of Yuan at below market interest rates (2.5% in 2006). This absorbs the increase in the money supply caused by the purchase of foreign reserves with printed Yuan. This sterilised intervention creates losses for the commercial banks as they are forced to reduce their lending and efficiency. http://en.mercopress.com/2009/10/14/chinas-reserves-reach-2.273-trillion-usd-including-800-billionus-bonds. 9 Net domestic assets (NDAs) are composed of net claims on the central government, credit given to the private sector and statutory bodies, and other domestic assets. 10 The State Council has the final decision on all matters relating to monetary policy. 8

142 gradbook final.indd 142

22/10/2010 12:48:00


Since 2007, however, this sterilisation tactic has been diminished as the PBC began to make losses on the spread between the interest earned on US bonds and the interest paid to commercial banks on the ‘sterilisation’ bonds issued. This was caused by the rise in inflation in China following the ‘de-pegging’ of its exchange rate11 causing the yield for ‘sterilisation’ bonds to increase while low interest rates implemented by the Federal Reserve in the US in 2007 caused the yields on short-term bonds to decline (McKinnon & Schnabl, 2009). This caused the PBC to switch its main mechanism of sterilisation to the reserve ratio. The reserve ratio increased from 6% in 2003 to 17.5% in June 2008 forcing commercial banks to further reduce their money supply and immobilise large sums of reserves at the central bank. The reserve ratio increased intensely in 2007 to combat the net interest losses made on ‘sterilisation’ bonds (McKinnon & Schnabl, 2009).

Fig. 4. Source: Greenwood (2008).

11

China relaxed its peg to the dollar between July 2005 and July 2008.

143 gradbook final.indd 143

22/10/2010 12:48:00


Figure 4 shows the impact of both sterilisation bonds and the reserve ratio in combating inflationary pressures caused by the rise in foreign reserves. Money supply increases need not be fully sterilised because the quantity theory of money shows that strong output growth (as experienced by the Chinese economy) absorbs proportional money growth increases (Makin, 2009). This is why foreign reserves can exceed sterilisation efforts without causing inflationary pressures, as money growth isn’t inflationary as long as it doesn’t exceed productivity growth. The PBC also imposed credit quotas on the banking sector to restrict liquidity growth. This ultimately led to the politically better-connected state enterprises receiving credit while smalland medium-sized enterprises were left to feel the pinch. This leads to an inefficient allocation of resources. The sterilisation operations created an environment of frugal domestic investment inhibiting money growth and ultimately, domestic growth. Between 2002-2007 the sterilisation polices employed by the People’s Bank Of China amounted to 4.5 trillion Yuan reducing net domestic assets from plus 2.2 trillion Yuan to minus 2.3 trillion Yuan (Mussa, 2009). The Chinese monetary authorities thus eroded the growth of net domestic assets below the growth rate of the economy in order to offset the large increases in the monetary base caused by the purchase of foreign reserves. Figure 5 gives us a clearer picture of how sterilisation became rampant from 2002 onwards with ‘bonds issued’ rising enormously along with ‘deposits’ at the PBC to offset the rise in foreign reserves.12

Fig. 5. Bonds issued and deposits appear as negative figures in the graph as they are reducing the money supply by absorbing credit in the private sector. 12

144 gradbook final.indd 144

22/10/2010 12:48:01


Critical Analysis of Sterilisation Operations

While the sterilisation operations have been largely effective in keeping the Yuan undervalued, the process cannot last indefinitely with net losses being made on the interest paid on sterilisation bonds against the yields earned on foreign bonds as explained previously. The only way to offset the rise in inflation causing the higher interest rates would be to increase the interest rate further. This, however, would not only result in hot money13 surging into the country but also in greater losses made on sterilisation operations. Lowering the interest rates to offset the interest payments would only increase demand for money and cause further inflation and further sterilisation. Therefore, a catch 22 situation emerges. Furthermore, the reserve ratio cannot be raised indefinitely thus ultimately limiting sterilisation operations in the long run.

Domestic Demand

The effective sterilisation of domestic demand has resulted in excessive reliance on net exports and foreign direct investment. The Chinese government has consistently advocated the need for more sustainable growth by growing consumption and domestic demand yet the sterilisation operations of the PBC inhibit any sustainable development from coming about. China hoards too much and consumes too little. Domestic consumption only accounts for 40% of China’s GDP in comparison to 70% in the US (Roubini, 2007). While I am not advocating that China increase it’s consumption levels to the exorbitant levels of the US, its current spending is extremely low by international standards for an economy in the midst of high growth. I would suggest that the sterilisation operations be unwound. By unwinding sterilisation operations, this will enable the banking system to work more efficiently boosting credit to the private sector, which could stimulate domestic growth that the Government strives for (Giavazzi, 2009).

Global Imbalances

The implications of the sterilisation operations employed by China globally are that they sustain global imbalances. Comparing the savings/investment balance of China to the US largely explains this problem. Makin (2009) points out that capital outflow from China matches the external deficit of the US. To explain this we must analyse the foreign exchange reserves held by the PBC. These reserves are purchased as a result of printing excess Yuan. Thus by sterilising the printed Yuan to the extent that the excess Yuan are anchored to the PBC and thus do not increase money growth, the PBC therefore artificially creates ‘savings’ of foreign reserves. These reserves represent excess saving over investment in China and are used to fund the US deficit. This leads to a vicious cycle of imbalance, as borrowing is kept cheap in the US by the fact that China’s sterilisation operations create artificial demand for US assets keeping interest rates low in the US and borrowing high. The graph over leaf illustrates this phenomenon (McKinnon & Schnabl, 2009).

13

Hot money refers to the money that flows in financial markets chasing the highest short-term interest rates.

145 gradbook final.indd 145

22/10/2010 12:48:01


.

Source: IMF: WEO, IFS.

The global imbalance problem or the ‘global savings glut’ as Ben Bernanke, current chairman of the Federal Reserve, describes the phenomenon, was arguably a root cause of the current financial crisis. China’s huge current account surpluses were a substantial part of the savings ‘glut’. By keeping interest rates artificially low in the US, China created an environment in the US for asset bubbles to emerge, notably in the property sector, the collapse of which triggered the financial crisis.

Further Criticisms

Sterilisation systematically maintains an undervalued Yuan. Therefore, the PBC’s policy of sterilised intervention is a violation of China’s obligation under the IMF Articles of Agreement to “avoid manipulating exchange rates or the international monetary system in order to prevent effective balance of payments adjustment” (Mussa, 2008). Mussa (2008) hypothesised what would have happened without the sterilisation techniques employed by the PBC from 2002-2007. If the PBC increased net domestic assets in line with the increases in foreign asset acquisitions during this period, the monetary base would have grown just short of the 20 trillion yuan mark as opposed to the current figure of a relatively meager 10.2 trillion yuan. Thus, if the sterilisation policies were not imposed and an almost doubling of the monetary base took place during the period, surely this would place unremitting pressure on the exchange rate to appreciate and allow for David Hume’s price-specie-flow mechanism to take full effect. This would almost certainly correspond to a change in the real effective exchange rate to bring it into line with its long run equilibrium level and bring China’s domestic growth in line with its productivity levels.

146 gradbook final.indd 146

22/10/2010 12:48:03


Arguments for maintaining sterilisation

By using sterilisation operations to maintain its peg to the dollar amid financial uncertainty world wide, China shields itself from the possibility of a speculative attack on its currency similar to the attacks that befell its Asian neighbours when the Asian financial crisis hit in 1997-1998. The PBC further protects China from speculative attacks by accumulating foreign reserves to offset the risks of inflation triggered by a domestic crisis (Aizenman, 2008). Another aspect we must be mindful of when criticising the sterilisation operations of China, is the history of exchange rate policy undertaken by other East Asian countries in the midst of periods of high growth. Both Japan and Korea accumulated large hoards of foreign reserves as a self-insurance policy to deal with the fragility of their respective banking systems. There is a view that China cannot afford a more flexible exchange rate regime and thus should continue sterilisation until its banking and financial system is reformed and strengthened – its banks are weak, undercapitalised and poorly regulated (Roubini, 2007). On the point of developing China’s banking system, Goldstein & Lardy (2009) argue that the reason the banking system is impaired and underdeveloped is because of the sterilisation operations and therefore, they should be unwound to allow for increased efficiency and functionality of the banking sector. Firstly, credit quotas result in an inefficient allocation of resources. Secondly, banks must deposit large amounts of reserves at the central bank earning below market interest rates and further reducing the efficiency and market functionality of the banking sector. Xiao (2008) argues in favour of continued sterilisation citing that the economy first needs to reach a state of full employment before a more flexible exchange rate system can be introduced. He argues that by unwinding sterilisation procedures and allowing the Yuan to rise to its equilibrium value would cause a slowdown in the growth of China’s exports creating unemployment problems, especially in the rural sector where large-scale unemployment exists.

Current situation

With China’s economic growth for 2009 and 2010 estimated to be 8.5% and 9% respectively, huge pressure will further mount on the Chinese monetary authorities to unwind their sterilisation procedures and address the problem of global imbalances by trying to stimulate domestic growth rather than continuing with sterilisation operations that are heavily focused on maintaining large current account surpluses. As mentioned, Chinese leaders have pledged to change China’s growth model from export-led to consumption-led development. This would allow the value of the Yuan to come into line with its long run equilibrium value (Business and Leadership, 2009). Recently, China has taken measures to increase domestic growth via a fiscal stimulus programme in a bid to stimulate both domestic and global demand and reduce global imbalances. However, the spending has largely been targeted at new infrastructure projects rather than efforts to boost consumption such as reforming the banking culture in China to provide more credit to the private sector. The recent domestic stimulus program cannot, in my view, lead to sustainable development when the sterilisation practices of the PBC continue, effectively operating against the goals of the stimulus programme. Increasing domestic spending in conjunction with suppressing

147 gradbook final.indd 147

22/10/2010 12:48:03


the real value of the Yuan is a bit like burning the candle at both ends. In short, China can either continue its large-scale intervention and sterilisation operations or significantly reduce its large external surplus. It cannot do both. Harvard University Professor Martin Feldstein, who recently stated, “China’s policy of expanding domestic spending while depressing the Yuan will lead to its economy overheating”, emphasises this point.14 Sustainable growth can only be achieved by unwinding sterilisation operations.

The Future for Sterilisation

As productivity increases remain substantial in the Chinese economy, it is only a question of time before the Yuan appreciates (if it does not then there will be an eventual surge in inflation causing havoc on the value of the Yuan) and when it does there will be large capital losses on holdings of foreign reserves (currently standing at over 2.273 trillion US $). A 20% revaluation of the Yuan against the major reserve currencies would cause a capital loss equivalent to 8% of China’s GDP (Goldstein & Lardy, 2006). Delaying the appreciation of the Yuan through continued sterilisation practices will only create larger losses as the stock of reserves will be greater and therefore the required appreciation will also be larger (Roubini, 2007).

Conclusion

Sterilisation is a useful ploy to keep China’s exchange rate overly competitive in the short term. In the long term, however, it is not a viable exchange rate strategy. China must allow its currency reach its long-term equilibrium level and thus loosen the controls it implements over monetary policy. As can be seen from the evidence presented throughout this paper, the longer China maintains its sterilisation operations the harder a landing it will have when it finally unwinds these operations as the fundamental value of the exchange rate will continue to rise over time in line with productivity increases. By artificially buying up foreign reserves with hoarded Yuan and using them to fund the US deficit, China created a grapevine of massive current account surpluses, despite its exorbitant GDP growth levels, benefiting from the United States continued appetite for consumption (which China largely funded) between 2002 and 2008. By suppressing the growth of its domestic economy, China has been able to keep inflationary pressures to a minimum but this has come at the expense of domestic growth. China can be heralded for strategically using sterilisation operations to the great benefit of its own economy in the short run but in the long run it will pay the penalty for its shortsighted monetary policy. The arguments provided in this paper on balance greatly favour an unwinding of the sterilisation policies China employs. If the unwinding of sterilisation is delayed, it will cause havoc on the Chinese economy in the future causing enormous problems for the economy from which it may take years or even decades to recover from. China’s long-term sustainable growth is being undermined by shortsighted sterilisation practices. For China to truly become a world power, sterilisation practices must be put to an end. http://www.bloomberg.com/apps/news?pid=newsarchive&sid=awHX2QPENKgQ.

14

148 gradbook final.indd 148

22/10/2010 12:48:04


Appendix Chinese Growth Miracle

Looking at productivity growth in the years 2002-2008, there is also a remarkable surge in growth levels corresponding to an average of 10% real GDP growth in the period. This compares favourably with US Real Growth rate of a mere 3%.

Source: CIA World Factbook. This suggests that price levels should be converging between the two countries but with a pegged exchange rate and sterilisation policies to minimise inflation, the Yuan has been kept at an undervalued level. Mussa (2006) shows in table 5 that between 2002 and 2006 appreciation of the Yuan against the dollar was a mere 6%. However, during this period prices rose by 13.9% in the US as opposed to 7.6% in China. Therefore this makes for a real change in appreciation of a measly 1.4% by the end of 2006. Despite China’s productivity levels outperforming the US significantly and its current account surplus as a % of GDP hiking from 2% to 9.4%, for this to result in a mere 1.4% appreciation for the Yuan against the dollar shows the significant effects of the PBC’s sterilisation policies.

149 gradbook final.indd 149

22/10/2010 12:48:07


Source: Mussa (2007).

Comparing China and Malaysia in dealing with sterilisation problems

A similar interest rate problem on sterilisation bonds was faced by Malaysia in the mid-1990s when interest rates exceeded US interest yields for an extended period. This culminated in massive and unsustainable losses being made on its sterilisation operations. Following the unwinding of Malaysia’s sterilisation operations, credit to the private sector increased by 20% contributing to a massive increase in lending which led to an overheating of the economy. Thus, a step-by-step unwinding of sterilisation operations would seem to be the best course of action to take. Although China retains tight controls on capital flows, the danger remains that such a high increase in credit would create incredulous pressure on the price level to rise to unprecedented levels and suddenly result in the currency becoming overvalued. Comparing China to Malaysia in the past fails to acknowledge the scale of intervention undertaken by China in comparison. China, after all, has built up a large stockpile of over $2 trillion. This obviously leads to a higher risk involved, not only to itself but also to its main trading partners as they also have a vested interest in China’s exchange rate. China faces some difficult times ahead as it deals with the inevitable unwinding of the sterilisation process (Greenwood, 2008).

150 gradbook final.indd 150

22/10/2010 12:48:08


151 gradbook final.indd 151

22/10/2010 12:48:08


Engineering & Mechanical Sciences panel

r

Prof. Nick Quirke, UCD (chair) Prof. Clive Williams, TCD

Judges’ comments

This essay describes a final year project on outsole design for the enhancement of support and performance in football boots and other footwear. The project is original and the research analysis and results have led to a validated prototype, which has strong commercial potential, (which is highlighted in the essay with a commercialisation strategy and market analysis). The panel agreed that this focused, clearly written essay described an academically excellent research study that has future potential both in research and its application.

152 gradbook final.indd 152

22/10/2010 12:48:08


r

Engineering & Mechanical Sciences

Outsole design for the enhancement of support and performance in sports footwear William Holland

T

Abstract

his project is concerned with the enhancement of support and performance in the outsole of football boots and other certain types of sports footwear. The unique patented design is intended to increase acceleration and reduce metatarsal fracture caused by excessive foot flexure. Metatarsal fracture is an injury that has affected many high-profile footballers in recent years. A concept design for an innovative castellated outsole has been developed to provide support to the metatarsals at the critical point of excessive flexure. When the ridges meet, the stiffness of the outsole increases rapidly, thus providing support to the metatarsal bones. Installation of a highly elastic material between the ridges provides an energy return from the outsole to the foot during the toe-off phase of the gait cycle. A prototype has been manufactured using a castable polyurethane elastomer. Mechanical experimentation, fatigue testing, gait analysis and finite element analysis are undertaken to validate the design. Commercialisation is investigated.

INTRODUCTION

This self-conceived student project, undertaken in response to the sharp rise in the number of metatarsal injuries experienced by high-profile athletes in recent years, concerns the patented design of a novel outsole in football boots and other certain types of sports footwear to achieve enhanced support and performance.

153 gradbook final.indd 153

22/10/2010 12:48:09


Fig. 1.1. 5th metatarsal fracture – Wayne Rooney; Fig. 1.2. Illustrative X-ray of foot and metatarsals. The long and slender metatarsal foot bones, located between the cuneiform bones and phalanges (Figure 1.3), act akin to a rigid lever in the propulsion of the ankle and foot. The metatarsals also act as a flexible structure to aid with balance and support of the entire body.

Fig. 1.3. Metatarsal location and functionality. Following extensive biomedical engineering literature review, three major causes of fracture were identified: over bending, fatigue and stamping. Excessive bending of the foot is a major cause of metatarsal fracture, giving rise to a serious injury resulting in the sidelining of many well-known athletes in recent years. An innovative design for a castellated outsole (Figures 1.4 and 1.5) has been developed by the author to facilitate improved acceleration and achieve targeted reduction in foot flexure.

154 gradbook final.indd 154

22/10/2010 12:48:10


Fig. 1.4. Castellated outsole solid model design static and in-flexure – Author. How it works: 1. When the outsole is at rest, the ridges are separated. 2. Flexure causes the ridges to meet and compress against each other. 3. Flexibility enhanced at lower foot flexure levels and stiffness increased at higher foot flexure levels. The final stage of the innovative design for the outsole involves the installation for performance enhancement of an elastic inter-ridge material in the troughs of the outsole (Figure 1.7). At the instant when a football player’s foot leaves the ground, the foot flexes and affects a downward force. At this moment, the elastic inter-ridge material becomes compressed between the troughs and deforms elastically absorbing energy from the foot. During the toe-off phase of the gait cycle, the energy stored is released rapidly back into the foot yielding enhanced energy release and acceleration.

Fig. 1.5. Ridges in flexure – Author.

155 gradbook final.indd 155

22/10/2010 12:48:10


Fig. 1.6. Anatomy of the foot; Fig. 1.7. Elastic inter-ridge material – Author.

MATERIAL SELECTION & PROTOTYPE DEVELOPMENT

The first step undertaken in the design of the outsole was to determine the most suitable material from which a prototype could be manufactured. Extensive research was undertaken into diverse materials and related manufacturing processes. This research process, combined with consultations with experienced designers and experts including Mr. Bryan Appleby, International Sports Footwear Designer, Reebok, determined the optimum boot material as thermoplastic polyurethane (TPU), which is indeed the preferred outsole material for most football boot manufacturers. TPU is a broad term and thus further investigative material tests were carried out by the author on popular brands of boots utilising the Hounsfield tensile testing machine. These tests were designed to measure the Modulus of Elasticity of the outsole materials and also aided in the determination of the typical grade and stiffness of TPU to be used in the manufacture of outsoles for football boots. Various methods of prototype manufacture were then explored. The viability of the following processes were investigated: Injection Moulding (IM) is the preferred method of mass-producing football boots for most football boot manufacturers. IM can produce an impressive ‘180 pairs of outsoles per hour, inject at a speed of 10cm3/s and apply clamping and hydraulic pressures of 6MPa.’ – www.alibaba.com/ catalog/11511058/PVC_Tpr_Sole. Fig. 2.1. Injection moulding technology. IM is used almost exclusively for largescale production (CES Edupack ‘08). Capital and tooling costs are too high for it to be considered as a means of prototyping.

156 gradbook final.indd 156

22/10/2010 12:48:13


Machining:

Equipment for machining the outsole is readily available within the college. However, direct machining of the outsole was determined as unsuitable as the required dimensions of the ridges were too difficult to produce, especially in polymers, where warping and loss of geometry occurs. Machining also results in a rough surface in the base of each trough resulting in significantly higher stress concentrations and hence a shorter fatigue life in the outsole. Castellation of the outsole requires a high level of accuracy. A cutting rig (Figure 2.2) was designed, constructed and commissioned by the author to produce the concept prototype Mark 1 (Figure 2.3). The developed rig produced a physical model prototype, but was not considered for manufacturing a functional prototype as it could not produce the requisite accuracy and finish.

Fig. 2.2. Developed cutting rig – Author; Fig. 2.3. Mark 1 concept prototype – tilted ridges. The use of a castable polyurethane elasomer was next investigated. Castable PU elastomers come in two parts (isocyanate and polyol) and must be mixed together at room temperature in very specific ratios by weight. Once mixed thoroughly, the chemicals can be poured into a mould where they react chemically and harden to form the castellated outsole. The mechanical properties are “by no means inferior to TPUs used in existing football boot outsoles.” – B. Appleby, Reebok. A suitable castable polyurethane elastomer was sourced in the UK.

Fig. 2.4. 3D rapid prototyping printer.

157 gradbook final.indd 157

22/10/2010 12:48:13


As a casting process is involved, a mould was required to form the shape of the outsole. The college 3D Rapid Prototyping Printer (Figure 2.4) was employed to manufacture the ABS plastic mould as it is the most cost effective and time efficient means available in the college. The mould was produced through input of 3D Solid Model Design Drawings prepared by the author. However, as illustrated in figures 12 and 13, although the 3D printer successfully produced the mould for the basic shape of the outsole, it could not form the high resolution of the trough moulds – the 3D printer resolution of 0.25mm was inadequate for this purpose. Thus, it was decided that the mould produced in the 3D printer would be used purely to form the shape and depth of the outsole.

Fig. 2.5. Manufactured ABS mould of outsole; Fig. 2.6. Close-up of mould front section – Author. As an alternative, the author explored the use of the Hurco CNC machine to manufacture the mould for the troughs. A 3D solid model drawing of the mould (Figure 2.7) was prepared. The trough moulds were 5mm in height with a 0.2mm diameter fillet on top to reduce stress concentration at the base of each trough in the outsole. The same mould may be used to produce different trough depths by varying the penetration of the mould into the ABS outsole mould immediately after the PU has been poured when still in liquid form.

Fig. 2.7. 3D solid model design drawing of mould manufactured on the CNC machine – Author.

158 gradbook final.indd 158

22/10/2010 12:48:13


The drawing was exported into a CNC simulation software package to replicate the machining of the mould. The simulation predicted a very smooth and accurate finish. Mild steel was selected as the most suitable mould material as it is inexpensive, machinable and readily available in the college.

Fig. 2.8. Mild steel moulds manufactured on the CNC machine – Author. Figure 2.8 depicts three failed mould production attempts on the CNC machine. A variety of drills were used to mill the pieces. However, the moulds were not produced to sufficient accuracy – it was thought that this was perhaps due to problems in the tool setup or that the coolant nozzle was disturbed during the very time-consuming thirteen hour machining process.

Fig. 2.9. ABS plastic mould – Author. As the production of each mould had proven to be extremely time consuming, it was decided that an alternative methology was required to produce the mould. A two-stage moulding process was thus employed. The first mould was manufactured successfully in the 3D rapid prototyping printer as illustrated in figure 2.9. Several moulding materials were considered for pouring into the ABS mould for the second stage of the process.

159 gradbook final.indd 159

22/10/2010 12:48:14


Fig. 2.10. Mould manufactured from candle wax – Author. The first material considered was candle wax as it is cheap and readily available. Although there was a draft of 5° in the ABS mould, the wax proved difficult to remove and cracked in several places as in figure 2.10. While the process and material did produce the filleted tips of the trough moulds, the wax was discarded due to the inherent fragile nature of the material. The author then investigated the use of low melting point alloy (lead-tin). The low melting point is required as ABS melts at approximately 90°C. Any alloy poured into the ABS mould would therefore need to have a lower melting point. The lowest melting point of an alloy sourced by the author was 140°C, which would destroy the ABS mould. For this reason, the use of a low melting point alloy was discarded. Silicone rubber (Figure 2.11) was finally chosen as the most appropriate material with which a mould could be manufactured. Silicone rubber was employed because moulds made from silicone rubber give high quality reproduction and a long mould life with virtually no shrinkage and no risk of the plaster ever gripping to the mould surface. Silicone rubber has been specifically designed for ease of mixing, has low viscosity and can be employed without the use of a vacuum chamber.’ (GRS ‘09)

Fig. 2.11. Silicone rubber mould – Author; Fig. 2.12. Prototype cast utilising silicone rubber mould – Author. Without much difficulty, the silicone rubber did produce a mould (Figure 2.11) of a very high accuracy and so casting commenced shortly afterwards. A prototype was casted (Figure 2.12). Castings of test-pieces in ABS moulds (Figure 2.13) were also performed by the author for the purpose of three-point bending tests. The ABS test-piece moulds were required so that several test-pieces could be made in one single batch, thereby ensuring consistency in batch production and reproducible/repetitive testing.

160 gradbook final.indd 160

22/10/2010 12:48:14


Fig. 2.13. Manufactured moulds for bending test-pieces – Author. Once the test-pieces were fully cured, they were removed from their moulds. The results were promising. However, a significant amount of air-bubbles remained in the casting. A degassing chamber was subsequently employed to eliminate the air-bubbles (Atlas Polymers ‘09). The chamber was assembled using a vacuum pump and an air-tight pressure cooker (Figure 2.14).

Fig. 2.14. Degassing process for developed testpieces – Author. However, prior to the degassing process, it was observed that the isocyanate had begun to crystallise in the container which would prevent the isocyanate from reacting properly with the polyol resulting in an insufficiently cured material. A ‘melting out of the isocyanantes procedure’ was carried out in a digitally controlled oven at 80°C for two hours to recover the isocyanate. A trial casting process was then conducted with the aid of the degassing chamber. Unfortunately, although no crystals were visible during the process and most of the air-bubbles were successfully removed, the castings failed to cure and instead remained in their moulds as a slime-like substance. In order to resolve the problems in the casting, all aspects of the moulding process, including the release agent and mould temperature, were examined carefully and adjusted accordingly. Further castings demonstrated the same result. At this point, it was concluded that the isocyanate could not be recovered as it had been contaminated by moisture. Therefore, no further casting was carried out.

161 gradbook final.indd 161

22/10/2010 12:48:14


A highly promising replacement polyurethane has been sourced and acquired, and has successfully met all casting requirements and is in the final stages of validation.

EXPERIMENTAL VALIDATION

Bryan Appleby, International Sports Footwear Designer, recommended that the stiffness (flexural modulus) of the developing castellated outsole mimic that of the Adidas World Cup boot (Figure 3.1) as it is “best football boot” on the market. Consequent to this recommendation, both threepoint (Figure 3.2) and four-point bending testing was performed on test-pieces cut out from the outsole of an Adidas World Cup football boot by the author.

Fig. 3.1. Adidas World Cup football boot.

Fig. 3.2. Sample three-point bending testing of Adidas boot test-piece – Author.

162 gradbook final.indd 162

22/10/2010 12:48:15


The Flexural Modulus (Effective Young’s Modulus – Ef) calculated by the author averaged 193 MPa – this stiffness is of critical importance in the selection of the most appropriate materials in the manufacture of the developing castellated outsole. Four-point bending tests were then undertaken by the author to demonstrate the presence of two stages of stiffness in a castellated outsole and to illustrate the trend between specimens of various ridge heights. Four-point bending is employed instead of three-point bending to avoid catching the central knife-edge in the notch of the specimens.

Fig. 3.3. Four-point bending test – Author. The test-pieces were manufactured from ABS plastic in the 3D Rapid Prototyping Printer. Test-pieces of differing thickness configuration (ranging from 1mm up to 5mm) were analysed. In the first series of four-point bending tests, the test-pieces were glued to a polyurethane base as in Fig. 3.3.

Fig. 3.4. Four-point bending test results – Author.

163 gradbook final.indd 163

22/10/2010 12:48:15


Figure 3.4 clearly indicates that all test-pieces demonstrated two stages of stiffness as a ramping occurred in all the results of load (N) versus deflection (mm). However, this ramping was not as pronounced as it should have been due to significant stretching in the polyurethane base.

Fig. 3.5. Four-point bending test on modified specimens – Author. In order to address the issue of stretching in the base material, the ABS specimens were subsequently removed from the polyurethane and glued to a nylon fabric (Figure 3.5). Nylon fabric is employed as it provides the strength and flexibility required without excessive elongation. Sample results of the nylon fabric base four-point bending tests are presented hereafter:

Fig. 3.6. Sample four-point bending test results on modified specimens – Author. As anticipated, the above results are significantly different to those produced with the polyurethane base. The ramping of the load (N) versus deflection (mm) lines is far more pronounced in all test-pieces other than the 1mm test-piece. This increased level of ramping is due to the very low elongation in the nylon.

164 gradbook final.indd 164

22/10/2010 12:48:16


The undertaken experimental tests present strong evidence that the developing castellated outsole does indeed demonstrate the requisite and highly desirable two-stage stiffness mechanism. The experimentation also indicates that the test-piece stiffness is directly related to test piece thickness.

FINITE ELEMENT MODEL DEVELOPMENT AND ANALYSIS

The finite element method was employed by the author to conduct structural large displacement analyses of various two-dimensional plane strain models of the developing outsole. The models were developed to determine the stresses, stress distributions and displacements in the diverse configuration designs in order to optimise geometry and fatigue life characteristics. The author developed a 2D FE model made from polyurethane with an elastic inter-ridge material as shown in figure 4.1. Coupled with results from previous four-point bending tests, where the polyurethane demonstrated considerable stretching, polyurethane was dismissed as a viable material as the application of further bending forces would bring the stresses in the model dangerously close to the failure and fatigue limits of the polyurethane.

Fig. 4.1. 2D finite element model of castellated outsole – polyurethane elastomer and elastic inter ridge material – Author. It was concluded that the physical demands of the ridges in the castellated region were quite different from those in the base. The ridges require a significant level of stiffness, whereas the base needs to be strong in tension, very flexible but have a low level of elongation. Thus, it was concluded that two separate materials were required. A combination of carbon fibre and kevlar to be used in the ridges and base respectively were chosen as suitable materials. A 2D finite element model was prepared and analysed as in figure 4.2. The purpose of the analysis was to determine the mechanical properties of the elastic inter-ridge material designed

165 gradbook final.indd 165

22/10/2010 12:48:16


to provide the same previously experimentally determined stiffness as the Adidas World Cup Football Boot outsole.

Fig. 4.2. FE model of castellated outsole – carbon fibre ridges/Kevlar base – Author. The undertaken 2D Large Displacement Finite Element Analyses also indicate that the developing device clearly manifests the beneficial phenomenon of enhanced outsole spring-back due to the presence of the Elastic Inter-Ridge material, thereby providing the wearer with an enhanced energy return and acceleration benefit over standard outsole design during the critical toe-off phase of the gait cycle. The author is currently developing full three-dimensional finite element models of the MetaSolTM outsole design – see figure 4.3. The developing 3D model is to allow for the quantification of any out of plane effects and to facilitate further innovative design optimisation.

Fig. 4.3. 3D FE model of Meta-SolTM castellated outsole – Author.

166 gradbook final.indd 166

22/10/2010 12:48:17


GAIT ANALYSIS

‘Gait analysis is the quantitative measurement and assessment of human locomotion including both walking and running... In sports biomechanics, athletes and their coaches use movement analysis techniques to investigate performance improvement’ – Peterson I, Bronzino D, (2008), Biomechanics. The author is extremely fortunate in that the college is one of only two sites in Ireland with a state of the art Vicon Gait Analysis Facility. As all major sports equipment manufacturers employ gait analysis to quantify the perceived advantages of their products relative to a competitor’s offering, this method of analysis is being applied to demonstrate the benefits of elastic inter-ridge material to performance. High-speed camera-based motion measurement is used to measure relative ankle displacements in the classmate “volunteer” subject of figure 5.1.

Fig. 5.1. Classmate volunteer; Fig. 5.2. Gait analysis testing sensor application. Gait events are quantified using: • The high-speed camera-based motion measurement in the college. Reflective spheres will be attached to the subject’s feet and legs. The cameras monitor motion by the displacement of the reflective markers. • The forces and torque applied to the subject’s foot by the ground, or ground reaction loads. To demonstrate the energy absorption and return provided by the shock absorbers, a comparison is made between acceleration when wearing prototypes with and without shock absorbers.

167 gradbook final.indd 167

22/10/2010 12:48:17


Fig. 5.3. Gait analysis testing of subject – Author.

FATIGUE TESTING

Fig. 6.1. Instron fatigue tester. Fatigue testing is critical in the determination of the fatigue life characteristics of the cyclically loaded innovative Meta-SolTM design. As bending stresses applied to football boots do not occur unidirectionally, the Instron fatigue testing machine (Figure 6.1) cannot directly measure the fatigue life of the prototype.

168 gradbook final.indd 168

22/10/2010 12:48:18


A dedicated roller attachment and bending plate was designed, manufactured and commissioned by the author, thereby allowing the application of the bending loads to the developed prototype. The resistance of the outsole material (PU elastomer) to cracking can thus be measured. The developed platform and roller attachment, as shown in figures 6.2 and 6.3, are clamped by the lower and upper jaws of the fatigue testing machine respectively. The outsole is placed upsidedown on the platform in such a way that the castellated region is situated over the edge of the platform.

Fig. 6.2. 3D solid model design of roller attachment and plate – Author.

Fig. 6.3. Fatigue testing of developing prototype using developed roller attachment/plate – Author. Once the test is complete, inspection of cut propagation is performed. A Co-ordinate Measuring Machine (CMM) is employed to measure the depth of the troughs in the outsole before and after the test. (Figure 6.4 demonstrates a trial measurement of the trough depth in a test-piece of PU elastomer).

169 gradbook final.indd 169

22/10/2010 12:48:18


Fig. 6.4. CMM measurement of trough depth variation – Author. If cuts have initiated at the base of each trough, then the depths of each cut will be measured either as a function of the actual final depths of the troughs over a specific number of cycles or as the number of cycles required to attain a specific percentage of the growth of the cut. The number of cycles is recorded by the digital counter in the fatigue testing machine.

COMMERCIALISATION OF META-SOLTM

The author has established a multi-disciplinary commercialisation team in conjunction with three business studies in information systems degree students to facilitate the development of a comprehensive business plan for the developing solution. The brand Meta-SolTM has been established. A short-term 12-month patent has been filed for Meta-SolTM under the Patent Act (1992).

Target Market

To determine the enormous scale of the football boot market, reference is made to “The Big Count” published in 2006 in “FIFA magazine”: “265 million male and female players in addition to five million referees and officials make a grand total of 270 million people – or 4% of the world’s population – who are actively involved in the game of football.” From this statement, it is evident that there is great potential for Meta-SolTM in the football boot market.

170 gradbook final.indd 170

22/10/2010 12:48:18


Fig. 7.1. FIFA World Census 2006. Furthermore, when it is considered that a significant proportion of high-profile football players, including five members of the England team, have fallen victim to metatarsal injury in recent years, it can be concluded that the market for a metatarsal support mechanism in football boots is quite significant. It is also encouraging to know that, according to www.researchandmarkets.com: “The world market for total active sportswear and athletic footwear in 2003 amounted to US$145bn, meaning that US$23 was spent on behalf of every man, woman and child in the world.” Football boots are conservatively estimated to represent 15% of total athletic footwear sales. Once Meta-Sol’sTM technology has established itself in the football market, other field sports may be approached. The following list details the popularity of these sports: • Rugby: 2,175,000 licensed members worldwide – www.24.com • Baseball: 100,000,000 players worldwide – www.the2012londonolympics.com • Australian Rules: 650,394 registered members in Australia – www.wikipedia.org • GAA: 800,000 players in Ireland – www.gaelicplayers.ie It is envisaged that the technology will be licensed to one of the major football boot manufacturers. The concept will be initially applied to high-end boots and eventually introduced to the lower end of the market once consumer confidence grows. This license will be exclusive for five years. After it expires, other boot manufacturers will be approached. The audited accounts of the three largest boot manufacturers: Nike, Adidas, and Puma have been examined. Nike is the global leader with €5.6 billion sales in 2007. On this basis Nike has been identified as the preferred customer. However, Adidas and Puma will also be approached to gauge their appetite for this product.

171 gradbook final.indd 171

22/10/2010 12:48:18


A medium case scenario where Meta-SolTM would be licensed to Nike with a margin of €3 per pair of boots is proposed: COMPARISON OF SELECTED COMPANIES • High Case: €4 per pair of boots • Medium Case: €3 per pair of boots • Low Case: €2 per pair of boots

FUTURE WORK

The design and testing of the Meta-SolTM innovative sports footwear outsole is complex and will warrant much more development to fully exploit its potential. The ongoing liaisons with Atlas Polymers and Brian Appleby of Reebok are of critical importance in the project and innovative product advancement. Medical validation of the benefits of the device poses a particular challenge and opens up a further stream of research and development. The involvement of the new Medical Engineering Design and Innovation Centre (MEDIC) will be crucial to this stage of the project development. The project has been demonstrated at a recent major Industrial Engineering Exhibition, where it received much favourable comment from the distinguished audience. The author was approached by a Senior Manager from Enterprise Ireland, who strongly urged that the project be immediately submitted for Proof of Concept Funding. Intellectual property protection also poses a major challenge – Cruickshank Intellectual Property Attorneys have been most helpful in this regard at the early stages of this project. However, the sourcing of investors to fund the major cost of International protection under the Patent Cooperation Treaty (PCT) and other major development costs is imperative to the project progression. The exposure at the recent Industrial Engineering Exhibition has already generated some tentative enquiries from potential investors and further channels for investment and industrial cooperation are being pursued through the MEDIC and Rubicon Innovation centres.

172 gradbook final.indd 172

22/10/2010 12:48:19


To conclude, the Meta-SolTM Innovative Sports Footwear outsole has achieved substantial progress in all phases of research, design, development, prototype production/testing and commercialisation investigation. The author is most grateful for the involvement of the following organisations, whose continuing involvement will be critical to Meta-SolTM’s ultimate success.

Fig. 7.2. Organisations supporting Meta-SolTM.

Appendix A ENVIRONMENTAL AND HEALTH IMPACT ASSESSMENT OF THE USE POLYURETHANE IN OUTSOLES Extraction: The technological road from oil field to finished polyurethane product has numerous side trips. The following route is taken in the petroleum-to-polyurethane (PU) process:

173 111gradbook final.indd 173

22/10/2010 15:21:52


1. Petroleum is drilled and transported to a refinery. 2. The feedstock, crude oil is fractioned off producing naphtha, the part of crude oil that is used in the petrochemical industry and thus for the production of polyurethane. 3. Once the “cat cracking”, fluffing, blending and extruding processes are completed, the end product – polyurethane pellets – are shipped to customers. – URL No. 1. All of the above steps consume high levels of energy due to the burning of fossil fuels. In particular, the refinement and “cat cracking” processes burn vast amounts of fuel as they require furnace temperatures as high as 850°C – URL No. 2. It is therefore evident that the extraction of PU is not an environmentally friendly process. ‘The development of elastic polyurethanes began as a program to find a replacement for rubber during the days of World War II. In 1940, the first polyurethane elastomers were produced’ – URL No. 3. Natural rubber was the first material to be used in the manufacture of outsoles. Natural rubber has a primary production embodied energy of ‘62-70MJ/kg’ and a CO2 footprint of 1.5 kg/kg. This compares favourably with PU which has respective values of ‘113 – 125 MJ/kg’ and 4.57 – 5.28kg/kg. Clearly, the energy consumed in the production of PU is greater and thus results in much higher greenhouse gas emissions. However, unlike rubber, PU is recyclable and does not require the same volumes of water in its production process. The durability of polyurethane contributes significantly to the long lifetimes of many products. The extensions of product life cycle and resource conservation are important environmental considerations that often favour the selection of PU. Some products containing PU have life spans exceeding 30 years. Recycling: In the late eighties, the Society of the Plastics Industry (SPI) developed a numbering system for all plastics to determine their recyclability. SPI classified polyurethane with the symbol in figure A. However, polyurethane cannot be recycled together with most other plastics. Therefore, when people discard all grades of coded plastics into the same recycling bins, ‘the recycling collector sees materials like PU as contamination, since there is often no infrastructure for recycling the material, let alone paying for it once it is cleaned and separated. Even though PU is deemed to be recyclable by CES, this description is misleading as the recovery process is so difficult to operate. Fig. A. Recycling symbol for PU. ‘Since virgin engineering plastics have higher strength and thermal properties than virgin commodity plastics to begin with, higher values of these properties remain after recycling after their primary lifetime’ (Ehrig, page 154). The excellent mechanical properties of PU, allows it to comfortably retain sufficient quality for the manufacture of goods such as flowerpots and pallets that would otherwise be made with virgin plastics such as Polyethylene Terephthalate (PET). It is therefore evident that the superior mechanical properties of PU make it a useful material for reuse. Toxicity: According to the Material Safety Data Sheet for a thermoplastic PU elastomer sold by

174 gradbook final.indd 174

22/10/2010 12:48:19


Bayer Group, ‘toxic gases/fumes are given off during burning or thermal decomposition and may cause allergic skin and respiratory reaction’....’If the material is subjected to temperatures above it’s decomposition temperature the diisocynate may be liberated. Diisocynate vapour or mist at concentrations above the TLV or PEL can irritate (burning sensation) the mucous membranes in the respiratory tract (nose, throat, lungs) causing runny nose, sore throat, coughing, chest discomfort, shortness of breath and reduced lung function (breathing obstruction).’ Polyurethanes are not considered ‘carcinogenic substances as defined by IARC, NTP and/or OSHA.’– Bayer desmopan MSDS sheet URL No. 7. The above scenario is only relevant when burning and so suggests that PU is not a hazardous material to the environment under normal working procedures. One exception may be in its incineration, where companies do not control their stack emissions to meet regulation requirements imposed by their governing environmental protection agency. Another good indication that the toxicity of PU is of no major concern is the omission of PU from REACH. Introduced by the European parliament, ‘REACH is a regulation designed to manage and control the potential hazards and risks to human health and the environment from the manufacture, import and use of chemicals within the EU’ – URL No. 8. No toxicological information was provided on Bayer’s MSDS, however risk phrases R50- 53, (the most significant environmental risks defined by REACH) were not mentioned which suggests PU does not pose any major threat to the environment. From the above, it is reasonable to state that PU is not a very toxic material and so in terms of toxicity, it can be seen as a relatively green material. Dumping: According to Polyurethanes (1996), R. Leppkes: ‘All PU materials can be disposed of without problem in modern refuse incineration plants, allowing energy to be recovered at the same time. Acid combustion gases from the incineration of blowing agents and flame retardants are retained in flue gas scrubbers. Trials in practice have shown that fears of increased dioxin emissions are unwarranted.’ Although this book was written in conjunction with BASF GmbH – manufactures of PU – the results of these trials, if true, suggest that PU is a relatively green material when used in the incineration process. INTERNET REFERENCING: URL No. 1: http://www.reachoutmichigan.org/funexperiments/quick/plastic.html. URL No. 2: http://www.mindfully.org/Plastic/How-Plastics-Made.htm. URL No. 3: http://www.gale-edit.com/products/volumes/polyurethane.htm. URL No. 4: http://en.wikipedia.org/wiki/Plastic. URL No. 5: http://www.polyurethane.org/s_api/sec.asp?SID=3&VID=159&CID=868&DID=3521 & RTID=0&CIDQS=&Taxonomy=False&specialSearch=False. URL No. 6: www.mindfully.org/Berkeley/Berkeley-Plastics. URL No. 7: http://www.bayermaterialsciencenafta.com/resources/d/document.cfm?Mode= view&f=C95F64A7-F852-C0C4-09EFD9C841119B8E&d=71130B8A-A470-D641CCFD3691E83C57FC. URL No. 8: http://www.hsa.ie/eng/Sectors/Chemicals/REACH/Overview.

175 gradbook final.indd 175

22/10/2010 12:48:20


URL No. 9: http://reach.jrc.it/polymers_en.htm. URL No. 10: http://www.rooftherm.co.uk/index.php?option=com_content&task=view&id= 29&Itemid=46.

Appendix B FEASIBILITY STUDY: ON THE USE OF CASTABLE POLYURETHANE ELASTOMER IN A CASTELLATED OUTSOLE 1. REQUIREMENTS OF EXISTING FOOTBALL BOOT OUTSOLE MATERIALS 1. To provide a base for the boot to which the studs and uppers are fixed. 2. To reduce the amount of wear endured by the boot’s midsole. As a result, the outsole increases the overall durability of the boot. 3. To provide stiffness to the boot, an aid in acceleration. 4. To provide shock absorption by dispersing the stresses exerted by the foot on the studs so that the studs do not harm the foot. This improves comfort. 5. To protect the player from sharp objects such as stones on the football pitch. 6. To have a high fatigue strength to endure at least 100,000 thousand (Ross flex at -10oC) cycles. – www.huntsman.com/pu/Media/HuntsmanFolderlowdef 7. To be aesthetically pleasing. 8. To be lightweight (to keep total boot mass to a maximum of 280-340 grams). 9. To insulate from heat and cold. 2. ADDITIONAL REQUIREMENTS OF A CASTELLATED OUTSOLE MATERIAL 1. To provide additional rigidity at the critical point when the foot is bending excessively so as to provide support to the metatarsals. 2. Due to the high concentration of stress in the base of each trough, a very low level of notch sensitivity is required to prevent tear initiation of the material. 3. HOW PROPERTIES OF PU ELASTOMER MEET FUNCTIONS 1. Low permanent deformation after extended periods of stress. This enhances durability and shape retention. 2. High impact absorbing ability helps to absorb high stresses around studs. 3. High abrasion resistance improves durability. 4. High tear initiation and tear propagation resistance is particularly relevant to a castellated outsole as it extends fatigue life greatly. 5. Cast PU elastomer is available in a wide variety of stiffnesses, an essential characteristic for a material to be considered in the manufacture of a castellated outsole. 6. Low temperature flexibility and excellent insulation allows for its use in countries with cold climates. 7. Stiffness improves acceleration. 8. Resistance to oils, fats, hydrocarbons, oxygen, and ozone – properties were taken from: Leppkes R, (1996), Polyurethanes.

176 gradbook final.indd 176

22/10/2010 12:48:20


Star diagram of castable polyurethane elastomer (Bryce D.M (1997)). Star Diagram – cast PU elastomer. ‘A star diagram is a rating system with a long “ray” indicating that the material rates well on that point… These diagrams are designed to assist material selection personnel by pertinent material selection at a glance’ – Bryce D.M, (1997), Plastic Injection Molding. From the star diagram, it is clear that cast PU elastomer meets the exhibiting main criteria for manufacture of football boot outsoles: high impact strength, the high abrasion resistance and relatively inexpensive.

Chemical composition of cast PU Elastomer – courtesy of John O’Connell, TCL Plastics. The unique versatility of cast PU elastomer as a soling material stems from the almost limitless chemical formulation combinations which give designers and manufacturers the freedom to create innovative designs that are in step with fashion and technological change. The diverse range of hardness in cast PU elastomers can be explained simply by the combination in which the flexible and stiff segments are arranged. Cast PU elastomers can be made ‘light, tough, comfortable, flexible, insulating, waterproof, slip-resistant, hard wearing and shock absorbent as required, simply by varying the formulation. They can have an almost endless variety of shapes, surface textures and colours and incorporate

177 gradbook final.indd 177

22/10/2010 12:48:20


air bags, inserts or gels for extra comfort and support. Furthermore, cast PU elastomer bonds well to many types of boot upper (leather, textile) as well as a range of complimentary soling materials (rubber, leather).’ – http://www.huntsman.com/pu/Media/HuntsmanFolderlowdef.pdf. 4. ALTERNATIVE MATERIALS It has already been established that cast polyurethane is a suitable substitute for TPU, the preferred material for the manufacture of existing outsoles on the market. As a much stiffer material is required for a castellated outsole, further research was carried out using the internet to investigate if any alternative materials were more suitable. One website of particular relevance was that of www.huntsman.com. Huntsman are a world leader in the manufacture of plastics and are the ‘only major global supplier that has developed a range of TPU dedicated to the footwear industry.’ Listed on their website is a table of the typical physical properties of the TPUs they supply and their applications. Corresponding to the sports cleats in the applications is a Shore D hardness range of 44–65 and a flexural modulus range of 25–250 MPa. When these ranges were plotted on a graph in CES Edupack 2008, a broad range of alternative polymers to TPU and hence, cast PU elastomer were produced as in the graph below. The Shore D hardness property was chosen as plastic suppliers generally categorise their products in terms of Shore hardness. This would assist in sourcing any potential alternative plastics from plastic suppliers. Hardness is also a good indication of the compressibility of a substance, a vital mechanical property of the ridges in the outsole. The flexural modulus (effective Young’s modulus) was selected as it is the measure of the resistance of the outsole and cross section to bending.

CES 2008: FLEXURAL MODULUS (GPa) – vs – HARDNESS SHORE D Alternative Materials to Cast PU Elastomer

On closer inspection of the materials in the above graph, it was found that other properties of these alternative materials do not compare as favourably with those of TPU. For example, SBS

178 gradbook final.indd 178

22/10/2010 12:48:21


(Shore 90A), as previously recommended by John O’Connell of Tedcastle Plastics in Dublin, has a fatigue strength at 107 cycles of about 4MPa whereas TPU has a more favourable strength of approximately 18MPa. PEBA is 50% more expensive and TEEE is also more expensive and has a lower fatigue strength by 5MPa. The fatigue lives of Sodium ionomer and all the polyether based TPUs are more expensive and have smaller values of fatigue strength. The strongest candidate in the CES library was found to be the polyester based ‘TPU (Ester, aromatic, Shore 60D)’. Although, it is not mentioned in the graph, Polypropylene (PP) is sometimes used to make outsoles. According to Bryan Appleby, the sports footwear designer consulted at the beginning of the project, one major advantage of PP is that it does not produce burn marks during the injection moulding process. Thus, it does not tend to result in as many waste outsoles. The success of TPU stems from excellent mechanical properties and relatively low cost. This combination explains why it is the first choice of material for most football boot manufacturers such as Nike, Adidas and Puma. As cast PU elastomers belongs to the same family of polyurethane elastomers, it will be employed as a suitable substitute for TPU.

Typical physical properties of TPUs supplied by Huntsman Polymers for Sports Footwear.

179 gradbook final.indd 179

22/10/2010 12:48:21


English Language & Literature panel

r

Prof. Elmer Kennedy-Andrews, UU (chair) Dr. Emer Nolan, NUIM Prof. Nicholas Grene, TCD Prof. Jan Jedrzejewski, UU Prof. Peter Denman, NUIM Dr. Tim Hancock, UU Dr. Willam Murphy, UU Dr. Frank Sewell, UU Prof. Brian Carahar, QUB Dr. Anne Jamison, UU

Judges’ comments

This was a most ambitious and original essay offering a rhetorical analysis of the Geneva Bible’s annotations to the Book of Revelation – a highly specialised topic, but one that ripples out to encompass issues of translation, interpretation, the making of the English Bible, the Reformation, and Protestant-Catholic relations. The essay rigorously deconstructs the editorial additions to the text, revealing moments of instability and anxiety in this seemingly most certain and grounded of texts. The argument is very well informed and shows, through close readings, the ways in which editors have ‘written themselves’ into the holy book, but who in doing so have created the kind of cracks and gaps that call their project into question. Most impressive is the way this essay is capable at once of demonstrating a thorough knowledge of the historical context, an understanding of the textual problems of annotation and redaction within biblical studies, and an ability to read texts both critically and creatively – and to do all of this with style, wit and grace.

180 gradbook final.indd 180

22/10/2010 12:48:22


r English Language & Literature

To what extent do the annotations to the Geneva Bible (1560) present the Book of Revelation as “a conventional apocalypse, [with] the conventional apocalyptic purpose of providing comfort to the suffering faithful” (Gabel et al.)? Fionnuala Barrett

T

Introduction

he Book of Revelation offers particular succour to the underdog; in the words of the authors of The Bible as Literature, it is a biblical book “that give[s] solemn assurances of great days coming for the faithful – and those days have not yet come. In any troubled age… such writings have their appeal” (Gabel et al. 166). It is the contention of this essay that the consciousness of being part of “the suffering faithful” greatly informed the editors’ annotations to the Book of Revelation, producing a set of notes that consciously presents the text as a comfort to its beleaguered audience – though the actual effect of the notes may not always coincide so neatly with the intended effect. However, while the focus of this essay is the editorial apparatus of Revelation, it is important to note its place within the whole Geneva Bible project. In Michael Jensen’s opinion, the English Geneva Bible (1560) is a “Bible of the persecuted” (43). As the editors themselves acknowledge in the preface to the reader, they began their work when “the time… was most dangerous and the persecution sharp and furious”1 (“To the Reader”, iv, recto). Fleeing from Queen Mary’s virulently anti-Protestant rule in England, which, from 1555 to 1558, saw the burning of two hundred and eighty-eight Protestants (Gribben 57), it would be surprising if the editors of the Geneva Bible should have produced a work wholly divorced from the experience of “continual persecution” The spelling of quotations from the Geneva Bible (1560) has been modernised throughout this essay.

1

181 gradbook final.indd 181

22/10/2010 12:48:22


(note l, Rev. 6:9).2 The 1560 Bible’s “entire ethos,” Crawford Gribben contends, “was bathed in the apocalyptic tenor of the exiled communities” (68), and his statement is borne out in the frontispiece to this edition. Gribben has drawn attention to its “loaded… symbolic imagery” in depicting the Israelites in the wilderness, just before the parting of the Red Sea (68). It is not difficult to understand why the Genevan editors, persecuted and exiled upholders of their faith, seized on this episode and gave it such prominence as the figurehead of their work, nor why they chose to illustrate this story’s moment of darkness before the dawn. As Gerald Hammond has remarked, “Its implications are obvious – that this is the word of Scripture aimed at comforting and strengthening a beleaguered and oppressed people” (136). The audience knows that this picture depicts the moment of crisis, and can mentally supply the satisfying conclusion for themselves: “the Egyptians pursued and went after them to the midst of the Sea… but the Lord overthrew the Egyptians in the midst of the Sea. So the water returned and covered the chariots and the horsemen, even all the host of Pharaoh that came into the Sea after them: there remained not one of them” (Exod. 14:23-28). The story of the Red Sea could be said to be the message of Revelation writ small – the correspondence between the Red Sea story and Revelation, after all, is irresistible: that though they are seemingly hedged about on all sides, “The faithful are exhorted to patience” (note p, Rev. 14:12), for God will see them through. Furthermore, the Red Sea story, in being placed in the past rather than the eagerly-awaited future, offers a proof that God has interceded – with the implicit message being that he will do so again – on behalf of his chosen people. The frontispiece illustration proves the extent to which Revelation-inspired apocalyptic thought pervades the Geneva Bible project as a whole; as Gribben has remarked, such thought does seem “to thrive when the godly [are] both persecuted and geographically estranged” (57). It is hard to disagree with the authors of The Bible as Literature when they assert that “Revelation is, simply put, an apocalypse… and was written for the same reason that other apocalypses were written – namely, the author believes his own days to be the worst possible days and thus surely the last days; therefore the faithful were to be encouraged to persevere during this bad time, for their deliverance was soon to come” (Gabel et al. 161). It conforms to four of the five criteria by which Gabel and his co-authors judge a work to be apocalyptic – there is a cosmic level of conflict; “two mighty opposites must meet in mortal combat” (156); it emphasises the “last things”; and it takes the form of “a report of the vision experienced by the speaker” (157) – as well as five of the six apocalyptic criteria listed by Dillard and Longman, with Revelation containing narrow eschatology, mediated revelation, unusual imagery and a deterministic view of history, along with having been composed at a time of political oppression (386-389). The Geneva Bible annotations amply reflect the fact that Revelation, as a conventional apocalypse, was written “to comfort those among the Christian faithful who were puzzled (to say the least) at their lot in life” (Gabel et al. 161). One is not required to read the annotations for long before being struck by the determination of the editorial apparatus, in tandem with the main text, to offer solace to him “that feeleth himself oppressed with afflictions, and desireth the heavenly graces and comfort” (note n, Rev. 22:17). But before continuing further, a definition of the terms of this essay ought to 2

All biblical references throughout this essay are taken from the Geneva Bible (1560).

182 gradbook final.indd 182

22/10/2010 12:48:22


be made. “Comfort” is a polyvalent word which, to modern ears, can sound gentle but it can also be taken to signify rather more activity; a comfort is a source of strength and galvanisation, as well as being a balm for hurt minds. The editors use a similar word in their preface to the reader, declaring that this Bible has been created “for the edifying of the brethren in faith and charity” (“To the Reader”, iv, recto). Like “comfort”, there is a passive quality to the verb to “edify”, but its Oxford English Dictionary definition emphasises the “strengthening” connotation which the word has carried since at least the fourteenth century (2c). To help others to take comfort from Revelation does not mean merely to sympathise with them in their oppression; far more frequently, the aim of the Geneva Bible’s annotations to Revelation is to invigorate and prepare the reader mentally for the ongoing battle against the forces of evil. This polyvalence gives rises to the variety of approaches the editors use to further their presentation of Revelation as a narrative of comfort. Sometimes the comfort which the annotations to Revelation offer the reader is a relatively passive one, such as when they highlight the rewards of obedient faith. The single verse which reads, “They shall hunger no more, neither thirst any more, neither shall the sun light on them, neither any heat” (Rev. 7:16) is generously developed in the notes, which explain that “all infirmity and misery shall be then taken away” (note q), the faithful “shall have no more grief and pain, but still joy and consolation” (note r), and that God “shall give them life and conserve them in eternal felicity” (note u). While such imagery offers consolation to the oppressed minority, such passages in effect offer little more than an injunction to keep calm and carry on: a salutary but hardly strident message. However, rather than dwelling on exhortations to virtue, the annotations give far greater emphasis to the other, more active meaning of comfort, and do all in their power to strengthen the reader. This is done, perhaps surprisingly, by lavishing space on scathing criticism of the Catholic Church. Its clergy, those purveyors of “false and deceivable doctrine, which is pleasant to the flesh” (note e, Rev. 9:3), is a particular source of ire. They are described as “effeminate, delicate, idle, trimming themselves to please their harlots” (note q, Rev. 9:8) and they are accused of partaking in “the oppression of the poor and cruelty against God’s children” (note r, Rev. 9:8). Verse 9:7 alone furnishes the editors with ammunition for a torrent of recrimination: it demonstrates that “the Pope’s clergy [are] proud, ambitious, bold, stout, rash, rebellious, stubborn, cruel, lecherous and authors of war and destruction of the simple children of God” (note n); that, “They pretend a certain title of honour, which indeed belongeth nothing unto them, as the Priests by their crowns and strange apparel declare” (note o); and that “they pretend great gentleness and love: they are wise, politic, subtle, eloquent and in worldly craftiness pass all in all their doings” (note p). Even worse than the clergy, though, is the Bishop of Rome himself, who has his “power out of hell and cometh thence” (note n, Rev. 11:7) and “gaineth the victory, not by God’s word, but by cruel war” (note o, Rev. 11:7). His name is Apollyon, “That is, destroyer: for Antichrist the son of perdition destroyeth men’s souls with false doctrine, and the whole world with fire and sword” (note y, Rev. 9:11). He “is compared to an harlot because he seduceth the world with vain words, doctrines of lies and outward appearance” (note b, Rev. 17:1), and his “beauty only standeth in outward pomp and [impudence] and craft like a strumpet” (note f, Rev. 17:4). The purpose of these extended tirades may elude a reader who is uninvolved in the sixteenth

183 gradbook final.indd 183

22/10/2010 12:48:22


century Protestant-Catholic conflict, but to those caught up in the fight, they offer an inverse form of comfort to the “suffering faithful” on the Protestant side of the war. In revealing and emphasising the moral decay at the heart of the Catholic Church, such annotations have the effect of reinforcing the persecuted Protestant cause, and arguably feed the fire of the Protestant reader’s determination to fight on against Catholics who “infect and kill with their venomous doctrine” (note u, Rev. 9:10). Such outspoken drawing of parallels is by no means uniform in the annotations to the Geneva Bible as a whole, contrary to Tom Furniss’ observation that “The most significant tendency of the Geneva Bible’s editorial apparatus is to encourage its readers to make direct connections between what they read about the Old Testament Jews or early Christians and their own contemporary situation in England” (8). When compared to the annotations of another biblical apocalyptic book, that of Daniel, the editorial apparatus of Revelation appears positively reckless in its frankness. While there are copious notes for the visions of Daniel which often encroach into the space of the main text, these notes diverge most clearly from those of Revelation in not drawing attention to contemporary parallels. The four beasts of chapter seven are glossed as symbolising the Assyrians and Chaldeans (note c, Dan. 7:4), the Persians (note d, Dan. 7:5), Alexander and the Macedonians (note h, Dan. 7:6) and the Roman Empire (note l, Dan. 7:7). The Daniel annotations refer explicitly only to long-dead threats and, in chapter seven, do not draw parallels between ancient Rome and the Catholic Church. This is in direct opposition to the note accompanying the Whore of Babylon, which emphasises that image’s significance in relation to both “the ancient Rome [and] the new Rome which is the Papistry” (note d, Rev. 17:3). The Daniel notes display a completely different attitude to the one governing the notes to Revelation, which never resist making an unflattering allusion to Catholicism, such as when Rome is likened to Babylon: “for as much as the vices which were in Babylon, are found in Rome in greater abundance, as persecution of the Church of God, oppression and slavery with destruction of the people of God, confusion, superstition, idolatry, impiety, and as Babylon the first monarchy was destroyed, so shall this wicked kingdom of Antichrist have a miserable ruin, though it be great and seemeth to extend throughout all Europe.” (note m, Rev. 14:8). The circumspection to be found elsewhere in the Geneva Bible only emphasises the provocative nature of the rhetoric of the notes to Revelation, a provocation which is surely intended to rouse the Protestant reader to action. This campaign of provocation explains the gloating tone which enters the annotations when the main text deals with the destruction of the enemies of the “true religion” (“Epistle”, ii, verso). The annotations remark that those of the disobedient who are killed by plagues “were justly destroyed” (note d, Rev. 9:20), and elsewhere gleefully point out that, “The infidels are tormented by hearing the truth preached” (note q, Rev. 11:10). God is praised in the main text with the statement that his judgements are “true and righteous” (Rev. 16:7); this is glossed by the editors, in something of a reach, with, “For as much as thou destroyest the rebels, and preservest thine” (note f). Revelation is, according to its Genevan editors, a vision “which containeth the doctrine of God’s judgements for the destruction of the wicked and comfort of the godly” (note a, Rev. 15:1). On the superficial

184 gradbook final.indd 184

22/10/2010 12:48:22


evidence of the emphasis found in the Genevan annotations, it would seem that “the destruction of the wicked” is more important than the “comfort of the godly”, but the descriptions of each, though doing so in different ways, fulfil the same purpose of giving comfort to its reader. This seeming tension in the annotations’ rhetoric is perhaps best encapsulated in the note to the phrase about the “glassy sea, mingled with fire” (Rev. 15:2), which is interpreted as “this brittle and inconstant world mixed with fire, that is, troubles and afflictions, but the Saints of God overcome them all, and sing divine songs unto God by whose power they get the victory” (note c, Rev. 15:2). At no point for the Genevan annotations can the reward of the faithful (the “overcoming” of “troubles and afflictions” by “the Saints of God”) be separated from the “victory” over and defeat of the reprobate. However, the annotations are not solely concerned with making the contemporary applications of Revelation clear to the “simple reader”, which Jensen has identified as the Geneva Bible’s target audience. As well as this “innocent” strategy of straightforwardly reinterpreting the message of the main text to apply it to the reader’s life, the annotations cannily and insistently shore up the authority of the text itself. Presumably the aim is to make Revelation’s narrative of comfort not simply a matter of faith but of absolute certainty; the effect is somewhat different. The figure of Christ, “the light that giveth light to everyone that cometh into this world” (note l, Rev. 22:16), is given all possible support in the editorial apparatus; the sceptic might say that such support is excessive. It emphasises his “wisdom, eternity and divinity” (note s, Rev. 1:14) and that “his judgements and ways are most perfect” (note t, Rev. 1:14). His divinity is keenly pointed out: he is “equal God with [his] father, and eternal” (note b, Rev. 1:17); elsewhere, he is described as “a true and natural man and yet God equal with [his] Father” (note k, Rev. 22:16); and in another passage again, the reader is told that “The eternal Divinity of Jesus Christ is here most plainly declared with his manhood, and victory over death to assure his that they shall not be overcome by death” (note k, Rev. 2:8). The aim of such notes is clearly to affirm to the reader that they “ought nothing to doubt of the salvation of the faithful” (note a, Rev. 19:1). But this zeal has an oddly destabilising effect, at once reinforcing and, in so obsessively reinforcing, undermining the point the editors wish to make. Secondly, and interestingly for the literary project of the Geneva Bible, a great deal of importance is given to the book or text. When the text mentions that an angel (which is glossed as Christ) has in his hand “a little book open” (Rev. 10:2), the annotation eagerly develops its importance: “Meaning the Gospel of Christ, which Antichrist cannot hide, seeing Christ bringeth it open in his hand” (note e). A subsequent note explains that “the minister must receive [the scriptures] at the hand of God before he can preach them to others” (note m, Rev. 10:8). Stressing the authority of the book is understandable; after all, this is the period, as Gribben reminds us, in which “the Reformers replaced an infallible church with an infallible book” (73). Additionally, notwithstanding Jensen’s audience of “simple readers”, perhaps the editors were to some extent writing to themselves in these references to the importance of the text, and in doing so, reinforcing the importance of their own dedicated work in exile. The trumpeting of the authority of the book, however, is a great deal more precarious a position than building up the divinity of Christ; the making of the English Bible, even for as thoughtfully-produced an edition as the Geneva Bible, is a tortured affair, being a translation of occasionally contested or

185 gradbook final.indd 185

22/10/2010 12:48:23


partial originals, which is additionally far removed from the time of its composition. So much can fall into these gaps, from one language to another or one time to another; a mischievous critic could point out the inherent instability in relying too heavily on such a text. As Jensen has commented, “The translators go out of their way to unravel the difficulties of the ‘hard places’ rather than to shroud them in encoded mystery” (42). Perhaps it is this zeal on the part of the editors, their willingness to “go out of their way” to provide a comforting explanation for every potential bone of contention, that results in one of the most problematic, even fatal, elements of the annotations to the Book of Revelation. So endemic is the editors’ zeal to present Revelation as a narrative of comfort that they don’t seemingly know when to rein themselves in, even when doing so would be to the advantage of their aim. There is a highly problematic, pick-and-choose ethos in these notations whenever the text talks of definite numbers or spans of time which makes for a very destabilising adjunct to the main text. In Rev. 13:1, the significance of the seven heads belonging to the beast is glossed very specifically: it indicates “Rome, because it was first governed by seven Kings or Emperors after Nero, and also is compassed about with seven mountains” (note b). More often, though, a specific number is fudged in the notations: the “two and forty months” (Rev. 13:5) the main text allots to the reign of the beast becomes in the editorial apparatus the far more vague assurance that “Antichrist’s time and power is limited” (note i); similarly, “a thousand, two hundred and threescore days” (Rev. 11:3) becomes merely “a certain time” (note f). Rev. 2:10 matter-of-factly records that “ye may be tried, and ye shall have tribulation ten days”, which specific amount of time is lamely interpreted as “Signifying many times… although there shall be comfort and release” (note q). Occasionally these inaccuracies are understandable: the use in Rev. 7:4 of “an hundred and four and forty thousand”, a large number, to enumerate those saved by the living God is notated as “an infinite number” (note g), for example. But when Rev. 15:1 speaks of “seven angels having the seven last plagues”, and the corresponding note glosses this as “Meaning an infinite number of God’s ministers, which had infinite manners of sorts and punishments” (note b), there is a whiff of desperation. It is in Revelation’s use of specific numbers that the editors face their greatest hurdle in achieving their own narrative of comfort. Their inconsistent interpretations (where seven angels symbolise an infinity in one place, but in another, seven heads directly correspond to seven leaders and/or seven hills) severely destabilise the text they are attempting to shore up. Through fudging these numbers the editors attempt to present them as symbols for interpretation, yet the effect is only to draw attention to the specificity of the numbers used in the main text. Similarly, the annotations strive to play down any literal interpretation of Revelation as a prophecy which shows “the things which must shortly be fulfilled” (Rev. 22:6). The editorial apparatus combats the references to the imminence of the second coming which pervade the main text, to wholly self-defeating effect. Christ’s unequivocal assurance at the end of the main text of Revelation that, “Surely, I come quickly” (Rev. 22:20), is appended with a troublesome note: “Seeing the Lord is at hand, we ought to be constant and rejoice, but we must beware we esteem not the length nor shortness of the Lord’s coming by our own imagination” (note p, Rev. 22:20). While attempting to face down the literal and obvious interpretation of the main text, the note only points up how much of a stretch is needed to interpret this straightforward statement in any other way. It is this sort of annotation

186 gradbook final.indd 186

22/10/2010 12:48:23


which may give rise to the critical view that Jensen has observed of “the Geneva Bible as a text which squashes interpretation rather than inviting it” (37). The introductory “Argument” to Revelation exhorts the reader to “Read diligently: judge soberly, and call earnestly to God for the true understanding hereof” (114, verso). This essay has attempted to outline the extent to which, for the Geneva Bible, a “true understanding” of the Book of Revelation relies on viewing it as a source of comfort for its “suffering faithful” readership – that is, of comfort in both the “soothing” and “strengthening” senses of the word. This is done through addressing the concerns of the “suffering” audience but also, and more extensively, through attacking the enemies of this audience and thus indirectly exhorting the oppressed reader to persevere. However, the effort to create a comforting narrative is so totalising that occasionally cracks appear in the editorial apparatus which undermine the primary objective to address “the earnest desire that the faithful have to be delivered out of these miseries, and to be joined with their head Christ Jesus” (note q, Rev. 22:20). These are most pronounced in the anxious emphasis placed on the authority of Christ and the text itself, as well as in the annotations’ engagement with the use of numbers in the main text. Such potential interpretative gaps undermine the editors’ attempt to present Revelation as comforting to its readership. Thus, while the aim to create a narrative of comfort entirely pervades the annotations that they produced, the success of the narrative put forth by the annotations to the Geneva Bible is not as complete as the editors might have hoped. Works Cited Berry, Lloyd E. Introduction. The Geneva Bible: A Facsimile of the 1560 Edition. Peabody, Massachusetts: Hendrickson, 2007. 1-28. Betteridge, Maurice S. “The Bitter Notes: The Geneva Bible and its Annotations.” The Sixteenth-Century Journal 14.1 (Spring, 1983) 41-62. Accessed 04/01/10: http://www.jstor.org/stable/2540166. “comfort, v.” The Oxford English Dictionary. 2nd ed. 1989. OED Online. Oxford University Press. Accessed 08/01/10: http://dictionary.oed.com.elib.tcd.ie/cgi/entry/50044725. Cummings, Brian. The Literary Culture of the Reformation: Grammar and Grace. Oxford: Oxford University Press, 2002. Danner, Dan G. “The Contribution of the Geneva Bible of 1560 to the English Protestant Tradition.” The Sixteenth-Century Journal 12.3 (Autumn, 1981) 5-18. Accessed 06/01/10: http://www.jstor.org/ stable/2539783. Dawson, Jane E. A. “The Apocalyptic Thinking of the Marian Exiles.” Prophecy and Eschatology. Ed. Michael Wilks. Oxford: Blackwell, 1994. 75-91.

187 gradbook final.indd 187

22/10/2010 12:48:23


Dillard, Raymond B. and Tremper Longman. “Daniel.” An Introduction to the Old Testament. Nottingham: Inter-Varsity Press, 2007. 371-396. “edify, v.” The Oxford English Dictionary. 2nd ed. 1989. OED Online. Oxford University Press. Accessed 08/01/10: http://dictionary.oed.com.elib.tcd.ie/cgi/entry/50072154. Furniss, Tom. “Reading the Geneva Bible: Notes Toward an English Revolution?” Prose Studies 31.1 (April 2009). 1-21. Gabel, John B., Charles B. Wheeler, Anthony D. York and David Citino. The Bible as Literature: An Introduction. Oxford: Oxford University Press, 2006. The Geneva Bible: A Facsimile of the 1560 Edition. Peabody, Massachusetts: Hendrickson, 2007. Gribben, Crawford. The Puritan Millennium: Literature and Theology, 1550-1682. Dublin: Four Courts Press, 2000. Hammond, Gerald. The Making of the English Bible. Manchester: Carcanet Press, 1982. Jensen, Michael. “‘Simply’ Reading the Geneva Bible: The Geneva Bible and Its Readers.” Literature and Theology 9.1 (March 1995): 30-45. Norton, David. A History of the English Bible as Literature. Cambridge: Cambridge University Press, 2000. Prickett, Stephen. “The Bible in literature and art.” The Cambridge Companion to Biblical Interpretation. Ed. John Barton. Cambridge: Cambridge University Press, 1998. 160-178. Vanderkam, James C. “Apocalyptic literature.” The Cambridge Companion to Biblical Interpretation. Ed. John Barton. Cambridge: Cambridge University Press, 1998. 305-322. Zakai, Avihu. “Reformation, History, and Eschatology in English Protestantism.” History and Theory 26.3 (October 1987): 300-318. Accessed 04/01/10: http://www.jstor.org/stable/2505065.

188 gradbook final.indd 188

22/10/2010 12:48:23


189 gradbook final.indd 189

22/10/2010 12:48:23


Environment & Geosciences panel

r

Prof. Paul Ryan, NUIG (chair) Prof. Anna Davies, TCD Dr. David Jackson, The Marine Institute Dr. Matthew Parkes, The National Museum of Ireland

Judges’ comments

The panel was greatly impressed by the evidence-based nature and the originality of this study, which employs simple technology to purify drinking water. The panel feels that this is an excellent example of undergraduate research, which will have a significant impact on society. The report is of such a high standard that it is worthy of publication in the scientific literature and may well go some small way to fulfilling the requirement for access to sterile drinking water, which Kofi Annan, the United Nations Secretary General described as “a fundamental human need and, therefore, a basic human right”.

190 gradbook final.indd 190

22/10/2010 12:48:24


r Environment & Geosciences

Solar UV disinfection of drinking water for developing countries John Murtagh

“Access to safe water is a fundamental human need and, therefore a basic human right. Contaminated water jeopardises both the physical and social health of all people. It is an affront to human dignity.”

R

– Kofi Annan, UN Secretary General ABSTRACT

ecently, the field of UV disinfection has gathered some momentum in the wastewater treatment industry. UV disinfection has a low cost, short contact time and is efficient against parasites such as cryptosporidium. Solar UV can also be used quite effectively in drinking water treatment in developing countries at a low cost. This object of this project was to compare illumination areas and regimes in order to optimise disinfection. It was found that in the presence of a photocatalyst dose was not directly proportional to disinfection and this lead to the proposal of a novel solution for a continuous-flow drinking water treatment system for developing countries.

INTRODUCTION

The use of solar UV as a disinfectant has long been known. It was first recorded over 2000 years ago in a Sanscrit text “Oriscruta Sanhita” (Patwardhan, 1990). The first rigorous scientific study was not carried out until 1887 (Downes and Blunt). They showed that development of bacteria in a nutrient broth and urine could be stopped by exposure to the sun. The practical use of this discovery was not fully exploited until Aftim Acra proposed it as a method for the disinfection of drinking water in the 1980s. Acra proposed the use of a batch

191 gradbook final.indd 191

22/10/2010 12:48:24


reactor (SODIS) method in developing countries where there was an abundance of sunlight but little financial assets for long-term investments (Acra et al. 1980). An estimated 2.5 million people use this SODIS method (EAWAG/SANDEC, 2008). In recent years the focus has shifted towards developing a continuous flow solar reactor and identifying the parameters to maximise the disinfection of water. Optimum conditions such as turbidity, temperature, materials, exposure time, photocatalysts etc. have all been studied. Researchers have reached a general consensus that for solar disinfection to occur, 3-5 hours of sunlight above 500Wm-2 are required (Oates et al. 2003). In some experiments it has been indicated that a mechanism other than classical UV kinetics may be acting on the system. It was also found in several experiments that intermittent flashes of light followed by dark periods enhanced the solar disinfection kinetics. It was proposed that “dark zones” induced a stress on the bacteria which increased the disinfection kinetics (McLoughlin et al. 2006) and that there is an optimum balance between frequency of flash, length of flash and subsequent dark period. This may be explained by the interaction well established light repair mechanisms and the recently published “Memory Anti-Bacterial Effect” described by Li et al. (2009). Similar “pulsed” UV light has previously been used in wastewater, drinking water treatment and applications of food preservation. However, this has mostly been done at much shorter wavelengths and using much higher intensities. Sunlight reaching the earth is filtered by the ozone layer allowing only longer wave radiation to pass through. The term “strobe” has been used in this project to distinguish between the lower intensity larger wavelength disinfection in natural sunlight and the more intense shorter wavelength disinfection used in “pulsed” disinfection. The objectives of this project were to examine further this “stroboscopic” effect on disinfection kinetics by varying the illumination regime while keeping the illuminated area constant in both the presence and absence of TiO2. A novel disinfection method was proposed with an increase in disinfection efficiency due to both the induced “stress” and increased heat (dark patches could potentially act as black body absorbers in actual sunlight conditions) and reduce the need for expensive reflector materials. The appearance of a stroboscopic mechanism was confirmed in the presence, but not in the absence, of titanium dioxide. The comparison of experiments with respect to UV Dose was questioned and the prominence of specific disinfection mechanisms was inferred.

192 gradbook final.indd 192

22/10/2010 12:48:24


METHOD Materials

Fig. 1. Diagram of reactor used. A small-scale compound parabolic reactor was used, which consisted of six parallel Pyrex tubes (9.6 mm dia.), each 250 mm long, connected by opaque plastic tubing. The half acceptance angle of the reflector was 900 and the concentration ratio was approximately 1. The closed loop reactor was connected in series to a centrifugal pump (2.8 L/min, giving a fluid velocity of 0.64 m/s) pumping from a 1 L reservoir which was located in the dark. The Reynolds number for the flow rate was calculated to be 5450 meaning that the flow regime was in the smooth turbulent zone. The water undergoing treatment was continuously recirculated throughout the experiment (Misstear and Gill, 2009).

Experimental Procedure

Fig. 2. The illumination regime was varied using cardboard. The effective area remained constant.

193 gradbook final.indd 193

22/10/2010 12:48:27


The reactor was placed in a non-fluorescent box, below a parallel UV-A light source (Philips HB 175) and subjected to a light intensity of 15 W/m2 [measured using a PMA2100 radiometer (Solarlight, USA)]. The reactor was first disinfected by pumping 0.1 M ethanol around it for five minutes. This was then followed by rinsing with four flushes of sterile water to ensure that no ethanol would remain in the system and lead to unwanted bacteria kill-off during the experimental periods. 1 ml of E. coli K-12 was added to 1 L of sterile water, containing the appropriate concentration of TiO2, to give an initial E. coli concentration of approximately 1 x 106 CFU/mL. After priming the pump with the sample, the tubes at either end of the reactor system were placed in the bottle, which acted as the reservoir for the experiment. The bottle was placed in a non-fluorescent chamber and the pump was turned on. From previous flow regime tracer studies on the same system using Rhodamine, it was determined that complete mixing of the E. coli occurred throughout the system after 140 seconds (McLoughlin et al. 2006). At this time, a 100 µL sample was taken from the bottle as the UV light source was switched on. Samples were then taken at the sample points described below. Appropriate dilutions were made of each sample in powers of ten, by adding 900 µL of sterile water to 100 µL of sample. The plate count method was then employed for bacterial enumeration. A vial containing the diluted samples from 0 minutes was kept in the dark until the end of the test and the samples were plated again at this time. Two distinct types of experiment were carried out: experiments using suspended TiO2 photocatalyst at a concentration of 50 mg/L and experiments with no photocatalyst at all. The sample points were: 0, 15, 30, 45, 60, 120, 180, 240, 300, 360, and 420 minutes. At each time step, four parallel samples were taken: one was plated immediately, while the other three samples were left in the dark at 35°C for 2 hours, 24 hours and 48 hours before being plated (Misstear and Gill, 2009). The illumination regime was varied by placing pieces of cardboard in different configurations over the reactor. The data was compared by plotting graphs of log10 bacterial decay vs. time where dose received was the same. When comparing inactivation of different effective areas bacterial decay was plotted against cumulative UV dose. The cumulative UV dose was calculated using the following formula:

Where QUVn and QUVn-1 are the cumulative irradiated UV energy received per L of sample at times n and n-1; Δtn is the time interval between sampling times; UVGN is the average incident radiation on the irradiated area; A is the irradiated area; and VT is the total circulating volume. In the case of the UV lamp, the intensity was a constant 15 W/m2 throughout the experiment, so the equation became:

Where tn is the time at which the sample is taken and UVGN is a constant value of 15 W/m2.

194 gradbook final.indd 194

22/10/2010 12:48:28


Bacterial Preparation and Enumeration

E. coli K-12 (ATCC 10798) was used as the bacteria for all experiments and was stored in freeze dried form at -80°C before being prepared to give an E. coli concentration of approximately 1 x 109 CFU/mL. A single colony of E. coli was transferred from a streaked plate to 50 ml of Luria broth (Sigma-Aldrich, St. Louis, MO). This was incubated for 18 hours at 37°C and then centrifuged at 3000 rpm for 12 minutes. The broth was poured off and replaced with 50 mL of sterile water and the sample was centrifuged again. This water was poured off and again replaced with 50 mL of sterile water and centrifuged. When this water was poured off, a pellet of E. coli was left at the bottom of the vial. 5 mL of sterile water was added to this to give an E. coli concentration of approximately 1 x 109 CFU/mL. The bacteria were then refrigerated at 4°C in suspension and used within one week of preparation. This procedure was repeated at the beginning of every week. The number of viable cells in the water samples taken during the experiment was determined by plating 20µL of an appropriate dilution on Luria Agar (Sigma-Aldrich, St. Louis, MO) and counting colonies after incubation at 37°C for 18 hours, using the spread plate method. Quality control and assurance guidelines described in the Standard Methods (APHA, 1999) were strictly followed, while all samples were plated in triplicate. The method detectable limit (MDL) was 5 CFU/mL.

Photocatalyst Preparation

Titanium dioxide was used as the photocatalyst in all of the experiments. The TiO2 was supplied by the Degussa company and was of the type P25 Aeroxide (80:20 anatase: rutile). The appropriate mass of TiO2 was measured out using a four-figure microbalance (Cahn C-33) and mixed with 1 L of deionised water in an autoclave bottle. The contents of the bottle were then sterilised by autoclaving [HiClave HV-25 autoclave (HMC, Japan)] for three hours at 105°C. Preliminary experiments were conducted under the UV lamp to decide upon a suitable concentration of TiO2 to be used in the experiments. The reactor was dosed with an initial concentration of 1 x 106 CFU/mL of E. coli K-12, and with 50mg/l suspended TiO2.

195 gradbook final.indd 195

22/10/2010 12:48:29


RESULTS AND DISCUSSION 1. Photocatalytic – Strobed vs. Continuous Illumination (1 Pipe Effective Area) Initial Die-Off

Fig. 3. A comparison of continuous UV illumination (blue) against three strobed configurations. “Openings” refers to the amount of times the flow passes in and out of the light. The horizontal part of the 18 Openings series indicates a MDL has been reached. Initial plating of samples at the time intervals shown indicates an increase in disinfection with a corresponding increase in openings. Disinfection is more effective for continuous illumination compared to 6 and 12 openings, but less effective than 18 openings. It appears as though slightly more than 6 openings could be equivalent to the disinfection observed in continuous illumination. All series are quite similar until approximately an hour into the experiment when differences become noticeable.

196 gradbook final.indd 196

22/10/2010 12:48:31


Residual Disinfection

Fig. 4. Continuous illumination.

Fig. 5. 6 Openings.

197 gradbook final.indd 197

22/10/2010 12:48:33


Fig. 6. 12 Openings.

Fig. 7. 18 Openings. Samples were plated at the time intervals shown, two hours later, 24 hours later and 48 hours later. In general, disinfection seemed to follow the trend established in the initial die off section

198 gradbook final.indd 198

22/10/2010 12:48:35


with similar disinfection efficiencies. No regrowth of the bacteria was observed in any of the configurations which was to be expected due to the “residual disinfection effect� documented previously (e.g. Rincon and Pulgarin, 2007).

Discussion 1

From figure 3 we can see that there appears to be a number of flashes approximately equivalent to continuous irradiation. Below this number of flashes disinfection seems to be less effective, while above this number of flashes disinfection appears to be more effective. While it is unclear what may be causing this, one proposed explanation is that there may be a tradeoff between the direct and indirect disinfection mechanisms. Direct disinfection is a physical process, which takes time to occur. This form of disinfection may be favoured in continuous irradiation. Indirect disinfection is a photochemical process requiring much less time (McLoughlin, 2006). There may be a limiting concentration of ROS reached in the illuminated patches at any one time. By spreading out the dose into many flashes this may allow the more efficient production, reaction and re-production of these ROS. This process would be favoured by strobed configurations and may explain the similarity of disinfection rates in figure 4. There seems to be little or no difference in the disinfection rate until about an hour into the experiment (Figure 3). This may be related in some way to a threshold direct UV dose which is reached in all cases. After this point indirect mechanism may become more prominent explaining the discrepancy in disinfection rates. One thing is clear from figure 3. A variation in the illumination regime can result in large differences in the disinfection efficiency. Previously, researchers have related different experiments using cumulative UV dose. These results would question the validity of this method for the purposes of comparison. Variations in the illumination regime can affect the disinfection efficiency at a constant dose. Hence, experiments should not be compared using cumulative dose unless equivalent illumination patterns are used.

199 gradbook final.indd 199

22/10/2010 12:48:35


RESULTS AND DISCUSSION 2. Comparison of Different Illumination Areas Initial Die-Off

Fig. 8. Comparison of 1, 3 and 6 Pipe Effective Areas with respect to experimental time. The horizontal part of the graphs relate to two different MDLs that were used.

Fig. 9. Comparison of 1, 3 and 6 Pipe Effective Areas with respect to cumulative UV dose.

200 gradbook final.indd 200

22/10/2010 12:48:36


Residual Disinfection

Fig. 10. (Top) 1 Pipe Area with 18 Openings. (Middle) 6 Pipe Area. (Bottom) 3 Pipe Area.

201 111gradbook final.indd 201

22/10/2010 13:52:15


Discussion 2

Figure 8 shows that there is no significant difference in the three disinfection rates with respect to experimental time. When compared with respect to cumulative dose received (Figure 9), efficiency increases with reducing area, which seems counterintuitive. This may be due to a minimum threshold dose being achieved in all instances, but this seems unlikely when considering the results obtained in the previous section. Toxic contamination, induced pump stress and or temperature have been ruled out in previous experiments and are assumed to be discountable here due to the repetition of the results obtained by McLoughlin (2006). Therefore, the explanation may lie in an induced “shock� mechanism caused by the strobed effect or the enhancement of the photochemical process by spreading the dose over a larger area. The results of this experiment could be exploited in the design of a continuous flow reactor. As it seems to be just as effective to have a larger continuously irradiated dose as a smaller strobed dose spread over the same area, a dull black finish could be applied to pipes to create this stroboscopic effect. The dull black patches could potentially act as black body absorbers and heat up the water inducing the synergistic UV and temperature effect. Until now, the use of fixed photocatalysts has always proved less effective than suspended catalysts (Dunlop, 2002). Perhaps by combining the stroboscopic mechanism and the induced synergistic effect of black body absorbers with a fixed photocatalyst, adequate disinfection may be reached. This could potentially solve the problem of catalyst removal and re-growth of bacteria; however, further testing is necessary to determine the viability of such a configuration. No significant differences were noted in the replates and no regrowth took place as expected.

RESULTS AND DISCUSSION 3. Photolytic Disinfection Initial Die-Off

Fig. 11. Photolytic disinfection using both varying illumination regimes and varying areas with respect to experimental time.

202 111gradbook final.indd 202

22/10/2010 13:52:15


Fig. 12. Photolytic disinfection using varying illumination regimes with respect to cumulative UV dose.

203 111gradbook final.indd 203

22/10/2010 13:52:15


Residual Disinfection

Fig. 13. (Top) 3 Pipe Effective Area: 18 Openings. (Middle) 3 Pipe Effective Area: 6 Openings. (Bottom) 3 Pipe Effective Area: 12 Openings.

204 111gradbook final.indd 204

22/10/2010 13:52:15


Fig. 14. (Top) 6 Pipe Effective Area: Fully Open. (Bottom) 3 Pipe Effective Area: 42 Openings.

Discussion 3

It was found that the disinfection efficiency of a single pipe was almost negligible for photolytic treatment and so the irradiated area was increased to minimum of three pipes for testing. It was found that the stroboscopic mechanism had little, even somewhat detrimental effect on the disinfection rates, suggesting that the strobed mechanism may be confined to photocatalytic treatment. The replates appear to agree with initial die-off graphs with slightly more efficient disinfection taking place in the more continuously irradiated configurations. This would seem to support the argument in the first discussion that the stroboscopic mechanism could be a result of a more uniform and spread out production of ROS in the reactor as the concentration of ROS produced in photocatalytic treatment would be much faster.

205 111gradbook final.indd 205

22/10/2010 13:52:15


If true this would also suggest that, direct UV damage may be the prominent disinfection mechanism in photolytic treatment while indirect mechanisms account for the increased efficiency of the strobed photocatalytic treatment after a threshold UV Dose has been reached (1hr of illumination in this case). Figure 12 shows that photolytic disinfection efficiency is higher in a smaller area than in a larger one when compared with respect to cumulative UV dose. This would seem to support the theory that disinfection cannot be directly related to UV Dose.

SOURCES OF ERROR

> In some of the early experiments, a higher M.D.L. was used (50CFU/ml) due to greater than necessary dilution. Once this mistake was spotted it was rectified to the more acceptable value of (5CFU/ml) for subsequent tests. > Due to the high variability, microbial testing generally requires three repeats of each configuration to ascertain representative values. Due to time constraints, not all experiments were repeated three times and hence some of values obtained were less representative than others. > The light intensity was measured and found to be slightly less than 15Wm-2 at the sides of the reactor which may give rise to some inaccuracies in the figures.

CONCLUSIONS

> Varying the illumination regime while keeping the irradiated area constant did have an effect on photocatlalytic disinfection, with the more strobed configurations being favoured once a threshold number of flashes was exceeded. Possible explanations for this increase in disinfection efficiency are an induced stress caused by the transition from dark to light, or the spreading of the dose over a larger area. > Varying the illumination regime while keeping the irradiated area constant in photolytic treatment had little and a somewhat detrimental effect on disinfection. This may indicate that the stroboscopic mechanism is related to the addition of a photocatalyst. > These results would question the validity of comparing photocatalytic experiments with respect to UV dose, as variations in the illumination regime can cause discrepancies in disinfection rates. The validity of these comparisons has previously been questioned by Rincon and Pulgarin (2007). > Results obtained were in agreement with previous findings by McLoughlin (2006). In particular for photolytic disinfection, it was found that smaller area produced more effective disinfection. This questions the assumed direct relationship between dose and disinfection. > No significant regrowth was experienced in any of the experiment. However, further testing is necessary to confirm this as many of the photolytic configurations did not reach full measurable disinfection. > The fact that there is a negligible difference between a larger continuously irradiated dose than a smaller strobed dose spread over the same area could be exploited in the design of a continuous flow reactor. Black body absorbers could be used to increase the synergistic UV and temperature

206 111gradbook final.indd 206

22/10/2010 13:52:15


effect. This could also be combined with a fixed as opposed to a suspended photocatalyst. > Experimental evidence suggests that direct UV may be the more prominent disinfection method in photolytic treatment (neglecting temperature). Indirect mechanisms seem to account for the discrepancies in disinfection rates in photocatalytic treatment which seem to occur after a threshold UV dose has been reached (1 hour into experiment in this case). This appears contrary to Reed (2004) who claimed indirect damage is the prominent disinfection mechanism in UV disinfection. Further testing is necessary to confirm this.

RECOMMENDATIONS FOR FUTURE WORK

> An examination of bacterial UV resistance with respect to a common illumination pattern. As work by Rincon and Pulgarin (2003) and many papers on commercial pulsed UV have shown, the nature of the organism has a large bearing on its resistance. > Further exploration of the stroboscopic mechanism to ascertain optimum parameters of frequency of flash, length of flash, dark period. > Modelling the increase in temperature due to the action of black body absorbers (dark patches on pipe) to estimate the increase in synergistic effect. > Investigation into the use of extremely turbid water >350 N.T.U. to improve disinfection. > Further investigation into the Viable but non-culturable theory (VNC) using non-selective agar media. > The use of exogenous photosensitisers (such as Rose Bengal) to increase disinfection rates and investigate their use as a replacement for fixed photocatalysts. > Use of non-photosensitive antioxidants to quantify to some degree what effect the formation of ROS has on the disinfection rates. > Investigation into the potential harm of endo and exo toxins formed after inactivation of bacteria. > An investigation into the possible leaching of DEHP and DEHA into water from PVC bottles used in SODIS method. > An investigation into the presence of iron in water and its effect on UV transmission. > Use of contaminated sewage water as opposed to suspensions of laboratory isolates of faecal bacteria to simulate real conditions (Acra et al. 1990). > An investigation into the development of a reactor with a slightly concentrating reflector, sized to raise the temperature of water passing through the reactor to higher temperatures promoting the synergistic effect. > The construction of a “greenhouse� type pipe to increase temperature in pipe and promote the synergistic effect. >F ield trials in developing countries. > I nvestigation into the potential synergistic effect with strobed UV and elevated temperature >E xploration of the combination of a fixed photocatalyst, strobed mechanism and synergistic effect induced by back body absorbers.

207 111gradbook final.indd 207

22/10/2010 13:52:15


Fig. 15. TOP: Diagram of proposed strobed treatment system with black patches to increase heat. BOTTOM: Illustration of the synergistic effect of temperature and UV radiation exploited in the above design. Natural die off (purple), Irradiation inactivation (cream) and synergistic effect (green) (Acra et al. 1989; Wegelin et al. 1994, Marino et al. 1995).

208 111gradbook final.indd 208

22/10/2010 13:52:16


209 111gradbook final.indd 209

22/10/2010 13:52:16


Historical Studies panel

r

Prof. Sean Connolly, QUB (chair) Dr. Neal Garnham, UU Prof. Robert Gerwarth, UCD Prof. Edward James, UCD Dr. Micheál Ó Siochrú, TCD Prof. Alan Sharp, UU

Judges’ comments

This essay stands out as the most impressive in an excellent top cohort. Looking beyond simplistic assessments of the official attitude to women as either progressive or reactionary, it places policy realistically in the context both of broader problems within the Soviet state and of developments in interwar Europe as a whole. In doing so, it casts light, not just on the position of women, but on the much-debated question of continuity and discontinuity between the regimes of Lenin and Stalin. It is a fine piece of historical writing, drawing on a wide range of both primary sources and secondary literature, and showing a sympathetic insight into the reality of ordinary lives in a traumatic period.

210 111gradbook final.indd 210

22/10/2010 13:52:16


r

Historical Studies

Womanhood under Stalin: selfhood under threat? A critical exploration of the Soviet sexual counterrevolution of the 1930s Joanne Davies

O

stensibly, Nicholas Timasheff’s influential characterisation of early 1930s Soviet social policy as indicative of a ‘Great Retreat’ from the revolutionary values of the previous decade constitutes a compelling indictment of Stalinist Realpolitik.1 Certainly, the ‘Decree of the Prohibition of Abortions’ enacted in June 1936 severely diminished the formal sexual autonomy that had been afforded to Russian women in the previous decade. By recriminalising abortion, incentivising maternity and discouraging divorce the new law aimed to combat the ‘light minded attitudes towards the family’ that had developed in the years after the Revolution.2 Yet, to suggest that the sexually conservative path forged in the Stalin era represented a reversion to the traditional Tsarist values overestimates the ideological liberalism inherent in earlier Bolshevik reforms. In truth, the ‘sexual revolution’ of the 1920s, and the subsequent ‘sexual counter-revolution’ of the 1930s, were both expedient reactions to episodes of profound domestic turmoil. In both instances, these reactions were catalysed by an imperative need to reconfigure social relations. However, their potency was tempered by a strong element of residual sexual conservatism common to the Leninist, Stalinist and, indeed, the mass outlook. Thus, the impetuses behind the regressive measures of the 1930s are best conceived in terms of basic ideological continuity with the earlier reforms, mitigated by Stalin’s own imperative desire to consolidate the regime, and will be examined as such. Nicholas Timasheff, The Great Retreat: the growth and decline of communism in Russia, (New York, 1946). ‘Law on the prohibition of abortion’ in P. Boobbyer (ed), The Stalin era, (London, 2000), pg.158.

1 2

211 111gradbook final.indd 211

22/10/2010 13:52:16


As Richard Stites astutely comments, ‘the Thermidorian mood was born with the Revolution and was given voice as early as 1920 by the leader of the Revolution himself’.3 That the year 1920 may credibly be perceived as an acme of Bolshevik sexual liberalism therefore highlights the extent to which Lenin’s attitudes have been misrepresented in this respect. Undoubtedly, the Bolshevik government’s legalisation of abortion in November 1920 marked a seminal moment in its appreciation of women’s needs. Yet, the law was not designed to afford women supreme mastery over their bodies, nor to legitimise promiscuous behaviour. More a grudgingly acquiescent gesture than radical doctrine, the new law reluctantly accepted the ‘serious evil’ of abortion as a necessary interim ill.4 Couched in strikingly paternalistic terms, the decree simply aimed to protect vulnerable women from the ‘mercenary and often ignorant quacks’ who had traditionally preformed terminations.5 Notably, the language employed in the 1936 decree is similarly protective but actually appears less moralising. Abortion was not depicted as an outright ‘evil’, but rather as an unwelcome source of ‘proven harm’ to the health of the much-valued Soviet woman.6 Moreover, continuity with Lenin’s long-term aims was explicitly cited as the primary reason for the retraction of the law, as: “The abolition of capitalist exploitation in the U.S.S.R., the growth of material wellbeing and the gigantic growth of the political and cultural level of the toilers make it possible to raise question of a revision of the decision of the People’s Commissariats of Health and Justice of November 18, 1920.”7 This refusal to declare counter-revolutionary intention may be interpreted as Stalinist whitewash, but cannot be assumed so. Contemporary propaganda illustrated Stalin’s fervent desire to be viewed as a viable successor to Lenin in the continuum of communist thought. With respect to the abortion issue, this does not appear an entirely fanciful aim. Whilst Lenin had envisaged the ‘gradual disappearance of this evil’ rather than its forcible abolition, both regimes were motivated by a paternalistic pragmatism, albeit in markedly unequal measure.8 Such pragmatism thus played a pertinent role in Stalin’s decision to repeal the 1920 decree. By the early 1930s grave concerns from the influential quarters had been voiced with regard to the declining Soviet birthrate.9 Artificial contraception had been legalised in 1923 yet, partly due to an ongoing rubber shortage, abortion remained the preferred method of family planning for many

Richard Stites, The women’s liberation movement in Russia, (Princeton, 1978), pg. 376. ‘The Soviet Decree on the Legalisation of Abortion (1920)’ in S. Groag Bell and K. Offen (eds), Women, the family and freedom: the debate in documents, (Stanford, 1983), pg. 302. 5 Ibid. 6 Boobbyer, pg. 157. 7 ‘Law on the prohibition of abortion’ in Rudolf Schlesinger (ed), The family in the U.S.S.R.: documents and readings, (London, 1949), pg. 270. 8 Bell and Offen, pg. 302. 9 R. Thurston. ‘The Soviet family during the Great Terror, 1935-1941’ in Soviet Studies, vol. 43 (1991), pg. 557. 3 4

212 111gradbook final.indd 212

22/10/2010 13:52:16


Soviet women, particularly urban workers.10 Resultantly, the financially and socially cumbersome impact of free abortion manifested in terminations exceeding live birth in the major cities by the late 1920s.11 In Moscow, statistics for the year 1934 demonstrated that abortions outstripped live births by a factor of 2.7, a not atypical representation of the extent of the problem in urban areas.12 Coupled with a fall in the national birth-rate from 42.2 births per thousand people in 1928 to just 31.0 per thousand in 1932, these statistics alarmed the Kremlin.13 Indeed, so considerable was their fear, Stalin made a rare public statement on the subject, commenting that maternity was ‘not a private affair, but one of great social importance’.14 Thus, female fertility became constructed as a ‘state task’.15 Preoccupation with population size, and the attendant decline in female sexual emancipation, was ‘almost universal’ in inter-war Europe.16 Demographically devastated by the unprecedented death toll of World War I, many states grew increasingly insecure. As the threat of renewed hostilities intensified over the course of the 1930s, so too did popular concerns. In the Soviet Union, Stalin’s concurrent inception of a range of physically oppressive economic and political measures, including the Five Year Plans, collectivisation and the Terror, further mortified the population at large. Thus, having accurately predicted the resumption of global conflict from as early as February 1931, Stalin spent the ensuing decade intent on meeting the enormous industrial, political and social challenge required to maintain his position of power. Crucially, the intensity of Soviet internal weakness during the 1930s actually helped tone down Stalin’s modification of matters sexual, by ensuring that women remained in employment. Although in many respects analogous to initiatives adopted by neighbouring states, Stalin’s unique brand of social policy remoulded Soviet attitudes to sex, marriage and family in the image of totalitarian service to the state. Stalin’s shrewd recognition of women’s dual instrumental value as a ‘huge army of workers... called upon to bring up our children’ ensured that the Soviet backlash against ‘the new woman’ of the 1920s was actually less intense than in many states.17 Throughout the 1930s women’s participation in the urban labour force increased significantly, rising from 29% in 1928 to 40% by 1940.18 Therefore, whilst ‘fear of war made the regime pro-natalist’ it did not Wendy Goldman, ‘Women, Abortion and the State: 1917-1936’ in Barbara Evans Clements, Barbara Alpern Engel and C Worobec (eds), Russia’s women: accommodation, resistance, transformation, (Berkeley, 1991), pg. 247. 11 Goldman, (1991), pg. 263. 12 Chris Ward, Stalin’s Russia, (London, 1999), pg. 236. 13 D.L. Hoffmann, Stalinist values: the cultural norms of Soviet modernity, 1917-1941 (New York, 2003), pg. 98. 14 Choi Chaterjee, Celebrating women: gender. 15 Hoffmann, pg. 101. 16 Richard Overy, The inter-war crisis in Europe, (London, 1994), pg. 43. 17 Joseph Stalin, ‘Report to 17th Party Congress’ in Lawrence & Wishart (eds), Women and Communism: selections from the writings of Marx, Engels, Lenin, and Stalin, (London, 1950), pg. 87. 18 E. Mawdsley, The Stalin years: the Soviet Union, 1929-1953, (Manchester, 1998), pg. 49. 10

213 111gradbook final.indd 213

22/10/2010 17:36:37


incite a wholesale sexual counter revolution.19 Women did not assume a soley domestic function, but ‘were to be citizens, workers, housekeepers, wives and mothers. They were to serve their families and their nation’.20 A further facet of the ‘sexual counter revolution’, indeed one viewed as more integral to the ‘woman question’ than female reproductive autonomy, was that of marital relations within the Soviet family. Wilfully selective accounts of the 1920s have often misrepresented the era as a period of libidinous ‘sexual excesses’.21 Comparatively radical exponents of ‘free love’, such as Alexandra Kollontai, have regularly been cited as paradigmatic of Bolshevik attitudes to personal relationships.22 Yet, Lenin’s belief that ‘promiscuity in sex matters is bourgeois’ corroborates the view that ‘Victorian attitudes about morality and the importance of marriage’ prevailed in Party circles well beyond the Revolution.23 Similar assessments of Stalin’s sexual conservatism cannot, therefore, be constructed as evidence of his anachronistic subversion, but rather as a relatively average stance for someone of his age and position. Yet, to casually equate Stalin’s attitude with that of Lenin’s fails to account for his repeal of the progressive Family Codes, introduced under the Bolsheviks in 1918 and 1926, that had secularised marriage, recognised de facto unions and drastically simplified divorce, making it widely accessible. Rather, the motivation for the more restrictive, authoritarian matrimonial measures enacted in 1936 lay in Stalin’s attempt to consolidate power in a state riven with social instability. Whilst exacerbated by Stalin’s domestic policies much of the discontent was caused by the myopia of the ‘intellectuals and jurists’ of the 1920s, who had reconceived ‘gender and generational relations’ with little appreciation of the practical outcome.24 In effect, their extension of formal sexual equality within relationships was often appropriated by opportunistic males as free sexual license. Paraskeva Ivanova’s account of her sexual exploitation by a married Party superior is typical of the problems generated by the ambiguity that surrounded the Bolshevik stance on causal sexual relations, as she was admonished for ‘playing the bourgeois lady’ when she spurned her colleague’s advances.25 Sexual demands, couched in the rhetoric of socialism, often compelled vulnerable women to accept sexual compliance as another state task. Notably, this was able to occur as women, although legally equal to their male co-workers, largely occupied low-status subordinate roles. As the ‘sexual revolution’ of the 1920s had not done enough to change attitudes at grassroots level, ‘millions of women saw their lives ruined by Don Juan’s in communist garb’.26 M. Hutton, Russian and Western European women, 1860-1939, (Lanham, 2001), pg. 268. B. Evans Clements, Bolshevik women, (Berkeley; Cambridge, 1997), pg. 275. 21 Stites, pg. 346. 22 A. Kollontai, Selected Works, (London, 1977), pg. 67. 23 Hoffmann, pg. 92. 24 R.G. Suny, (ed), The structure of Soviet history, (New York, 2003), pg. 130. 25 P. Ivanova, ‘Why I do not belong in the Party’ in S. Fitzpatrick and Y. Slezkine (eds) In the shadow of the Revolution: life stories of Russian women from 1917 to the Second World War (Princeton, 2000), pg. 214. 26 N. Timasheff, ‘The Family, the school, the church: the pillars of society shaken and re-enforced’ in C. Ward (ed), The Stalinist dictatorship (London, 1998), pg. 305. 19 20

214 111gradbook final.indd 214

22/10/2010 13:52:16


Appreciably, by the 1930s, the tenets laid down a decade earlier were in need of some revision. Yet the 1936 decree, which strictly regulated divorce and enforced the payment of alimony, was not a magnanimous gesture born soley out of concern for women’s welfare. Indeed, Pravda’s gushing espousal of the new law as evidence of ‘Stalin’s concern for mothers and children’ only serves to further emphasise this.27 In truth, the issue of alimony and spousal desertion carried considerable political currency, particularly amongst women, as marital breakdown was endemic: 80% of all wedded unions, registered in Moscow, had ended in divorce by 1929.28 Explicit acknowledgment of the populist dimension of the legislation can be found enshrined within the decree itself. Its authors unabashedly state that, through the decree, ‘the Soviet Government responds to numerous statements made by toiling women’ in favour of regressive measures.29 Whilst public opinion never dictated policy, Stalin was willing to court it in order to consolidate his popularity with women. The voluminous amount of letters from women in support of the legalisation, published in Izvestiia and Pravda, during the public debate on the issue are testament to his success in this respect. Although during the heady days of Revolution some radical, albeit vociferous, elements within the Party had predicted that the family would begin ‘withering away’ under socialism, the majority of young people still aspired to companionate marriage, and thus provided a captive audience.30 Significantly, as the 1936 decree did not strip such women of the socio-economic status afforded by the Revolution, most did not deem the measures counter-revolutionary. In fact, Barbara Evans Clements argues that, as many Russian women resented the Bolshevik insistence that they surrender family responsibilities in order to achieve emancipation, ‘the new Soviet woman was an upgrade’.31 Fundamental to women’s largely positive responses to the ‘sexual counter revolution’ of the 1930s, was the recognition of their enduring traditionalism without seeking to regress to the pre-revolutionary status quo. In this respect, the regime was motivated by a practical desire to strengthen family bonds in service of the state, rather than at its expense. Goikhbarg, the primary author of the 1918 Family Code, had attempted to make the legislation ‘almost completely free of male egotism’ due to his sympathy for female emancipation.32 In constructing the 1936 revision, Stalin adopted a relatively similar stance, albeit motivated by unadulterated expediency, which Stites summarised thus: “(Restoration of the family) implied self-discipline and the attachment to a single-job, responsibility, and the good ‘bourgeois’ lifestyle habits; and perhaps most crucial, it provided a setting wherein men’s and women’s energy could be directed to production.”33 J. Brooks, Thank you comrade Stalin! Soviet public culture from Revolution to Cold War (Princeton, 2000). W. Goldman, Women, the state and Revolution, (Cambridge, 1993), pg. 297. 29 Schlesinger, pg. 272. 30 M. Buckley, Women and ideology in the Soviet Union, (New York; London, 1989), pg. 128. 31 B. Evans Clements, Daughters of Revolution: a history of women in the U.S.S.R. (Arlington Heights, 1994), pg. 75. 32 Stites, pg. 363. 33 Stites, pg. 385. 27 28

215 111gradbook final.indd 215

22/10/2010 13:52:16


Resultantly, the apparent ‘sexual counter revolution’ was also stimulated by the desire to rehabilitate male roles, albeit to a lesser degree than female function. Significantly, the regime’s approach was much less reactionary than the attitudes espoused by some exponents of a return to traditionalism. L.V. Kashkin’s 1936 ‘Proposals on Demographic Policy’ provides a useful point of contrast. Kashkin writes, ‘it is the woman’s duty to take care of the health of the new generations... As for the man, he has no duty to the new generation, since children are produced not by him but by the woman’.34 Such a simplistic analysis of the familial division of labour was by no means uncommon as Trotsky betrayed a relatively similar perspective when he wrote that even ‘the boldest revolution... cannot divide equally between [the sexes] the burden of pregnancy, birth, nursing and the rearing of children’.35 Yet, whilst women were expected to complete a ‘double shift’ by working both at home and in industry, the new law ‘did not enshrine the rights of the father’ within the domestic realm.36 Pravda’s lead editorial of 9th June 1936 underscored the Party’s attempts to emphasise male responsibility within the reformed approach to sexual and familial relations.37 Contrasting the communist marriage was unions forged by capitalists, the writer concluded that ‘without deep and serious love, without the bliss of motherhood and fatherhood, the personality of both individual and society is incomplete. Communism makes for whole and happy men’.38 Fidelity and fatherhood were emphasised as natural attributes, inherent to the disposition of any good communist. McCauley suggests that Stalin viewed familial loyalty as a litmus test for loyalty to the Party, noting that ‘he did not trust a comrade who divorced his wife’.39 Similarly, Stalin’s nephew, Budu Svanidze, recalled his uncle’s view that ‘the danger of loose morals is the gravest threat there is’.40 In his affectionate memoir, Svanidze also recorded Stalin’s opinion that, ‘the French Revolution collapsed because of the degeneration of the morals of its leaders, who surrounded themselves with loose women from the Palais Royal’.41 Whilst Svanidze dramatically overstated his moral integrity, Stalin’s recognition of the benefits of an immovable and unchanging family framework remains credible. Furthermore, increased male visibility in the 1930s family propaganda is also of significance. Men’s presence in domestic contexts demonstrated that the sexual counter revolution was not a backlash against women, but rather the opportunistic promulgation of new gender roles within a nuclear family set up – a novel concept within Russian society.42 Pictures of fathers and children, infants in day-care and women at work dominated the pages of Pravda in 1936 at the expense L.V. Kashkin, ‘Proposals on the demographic policy’ in L. Siegelbaum and A. Sokolov (eds) Stalinism as a way of life (New Haven, 2000), pg. 203. 35 L. Trotsky, The revolution betrayed: what is the Soviet Union and where is it going? (London, 1973), pg. 144. 36 Hoffmann, pg. 89. 37 Boobbyer, pg. 156. 38 Ibid. 39 M. McCauley, Stalin and Stalinism, (Harlow, 1983), pg. 46. 40 B. Svanidze, My Uncle Joe, (London, 1952), pg. 96. 41 Ibid. 42 Clements (1997), pg. 275. 34

216 111gradbook final.indd 216

22/10/2010 13:52:16


of maternal imagery.43 Moreover, as ‘Stalinists rejected male leadership as an obscurantist principle’ they refused to restore the supreme authority that they had had over their wives under the Tsar, mitigating the counter-revolutionary effect of the revisions.44 Instead, the material and emotional rewards of familial devotion were conveyed to both sexes. Konstantin Zotov’s utopian depiction of rural family life in the propaganda poster captioned, ‘Every Collective Farm Peasant or Individual Famer now has the Opportunity to Live Like a Human Being’ demonstrated this Soviet ideal.45 Whilst Stalin’s revisionist approach to sexual matters was not an entirely chauvinistic endeavour, he channelled popular approval for his reforms into the consolidation of his authoritarian role as national paterfamilias. In doing so, he strengthened his own political authority by interweaving his ‘cult of personality’ within the emergent ‘cult of family’. As Elizabeth Waters argues, ‘the Stalinist regime sought to bolster its legitimacy through a semblance of patriarchal stability’.46 As the ‘new family was cast in the image of the state’ fatherly domestic authority deferred to the Party, and thus to its increasingly paternalistic leader.47 Whilst many other aspects of the sexual reforms demonstrated at least tacit continuity with the tenets of the Revolution, Stalin’s emergence as the nation’s ‘beloved father and leader’ owed everything to the precedent set by the Tsarist regime.48 As paterfamilias, Stalin exerted ultimate authority within the system. An embittered and exiled Trotsky critiqued Stalin’s reversion to the sexual values of old, linking the policy change to Stalin’s incarnation as supreme authority figure. In Thermidor in the Family, Trotsky wrote that ‘the most compelling motive for the present cult of the family is undoubtedly the need of the bureaucracy for a stable hierarchy of relations, and for the disciplining of youth by means of 40,000,000 points of authority and power’.49 Trotsky’s analysis of Stalin’s motivations is corroborated by both the sustained valorisation of the leader over the course of the decade, and the increasingly repressive measures adopted in order to shore up his image. Essentially, the seismic impact of the Revolution necessarily shook the conservative foundations of the old society at its most basic familial level. Bolshevik radicals believed that, by reducing the old family to rubble, personal relationships would be reconstructed according to a socialist blueprint. Whilst residual conservatism amongst Party moderates precluded the introduction of wholly radical initiatives, the still progressive sexual reforms of the 1920s caused remarkable reverberations into the subsequent decade. Due to the largely negative impact of R.T. Manning ‘Women in the Soviet countryside on the eve of World War II, 1935-1940’ in B. Farnsworth and L.Viola (eds), Russian peasant women (New York, 1992), pg. 211. 44 E. Van Ree, The political thought of Joseph Stalin: a study in twentieth century revolutionary thought, (London, 2002), pg. 171. 45 V. Bonnell, Iconography of power: Soviet political posters under Lenin and Stalin, (Berkley, 1997), pg. 105. 46 E. Waters, ‘The modernisation of Russian motherhood, 1917-1937’ in Soviet Studies, xliv (1992), pg. 131. 47 J. Van Geldern, ‘The centre and the periphery; cultural and social geography in the mass culture of the 1930s’ in S. White (ed), New directions in Soviet history (Cambridge, 1992), pg. 74. 48 Brooks, pg. 186. 49 Trotsky, pg. 153. 43

217 111gradbook final.indd 217

22/10/2010 13:52:16


said reform on an ill-prepared, avowedly traditionalist population, Stalin pursued their repeal in order to consolidate his political position, and to meet the challenges posed by the imperatives of rapid industrialisation and impending warfare. Such modifications were ‘neither a conclusive demonstration of the family’s functional necessity nor a complete reversion to the status quo ante’ but rather a highly pragmatic course ploughed by a leader willing to eschew ideology in order to strengthen Soviet society, and definitively bolster his position at its centre.50

Lapidus, pg. 111.

50

218 111gradbook final.indd 218

22/10/2010 13:52:16


219 111gradbook final.indd 219

22/10/2010 13:52:16


International Relations & Politics panel

r

Dr. John O’Brennan, NUIM (chair) Prof. Neil Collins, UCC Dr. John Barry, QUB Dr. Neil Robinson, UL

Judges’ comments

This is a theoretically rich and empirically challenging essay that analyses the so-called ‘war on terror’ of recent years. Utilising constructivist and post-structuralist approaches from the cannon of international relations theory, it seeks to explain and understand the reasons for the West’s violent engagement with radical Islamic groups and how what might previously have evolved as a rather marginal conflict was ratcheted up to dominate statecraft in the contemporary era. The essay demonstrates an impressive commitment to applying abstract theory to real world empirical realities and in doing so, displays a maturity and confidence which is rarely encountered in undergraduate work. It draws on and employs an outstanding range of both conceptual and empirical sources, which allows it both to cleverly integrate theory and policy practice and establish a persuasive link between the complex issues of identity and the practices and justifications for the ‘war on terror’. In doing so, it elaborates an enlightening and robust account of the complex interlinks between contemporary notions of law, sovereignty, the state and violence. It does so by drawing on an impressive range of scholars including Edward Said, Fred Halliday, Walter Benjamin, and Jacques Derrida, and linking their work to more contemporary IR theorists such as Maaja Zehfuss. We therefore wholly commend the essay for its originality of thought, bold arguments and mature engagement with a complex subject.

220 111gradbook final.indd 220

22/10/2010 13:52:16


r

International Relations & Politics

“The war against terrorism has been based on misconceptions about the nature of the enemy.” Discuss Cormac Hayes

I

ABSTRACT

n this essay we are asked to discuss whether the ‘war on terrorism has been based on misconceptions about the nature of the enemy’. This shall be done by identifying key discursive ‘codes’ specifically the binaries of West/Islam and state/terrorist that are embedded in the discourse of the ‘war against terrorism’. We will then critically assess these ‘codes’ in a manner inspired by post-structural writers from both within International Relations academia and beyond. Ultimately, the aim will be to open up some space for a more critical and ethical analysis of the ‘war on terrorism’.

INTRODUCTION

9/11 represented a moment of deep trauma in the modern era to an American nation unaccustomed to such irruptions of violence on their own soil and as such there was a palpable need for clear and purposeful explanation and reaction. However, from the very moment such desire took hold the trauma and globally induced shock at the endlessly broadcast, spectacular image of the event, meant that explanation became opportunistic and caricatured, and reaction in the form of vengeance, conflict and violence became inevitable. The explanation of events in the Manichean terms of us/them, west/east and West/Islam as well as some deeper, underlying discursive dichotomies like state/terrorism, sovereign/non-sovereign and legitimate/illegitimate violence, delimit the possibility of response and reaction by a process where: “… policy debate and political action downshifted to a simple declarative with an impossible peformative: to eradicate evil. Binary narratives displaced any complex or

221 111gradbook final.indd 221

22/10/2010 13:52:16


critical analysis of what happened and why.” (Der Derian, 2002: pg. 265) In this essay we will highlight, analyse and deconstruct in a Derridian fashion, two of the prevailing binaries of the ‘war against terrorism’ and in doing so highlight how these inevitably construct ‘misconceptions about the nature of the enemy’.

West/Islam

“The media framed the whole crisis within the context of Islam, of cultural conflicts, and of Western civilisation threatened by the Other.” (Abrahamian, 2005: pg. 531) “... it goes much beyond Islam and America, on which one attempts to focus the conflict to give the illusion of a visible conflict and of an attainable solution (through force).” (Baudrillard, 2001) From the outset, the political discourse; its rhetoric, strategy and realisation, have been plagued by a dialectic particular to the ‘war against terrorism’ with the results cutting across the purported lines of ‘civilisational’ struggle and victimising an entire array of peoples of various ethnic and religious identities. What is more of a patchwork of contingent conflicts, played out by a variety of actors with wildly diverse rationales, ideologies and motivations are conflated into a grand narrative that resembles an incredibly crude world view and a heuristic device that sets in motion the subsequent series of invasions, incursions, bombings and techno-terror state-violence while also serving to justify these strategies. One such example of this that quickly emerged in the wake of 9/11 was the juxtaposition by the Israeli government of their struggle with second Palestinian intifada and the US-led fight against Al-Qaeda: “Sharon insisted that the weekend’s events had made it clear that America and Israel were engaged in ‘the same war on terrorism’, and if America had been justified in its military retaliation against Al-Qaeda and the Taliban, then Israel was justified in launching its gunships against Hamas, Islamic Jihad and the Palestinian Authority in Gaza and the West Bank.” (Gregory, 2004: pg. 184) As Fred Halliday states, “Interconnections have to be recognised, but it is better to avoid simplistic reductions of one conflict to the other” (Halliday, 2002: pp.39-40, emphasis added). However, this rhetorical oeuvre of the strategists and speechwriters of the Bush Administration and the many governments worldwide who adopted the formal language of the ‘war on terror’, has gained a chapter of its own the historical global political discourse. The essential rhetorical thrust of the ‘war against terror’ has served as a conflation of an array of long simmering political conflicts into a simple dichotomous strategy of understanding. The most immediate dichotomy that legitimates much of the response is the apparent disconnect between the values of Western liberal democracy and Islam, the West/Islam divide or Samuel Huntington’s oft quoted ‘Clash of Civilisations’ thesis that spoke of a “… centuries old military interaction between the West and Islam…[that]…is unlikely to decline. It could become more virulent.” (Huntington, 1992: pp. 31-32)

222 111gradbook final.indd 222

22/10/2010 13:52:16


As stated in the National Security Strategy of the United States (2002): “In the war against global terrorism, we will never forget that we are ultimately fighting for our democratic values and way of life” (Bush, 2002: pg. 6) and in statements like this, there is posited a value clash between us (the West) and them (the Islamic, terrorist other), where an absolutely stable identity is both implicit and needed for such a dichotomous opposition to maintain itself as mutually exclusive and a rationale for conflict. It is this position that we will first seek to deconstruct “…by showing how they are at once mutually constitutive and yet always in the process of dissolving into each other.” (Walker, 1993: pg. 25) These categories of identity are wholly rejected by a variety of academics for a variety of reasons with Halliday stating that: “…’the west’ is not a valid aggregation of the modern world, and lends itself far too easily to monist, conspiratorial presentations of political and social interaction; but nor is the term ‘Islam’ a valid shorthand for summarising how a billion Muslims, divided into over fifty states and into myriad ethnicities and social groups, relate to the contemporary world, each other or the non-Muslim world.” (Halliday, 2002: pg. 122) If Huntington is correct about one thing, it is that there is certainly an old division of identity between the West and Islam but that this division is itself, first and foremost, discursive and an example of the idea that “… all reality is structured by differences, just as texts are, and that we have no way of referring to the ‘real’ except through representation and interpretations” (Zehfuss, 2009: pg. 143), with this specific dichotomous relationship of West/Islam echoing Derrida’s idea that “…Western thought is structured by dichotomies, that is, by pairs of concepts that appear to be opposites of each other”. (Ibid: pg. 139) Not an entirely unfamiliar analysis of the relationship between West/East, one set off in Edward Said’s book ‘Orientalism’ where he states: “Orientalism as a Western style for dominating, restructuring, and having authority over the Orient…[and a]…systematic discipline by which European culture was able to manage – and even produce – the Orient politically, sociologically, militarily, ideologically, scientifically and imaginatively during the post-Enlightenment period…[and]…European culture gained in strength and identity by setting itself off against the Orient as a sort of surrogate and even underground self.” (Said, 1978: pp. 2-3) Huntington’s thesis and the subsequent West/Islam dialectic of the ‘war on terror’ are the 21st century update to a familiar discursive strategy of a relationship with the Asian/Islamic ‘Other’. Thus, from the outset, strategists, theorists, commentators and policymakers are beset by a tradition of thought that rests upon a mutually exclusive identity, that can be argued is instead mutually constitutive and far more ambiguous, leading to ‘misconceptions about the nature of the enemy’ from the very moment we perceive an Islamic/Asian identity as defined against a Western one. Our next point in this would be to examine the very Western identity against which we define the threat of the ‘Other’ and as noted above by Halliday, we struggle to find a fixed and stable definition to this term upon closer examination. Maja Zehfuss returns to the event itself and the attackers, analysing their relationship to the ‘West’ when she states: “Almost all the hijackers lived for months, some for years, in the USA before September

223 111gradbook final.indd 223

22/10/2010 13:52:16


11. Three of the suspected pilots had lived and studied in Germany before that. Zacarius Moussaoui and others suspected of having aided the hijackers are citizens of Western countries such as France and Germany. All were in some way trained or socialised in the West. Given these links with the West, what does it mean to say that ‘we’ are defending ourselves against them?” (Zehfuss, 2003: pg. 519) The notion of a Western identity, though deeply ambiguous, comes to form the core of what is defended by the ‘war against terrorism’ but it is also haunted by ‘misconceptions’ about the nature of a friend/enemy opposition and identity. It begs the question as to, if we cannot unambiguously identify an identity we are defending, how can we identify clearly those whom to defend it from?

State/Terrorist

“Terrorism is a threat to all States [1st] and to all peoples [2nd]. It poses a serious threat to our security, to the values of our democratic societies and to the rights and freedoms of our citizens, especially through the indiscriminate targeting of innocent people. Terrorism is criminal and unjustifiable under any circumstances.” (EU Counter-Terrorism Strategy, 2005: pg. 6, emphasis added.) “Terrorism ‘from below’ is a crime, so too is terrorism ‘from above’.” (Halliday, 2002: pg. 25) Sitting alongside the West/Islam dichotomy is another binary code that, while more implicit, is intrinsic to the discourse of the ‘war against terrorism’; the legitimating dichotomy of state/ terrorism. There are a whole series of unquestioned assumptions in the categorising of an event as terrorism or a group as terrorists that serves to both legitimise state violence and repression, and at the same time inscribe the limits of political discourse and imagination. The opposition we have here is what Richard K. Ashley calls the ‘heroic practice’ which: “… is as simple as it is productive. It turns on a simple hierarchical opposition… [state/ terrorist]… where the former term is privileged as a higher reality, a regulative ideal, and the latter term is understood only in a derivative and negative way, as a failure to live up to this ideal.” (Ashley, 1989: pg. 230) So there exists in the pursuit of the ‘war against terrorism’, the presupposition that the state response to violent terrorist action, that is in itself violent, is legitimated by the legal foundations of the sovereign state and its rights and duties under international law. The democratic Western states whether responding to criminal acts of terrorism or under a UN mandate, utilise force or violence, within a legal framework, to achieve their stated ends. The terrorist, on the other hand, is seen to utilise violence whether discriminate or indiscriminate, outside the boundaries of legitimacy and the bounded field of the monopoly to the use of force claimed by the sovereign nation-state. There exists in this apparent split between state/terrorist action, a clearly demarcated area of legitimacy outside of which the terrorist functions and it is this functioning boundary of legitimacy that serves to identify the enemy in the ‘war on terror’. For us, however, it is necessary to look again at these distinctions and uncover the

224 111gradbook final.indd 224

22/10/2010 13:52:17


contradictions and deferrals that, as Derrida would note, haunt such apparently stable binaries and investigate whether we can see the ‘misconceptions about the nature of the enemy’. The origins of word ‘terrorism’ shed at least a little contextual light: “Originally a ‘terrorist’ was one who legitimated and practised Terror… It was an objective designation, defamatory only for political adversaries. Hence the great Jacobins of the committee for Public Safety during the French Revolution declared themselves to be terrorists pure and simple. They placed Terror officially ‘on the agenda’.” (Badiou, 2006: pg. 17) While Halliday also states: “It should not be forgotten that the word ‘terrorism’ began life not as applied to the tactics of rebels, but as an arm of state policy, in the French and Russian revolutions.” (Halliday, 2002: pg. 48) So we are immediately struck by the nature of the origin of the term ‘terrorist’ and the methods of inflicting ‘terror’ upon the population as having their genesis in the birth of the modern nationstate. What needs to be added to this origin is how states have not in reality disavowed the use of methods of terror to achieve their strategic aims, so that the apparent distinction between the methods of the terrorist and the methods of the state can be called into question: “On the one hand, the perpetrators of 11 September and other acts of sudden violence against civilians hold to the view that extreme, indeed any violence is justified in pursuit of a political goal… On the other hand many states in the world, in the Middle East and elsewhere, such as the Russians in Chechnya, hold to the view that extreme violence is justified in defence of their state.” (Halliday, 2002: pg. 47) The US stated its aims, beliefs and responsibilities in the ‘war against terror’ rather clearly: “America must stand firmly for the non-negotiable demands of human dignity: the rule of law; limits on the absolute power of the state” (Bush, 2002: pg. 3) and “The reasons for our actions will be clear, the force measured, and the cause just” (Ibid: pg. 16) so that we can clearly see a line drawn between the legitimacy in state action in opposition to the acts of terrorists, but again we must question this clear demarcation, this cut, that allows no ambiguity to enter the terms of the ‘war against terror’: “The denunciation of 11 September by George Bush opens up the discussion of other groups that he and his predecessors may have supported (the Afghan mujahidin, the Nicaraguan contras…), who certainly committed acts of terror and whom many also see as terrorists…[and]…states are the greatest perpetrators of violence and terror.” (Halliday, 2002: pp. 47-48) Already we can see that the line between the state and the terrorist can be called into question as both can be declared guilty of using violence to achieve political ends. What we can clearly demonstrate is the frequency with which state action is both violent and completely disregards international law, as seen in the recent UN Report (Goldstone Report) into Operation ‘Cast Lead’

225 111gradbook final.indd 225

22/10/2010 13:52:17


and the subsequent condemnation of state action. This argument can be countered, however, in regards to both the enforcement and applicability of international law, which claims authority from, while simultaneously over sovereign states, so existing in an apparent aporetic relationship with regards function and legitimacy. So tearing away the first layer of legitimacy of international law, we can still see the moral and legal authority of the sovereign state attempting to maintain the clear distinction with the terrorist. Again we come back to the methods of Derrida who: “By redirecting our attention to the shifting ‘margins’ and limits which determine such logocentric procedures of exclusion and division… continues to dismantle our preconceived notions of identity and expose us to the challenge of hitherto suppressed or concealed ‘otherness’ – the other side of experience which has been ignored in order to preserve the illusion of truth as a properly self contained and self-sufficient presence.” (Kearney, 1984: pg. 106) By examining the origins and juridical foundations of the idea of sovereignty on which the nation-states legitimacy rests, we can see how the relationship between the state and the terrorist blurs in regards particularly the relationship of each to the use of violence. The origins of sovereignty and its legitimacy when explored reveal its inseparable relationship to the use of violence as both means and ends. Walter Benjamin’s essay ‘A Critique of Violence’(1927) is one that explores the violent nature of the founding and maintenance of the juridical system erected to affect the monopoly to the legitimate use of force: “... Law’s interest in a monopoly of violence vis-à-vis individuals is not explained by the intention of preserving legal ends but, rather, by that of preserving the law itself: that violence when not in the hands of the law, threatens it not by the ends that it may pursue but by its mere existence outside the law.” (Benjamin, 1927: pg. 281) If we understand Benjamin’s idea as relating to Thomas Hobbes and Carl Schmitt’s writings on sovereignty, the sovereign authority in a polity and its violence is indivisible so that it can solely enforce the law and exists as a “... rule of law that knows no authority ultimately other than it’s own violent instantiation of itself,” (Dillon, 2009: pg. 51) the idea being that only one sovereign can exist for if there were to be two, they would effectively delegitimise and undermine the other’s authority in the juridical system. The state is threatened by a violence outside of its own that opposes it’s (violent) legitimacy with the possibility of a (violent) legitimacy of it’s own, and we can see from Benjamin that the opposition between the state and the terrorist is founded on their similarity rather than their difference: “One knows nothing about terrorism if one does not see that it is not a question of real violence, nor of opposing one violence to another… but to oppose to the full violence and to the full order a clearly superior model of extermination and virulence operating through emptiness.” (Baudrillard, 2001: pg. 132, emphasis in original) The apparent distinction between the state and terrorist becomes more ambiguous when one examines the use of violence as a means/ends function. One would distinguish the terrorist from

226 111gradbook final.indd 226

22/10/2010 13:52:17


the state in their capacity to use violence as both a means towards their cause but as an end in itself, aimed at causing terror and fear in a population whereas the state is limited to its use of violence for an end alone. Benjamin again questions this idea highlighting the ambiguity in our distinctions held between law, life and violence: “For if violence, violence crowned by fate, is the origin of the law, then it may be readily supposed that where the highest violence, that over life and death, occurs in the legal system, the origins of law jut manifestly and fearsomely into existence… For in the exercise of violence over life and death, more than in any other legal act, law reaffirms itself.” (Benjamin, 1927: pg. 286) Through Benjamin and the responses he inspires from Derrida, Agamben, Vaughan-Williams etc., we see “… the violence of the foundation and reproduction of juridical–political order” (Vaughan-Williams, 2008: pg. 326) and the intrinsic link of the sovereign state to the use of violence to found and preserve its order, so that violence is deployed as an end in itself for the sovereign. In these “… inter-relationships between the law and justice, authority and violence, and authorisations of authority and mystery” (Vaughan-Williams, 2008: pg. 327) we see the dividing line between the state and the terrorist blur and the two concepts begin to fade into each other, again we must begin to question this binary code that underwrites the ‘war against terrorism’ and see how from the moment we begin to think in such terms, we have delimited the political landscape such that we are plagued by ‘misconceptions about the nature of the enemy’.

Conclusion

In this essay we have analysed two of the binary codes that inform the logic of the ‘war against terrorism’ in order to show how, through these binary operations, we can see the discourse of the ‘war against terrorism’ riddled with ‘misconceptions about the nature of the enemy’. This first becomes visible in the inevitable rush to apportion blame for such a traumatic event with the (re) emergence of the West/Islam dichotomy which, has come to form one of the central antagonisms of the ‘war against terrorism’. The second, that of state/terrorism, is a far more subtle code and one whose legitimacy is rarely questioned in academia or policy circles yet is an issue that is starting to peer above the parapets in International Relations and beyond with a number of scholars critically engaging with the ideas of sovereignty, law, life, violence and modernity. The work of R.B.J. Walker, Giorgio Agamben and Nick Vaughan-Williams are but a few whose critical engagements are investigating the many presuppositions that prop up modern political discourse, which has at it’s heart the sovereign nation-state, and in these engagements re-opening the most important of ethical enquires that expose the ‘misconceptions about the nature of the enemy’ and re-focus academic thought to identify such unfortunate ‘misconceptions’ that have and can only lead inevitably to conflict, violence and suffering.

227 111gradbook final.indd 227

22/10/2010 13:52:17


Languages & Linguistics panel

r

Prof. Grace Neville, UCC (chair) Dr. Maeve Conrick, UCC Prof. Elisabeth Okasha, UCC Dr. Claire O’Reilly, UCC Dr. Rachel MagShamhrain, UCC Dr. Ciara O’Toole, UCC Dr. Daragh O’Connell, UCC Dr. Anne Gallagher, NUIM Dr. Paul Ryan, WIT Dr. Niamh Thornton, UU

Judges’ comments

This essay is a remarkable piece of work for an undergraduate student. It combines careful, indepth, wide-ranging research with elegant analysis, and shows definite post-graduate potential. The focus throughout is clear. The writer displays empathy towards the subject under discussion while at the same time maintaining a critical distance throughout. Great credit is due to the writer of this fine critique of such a complex issue.

228 111gradbook final.indd 228

22/10/2010 13:52:17


r Languages & Linguistics

La langue bretonne: passage de la langue vernaculaire à la langue condamnée Anne Molloy

«L

a langue officielle de la République est le français ».1 Ce premier alinéa de l’article 2 de la Constitution française représente la pierre angulaire de la politique linguistique de la France. Aux yeux de l’étranger, l’impression de la situation linguistique en France (c’est-à-dire celle avancée par l’État français) est une de monolinguisme, unité, et standardisation, rigidement contrôllée par des autorités comme l’Académie française. Par contre, en réalité on s’y trouve confronté avec un plurilinguisme caché et surprenant: parmi les pays de l’Europe occidentale, c’est la France qui a le profil linguistique le plus varié.2 Ce plurilinguisme vient de l’existence de plusieurs langues régionales dont les passionnés font pression depuis longtemps pour leur obtenir un statut officiel, afin de garantir le sauvegarde de ces trésors culturels. Cette lutte peu fructeuse démontre l’influence inséparable de la politique sur la situation linguistique: parfois, l’interêt pour la promotion de la langue régionale se fusionne avec des demandes pour l’autonomie politique, la langue régionale devenant ainsi un symbole de l’identité culturelle; en outre, la position puissante de l’État centraliste a renforcé son ambition antérieure de réaliser un monolinguisme comme symbole de l’unité nationale, au prix de l’éradication des langues minoritaires. Dans cette dissertation je vais donner un aperçu de la situation des langues régionales en France, puis brosser un tableau de l’état actuel de la langue bretonne. C’est en raison Trésor de la langue française au Québec, La politique des langues régionales et minoritaires en France, <http://tlfq.ulaval.ca/axl/europe/france-3politik_minorites.htm>, 8 octobre 2009, pg. 2. 2 Laroussi, F. & Marcellesi, J-B., The Other Languages of France: towards a multilingual policy, in Sanders, Carol, ed., French Today: Language in its Social Context, (Cambridge: Cambridge University Press, 1993), pg. 85. 1

229 111gradbook final.indd 229

22/10/2010 13:52:17


de contraintes du temps et de place que j’ai décidé de me concentrer sur l’exemple du breton, mais aussi grâce à mon expérience personnelle d’avoir passé mon année Erasmus à Brest; cependant, je crois qu’une grande partie de cette information se révélera pertinente à toutes les langues régionales. Je vais aborder les sujets de l’état actuel de la langue; les raisons pour son déclin et les efforts de revitalisation linguistique; et, finalement, l’avenir de cette langue minoritaire menacée.

Carte des langues régionales de la France.3

La lutte des langues régionales pour un statut officiel

En raison de la primauté constitutionnelle de la langue française, symbole de l’unité nationale et glorifiée dans l’article 2 de la Constitution comme la seule langue officielle, les langues régionales – l’alsacien, le basque, le breton, le catalan, le corse, le flamand, et l’occitan – sont confrontées avec des obstacles quasiment insurmontables dans le cadre de leur lutte pour obtenir un statut officiel, car chaque demand se trouve dans l’impâsse contre cet article. Une reconnaissance constitutionnelle permettrait de « définir les droits linguistiques des citoyens et de préciser les obligations des pouvoirs publiques »4, et faciliterait l’usage des langues minoritaires dans l’enseignement public, les cérémonies de mariage, les procès juridiques etc. La tentative de l’Assemblée nationale en mai 2008 d’inscrire les langues régionales dans l’article 2 de la Constitution (« les langues régionales appartiennent au patrimoine de la nation »5) fut rejetée par

http://www.tlfq.ulaval.ca/axl/Europe/images/france-langues.gif. Observatoire de la langue bretonne, La langue bretonne à la croisée des chemins: Deuxième rapport général sur l’état de la langue bretonne, (Office de la Langue Bretonne, 2007), < http://www.ofis-bzh.org/index.php>, 21 novembre 2009, pg. 20. 5 Le Figaro, Langues régionales: le Sénat contredit l’Assemblée, <http://www.lefigaro.fr/ politique/2008/06/18/01002-20080618ARTFIG00614-langues-regionales-le-senat-contredit-l-assemblee.php >, 9 octobre 2009. 3 4

230 111gradbook final.indd 230

22/10/2010 13:52:17


le Sénat, ce qui a déclenché un débat acharné en France. Leur peur a été exprimée par la critique de l’Académie française que la reconnaissance des langues régionales porterait « atteinte à l’identité nationale »6 en menaçant la primauté de la langue française. Finalement, cette stipulation controversée a été ajoutée à l’article 75-1 (sur les collectivités territoriales) de la Constitution. Évidemment, il est peu probable que cette modification ait pour effet de sauvegarder les langues régionales de l’extinction, car sa portée juridique n’est que déclarative et symbolique.7 C’est aussi invraisemblable que la France ratifie la Charte européenne des langues régionales et minoritaires (ce que souhaitent les partisans de ces langues) sur l’argument de son inconstitutionnalité, le français étant la seule langue officielle de la République. Sans ce statut officiel, beaucoup craignent que ces langues menacées ne disparaissent dans un proche avenir. Voici quelques caricatures traitant ce controverse:

8

Ibid.if. Trésor de la langue française au Québec, pg. 2. 8 http://www.christian-antonelli.com/dessin-de-presse/images/senatrefuselangueregionale.jpg. 6 7

231 111gradbook final.indd 231

22/10/2010 13:52:17


9

Statistiques actuelles concernant la langue bretonne

En tenant compte de l’objectif de l’État de réaliser une France « une et indivisible » fondée sur la base d’une unité linguistique, on prend conscience de la difficulté voire l’impossibilité d’obtenir des statistiques concernant l’usage des langues régionales, les recensements de la population refusant de comporter aucune question d’ordre linguistique. Toutefois il exite certaines données fiables et récentes, comme celles présentées dans le Deuxième rapport sur l’état de la langue bretonne10, publié en 2007 par l’Observatoire de la Langue Bretonne. On apprend ici qu’à l’époque, il y avait 270,000 brittophones11 en Bretagne, donc 6,7% de la population bretonne. Ces chiffres montrent bien que la langue bretonne est minoritaire en Bretagne, malgré qu’elle y soit la seconde langue la plus parlée. Un profil type du brittophone se présente: une personne résidant plutôt dans le départment occidentale de Finistère (où 20% de la population adulte parle breton), généralement un agriculteur âgé en moyenne de plus de 50 ans. Cependant, malgré la prépondérance des agriculteurs (dont près d’un tiers sont des locuteurs actifs), le breton se parle dans toutes les catégories socioprofessionnelles, y compris celles dites « élevées ». Il faut se rendre compte du vieillissement de la population bretonne: trois brittophones sur quatre avaient plus de 50 ans en 1999, et un sur deux étaient âgés de plus de 65 ans, tandis que moins d’un sur dix avait moins de 30 ans. Toutefois, la courbe des brittophones réamorce une croissance chez les plus jeunes, grâce aux efforts des initiateurs de l’enseignment bilingue: 13,077 enfants étaient http://raforum.info/IMG/jpg/languesregionales.jpg. Rapport de l’Observatoire de la Langue Bretonne – l’Observatoire dispose ici des résultats de l’enquête « Étude de l’histoire familiale » menée par l’Institut National de la Statistique et des Études Économiques (l’INSEE) en 1999. 11 Brittophone/locuteur actif: personne capable de tenir une conversation en langue bretonne. 9

10

232 111gradbook final.indd 232

22/10/2010 13:52:17


scolarisés dans les filières bilingues à la rentrée 2009.12 C’est par le biais de cet enseignement que le breton se transmet presque exclusivement aujourd’hui, la famille ne jouant pratiquement plus son rôle dans la transmission de la langue bretonne: moins de 3% des parents parlent en breton à leurs jeunes enfants. Les sondages relatifs à l’opinion des Bretons sur la langue bretonne montrent une nette progression de l’attachement à la langue: en 1991, 76% des habitants de Basse-Bretagne estimaient qu’il fallait conserver la langue bretonne; ce chiffre avait progressé jusqu’à atteindre 88% en 1997 et 92% en 2001. Le breton s’affirme donc, aux yeux de la grande majorité des habitants de la Bretagne, comme l’un des éléments fondamentaux de l’identité bretonne, ce qui montre la nature inséparable des idées culturelles et politiques sur la langue.

Ar Gwenn-ha-Du: le drapeau breton13, symbole de l’identité culturelle bretonne.

« De la langue vernaculaire à la langue condamnée »14: le déclin du breton

Comme l’ensemble des langues régionales en France, le breton a enregistré un recul général au cours du XXe siècle. La proportion de locuteurs dans la population a été divisé par trois pour l’alsacien, par deux pour le basque et par dix pour le breton, en faisant l’une des langues de France qui a le plus regressé.15 Ce déclin est tel que certains commentateurs affirment que le breton est une langue en voie d’extinction, malgré les efforts vaillants de Bretons passionnés pour la sauvegarder. En 1930, Professeur Roparz Hemon a évalué le nombre de bretonnants à environ 1,200,000.16 Depuis cette enquête, de nombreux facteurs historiques, culturelles et politiques Office de la Langue Bretonne, < http://www.ofis-bzh.org/fr/langue_bretonne/chiffres_cles/index.php>, 26 novembre 2009. 13 http://ffaperitif.files.wordpress.com/2009/07/drapeau-bretagne-jpg1.jpg. 14 Le Besco, Patrick, Parlons Breton: langue et culture, (Paris: Editions L’Harmattan, 1997), pg. 145. 15 Rapport de l’Observatoire de la langue bretonne, pg. 19. 12

233 111gradbook final.indd 233

22/10/2010 13:52:17


ont contribué à une baisse considérable de ce chiffre. Le déclin s’est réellement amorcé après la Seconde guerre mondiale, lorsque les parents bretonnants ont décidé de ne pas transmettre à leurs enfants une langue qu’ils ont été amenés à considérer comme un fardeau. Comme on a déjà vu, la transmission familiale reste encore très faible aujourd’hui. Cependant, comme le dit Yann Fañch, le sujet prête toujours à polémique: « certes, les Bretons ont abandonné leur langue, mais on ne peut nier que la France ait fait (et continue de faire?) Tout son possible pour éradiquer ses langues minoritaires ».17 Ce dernier est une allusion à la politique délibérée de l’État centraliste français de vouloir créer l’unité linguistique de la France, pour laquelle il juge nécessaire d’assimiler et d’intégrer complètement la population bretonne.18 Lorsque Jules Ferry a établi l’enseignement obligatoire en 1870, l’usage de chaque langue sauf le français fut interdit dans l’enseignement; à l’époque, le sous-préfet de Morlaix a ainsi adressé les enseignants locales: « surtout, rappelezvous, messieurs, que vous n’êtes établis que pour tuer la langue bretonne ».19 Cette interdiction a ajouté au décret de Villers-Cotterêts en 1539, qui a ordonné l’usage unique du français dans les documents législatifs. En 1900-05, le paiement de l’indemnité concordaire pour les prêtres qui continuaient de faire leurs sermons en breton a été aboli.20 En ce qui concerne les médias, en 1923 l’État a interdit l’usage de chaque langue sauf le français à la radio; plus récemment, en 1997 le Premier Ministre a limité l’aide financière pour la presse régionale aux publications rédigées uniquement en français. Pour comble, un nouveau « Article 2 » fut ajouté à la Constitution en 2002, stipulant que « la langue officielle de la République est le français ».21 Cet article, présenté comme outil indispensable dans la lutte contre la pénétration de l’anglais, s’est aussi révélé très utile pour les opposants à la ratification par la France de la Charte européenne des langues régionales et minoritaires. Donc, une politique voulant à tout prix exclure la langue bretonne (et les autres langues régionales) de la vie publique, de l’enseignement, de l’église, de la loi et des médias, ne pouvait pas manquer d’avoir un impacte dévastateur sur l’état actuel du breton.

Les efforts pour une revitalisation linguistique

Pour combattre ce déclin machiné par l’État français, de nombreuses actions, soutenues par un attachement grandissant des Bretons à leur langue et à leur culture, luttent pour la survie et la promotion du breton. Je traiterai d’abord les activités politiques, et ensuite les actions culturelles. Depuis le début du XXe siècle, plusieurs organisations politiques ont lutté, sans beaucoup Denez, Per, Brittany: A Language in Search of a Future, (Bruxelles: European Bureau for Lesser-Used Languages, 1998), pg. 32. 17 Fañch, Yann, en écrivant pour le Centre Généalogique des Côtes d’Armor, <http://www.genealogie22. org/09_documentation/breton/breton.html>, 20 novembre 2009. 18 Fouéré, Yann, Histoire résumée du Mouvement Breton du XIXe siècle à nos jours, (Quimper: Editions Nature et Bretagne, 1977), pg. 14. 19 O’Callaghan, M.J.C., Separatism in Brittany, (Cornwall: Dyllansow Truran, 1983), pg. 39. 20 Denez, pg. 12. 21 Trésor de la langue française au Québec, pg. 2. 16

234 111gradbook final.indd 234

22/10/2010 13:52:17


de succès, pour obtenir l’autonomie politique et culturelle de la Bretagne, par exemple l’Union Démocratique Bretonne (U.D.B.) ou Strollad ar Vro (« le Parti Breton »). Frustré par ces efforts infructueux, le Front de libération de la Bretagne (F.L.B.) s’est tourné vers la violence pour atteindre son but, avec des attentats explosifs sur des installations militaires et des bâtiments administratifs. Ces attentats ont eu un retentissement énorme en Bretagne, en France et à l’étranger, surtout à l’occasion de la grève de la faim d’une soixantaine de prisonniers du F.L.B. à la prison de la Santé à Paris en 1969. Le vaste mouvement de solidarité provoqué par l’« affaire du F.L.B. » a beaucoup contribué au revirement de l’opinion des Bretons en faveur de leur identité culturelle. Aujourd’hui, la lutte pour un statut officiel pour la langue bretonne (et, effectivement, pour l’ensemble des langues régionales) continue de manière moins extrême. Le vif débat provoqué par le refus d’inscire les langues régionales dans la Constitution a déçu le Mouvement breton. Marianna Donnart, une jeune bretonne passionnée que j’ai interviewé, a décrit la modification constitutionnelle comme un « leurre »: beaucoup de gens croient que le gouverment « jacobin » ne l’a proposée que pour apaiser les militants régionaux, quand en réalité chacun savait qu’elle ne leur apporterait rien de concret.22 Quand même, selon Marianna, cette reconnaissance à l’article 2 aurait été appréciée, car elle aurait avancé l’objectif des écoles bilingues de Bretagne d’être intégrées dans le système de l’enseignement public pour recevoir l’aide financière de l’État. Les trois filières bilingues, Diwan (« germe »), Div Yezh (« deux langues ») et Dihun (« éveil ») constituent l’un des éléments les plus importants de la promotion culturelle de la langue bretonne. C’était grâce aux efforts d’associations de parents d’élèves que la première maternelle bilingue de Diwan a été ouverte à Ploudalmézeau (dans le Finistère) en 1977. Selon Marianna, « le breton est la langue de travail et la langue de vie des écoles Diwan, qui fonctionnent par un système d’immersion, afin d’acclimater les enfants à un environnement bretonnant. Ils entendent du français partout en dehors de l’école, alors pour les rendre bilingues, cette immersion est nécessaire ».23 Le linguiste Claude Hagège24 affirme que le bilinguisme précoce développé depuis la maternelle (à l’âge où la capacité d’apprentissage des langues est la plus forte) porte beaucoup d’avantages à l’enfant: d’après des études scientifiques, les enfants bilingues sont plus à leur aise avec les concepts mathématiques, les travaux de création et la maîtrise de plusieurs langues.25 De plus, ils ont plus de facilité à accepter d’autres cultures et d’autres manières de penser, étant plus ouverts d’esprit et respectueux de la diversité culturelle – une théorie vivement soutenue par Marianna. Elle constate que les écoles Diwan, auxquelles elle a assisté de la maternelle au baccalauréat, fournient aux enfants un milieu scolaire très agréable – les élèves et les enseignants se tutoient, et tout le monde se connaît. L’énorme popularité de ces écoles, qui devient de plus en plus forte chaque année (il y a maintenant 38 écoles Diwan en Bretagne et une à Paris), vient d’une part d’une évolution Entretien avec Marianna Donnart, 20 novembre 2009. Entretien avec Marianna Donnart, 20 novembre 2009. 24 Site internet de Diwan Breizh, <http://diwanbreizh.org>, 23 novembre 2009. 25 Site internet de Diwan Breizh. 22 23

235 111gradbook final.indd 235

22/10/2010 13:52:17


remarquable de l’attitude envers l’enseignment bilingue, grâce à la réalisation des parents de ses nombreux bienfaits; et d’autre part de la détérmination de sauvegarder la langue bretonne, démontrant l’importance de l’enseignement dans le mouvement culturel breton.

Photo de la campagne de presse de la Région de Bretagne pour promouvoir l’enseignment bilingue. 26 Pour faciliter un enseignement efficace avec une terminologie moderne, Diwan publie des dictionnaires spécialisés et des livres scolaires comme Ni a gomz brezhoneg (« Parlons breton »). L’édition active dans la langue bretonne est pour la plupart à titre bénévole, « par amour de la langue et pour faire vivre la langue » comme le dit Marianna. Les deux tiers des livres publiés pendant entre 2000 et 2005 (environ 80 chaque année) étaient destinés aux adultes27; ce chiffre démontre l’offre très limitée en livres pour adolescents et enfants. Néanmoins, quelques bandes dessinées en version bretonne sont récemment apparues, par exemple le très populaire Titeuf; et la création pour les jeunes connaît un essor grâce au Prix ar yaouankiz (« Prix de la jeunesse »), crée par FEA28 en 2002, qui encourage les écrivains en herbe à écrire des oeuvres littéraires (par exemple de science-fiction) à l’intention des jeunes lecteurs. En ce qui concerne la presse, on trouve actuellement une dizaine de titres en breton, dont la plus important est « Ya! », la revue en breton la plus lue avec plus de 1000 abonnés. Malgré cette réussite, le poids de l’édition de journaux en breton reste faible en raison d’un lectorat trop réduit, qui ne mérite toujours pas la publication d’un quotidien exclusivement rédigé en breton.29 Les avancées dans les médias, un élément clé de la valorisation de l’usage du breton dans la vie quotidienne, laissent à désirer. Moins de quatre heures d’émissions en langue bretonne sont diffusées par semaine, malgré la création de TV Breizh en 2000; son ambition de devenir une chaîne bilingue a échoué à cause des problèmes de financement et à un manque flagrant de volonté politique. Beaucoup de Bretons http://www.bretagne.fr/internet/jcms/preprod_40050/pourquoi-pas-une-ecole-bilingue-pour-votre-enfant. Rapport de l’Observatoire de la langue bretonne, pg. 100. 28 Formation-Education-Animation. 29 Rapport de l’Observatoire de la langue bretonne, pg. 103. 26 27

236 111gradbook final.indd 236

22/10/2010 13:52:17


expriment une dissatisfaction avec les radios qui diffusent des émissions en langue bretonne, qui ne représentent que 12% des radios bretonnes,30 et ne sont pas accessibles dans toute la Bretagne. En revanche, l’offre de loisirs et de spectacles à travers le breton s’est diversifiée depuis le début des années 2000, avec des nouvelles troupes de théâtre comme The Pirate-Puppet Company (qui propose des spectacles de marionnettes en breton), des nouveaux chanteurs qui mélangent la musique traditionelle bretonne avec des influences modernes (comme Tri Bleiz Die ou Pascal Lamour), un nouveau sitcom en breton intitulé « Leurenn BZH » ou la création de camps de vacances pour les jeunes comme An Oaled. Ces initiatives culturelles, menées pour la plupart par des associations bénévoles, aident à donner à la langue bretonne une place concrète dans la vie quotidienne des Bretons.

Couvertures de Ni a gomz brezhoneg31 (livre scolaire breton) et la version bretonne de Titeuf. 32 Quant à la place accordée à la langue bretonne dans le paysage linguistique, sa nature minimale est dûe à la timidité de la politique menée par la France en faveur des langues régionales: seulement un quart du budget de la DGLFLF33 est attribué aux langues minoritaires. En conséquence, les collectivités territoriales de Bretagne (l’Assemblé régionale, les départements et les communes) commencent à prendre en compte la langue bretonne dans leurs politiques. Par exemple, la création de l’Office de la Langue Bretonne en 1999 fait partie de la politique linguistique de la Région Bretagne. Cette initiative oeuvre pour promouvoir la langue bretonne dans tous les domaines de la vie sociale et publique, en travaillant avec des entreprises pour rendre Rapport de l’Observatoire de la langue bretonne, pg. 116. http://www.culturebretagne.free.fr/images/lb_001.gif. 32 http://www.multimedia.fnac.com/.../5/9/0/9782913652095.jpg. 33 La Délégation Générale à la Langue Française et aux Langues de France, chargée depuis 2001 de coordonner la politique linguistique pour l’État. 30 31

237 111gradbook final.indd 237

22/10/2010 13:52:17


le breton une partie réelle de la vie quotidienne des Bretons; par exemple, selon Marianna, parce que le Crédit Mutuel fournit des chéquiers et des distributeurs automatiques de billets bilingues, tous les Bretons intéressés à la langue bretonne sont clients de cette banque, ce qui montre les bienfaits économiques de l’usage du breton à des entreprises astucieuses. Cet exemple montre l’importance de la visibilité linguistique pour la vitalité d’une langue minoritaire, qu’il soit dans les transports en commun ou la signalisation bilingue, car elle lui donne une utilité quotidienne ainsi que l’occasion d’être vue par tous. L’installation de la signalisation bilingue sur les réseaux routiers est actuellement en cours dans les départements bretons: en 2007, un tiers du réseau du Finistère était équipé.34 A travers ces actions de plus en plus engagées des collectivités de Bretagne, il est clair que la langue bretonne n’est plus seulement l’affaire de particuliers et d’associations: elle devient progessivement l’affaire de la société bretonne dans son ensemble.35

Un exemple de la signalisation bilingue en Bretagne: une photo que j’ai prise pendant mon séjour à Brest, 1 mai 2009.

A la croisée des chemins: l’avenir de la langue bretonne

Le titre du Rapport de l’Observatoire de la Langue Bretonne, La langue bretonne à la croisée des chemins, démontre que le breton se trouve à un moment décisif pour ce qui concerne son avenir. Le nombre des locuteurs baisse terriblement d’année en année, en commun avec les autres langues régionales, mais, paradoxalement, le nombre de personnes apprenant le breton n’a jamais été aussi haut. Ces filières bilingues, selon Patrick Le Besco, sont d’une « portée symbolique réelle. Elles montrent l’intérêt des jeunes générations pour le breton, non pas pour la langue en tant que telle, mais en tant que symbolisant et concrétisant le ‘droit à la différence’ ainsi que le droit à la conservation de la culture, idées très prisées de nos jours ».36 Cette précision est significative: bien que 88% des Bretons le reconnaissent comme leur langue régionale, seulement 6.7% peuvent Rapport de l’Observatoire de la langue bretonne, pg. 33. Rapport de l’Observatoire de la langue bretonne, pg. 48. 36 Le Besco, pp. 157-8. 34 35

238 111gradbook final.indd 238

22/10/2010 13:52:18


tenir une conversation en breton. Selon UNESCO, le nombre de jeunes locuteurs est l’indicateur principal de santé d’une langue. La proportion de jeunes locuteurs doit atteindre 30% pour qu’une langue ait des chances de survie – or, sur l’ensemble de le Bretagne, entre 1 et 2% des moins de 18 ans seulement pratiquent la langue bretonne aujourd’hui.37 Malgré les efforts vaillants des associations et des passionnés de combattre ce déclin de l’usage du breton, ces chiffres annoncent un avenir plutôt pessimiste pour cette langue minoritaire. Pour donner cet aperçu de l’état actuel des langues régionales en France, en me concentrant sur l’exemple de la langue bretonne, j’ai traité plusieurs aspects dans cette dissertation: tout d’abord, j’ai discuté la situation actuelle des langues régionales en faisant référence à la politique linguistique de l’État; deuxièmement, j’ai présenté quelques stastiques récentes qui servent de donner une impression de l’état du breton aujourd’hui; troisièmement, j’ai présenté les raisons pour le déclin de la langue bretonne au cours du XXe siècle, notamment la politique menée par l’État français, efficacement résumée dans la déclatation du Ministre de l’Instruction Publique en 1925 que « pour l’unité linguistique de la France, la langue bretonne doit disparaître ! ».38 Ensuite, j’ai décrit la lutte de Bretons passionnés pour obtenir un statut officiel pour le breton; j’ai aussi discuté les actions culturelles et politiques qui aspirent à augmenter la visibilité linguistique du breton dans tout les domaines, et à promouvoir la langue comme partie réelle de la vie quotidienne. Malgré ces efforts vaillants, comme j’ai exprimé, l’avenir de la langue bretonne, comme celui de l’ensemble des langues régionales en France, reste encore douteux, avec une baisse annuelle du nombre de locuteurs; si l’État français ne revient pas sur sa décision d’exclure les langues régionales de la Constitution, s’il n’affirme pas sa volonté de mettre un terme au déclin de cette langue minoritaire pour parvenir à en inverser le processus,39 je crains que ces trésors du patrimoine culturel de la France ne soient perdus pour toujours.

Rapport de l’Observatoire de la langue bretonne, pg. 15. Le Besco, pg. 146. 39 Rapport de l’Observatoire de la Langue Bretonne, pg. 20. 37 38

239 111gradbook final.indd 239

22/10/2010 13:52:18


Law panel

r

Prof. Brice Dickson, QUB (chair) Prof. Dermot Walsh, UL Suzanne Egan, UCD Dr. Noel McGrath, UU

Judges’ comments

Besides covering a topic of immense social and economic consequence in the modern world, this essay was extremely well-structured, very carefully worded and appropriately referenced. Without descending into inordinate detail, it explained clearly and effectively how rights are conceived of differently in a libertarian state like Ireland and an authoritarian one like China. It manages to avoid glib generalisations while conveying a sense of the basic differences in approach between the two systems. Rightly, it focuses on the functions and effects of the systems rather than on nomenclature and institutions. It looks not just at constitutional provisions but also at ordinary legislative and case law developments. The coverage is accurate, interesting and concise. The writer contends that freedom of association in Ireland has pursued libertarian freedom at too great a cost, while in China individual interests have been deemed to be synonymous with state interests and there is no right to strike. Trade unions serve a different function in China, but also in Ireland trade unions have been considerably weakened by law. The essay transmits a clear message in a well-reasoned manner. It is authoritative and yet very readable. The judges think it is a worthy winner of the prize for the best essay in law written by an undergraduate student in Ireland during 2010.

240 111gradbook final.indd 240

22/10/2010 13:52:18


r Law

A comparative analysis of freedom of association, trade unions and labour rights: authoritarian and libertarian perspectives Donncha Conway

T

INTRODUCTION

his essay intends to compare how freedom of association is embodied in authoritarian and libertarian regimes. Rights related to freedom of association, and the application of such rights in the context of trade unions, provide the essay’s main focus. Using Ireland and China as models of libertarianism and authoritarianism respectively, the essay will draw conclusions as to how the nature of a state affects its ability to guarantee rights, with reference to the two jurisdictions mentioned. Freedom of association is a right which is recognised as fundamental in liberal democracies,1 as well as in international legal instruments and establishments.2 Upon it are predicated crucial rights in the area of collective labour law, such as the right to form and join independent unions, the converse right to refrain from joining a union and the right to partake in union procedures. The method in which such a right is espoused – and the extent to which it is effective – depends greatly on the state’s approach to it. Particularly, the vitality of a freedom of association guarantee is affected by the state’s manifestation of its power as either libertarian or authoritarian. This essay intends to explore the effects of a state’s character on its capacity to protect freedom of association, and related labour rights, at the individual level.

E.g. Bunreacht na hÉireann, Article 40.6.1° (iii); First Amendment to the US Constitution, as interpreted by the Supreme Court in NAACP v Alabama, 357 US 449 (1958). 2 E.g. Preamble to the Constitution of the International Labour Organisation; Freedom of Association and Protection of the Right to Organise Convention (ILO No. 87), 68 UNTS 17; Articles 20 and 23 of the Universal Declaration of Human Rights. 1

241 111gradbook final.indd 241

22/10/2010 13:52:18


Definitions and Assumptions

An exhaustive analysis of the criteria which classify a state as being libertarian or authoritarian is outside the scope of this work. However, certain definitional assumptions must be made as to the characteristics of both for the purpose of this essay. First, “Libertarian”3 will refer to the accordance of sovereignty to the individual and protection of her personal liberty as against the state.4 Such protections typically take the form of a codified constitution which assumes an identity antecedent to the state, and so legitimately controls the actions of the latter.5 Essentially, this essay makes the, hopefully acceptable, generalisation that a libertarian state prioritises such rights of the individuals as are fundamental, and has meaningful institutional mechanisms, such as judicial review, for ensuring that the state does not impermissibly encroach on such individual freedoms.6 In this essay, for descriptive and comparative purposes, Ireland will be used as a model for the “libertarian” system. Secondly, “Authoritarian” will refer to a system in which statist concerns may prevail over the interests of individuals, there being no determinative mechanism to vindicate whatever personal rights are extant.7 Such systems are typified by government opacity and a lack of individual input into state control.8 In this essay, China will be used as an example of a state which embodies the authoritarian approach to rights protection.9 As well as all cognate words. As Axtmann put it “liberal democracy is premised on the notion of popular sovereignty and its institutionalisation [sic] in citizenship rights”, Roland Axtmann Liberal Democracy into the Twenty-First Century (Manchester University Press, 1996) at 10. 5 As Thomas Paine described it, the constitution “acted not only as an authority, but as a law of control to the government”. Thomas Paine, The Rights of Man (JM Dent London, 1993) at 187. 6 For more inclusive analyses, see generally Axtmann, note 4; Ronald Hamowy ed. The Encyclopaedia of Libertarianism (SAGE, 2008). 7 Though there may, as a matter of fact, be a confluence between states which are authoritarian and those which are tyrannies or despotisms, words such as those are herein avoided. As Friedrich put it, these terms “have a distinctly pejorative flavour”. Carl Friedrich, Totalitarian Dictatorship and Autocracy (Harvard University Press, 1965) at 15. 8 While nature of an authoritarian state is undoubtedly indeterminate in nature, Linz’s description of the characteristics of such states is widely credited. “Authoritarian regimes are political systems with limited, not responsible political pluralism; without elaborate and guiding ideology but with distinctive mentalities; without intensive or extensive political mobilisation, except some points in their development; and in which a leader or occasionally a small group exercises power within formally ill-defined limits but actually quite predictable ones.” Juan Linz “An Authoritarian Regime: The Case of Spain” in Erik Allard and Stein Rokkan eds., Mass Politics: Studies in Political Sociology (New York Free Press, 1970). 9 For a detailed consideration of the political science of authoritarianism, see Juan Linz, Totalitarian and Authoritarian Regimes (Lynne Rienner, 2000). 3 4

242 111gradbook final.indd 242

22/10/2010 13:52:18


Thirdly, though it may not be unanimously accepted as a matter of political science, this essay will treat the ideas of libertarianism and authoritarianism as being not only mutually exclusive, but also as being fundamentally opposed in their protection of rights. Libertarian systems, by definition, apotheosise the basic rights of individuals, and predicate their system of government upon them. Conversely, authoritarian systems treat the rights of individuals not as a defining basis for their power, but rather as a particular matter within their power. It is not that authoritarian systems have no conception of rights protection, or that they wilfully fail to enforce rights, but that rights are not accorded to individuals in a manner which limits the action of the state.

Comparative Caveats

This essay intends to examine freedom of association, and correlative labour rights, from authoritarian and libertarian perspectives, using China and Ireland respectively as examples. However, at the outset of a comparative analysis of these systems, some caveats should be noted. It would clearly be disingenuous to conduct a straightforward instrument-to-instrument comparison of the two legal systems. It is axiomatic in any comparative law analysis to compare that which is, in fact, comparable. To this end, it is important to compare functions and effects rather than formal terminology and institutions. A holistic approach is the only way to compare legal regimes in a manner evaluative of their true effect. As Blanpain states, the comparative lawyer must examine “reality” or “what is going on”.10 This approach is integrated, and necessitates the examination not just of legal instruments purporting to have similar effects, but also of judicial practise and interpretive traditions, as well as extra-legal factors such as soft laws, tacit understandings and societal norms or customs insofar as they affect the law. For the above reason, this essay will not limit itself to examining constitutional freedom of association provisions in Ireland and China. Rather, it will inspect the legislative elaboration on such provisions, and their effect in courts and in practise. Customs and other social factors will also be examined so as to ascertain a more informed comparative view of the real-world ramifications of the freedom of association laws in either jurisdiction. Before comparing the status of freedom of association in Ireland and China, the relevant laws from both jurisdictions will be reviewed independently.

IRISH LABOUR LAW: FREEDOM AT A COST

Article 40.6.1° of Bunreacht na hÉireann is the sine qua non of Irish freedom of association law:11 Roger Blanpain, “Comparativism In Labour Law and Industrial Relations” in Roger Blanpain ed., Comparative Labour Law and Industrial Relations in Industrialised Market Economies (Wolters Kluwer, 2007) at 13. 11 This provision is not necessarily as broadly applicable as, for example, the US Constitution’s first amendment insofar as the latter has been held to encompass associations other than trade unions, e.g. Boy Scouts of America v Dale 530 US 640 (2000); NAACP v Alabama 357 US 449 (1958); Roberts v United States Jaycees 468 US 609 (1984). For this essay’s purposes, “freedom of association” will refer to freedom of association as it applies in a labour law context. 10

243 111gradbook final.indd 243

22/10/2010 13:52:18


The State guarantees liberty for the exercise of the following rights, subject to public order and morality: [...] (iii) The right of the citizens to form associations and unions. Laws, however, may be enacted for the regulation and control in the public interest of the exercise of the foregoing right. This article has not only attributed a strong degree of protection to certain rights and freedoms in the context of trade unions, but has also been held to protect a wide variety of labour rights and freedoms, including rights which are not immediately apparent on first inspection of the text.

Irish Case Law

The fundamental nature of freedom of association was demonstrated at an early stage in Supreme Court jurisprudence in National Union of Railwaymen v Sullivan.12 This case is valuable in its elaboration on the position of association rights against government regulation. Part III of the Trade Union Act 1941 purported to create a tribunal with the power to determine that certain trade unions should have a monopoly on organising workers of a particular class. The plaintiffs in Sullivan argued that this was an unconstitutional infringement on their freedom to associate and form unions as guaranteed by article 40.6.1° (iii), as they stood to be deprived of their alleged right to belong to a trade union of their choice. Central to determining the constitutionality of Part III was the scope of the legislative power of limitation under article 40.6.1° (iii): “laws... may be enacted for the regulation and control in the public interest of the exercise of the [right to form associations and unions]”. The Supreme Court, in reversing the High Court decision of Gavan Duffy J, held that the Oireachtas power of regulation was restricted to limiting the ambit of freedom of association, and did not permit its deletion. Since workers in a particular field affected by the tribunal’s decision under Part III would have no choice as to what union to join, it was held that: “logically and practically, to deprive a person of the choice of the persons with whom he will associate, is not a control of the exercise of the right of association, but a denial of the right altogether.”13 Though the constitution patently seems to creative a legislative power to regulate freedom of association as it sees fit, the court held that such a right was not capable of being subsumed entirely by way of regulation. There is, of course, great merit to the observations made by Gavan Duffy J in the High Court to the effect that regulation of free association in this respect was based on a policy of strengthening trade unions.14 Yet the Supreme Court was adamant that the constitutional protection of free association prohibited complete removal of rights: The Constitution states the right of the citizens to form associations or unions in an emphatic way, and it seems impossible to harmonise this language with a law which [1947] IR 77. Ibid at 102. 14 Ibid at 88. 12 13

244 111gradbook final.indd 244

22/10/2010 13:52:18


prohibits the forming of associations and unions, and allows the citizen only to join prescribed associations and unions.15 While it is submitted that Kelly is correct in criticising the brevity of the Supreme Court’s analysis of “regulation” in Sullivan,16 this case may be read as an exhibition of the fundamentality of freedom of association in a libertarian system. Explicit constitutional powers of regulation and strong policy concerns notwithstanding, the individual’s right to associate freely remains paramount. Aside from the right to associate freely, as evidenced in Sullivan, Article 40.6.1° (iii) also protects the right to dissociate. This aspect of Irish association law was first described by the Supreme Court in Educational Company of Ireland v Fitzpatrick.17 In Fitzpatrick, a group of workers withdrew their labour and engaged in picketing after the plaintiff employer refused to compel its employees to join the Irish Union of Distributive Workers and Clerks. The employer sought an injunction to restrain picketing. In analysing the application of Article 40.6.1° (iii) to the facts, Budd J18 spoke of the guarantee therein in terms of liberty: “Taking the language of the Article quoted in its ordinary meaning it will be noted that what the State guarantees is “liberty” for the exercise of the right of the citizens to form associations and unions. ... I would myself construe the words of the Article as meaning by implication that a citizen has the correlative right not to form or join associations or unions if he does not wish to do so, and it seems to me to follow that in the case of associations or unions already formed he is free to associate or not as he pleases.”19 It is notable for present purposes the way in which Budd J dealt with this liberty. A plain reading of the text of Article 40.6.1° (iii) does not immediately display the existence of any right to join, or freedom from joining, a trade union. The article is couched merely in terms of forming unions. It is submitted that there is absolutely nothing to suggest, on a bare textual reading of the constitution, that the right to form unions necessitates the right to join unions. Furthermore, the proposition that there exists a constitutional right to join unions does not, on even the most strained Hohfeldian analysis, result in the attribution of equal constitutional force to the right to wilfully refrain from joining unions. It is submitted that this reasoning, being somewhat unclear and not related to the constitutional text in any obvious way, in fact displays a fundamental awareness of libertarian concerns in the field of labour law. To have decided the case purely on the constitutional text would have lead to the conclusion that employees may be compelled to join trade unions against their will. The guarantee of freedom of association, coupled with the views of Kingsmill-Moore J on the effect of Article 40.6.1° (i), was imbued with a libertarian vitality according primacy to the sovereignty Ibid at 102. JM Kelly, The Irish Constitution (4th ed., Tottel, 2004) at 1819. 17 [1961] IR 345. 18 Budd J delivered the decision of the High Court, which was upheld on appeal to the Supreme Court. 19 Ibid at 362. 15 16

245 111gradbook final.indd 245

22/10/2010 13:52:18


of the individual over her own decisions. Though it took something of a hermeneutical jig on the superior courts’ behalf, it is submitted that the libertarian notion freedom of choice in union membership is now nestled in Irish constitutional law by virtue of Fitzpatrick. In Doyle v Croke20 it was held by the High Court that the free association elements of Article 40.6.1° protected the entitlements of union members to fair procedures within their union. Relating to a dispute over the handling of union funds, Costello J held that “... if the constitutional right of citizens to form associations and unions is to be effective the Article in which it is to be found should not be construed restrictively as the right would be of limited value if it did not protect individual members against procedures which might be unfair to them. ... The plaintiffs and all the members of their union had a constitutional right by virtue of Article 40.6.1° (iii) to fair procedures.” Again, the right of union members to fair procedures in relation to the union’s decisions does not seem to be protected by the text of Article 40.6.1° (iii). Doyle is another example of judicially adding flesh to the skeletal constitutional guarantee of freedom of association in order to protect individualist concerns within unions. The article was interpreted beyond its textual confines so as to give it as great an effect for individuals as possible.

Irish Conclusions

The above cases demonstrate the libertarian underpinnings which have been attributed to the freedom of association provision in Bunreacht na hÉireann. On its face, Article 40.6.1° (iii) is a comparatively strong guarantee, yet still the cases mentioned demonstrate a willingness to depart from a literal reading of the article to expand its remit even further. In prioritising the rights of the individual in a trade union context to this extent, it is fair to say that the courts have, in many respects, weakened the overall power of trade unions acting collectively. In particular, the Supreme Court’s depletion of the Oireachtas power to regulate trade union membership in NUR v Sullivan was a major impediment in tackling the issues arising from union multiplicity, which could intuitively be seen as falling within “the public interest”. While Gavan Duffy J recognised this concern in NUR v Sullivan,21 the Supreme Court did not address the matter from this perspective. To conclude this brief Irish overview, the superior courts have gone to great interpretive lengths to ensure that Article 40.6.1° (iii) remains infused with a decidedly libertarian outlook. While the attribution of sovereignty to the individual is doubtless a worthy pursuit in many respects, it is submitted that the Irish jurisprudence in the area has focused too much on the individual right at the expense of the collective right, which is crucial to effectively attributing power to organised workers, and which Article 40.6.1° (iii) was apparently designed to respect. To this extent, it could be said that Irish freedom of association has pursued libertarian freedom consistently, but at too great a cost. (1988) 7 JISLL 150 at 159. [1947] IR 77 at 88.

20 21

246 111gradbook final.indd 246

22/10/2010 13:52:18


CHINESE LABOUR LAW: THE COSTS OF FREEDOM

At a time when China’s economic growth, and its accompanying international employment capacity, is of huge global significance,22 it is important for the western world to have a detailed insight into Chinese labour, its legal position and its place in society. Only on a comparative analysis, which allows for a panoramic analysis of legal norms, free from the constraints of the institutions to which they are formally attached, can the true nature of Chinese freedom of association be assessed.

Constitutional Disparities

Article 35 of the Chinese Constitution states that: “Citizens of the People’s Republic of China enjoy freedom of speech, of the press, of assembly, of association, of procession and of demonstration.”23 While ostensibly the above provision is not dissimilar to Article 40.6 of Bunreacht na hÉireann, it is not comparatively correct to equate these two provisions as sources of law. While the Chinese Constitution does declare that “it is the fundamental law of the state and has supreme legal authority”,24 in actual fact the constitution is, in typical authoritarian fashion, more or less non-justiciable. It acts as more of a mission statement and as a guideline for state practise than a binding legal instrument.25 Furthermore, whereas the personal rights provisions of Bunreacht na hÉireann presuppose an independent system of constitutional review to ensure their enforcement, no such system is evident in China. In fact, Article 67 of the Chinese Constitution states that the national People’s Congress has the sole authority to interpret laws, thus removing such power from the courts.26 In addition to this, it should be established that, while an analysis of relevant superior court jurisprudence is vital to understanding the position of Irish constitutional law, the same cannot be said of China, where court decisions do not shape the legal landscape in the same way. Not only does the Chinese constitution fail as a source of litigation as mentioned above, but courts do not operate a system of precedent in any meaningful respect,27 which clearly reduces the comparative quality of examining Chinese judicial decisions. The scarcity of published decisions further frustrates meaningful analysis in this regard.28 Baizhu Chen and Yi Feng, “Determinants of Economic Growth in China” (2000) 11 China Economic Review. 1982 Constitution of The People’s Republic of China: http://cin.sagepub.com/cgi/reprint/9/4/36.pdf (last visited 31st March 2010). 24 Ibid, preamble. 25 See, generally, Ulric Killion, “Post-WTO China and Independent Judicial Review” (2004) 26 Houston Journal of International Law 507. 26 Article 67: “The Standing Committee of the National People’s Congress exercises the following functions and powers: (1) to interpret the Constitution and supervise its enforcement ... (4) to interpret statutes”. 27 Nanping Liu, “Legal Precedents with Chinese Characteristics: Published Cases in the Gazette of the Supreme People’s Court” (1991) 5 Journal of Chinese Law 5. 28 Ibid. 22

23

247 111gradbook final.indd 247

22/10/2010 13:52:18


The Chinese Vision of Freedom and of Trade Unions

The Chinese constitution envisages a very different approach towards freedoms than its Irish counterpart. This is particularly true in the labour and trade union contexts. Whereas it can be said that the embodiment of trade unions in Bunreacht na hÉireann assumes their independence from the state and their antithetical relationship with employers, China’s constitution is predicated upon several values which conflict with this version of trade unionism. Primarily, the Chinese constitution establishes a state which is avowedly authoritarian. The state is professed to be “under the guidance of Marxism-Leninism”,29 and is generally founded upon basic communist ideals.30 In such a communist state the workers are communal owners of all means of production, and are also nominally sovereign in exercising state power.31 As Che points out, this establishes a presumption of unity of interests between unions and the state.32 This sterilises the meaningful exercise of workers’ rights, as to entertain a conflict between the interests of workers and the interests of the state would be to undermine the entire premise of Marxism. Thus, it can be said that rights, such as freedom of association, fall at the first hurdle in authoritarian systems such as China. It is not even the case that state interests take precedence over the interests or rights of the individual, though this could often be said to be the outcome. Rather, it can be said that state interests effectively are individual interests. This supposed unity of interest can be used to justify non-enforcement of association and labour rights. As against this, some writers have suggested that such rights arrangements are not a matter of positive law in China, but rather are an embodiment of typical “Asian values”.33 Whereas libertarian state theory can be traced back to Roman law, which embodied the principal of restraining arbitrary exercise of state power, Chinese law can be said to be founded, inter alia, on Confucianism. This philosophy demands the acceptance that humanity manifests itself in a finite social hierarchy, and that certain rights and duties arise from such tiered social relationships.34 The power structures embodied in the Chinese constitution, and in Chinese society, have been 1982 Constitution of The People’s Republic of China, Preamble, as amended by the Fourth Amendment on March 14, 2004, by the 10th NPC at its 2nd Session. 30 Article 1 of The People’s Republic of China states that “The People’s Republic of China is a socialist state under the people’s democratic dictatorship led by the working class and based on the alliance of workers and peasants. The socialist system is the basic system of the People’s Republic of China.” 31 Karl Marx, Das Kapital: Translated by Ben Fowkes (Penguin, 1990) at 270-280. This philosophy also impugns the traditional concept of employment contracts, which are seen as exploitative bourgeoisie instruments of social control. Labour contracts have, however, gained a certain recognition in Chapter 3 of The Labour Law of the People’s Republic of China, 1994. 32 Ken Che, “People’s Republic of China” in Blanpain ed., International Encyclopaedia for Labour Law and Industrial Relations Vol 4 (Kluwer, 2004) paras 257-259. 33 Yash Ghai, “Asian Perspectives on Human Rights” (1993) 23 Hong Kong Law Journal 342. 34 Lloyd M Richardson, note (2007) 107 Policy Review, http://www.hoover.org/publications/ policyreview/3477541.html> (last visited April 2nd, 2010). 29

248 111gradbook final.indd 248

22/10/2010 13:52:18


said to find their basis in this philosophy.35 However, it is submitted that whether the inferiority of individual rights as against state objectives arises from socio-historical factors or legal positivism is academic. Under either view, the fact remains that the Chinese legal system relegates freedom of association, and the union freedoms it protects, to a position of relative legal insignificance. China has undergone a recent period of expansion to western investment36 and a so-called “socialist market economy”.37 The associated social changes have demanded a re-assessment of Chinese labour law so as to enable its reaction to changing market forces. Though legal reforms have been instated there is still no meaningful right to freely associate, and there remain shortcomings in the Chinese protection of labour-related rights.

Chinese Freedom of Association and Labour Rights in PractiCe

The Trade Union Law of the People’s Republic of China 200138 was introduced with a view to establishing the role of trade unions in the new economic society,39 and to expand on the union provisions contained in the Labour Law of the People’s Republic of China.40 Article 10 of the 2001 Law establishes the All China Federation of Trade Unions (ACFTU) as “the unified national organisation”, meaning it is the only trade union protected at law. By virtue of this, it is immediately apparent that the ‘guarantee’ of freedom of association in Article 35 of the constitution does not establish the idea of trade union plurality, or a free choice as to which union to join. Though article 2 of the 2001 Law says that “[t]rade unions are mass organisations of the working class formed ... on a voluntary basis”,41 it is apparent that the only voluntary element involved is whether or not to join an ACFTU-affiliated union, and does not extend to the foundation of, or association with, independent trade unions. This is further confirmed by the unfettered subjection of union organisation to regulation in article 7 of the 1994 Law.42 It is perhaps not surprising, given this wilful denial of freedom of association, that China has Killion, note 25, at 515. Dong-One Kim, “Industrial Relations in Asia” in Michael Morley, Patrick Gunnigle and David Collings eds., Global Industrial Relations (Routledge, 2006) at 164. 37 1982 Constitution of The People’s Republic of China, Article 11 as revised on March 15, 1999, by the 9th NPC at its 2nd Session. 38 The Trade Union Law of the People’s Republic of China 2001 (Order of the President No.62), hereafter referred to as the 2001 Law. 39 Ibid, “Article 1: This Law is enacted in accordance with the Constitution of the People’s Republic of China with a view to ensuring the status of trade unions in the political, economic and social life of the State, defining their rights and obligations and bringing into play their role in the socialist modernisation drive.” 40 (Order of the President No.28), hereafter referred to as the 1994 Law. 41 Emphasis added. 42 “Labourers shall have the right to participate in and organise trade unions in accordance with the law”. 43 Convention 87 Concerning Freedom of Association and Protection of the Right to Organise. 35 36

249 111gradbook final.indd 249

22/10/2010 13:52:18


heretofore declined to sign the International Labour Organisation’s Freedom of Association Convention.43 In terms of legal powers and their position within the bargaining process, Chinese trade unions are weak. The ACFTU is essentially a branch of the state, and so lacks the requisite independence to truly represent the interests of its members, particularly when that employer is the state.44 Trade unions are also afforded weak rights to input into the process of industrial relations. For example, in the case of unfair dismissal, a union is merely entitled to “advance its opinion”.45 In relation to working conditions, trade unions may merely put forward opinions.46 These non-binding representative functions seem to render the ability of workers to vindicate their rights only slightly more effective collectively than on an individual basis. Furthermore, there is no right to strike in Chinese law. Such a right was expressed in the 1978 constitution, but conspicuous in its absence from the 1982 constitution. As Potter and Jianyong point out, the practical problems of poverty and a surplus of available labour mean that there is little impetus to resolve labour difficulties on an individual level.47 This lack of desire to seek enforcement on an individual level, along with the lack of power to seek it collectively, means that Chinese labour rights – to the limited extent they exist – often go unenforced in practise. Though there is no right to coercive industrial action such as striking, and no right to join an independent union, it is not correct to say that collective is bargaining is entirely absent from Chinese labour law. There exists a common practise of tripartite consultation involving discussions between government representatives, union representatives (ACFTU) and industry representatives (China Enterprise Commission).48 This form of dialogue is mandated by the ILO Tripartite Consultation Convention,49 of which China is a signatory. It is submitted that this form of consultation does not, however, do much to further the interests of trade union members in China. Tripartism presumes independence of the parties involved from each other, and thus has a distinctly western – even corporatist – flavour. Applied to a situation like China where all parties to the consultation process are effectively arms of the state, it cannot be said that meaningful social dialogue is engaged in. Though the tripartite commissions bear resemblance to the social partnership arrangement in Ireland, the Chinese process is necessarily lacking in the antagonism which promotes the dialogue and compromise which the process seeks to encourage.50 Convention 87 Concerning Freedom of Association and Protection of the Right to Organise. Simon Clarke and Change-Hee Lee, “The Significance of a Tripartite Consultation in China” (2002) 9 Asia Pacific Business Review 61 at 77. 45 The Trade Union Law of the People’s Republic of China 2001, Article 21. 46 Ibid, Article 23. 47 Pitman B. Potter and Li Jianyong, “Regulating Labour Relations in China: The Challenge of Adapting to the Socialist Market Economy” in Perry Keller ed., Chinese Law and Legal Theory (Aldershot, 2000) at 774. 48 Clarke and Lee, note 44. 49 Convention 144, Tripartite Consultation (International Labour Standards) Convention, 1976. 50 It is not, of course, suggested that the Irish Social Partners should be held up as a paragon of social dialogue. 43 44

250 111gradbook final.indd 250

22/10/2010 13:52:18


In response to the outright denial of certain union freedoms, there has developed a practise of illegal union formations.51 Not being permitted by law, such workers’ organisations operate in a manner similar to secret societies, and frequently adopt militant tactics. While such activity undoubtedly harms the possibility of legal recognition for independent trade unions, there is merit to the claim that, in implementing reform in the union sector, the state-driven ACFTU is part of the problem rather than part of the solution.52

Chinese Conclusions

The absence of legal guarantees for freedom of association and industrial action in Chinese law, coupled with the dependence of unions on the state, means that there is little by way of effective action which a union can take to represent the collective interests of its members. It might be fair to say, however, that industrial agitation to achieve a goal has never been the aim of a Chinese trade union. Trade unions, as embodied in Chinese Law, are scarcely comparable to those in Ireland. In fact, the similarities between them do not extend far beyond the name. Generally speaking, an Irish trade union, in common with trade unions in most developed libertarian states, is a non-state body, formed based on the consent of its members, which exercises certain legally-recognised – and legally-controlled – rights to independently represent its membership, and to peacefully take action in pursuit of protecting those rights. A Chinese trade union is unlike this in virtually every respect. It is submitted that the difficulty in equating the societal role of trade unions in Ireland and China arises from the fact that the existence of both is premised on differing societal mores and outlooks. If an Irish trade union exists for any one thing alone, it exists to introduce greater parity to the employee-employer relationship by creating the threat of collective action adverse to the employer’s interests. The role of a Chinese trade union is not so simple. Whereas they have been ascribed a limited representative function in respect of their members’ interests, the 2001 Law assigns to them a multiplicity of other functions which are unknown to Irish trade unions. These include input into the planning process,53 the organisation of labourers’ recreational time,54 and a curious edification Gordon White, “Chinese Trade Unions in the Transition from Socialism” (1996) 34 British Journal of Industrial Relations 433 at 450. 52 Morley, Gunnigle and Collings, note 36, at 165. 53 Article 33: “When working out plans for national economic and social development, the people’s governments at or above the county level shall, where major questions related to the interests of workers and staff members are concerned, listen to the opinions of the trade unions at the corresponding levels.” 54 Article 31: “Trade unions shall, in conjunction with enterprises and institutions, conduct... sparetime cultural and technical studies and vocational training, and also recreational and sports activities.” 55 Article 7: “Trade unions shall educate workers and staff members constantly in the need to improve their ideological, ethical, technical, professional, scientific and cultural qualities, in order to build a contingent team of well-educated and self-disciplined workers and staff members with lofty ideals and moral integrity.” 51

251 111gradbook final.indd 251

22/10/2010 13:52:18


function.55 Indeed, in its constitution, the ACFTU professes itself to be “a bridge and link between the Party and workers”.56 Trade unions, as we understand them to be in Ireland, are non-existent in China. However, bodies called trade unions exercise a different social and normative role in society by performing functions such as those outlined. This demonstrates the harmony of state and union interests which underpins classic Marxism, and is consonant with authoritarianism. It is submitted that the monolithic all-encompassing party-union machine may have been appropriate in the erstwhile state-dominated command economy, but that labour laws, particularly those relating to freedom of association, should change to reflect the newfound diversity of the Chinese labour system. Whereas a state-owned union may have been adequately representative in a state-owned economy, the recent advent of private and foreign investment in China mandates a more flexible approach to labour laws, freedom of association and union plurality.

CONCLUSION

Having cursorily reviewed the position of union rights and freedoms in Irish and Chinese law, it is proposed to draw on two of the essential comparative forces which distinguish libertarian and authoritarian regimes.

Regulation of Rights

While both authoritarianism and libertarianism philosophies typically recognise the necessity to regulate freedom of association in certain contexts, the extent to which this power is exercised is one of the essential differences between them. In Ireland, the power of the state to regulate the formation and activity of trade unions is constitutionally enumerated.57 However, as discussed, the open-ended wording of such restrictions does not confer an open-ended power of regulation.58 The fundamentality of individual rights prevented the entire abolition of freedom of association, even in a narrow context. Though state regulation of freedom of association is presumably permissible to an extent, this regulation may have, at most, partial impact on association rights. This approach evinces a libertarian view of regulating rights.59 In China, freedom of association is expressed in Article 35 of the constitution. Though no specific regulatory authority is conferred on the state in relation to the right, it is clear that the state has more or less unlimited regulatory powers in this respect. The 2001 Law’s designation of the ACFTU as the sole union is an egregious infringement of freedom of association. However, China’s authoritarian character permits such an infringement. Not only is there an absence of a procedure to review enacted laws against a basic law, but it is accepted practise in authoritarian Constitution of the Chinese Trade Unions (2008) <http://www.acftu.org.cn/template/10002/file. jsp?cid=48&aid=469> (visited 3rd April, 2010). 57 Articles 40.6.1 (ii) and (iii). 58 NUR v Sullivan (1947) IR 77. 59 Supra, pg. 2. 56

252 111gradbook final.indd 252

22/10/2010 13:52:18


regimes to equate state policy with law. Article 14 of the Chinese Constitution states that “the state continuously raises labour productivity, improves economic results and develops the productive forces by enhancing the enthusiasm of the working people�. This provision is not entirely aspirational, and can be used to justify state repression of free association.

Advancement of the Union Ideal

Interestingly, the adherence of both jurisdictions to their respective regimes can ultimately be said to have impacted negatively on the ability of trade unions to effectively represent workers in trade disputes. In Ireland, the prioritisation of the individual in cases such as Sullivan and Fitzpatrick can be said to have weakened trade unionism generally, e.g. by encouraging a multiplicity of weaker unions rather than a smaller number of well-regulated, stronger unions. In China, the curtailment of freedom of association and the restriction of union representation to one state body demonstrably restricts the ability of workers to address their grievances collectively. Indeed, the consistent yearly increase in the number of illegal industrial disputes in China, despite the 2001 laws,60 could be said to be a demonstration of the ineffectiveness of these rules, and provide impetus for reform of Chinese freedom of association in conformity with international standards.61

The Future

Libertarian and authoritarian regimes can both involve a negative impact on freedom of association and on the efficacy of trade unionism generally. While this work does not purport to be prescriptive, it is submitted that true freedom of association is better housed within the libertarian system. Only in a libertarian regime, where the most fundamental rights and freedoms are pragmatically guarded, can freedom of association be protected, free from insidious regulatory erosion. The authoritarian regime necessarily rejects freedom of association as contrary to its purpose. Whereas there have undoubtedly been interpretive problems in Ireland in relation to the constitutional freedom of association guarantee, it is submitted that these problems are idiosyncratic, and relate to the wording of the particular constitutional law in question and the judicial application thereof. This should not be taken as authority for the proposition that these problems are endemic to the libertarian system itself, which is conducive to giving life to fundamental freedoms.

60 61

Morley, Gunnigle and Collings, note 36, at 160. Note 43.

253 111gradbook final.indd 253

22/10/2010 13:52:19


Life Sciences panel

r

Prof. Cliona O’Farrelly, TCD (chair) Dr. Aoife O’Donovan, UCSF Dr. Andrew Lloyd, TCD Dr. Rod Ceredig, NUIG Dr. Ken Mok, TCD

Judges’ comments

This is a beautifully written essay with great style and originality. The topic is wide ranging – touching on physiology, perception, evolution, music and sport – but the execution is well integrated and engaging as well as literate and free from jargon. Indeed, it reads like a piece of excellent journalism, which would not be out of place in, say, the New Yorker. It is, however, not merely an exercise in popularisation; rather it is a properly scientific evaluation of evidence and points out that there is much, much more to time than clock time. The author has a great future in synthesising complex science for her peers and also explaining science to the public.

254 111gradbook final.indd 254

22/10/2010 13:52:19


r Life Sciences

How should we treat time in our investigation of coordinated movement? Rachel Carey

C

onsider the speed and accuracy with which a violinist plays, the skill with which a footballer handles the ball, the agility with which a gymnast balances on the beam. Consider the perfect timing and synchrony with which a group of dancers practice their steps, or an orchestra creates a sound. The human body is a complex system, capable of extraordinary things. It can be trained to perform intricately complex tasks with ease and fluency; ‘Any complex motor skill looks natural and smooth when performed by pros’ (Kelso, 1995, pg. 74). Each year, a world record is broken in athletics. In 2009, the world watched in awe as Usain Bolt ran the 100 metres sprint in a time of 9.58 seconds, representing the biggest increase in the record since electronic timing was introduced in 1968. With sufficient practice and training, we can push the human body to incredible levels. Van Gelder and Port (1995, pg. 1) asked the question ‘How do we do what we do?’ The coordination with which we perform movements is a topic that has garnered much research attention over the years. How do we coordinate the actions of systems as complex as the human body? When we move, we usually perceive our body as one unified entity, as a whole rather than as a system of individual parts. We perform a wide range of complex tasks on a daily basis, often at the same time, often with little awareness of our actions. Such an elaborate level of coordination requires a high degree of temporal specificity. An abundance of research has been devoted to examining the temporal properties of coordinated actions. The current paper explores the role of time in various forms of coordinated movement. It illustrates the importance of temporal processes in areas such as music, inter-limb coordination and speech production, and discusses the usefulness of the dynamical systems approach to cognition in describing complex structures and processes. Humans are capable of coordinating a complex system of multiple linkages, muscles and joints. A central and recurring question among researchers concerns how the various elements of the

255 111gradbook final.indd 255

22/10/2010 13:52:19


motor system are organised to generate an action. Nikolai Bernstein (1967) pointed out that this system can involve a large number of independent variables or degrees of freedom. If the brain is viewed as a kind of puppeteer which computes unique motor patterns for every movement, controlling individual muscles across the human body would be a hugely complex task. Bernstein viewed coordinated movement as the process of ‘mastering redundant degrees of freedom’ (1967, pg. 127). To illustrate, consider his classic description of a blacksmith hitting a hammer. One’s expectation may be that the greatest variability would be found in the trajectory of the hammer, rather than in the motion of the individual arm joints. In reality, the opposite was found to be the case. While there was large variability in the motion of the arm joints, the hammer’s movement was discovered to be relatively consistent. Since it seemed clear that the brain could not send direct signals to control the variability of the hammer, Bernstein proposed that the joints were working as part of a linked system, correcting each other’s movements. This contradicted the notion of the brain as a controller, and highlighted the interdependence with which individual joints act. The brain never does things independently of each other. Almost every movement we make, no matter how simple or complex, requires the coordination and integration of various elements of a system (Bizzi, Tresch, Saltiel & Avella, 2000). Instead of treating individual muscles and joints separately, movements may be viewed as the product of generalised motor programs (Schmidt & Lee, 2005). These programs allow greater control over actions by reducing degrees of freedom and joining the many individual components into one combined unit or controllable system (Sporns & Edelman, 1993). Such functional groupings of muscles or functional synergies (Bernstein, 1967, in Kelso, Tuller, Bateson & Fowler, 1984) allow a group of muscles to act as a single unit in order to accomplish a task. Kelso and colleagues (1984) used the speech system to test the existence of such functional synergies by perturbing participants’ jaws during speech. They hypothesised that then when one element of the speech system is disturbed, all other functionally connected elements would spontaneously readjust to compensate and preserve the individual’s goal. This is exactly what the researchers found, uncovering flexible, immediate and goal-specific compensation and coordination between functionally related parts (Kelso et al. 1984). The temporal consistency and flexibility with which humans perform actions is a topic of considerable interest (Ivry & Richardson, 2002). The general question of time has a long history and is not a simple one to comprehend. It is a topic which has garnered much attention, inspiring numerous articles and books and generating interest across many disciplines. How should we begin to think about time? How should we measure it? How should we treat it in psychological research? There is a wide array of publications addressing timing issues in various areas of psychology, however, a difficulty in the examination and investigation of time lies in its relative subjectivity. Time can be a personal construct and can vary between individuals. Even if we assume, for a moment, that the passage of time can be measured by hands of a clock, the perception of time can fluctuate from one person to the next. A week can seem to stretch on for a year and a year can appear to fly by like a day. One fundamental aspect of time is the change it brings. We cannot observe time directly, but we can examine how things change over

256 111gradbook final.indd 256

22/10/2010 13:52:19


time. This concept is fundamental to the dynamical systems approach to cognition. The word dynamics refers to something which evolves or changes over time. A dynamical system, then, describes how the state of a system changes or ‘behaves’ as time elapses (Norton, 1995). As time passes, a thought may become a word, an idea may become a reality, an intention may become a coordinated action. In a series of experiments, Kelso and colleagues examined the coordination of rhythmic movement within and between limbs. They first investigated bimanual movement by asking participants to oscillate their fingers back and forth to the beat of a metronome. When the metronome frequency increased, participants abruptly and spontaneously switched from antiphase (anti-synchronous) to in-phase (synchronous) finger movements. This effect did not occur when participants began with the in-phase movement, suggesting that participants are capable of performing only synchronous movement at a high frequency level. The principals underlying these ‘finger wagging’ experiments have proven crucial in understanding movement in general. Each finger may be seen as modelled by an internal oscillator, and the coupling between the oscillators allows the movement of one finger to become intricately related to the movement of the other. This coupling of two autonomous dynamical systems results in a composite system which is simpler in structure and more constrained than the aggregate of its components (Cummins, 2009). We will return to the discussion of dynamical systems later in relation to the perception of rhythm in music. Kelso went on to explore multi-limb coordination dynamics using a Multiarticulator Coordination Device (MCD). He investigated the pattern of coordination between the arms and legs and, as in the bimanual experiments, he uncovered two predominant patterns. When participants were asked to adopt the most comfortable pattern for themselves, a spontaneous transition occurred from ‘trot’ to ‘pace’ and from ‘bound’ to ‘jump’. In other words, participants reliably switched from anti-phase to in-phase cycling when the metronome frequency increased. Furthermore, in 1991, Kelso, Buchanan and Wallace conducted experiments to explore rhythmic multi-joint arm coordination. They examined rhythmic movements within the arm under four different conditions, using infrared light-emitting diodes (IREDs) which were placed on the subject’s joints. The researchers found, for the first time, phase transitions within a limb. Once again, what emerged was an abrupt transition from anti-phase to in-phase movement as cycling frequency increased. What is of note in these experiments is that, in order to perform the tasks, participants had to integrate temporal information into the mental representation of the intended movement (Ivry & Richardson, 2001). Adjusting the metronome frequency seemed to precipitate a spontaneous transition in participants’ movements. A preference for synchronous movement was evident, and this is the only form of coordinated action which remained stable at high frequencies. These experiments paved the way for future research in coordination dynamics. Finger tapping and other repetitive movement tasks have since been widely used in examining the timing of events in coordinated movements. We can appreciate the significance of timing for coordinated action if we consider a movement as seemingly straightforward as walking. Humans and other animals are capable of walking over

257 111gradbook final.indd 257

22/10/2010 13:52:19


long distances with relative ease. However, even an action as simple as this one has behind it a complex system of neuronal activation. Walking requires the time-varying coordination of a wide array of muscles and connections (Donelan & Pearson, 2004). When an individual is walking, the different elements of the movement are far from independent of each other. We can describe one part of the movement in relation to another; as one heel hits the ground, the other leg is midway through its cycle. Further, there is rhythmic movement of the upper limbs to allow for smooth motion (Zehr & Haridas, 2003). There is evidently a strong temporal structure underlying this apparently simple task. A core question in research into human timing systems concerns whether there is a psychological construct that acts as an internal clock, a mental space where time can be perceived and durations measured. Studies involving finger tapping tasks seem to provide support for the idea of an internal timekeeping system (Ivry & Keele, 1989). Research in the area has examined two forms of coordination; synchronisation (tapping on the metronome beat) and syncopation (tapping in between successive metronome beats). Previous studies have suggested that participants abruptly switch from syncopation to synchronisation when asked to tap at higher rates (Kelso et al. 1992). Mayville and colleagues used fMRI to investigate the neural structures underlying such coordination. They discovered that the distribution of certain cortical and subcortical areas involved in this type of coordination with an external stimulus depends on the pattern of timing. Both synchronised and syncopated coordination involve the contralateral sensorimotor and caudal supplementary motor cortices, as well as the cerebellum. However, additional networks involving structures such as the basal ganglia and dorsolateral premotor cortex seem to be required for syncopated coordination. This suggests that performing the two different forms of coordination require different strategies, with syncopation involving more attention demanding processes and requiring the planning of each individual movement (Mayville, Jantzen, Fuchs, Steinberg and Kelso, 2002). In general, research implicates cerebellar systems, as well as the basal ganglia, in timekeeping operations (Harrington, Haaland and Knight, 1998; Ivry & Richardson, 2001). Interest in this subject partly stems from research into disorders affecting motor skills which are caused by basal ganglia impairments such as Parkinson’s Disease (PD). Until recently, little was known about bimanual coordination problems facing individuals with PD. A study by Van Den Berg, Beek, Wagenaar and Wieringen (2000) demonstrated that such problems do exist, further highlighting the role of the basal ganglia in inter-limb coordination. Music is an interesting example of the extent to which timing is necessary for coordinated movement. There are few activities as demanding on time-ordered motor skills as music (Dahl, 2005). When a musician plays, their movements are intricately designed to fit a specific temporal pattern. The tempo of a piece of music can entirely alter the mood it expresses or the feelings it evokes. To play as part of an orchestra, musicians must be acutely aware of the precise timing of the piece and of the movements of other members. Coordinated rhythmic movement has the ability to bring groups together, to create one single unit from multiple independent parts. When an individual is first learning to play an instrument, they devote a lot of effort, attention and concentration to their exact movements, often having to mentally count out beats in order to

258 111gradbook final.indd 258

22/10/2010 13:52:19


maintain the correct rhythm. It takes years of practice to master an instrument, to become truly capable of such complex coordination while at the same time conveying a message to the listener. Ericsson, Krampe and Tesch-Rome (1993) estimated that the top musicians have spent over 10,000 hours practicing by the age of 21. But then, there are also musicians who seem to have rhythm engrained inside them. In This is it, the film featuring footage from Michael Jackson’s upcoming concert which was released after his death, there is a particular scene which stands out as an example of the extent to which expert musicians can sometimes ‘feel’ the rhythm or timing of a piece. During rehearsal, the infamous performer informs his musical team that he feels they the timing is wrong: ‘No, you gotta… you gotta let it simmer. It’s got a moment where it has to simmer’. Sensitivity to, and awareness of, precise timing events is in the very essence of music. One form of coordinated movement which involves substantial timing adjustments is speech production. Speech is one of the most complex motor behaviours in a human’s repertoire and requires the coordination of a wide range of various components. The temporal patterning of elements in speech is often referred to as speech rhythm (O’ Dell & Niemimen, 1999). The focus of research into rhythm in humans was initially approached from a physiological perspective and examined, for example, circadian rhythms, neural rhythms or cardiac rhythms. However, rhythm is a topic which can relate to many different contexts. Generally speaking, rhythm is the whole feeling of movement in time, including pulse, phrasing, harmony and meter (Apel, 1972). From listening to the flow of speech in everyday situations, many hold the strong belief that there is a rhythmic basis behind the production of speech, although such rhythm may vary from language to language. Such claims, however, have been largely unsupported by empirical research (Dauer, 1983, Crystal & House, 1990). Studies have failed to find evidence of an underlying rhythmic structure to speech and attempts to distinguish regular intervals of stresses or syllables in speech have been largely unsuccessful (Eriksson, 1991, cited in O’ Dell & Niemimen, 1999). In a paper by Port, Tajima, and Cummins (1996), the authors posited that the temporal structures involved in cyclic behaviours such as walking or finger tapping (as described above) could also be found in the production of speech. In other words, they suggested that speech contains similar timing properties as those which exist in other forms of coordinated movement. Port and colleagues point out one important feature of human behaviour, namely, self-entrainment. Entrainment refers to the influence of the timing of repetitive motions by one oscillator on that of the other oscillator, in such a way that they adjust and fall into a simple temporal relationship with each other (Port et al. 1996). This tendency for oscillations to become related can be observed in a wide variety of systems. For example, Kelso, Southard and Goodman (1979, cited in Port et al. 1996) asked participants to perform an ‘easy’ and a ‘hard’ reach task. Unsurprisingly, the easier task took considerably less time to complete than the hard task. However, the authors found that when participants were asked to perform the two tasks simultaneously, the easier reach was held back or constrained by the more difficult one. Participants found it easier to coordinate the two movements such that they took the same amount of time to complete; however, such coupling introduces a constraint on the movement of each arm. In this way, the coupling of the two systems creates a composite system with a more constrained, simpler structure (Cummins, 2009). To return to the area of music for a moment, the phenomenon of self-entrainment and

259 111gradbook final.indd 259

22/10/2010 13:52:19


coupling can be observed in the difficulties facing individuals who are learning to play certain musical instruments, such as the piano or guitar. Playing such instruments involves the two arms completing very different tasks and working somewhat independently of each other. Individuals learning to play these instruments, therefore, are required to control self-entrainment by attempting a de-coupling of the two arms from each other. Cummins and Port (1998) introduced a speech cycling task to examine rhythmic constraints. Speech cycling involves participants repeating the same phrase over and over in time with an auditory stimulus while the temporal distribution of prominent syllable onsets, or beats, are examined. Repeating the same phrase allows a stable cycle to be created which is not evident in everyday unconstrained speech. The paper by Cummins and Port presents two speech cycling experiments in which participants are asked to repeat phrases in the form of ‘X for a Y’ (e.g. ‘big for a duck’) in time to a two-tone metronome. Participants were instructed to repeat the phrase such that the first word of the phrase (X) was aligned with the first, higher tone and the last word (Y) was aligned with the second, lower tone. Speech became thereby entrained to an external stimulus, allowing performance to be stabilised in time and temporal constraints to be observed. The interval duration between the low and the next high tone was varied in order to examine the extent to which the intervals between stress beats (feet) were rhythmically independent. The results indicated that participants could produce only three stable patterns, corresponding to a hierarchical nesting of one unit (the internal timing of each phrase) within another, larger unit (the overall repetition cycle; Cummins, 2002). This task is artificial and bears little resemblance to natural conversation. However, the findings provide evidence for the presence of rhythmic constraints on stress timing, such that stress beats are constrained to occur at specific phases of a cycle. Many of us have had the curious experience of finding our speech slowing down or speeding up or our accents changing slightly depending on the individual to whom we are talking. This is not only a perceptual phenomenon; empirical evidence has demonstrated that adults move in synchrony with each others’ speech rhythms (Condon, 1976) and also converge in dialect (Giles, Coupland, & Coupland, 1991) and speaking rate (Street, 1984). Cummins (2009) views rhythm as an affordance for movement, permitting coordination within a stimulus. Affordance is a concept associated with psychologist J.J. Gibson and refers to a property of the environment which is relevant to the movement potential of the organism (Cummins, 2009). For example, one could speak of a chair’s ‘sit-ability’ or a stairs’ ‘climb-ability’. When viewed in this context, can we still conceive of rhythm to be involved in speech processes? As mentioned, there has been an abundance of research examining rhythmic regularities underlying speech production. Saying prayers aloud in church or chanting a slogan in a protest demonstrates the capacity humans have for synchronous speech. Cummins (2009) examined synchronisation among two speakers who were reading a prepared text simultaneously. This is a task humans seem to find exceptionally easy, naturally locking together with the speech of another individual and maintaining a high level of synchrony without the need for practice. Indeed, practice does not usually improve the degree of synchrony one achieves, suggesting that performance depends on the strategies we use when speaking normally. This study suggests that the speech signal contains

260 111gradbook final.indd 260

22/10/2010 13:52:19


a huge amount of information and that synchronisation depends on several kinds of information inherent in the speech signal. It is possible to interpret this task as creating a coupling relation between the two speakers, each of which may be viewed as an autonomous dynamical system which are involved in mutual entrainment processes (Cummins, 2009). Further evidence for a coupling relation between the perception of speech from a certain source and the system of motor speech production of the listener comes from the field of neuroscience. Evidence from Transcranial Magnetic Stimulation (TMS) studies has found that simply listening to the speech of others elicited an increase in evoked muscle potentials of speech articulators (Fadiga, Craighero & Olivier, 2005). Evidence has emerged in recent years that there are mirror neurons (Rizzolatti, Fadiga, Gallese & Fogassi, 1996) located in the premotor cortex which recognise the meaning of other people’s actions. It seems that when we observe an action performed by another individual, neural activity is produced in our brain which is similar to that produced when we are performing the action ourselves. Such findings can be reconciled with Cummins’ treatment of rhythm as an entrainment for coordinated movement. By looking at rhythm through this perspective, the importance of temporal structure in speech processes becomes clear and is consistent with its unifying role in other forms of coordinated movement. Cummins (2009) demonstrates that, while speech is a uniquely human, complex activity, the basic underlying principles are applicable to other forms of coordinated movement. The importance of rhythmic coordination in areas such as music, dance and even speech has generated an abundance of recent research. It is not unusual for groups of people who are working together to sing or hum a tune in order to keep to a particular rhythm. Large groups of musicians in an orchestra, all playing different instruments, can be unified by the underlying rhythm. Rhythm allows for coordination and organisation of movement. It may take years to become a professional musician, but even those with no musical training whatsoever have little trouble recognising the rhythm and following the tempo of a musical piece. We appear to be capable of perceiving temporal regularity in musical sequences with very little effort. This is quite a remarkable feat, since music performances are typically full of temporal irregularities. Large and Palmer (2002) addressed this, proposing that the perception of temporal regularity is achieved by a system of internal oscillations which entrain to the rhythms in complex musical pieces. A dynamic attending framework, as described by Large and Jones (1999), proposes a coordinated relationship between external rhythms (created by external stimuli) and internal rhythms (oscillations) that comes about as a result of entrainment. Dynamic systems theories view time as part of an intrinsic whole, which cannot be isolated within an organism. There is no sense of one system controlling another, but an emphasis on mutual dependence, naturally cutting across the boundaries of brain, body and the external environment. These theories suggest a deep and sophisticated level of interaction between the organism and its environment. Kelso (1995) examined the speed at which horses chose to move when in a natural environment. He noted that horses have a restricted range of speeds for every gait in their repertoire, and that the speed they chose when allowed to move about freely was optimal for oxygen consumption. He concluded that these are not arbitrary gaits, rather, animals have evolved in such a way that they each have a discrete set of movements, allowing them to act

261 111gradbook final.indd 261

22/10/2010 13:52:19


in a selective, coordinated way. It seems that while we are autonomous beings, we are far from independent of our environment. Questions of time, what it is, how we measure it, how we should think about it and treat it in psychological research, continue to generate much interest, discussion and debate among psychologists. As humans, we are remarkably aware of timing processes. We are capable of detecting temporal consistencies and inconsistencies, perceiving rhythms and measuring durations. Time relates to every aspect of our lives and is a core aspect of the relationship between an organism and its environment. However, there appears to be an underestimation for the problem of time in general within cognitive science, and it is only recently that psychologists have begun to consider in detail how time is represented and utilised for coordinated actions. By interpreting rhythm as an affordance for the entrainment of movement, as suggested by Cummins, we move towards a more coherent understanding of the role of timing processes in coordination. Through this perspective, the role of time is one of organisation and unification. It is inherent in every action we produce, from speech processes and music perception, to simply throwing a ball or walking, and the role of time should therefore be considered central to all forms of coordinated movement.

262 111gradbook final.indd 262

22/10/2010 13:52:19


263 111gradbook final.indd 263

22/10/2010 13:52:19


Mathematical Studies panel

r

Prof. Daniel Heffernan, NUIM (chair) Dr. Oliver Mason, NUIM Prof. Stephen Kirkland, NUIM Dr. Edward Cox, UCD Prof. John Carroll, DCU Dr. Paul Watts, NUIM Dr. Anthony Small, NUIM Dr. Wilhelm Huisinga, NUIM

Judges’ comments

This paper is quite impressive in several ways: first and foremost, in its clarity. The quality of writing and presentation of the material would be above average in a journal article, and so is all the more notable from work by an undergraduate. Secondly, the author demonstrates an admirable and deep understanding of the areas of mathematics he uses, like Lie groups and algebras, complex analysis and especially differential geometry. This is further supported by his judicious use of citations, and it seems apparent that the author is well-versed in all of the references he uses. As regards the project topic, it is an interesting one – namely, the determination of the configurations (instantons) which minimise the action functional in a particular metric space – and the paper always keeps the goal in mind. Further, in some of the later chapters of the report, the author presents some original calculations, which given the advanced nature of the material, is highly impressive. On the whole it is a superb and highly readable introduction to and review of the title.

264 111gradbook final.indd 264

22/10/2010 13:52:19


r Mathematical Studies

Instantons and the Taub-NUT space Chris Blair

A

Abstract

general construction of Yang-Mills instantons on the Taub-NUT space recently appeared in [15], including as an example the expression for an SU(2) instanton of unit instanton charge and vanishing monopole charges. We explain the techniques and ideas used in this construction, and investigate the properties of the instanton solution. In particular by taking the limit in which Taub-NUT space reduces to R4 we show how the solution can be related to the familiar k = 1 SU(2) instanton on R4. We also investigate the behaviour of the solution at the origin and at infinity in Taub-NUT, and examine the instanton in the R3 × S1 limit of the Taub-NUT space, in which we expect to obtain a caloron. The ideas we need are based on the ADHM construction of instantons on R4 as well as the Nahm transform which produces monopoles on R3. We review these constructions, and explain how they are incorporated into the bow diagram formalism of [15]. We also demonstrate how bow diagrams may be used to obtain the k-centered Taub-NUT space and a natural connection over it via a hyperkähler quotient.

Introduction

Instantons are special self-dual solutions to classical Yang-Mills theory [1] defined on a fourdimensional space, which minimise the Yang-Mills action. Their study brings together many different areas of mathematics and physics. From the mathematical viewpoint the study of instantons involves areas such as fibre bundles, twistor theory and algebraic geometry. Physically, instantons can have important effects in quantum field theory, for example leading to tunnelling between different vacuum states in quantised Yang-Mills theory [3] [4]. The first instanton solution, for SU(2) Yang-Mills theory, was found by Belavin, Polyakov, Schwartz and Tyupkin [2], who also noted that instantons could be classified by an integer topological invariant k. Their solution was for the k = 1 instanton – later SU(2) instantons with k > 1 were found using various ansatze by Witten, ’t Hooft and others [5] [6], while Atiyah and Ward [7] [8] utilised twistor theory methods to produce instanton solutions.

265 111gradbook final.indd 265

22/10/2010 13:52:19


An approach that gave all instanton solutions on R4 for any compact gauge group was found by Atiyah, Drinfel’d, Hitchin and Manin [9] [10]. This has come to be known as the ADHM construction, and reduces the problem of producing instantons to that of finding a certain set of algebraic data satisfying some particular constraints. The ADHM construction was modified by Nahm [11] [12] [13] to produce magnetic monopoles (originally introduced by Dirac [17] in 1931 and shown to arise in non-abelian gauge theory by ’t Hooft [19] and Polyakov [20] in 1975). Static monopole configurations on R3 can be viewed as instantons on R4 which are invariant in one direction. The algebraic data of the ADHM construction are replaced by a set of four matrices (T0, T1, T2, T3) depending on an auxiliary variable s and obeying a system of non-linear ordinary differential equations known as the Nahm equations. The Nahm transform can in turn be generalised to produce calorons [14] (instantons on R3 × S1), and has a number of other uses (reviewed in [26]). Elements of both the ADHM construction and the Nahm transform will be needed in this report, which aims to construct and then analyse Yang-Mills instantons on the Taub-NUT space [32] [33]. This space is an example of a gravitational instanton. A gravitational instanton [27] [28] is a four-dimensional manifold which is a vacuum solution to Einstein’s equations of general relativity and which has self-dual Riemannian curvature tensor. As these spaces are four-dimensional, it makes sense to define a Yang- Mills theory on them and look for the instantons of this theory. Non-compact gravitational instantons are classified by their asymptotic behaviour. In the simplest case, the gravitational instanton behaves locally like R4 at infinity and is termed asymptotically locally Euclidean (ALE). Such spaces were completely classified by Kronheimer [30], who showed that every ALE space is diffeomorphic to a minimal resolution of C2/Γ, where Γ is a finite subgroup of SU(2). The construction of all Yang-Mills instantons on ALE spaces was accomplished by Kronheimer and Nakajima [31], using a generalisation of the ADHM construction. Of interest to us in this report are asymptotically locally flat (ALF) spaces. An ALF space is a gravitational instanton that resembles R3 × S1 at infinity. In contrast with ALE spaces (of which R4 is the simplest example), ALF spaces are essentially curved. The construction of instantons on a curved space leads to difficulties both technical and conceptual. The construction of instantons on ALE spaces relied on the fact that these spaces are essentially deformations of flat space, so that it is possible to easily generalise the ADHM construction from R4. On a curved space, more work is needed. The k-centered Taub-NUT space [27] is an ALF space with metric (1.1) where is a three-component vector, τ is periodic with period 4π, l is a constant and the are constant vectors (called the centres of this multi-Taub-NUT space). The one-form is such that . Note that there is a coordinate singularity at each centre, If k = 1, then the space is known just as the Taub-NUT space. Setting l = 0, the Taub-NUT space reduces to R4. A number of instanton solutions on the Taub-NUT have been found over the years, for instance in [40] [41] [42] [43].

266 111gradbook final.indd 266

22/10/2010 13:52:19


A general construction of instantons on the Taub-NUT has recently been described by Cherkis [15] [16]. The aim of this report is to review this construction, and use it to obtain an expression for the SU(2) instanton with unit instanton number, as found in [15]. We then carry out an analysis of this solution, in particular showing how to relate it to the k = 1 SU(2) instanton on R4 in the l = 0 limit. Our report is structured as follows: firstly, we review classical Yang-Mills theory and the ADHM construction of all instantons on R4. We exhibit the basic k = 1 instanton solution using the ADHM method as described in [10]. In chapter 3 we introduce the theory of monopoles. These are described by the Bogomolny equation, which can be obtained from the Yang-Mills self-duality equation by supposing invariance in one direction in R4. If we suppose invariance in three directions, we obtain a system of non-linear matrix ordinary differential equations known as the Nahm equations. We describe the Nahm transform which takes solutions of the Nahm equations to solutions of the Bogomolny equation. We outline what is meant by the notion of a magnetic monopole, in both abelian and non-abelian gauge theory, and use the Nahm transform to explicitly construct the most basic examples. We then introduce, in chapter 4, the idea of a moment map. This is an extension of conserved quantities in classical mechanics to the action of Lie groups on manifolds, and in fact can be thought of as underlying both the ADHM and Nahm transforms. We illustrate how to formulate these transforms in terms of moment maps. In doing so we also introduce the building blocks of the bow diagrams we will use for Taub-NUT space. Finally, in chapter 5 we write down the Taub-NUT metric, and explain how it relates to the flat space metric. We show how the Taub-NUT space can be derived from bow diagrams as a hyperkähler quotient. We also extend this hyperkähler quotient to obtain the k-centered Taub-NUT space and a natural connection over it. In chapter 6 we come to the construction of SU(2) instantons on Taub-NUT. We describe how this is achieved using the bow diagram formalism and form the appropriate linear operator D which is essential for the construction. Chapter 7 contains the result of applying this construction to produce an SU(2) instanton on Taub-NUT, producing a general expression for this instanton, as originally found in [15]. The chief novel part of this project is contained in chapter 8, where we analyse this instanton. In particular we demonstrate how to write it in a gauge that reduces directly to a standard expression for the k = 1 instanton on R4 in the l = 0 limit. We also examine the behaviour of the instanton both at the origin of the Taub-NUT space and at infinity, as well as briefly examining one further limit in which we would expect to obtain a caloron solution. The appendix contains the explicit details of some calculations, as well as some other results. I would like to thank my supervisor, Sergey Cherkis, for his help and encouragement, as well as David Leen, Sam Palmer, Ronan Shee and Eoin O’Byrne (Islands website co-workers) for many useful discussions on topics related to this report.

267 111gradbook final.indd 267

22/10/2010 13:52:19


Chapter 2 Yang-Mills instantons 2.1: Yang-Mills theory

Yang-Mills theory was originally introduced [1] to model isotopic spin in nucleons, and can be thought of as a non-abelian generalisation of Maxwell electromagnetism. Although we will only be concerned with classical properties of instantons in this report, it is worth noting that YangMills theory in its quantised form is crucial to the Standard Model of modern particle physics. An initial motivation in the study of instantons was the observation that they provide the absolute minima of the classical, Euclidean Yang-Mills action, as we will show below. In the path integral approach to quantum field theory we integrate the exponential of i times the Minkowskian action over possible field configurations. The dominant contributions to this integral are expected to correspond to classical field configurations which minimise the Euclidean action obtained by Wick rotation, t g it (as then eiS becomes e−SEuc , which is a maximum when the Euclidean action is minimised) – and hence correspond to Yang-Mills instantons.

2.1.1 Self-duality equations

We begin by reviewing classical Yang-Mills gauge theory on Euclidean space R4, loosely following Atiyah’s lectures [10]. Let G be a compact Lie group with Lie algebra g. The objects of our study are vector bundles over R4 whose structure group is G (i.e. under changes of local trivialisation, the fibres transform in a representation of G). The connection on our vector bundle is a Lie algebra valued 1-form A, with covariant derivative D = d−iA and curvature F = DA = dA−iA ^ A. Note that we consider both A and F to be hermitian. A particular choice of local trivialisation in which to represent A and F is known as choosing a gauge. A change of trivialisation is known as a gauge transform and takes the form (2.1)

for g: R4 gG

In component form we write A = Aμdxμ (where Aμ is a linear combination of hermitian. generators of the Lie algebra g), and , with As we work over a Euclidean space we do not distinguish upper and lower indices. The Yang-Mills action in component form is (2.2)

or invariantly

(2.3)

Here h denotes the Hodge star operation on R4, which takes r-forms to (4 − r)-forms. In particular two-forms are mapped to two-forms, and the tensor hF is called the dual tensor of F. The action of h is defined in terms of a particular orientation of coordinates. We will take (x1, 2 x , x3, x4) as being positively oriented. Then, for instance, we can set h (dx1 ^ dx2) = dx3 ^ dx4,

268 111gradbook final.indd 268

22/10/2010 13:52:20


with all other possibilities being determined by cyclic permutations of the indices. In terms of components we have The equations of motion which follow from varying the action are

(2.4)

The key constraint we place on solutions to these equations is that they have finite action. This seemingly simple requirement has powerful topological consequences, which we will explore in the next section. The curvature form also satisfies the Bianchi identity: (2.5) or more compactly

(2.6)

Comparing (2.4) and (2.6) we see that the equations of motion will be automatically fulfilled if (2.7) This equation is known as the self-duality equation. We say that F is self-dual if it equals its dual hF, and anti-self-dual if it equals − hF. As the definition of h depended on an orientation of coordinates, we can interchange self-dual and anti-self-dual solutions of the self-duality equation by changing this orientation. Instead of trying to solve the full Yang-Mills equation of motion (2.4), which is a second- order non-linear partial differential equation in Aμ, we can instead try to find connections satisfying the (somewhat simpler) self-duality equation. Solutions of this equation are known as Yang-Mills instantons.

2.1.2 Topological constraints

We require that the action (2.2) be finite. This is equivalent to requiring that the curvature Fμν decay to zero as we go to spatial infinity, which in turn says that the connection Aμ must go to within a gauge transformation of zero: (2.8) If we consider some bounding sphere at infinity then the above condition gives a map g G from this sphere into the gauge group G. From now on we restrict to the case G = SU(2). As well as being the simplest and most studied case, it also in fact contains all the details we need to study other gauge groups. For instance, for another classical compact Lie group G we can consider deforming g so that is a map from into an SU(2) subgroup of G, and all the results from this section will continue to hold. The crucial point to note about SU(2) is that it is isomorphic to a three-sphere, so in effect we have a map g : S3 g S3. Such a map of spheres is classified by an integer invariant k, which gives the number of times the first sphere “wraps” around the second sphere (this is most easily visualised for the circle S1 by imagining the first circle winding around the second). More

269 111gradbook final.indd 269

22/10/2010 13:52:20


formally, k is an element of the homotopy group πn(Sn) Z, and in fact is a topological invariant characterising the connection A. It is important to note that the map g(x) for k ≠ 0 is only defined for large enough |x|. If we try to apply it everywhere in R4 we will find that there is some point where g(x) is not properly defined. In the trivial case k = 0 we have that g(x) is constant. We can then apply g globally on R4 to find that the connection Aμ is everywhere gauge equivalent to zero, and has zero curvature. A connection with vanishing curvature is termed flat. A related description of k can be obtained in terms of a principal SU(2) bundle over S4. For connections whose curvature goes to zero at infinity we may compactify R4 by adjoining to it the idealised point at infinity. The resulting space R4 is homeomorphic to S4. We then consider 4 an SU(2) fibration over this S . To describe the bundle, we introduce two local patches on the base space. We suppose that R is an arbitrarily large positive number, so that we have Aμ ~ ig−1(x)∂μg(x) as |x| g R (with x R4). We use R to define the “northern” and “southern” hemispheres of . Over each hemisphere we can trivialise the bundle, and on their intersection we have a transition function which is just a gauge transformation: (2.9) Here AN denotes the expression for the connection over UN, and AS that over US. All the topological information about the bundle is then contained in the function g(x). This is defined on UN Us S3, so again we have a map classifiedbyan integer k. Here k expresses the non-triviality of the SU(2) principal bundle over S4. It remains to find a concrete expression for k in terms of A or F. We claim that (2.10)

This can be written invariantly as

(2.11) The four-form is an invariant of our original vector bundle known as the second Chern class. Let us now sketch the demonstration that the integral (2.10) does indeed give us the correct value of k. First, we observe that can be written as a total divergence: (2.12) which follows from direct calculation. Hence, we can use Gauss’ theorem to rewrite the integral in (2.10) as an integral over a bounding sphere at infinity: (2.13)

270 111gradbook final.indd 270

22/10/2010 13:52:20


Now at infinity

which can be expressed cleverly as Inserting this and using the asymptotic form of Aμ we find (2.14)

Finally, one can show (see for instance the original paper [2] by Belavin et al for an approach involving the group measure, or [37] for a cohomological proof) that this integral gives the integer k classifying the number of times the map g(x) winds around the group SU(2), and is indeed a topological invariant.

2.1.3 A bound on the action

Recall the expression for the action:

(2.15) Now for any two matrices B and C the inner product tr B†C is positive definite. As both Fμν and its dual are hermitian we have the obvious inequality (2.16)

Expanding out the left-hand side and using

(2.17) we have

(2.18) (2.19)

Now trFμνFμν is non-negative, so we can take the norm on both sides to find (2.20) or, referring to equation (2.10),

(2.21)

with equality when

(2.22) This important result shows that instantons correspond to the minima of the Yang-Mills action. If k > 0 then equality implies the self-duality equation Fμν = +(hF)μν , while if k < 0 we instead have the anti-self-duality equation Fμν = - (hF)μν. If k = 0 then the curvature Fμν is identically zero. We refer to k as either the instanton number or the instanton charge.

271 111gradbook final.indd 271

22/10/2010 13:52:20


2.1.4 SU(2) connections and quaternions

An important link in the study of instantons is the isomorphism between SU(2) and the group Sp(1) of quaternions of unit norm. We will denote the unit imaginary quaternions by e1, e2, e3. These satisfy the identities (2.23) We will frequently represent these in terms of the Pauli sigma matrices as (2.24) This representation also makes clear the isomorphism between the Lie algebra of Sp(1), which is generated by the unit imaginary quaternions, and the Lie algebra of SU(2), which is generated by hermitian T1,T2,T3 satisfying [Ti,Tj] = iεijkTk. Clearly A point x = (x1, x2, x3, x4) R4 naturally corresponds to the quaternion x1e1 + x2e2 + x3e3 + x4. A hermitian SU(2) potential A is of the form A = iAμ(x)dxμ where Aμ(x) is a pure imaginary function of the quaternion x. A short calculation shows that (2.25) which is self-dual in the orientation (x4, x1, x2, x3). Similarly, the two-form dx ^ dx is anti-self-dual in this orientation.

2.2 The ADHM construction

The method of Atiyah, Drin’feld, Hitchin and Manin [9] [10] constructs instantons by embedding vector bundles inside larger vector bundles. In fact it gives all instantons on R4 for any compact Lie group. We are interested in constructing a self-dual connection in a vector bundle E of (quaternionic) dimension n over some four-dimensional space X (which we will take to be R4 or its compactification S4). We consider this vector bundle as embedded inside a larger trivial bundle X × HN , with N = n + k. Choosing a quaternionic vector bundle corresponds to seeking to produce Sp(n) instantons with instanton number k. The description for other gauge groups is similar (and in the case at hand we can easily use the isomorphism between SU(2) and Sp(1) to produce the SU(2) instantons we are interested in). The key idea is that choosing a particular gauge for the bundle E gives us a way of projecting from the trivial bundle X ×HN onto E. This projection then induces a connection on E, and with appropriate constraints this connection can be shown to be self-dual. We represent points x X as quaternions, and define a quaternionic operator D: Hk g HN which depends linearly on and which satisfies (2.26) with the solutions of the equation (2.27)

272 111gradbook final.indd 272

22/10/2010 13:52:21


giving an orthonormal basis for E (that is to say for each x the solutions ψ(x) give a basis for the fibre at x). We assume that this equation does indeed have n linearly independent solutions, i.e. that D† has rank n. If ψ1(x), . . . , ψn(x) are n linearly independent solutions normalised so that ψi†ψi = 1 then we form these into the matrix ψ = (ψ1, . . . , ψn) where each column is a solution ψi. We have ψ†ψ = In, and the projection operator from the large trivial bundle X × HN onto the space of solutions of D†ψ = 0 is (2.28) Now, the connection D = d − iA on E is defined by (2.29) where d is the trivial covariant derivative on X × HN and f is a section of E. As we must have D†f = 0 we can put f = ψg for some g : X g Hn, then (2.30) which says that our induced connection is

(2.31)

This expression might appear to suggest that A is pure gauge; however note that ψ is an N × n matrix and so does not represent a gauge transformation (except in the degenerate case N = n which corresponds to the k = 0 instanton). Note also that A is indeed hermitian: A† = −idψ†ψ = −i d(ψ†ψ) − ψ†dψ = iψ†dψ = A, having used ψ†ψ = I. The curvature is (2.32) Using dψ†ψ = −ψ†dψ we can write this as (2.33) From (2.28) this becomes (2.34)

where we have used that D†ψ = 0. Now at the very start we required that (D†D) be an invertible matrix with only real entries, so (D†D)−1 is of course real too and hence commutes with dD. This means we have (2.35) so that F is proportional to the two-form dD ^ dD†. Recalling that D is linear in the (quaternionic) variable x, with say D = a + bx for some matrices a, b, we have (2.36) The two-form self-dual, and so the curvature we have constructed is too. In the case of a charge k Sp(1) SU(2) instanton over R4 we can take [10]

273 111gradbook final.indd 273

22/10/2010 13:52:21


(2.37) where B is a k × k matrix of quaternions and Λ is a 1 × k matrix of quaternions. Requiring that D†D is real gives us the constraint Λ†Λ + BB† = real and forces B to be symmetric. We have

2.3 Example: the k = 1 SU(2) instanton

(2.38) where B and Λ are quaternions. We will find that B can be interpreted as the position of the instanton, while Λ will become a scaling factor. Let us choose Λ=ρ to be real. Writing we have the equation (2.39) Notice that we have some freedom in choosing v or χ. This corresponds to a choice of gauge for the final instanton solution. For now we take χ=1, so that and our solution is (2.40) The condition ψ†ψ = 1 gives the normalisation factor N to be (2.41) Now, writing

and using the fact that A = iψ†dψ is hermitian, we have (2.42)

we calculate

and from this we find (2.43)

274 111gradbook final.indd 274

22/10/2010 13:52:21


This expression for A displays singular behaviour as x g B. We have However, if we let

(2.44)

then

(2.45) so we see that A is in fact pure gauge near B, and the singular behaviour can be removed by applying the gauge transformation g−1(x): (2.46) The gauge transformation g−1(x) = (x − B)/|x − B| can also be applied to the solution ψ. We had (2.47) and multiplying this on the right by g−1(x) gives instead (2.48) which corresponds to choosing χ = x − B rather than χ = 1 in equation (2.39). Calculating A from equation (2.48) gives the potential (2.46) immediately. We observe that the connection A in its original, singular form (2.43) in facts decays to zero as |x| g ∞. The gauge transformed connection (2.46) becomes, as we would expect, a gauge transformation of zero: (2.49) As we noted in section 2.1.2, this is not a global gauge transformation, having a singularity at the origin. One way of verifying that this is an instanton with k=1is to note that the map is clearly a map from which winds once around SU(2). Alternatively we could calculate the curvature and use the formula (2.50) In the gauge of equation (2.46) the curvature is (2.51) Filling this in to equation (2.50), computing the wedge product

275 111gradbook final.indd 275

22/10/2010 13:52:22


and also changing coordinates to y = x − B we have (2.52)

Here the 2π2 factor is the volume of the unit sphere S3 in R4, which we obtain from changing to polar coordinates. The integral over |y| is easily evaluated in terms of z = |y|2 + ρ2, and we indeed find that k = +1. Notice how by choosing an inequivalent orientation of our coordinates we would obtain an additional minus sign from the initial wedge products, and so turn our instanton into an anti-instanton. As a final comment, note that the quantities B and ρ really give us a five-parameter family of (gauge inequivalent) k = 1 instanton solutions. If we take into account global SU(2) gauge transformations, we see that in fact the space of all k = 1 SU(2) instanton solutions is eight dimensional (such a space is known as a moduli space).

Chapter 3 Monopoles and the Nahm transform 3.1 Dimensional reduction of the self-duality equations

Consider the self-duality equations in component form:

(3.1)

We will now consider a number of interesting dimensional reductions of these equations. Firstly, suppose that the potential Aμ does not in fact depend on the spatial coordinates xμ at all. Then we obtain the system of commutators (3.2) If we combine these into a suggestively-named quaternion as

(3.3)

then (3.2) is equivalent to the condition that D†D be real, which was one of the constraints on the ADHM operator D necessary to produce self-dual fields. This is our first hint of a duality transformation between solutions to different versions of the self-duality equations. Now let us suppose x4 invariance. We let Φ A4 so that we obtain (3.4)

276 111gradbook final.indd 276

22/10/2010 13:52:22


or more compactly

(3.5)

where the covariant derivative is DiΦ = ∂iΦ − i[Ai, Φ] and h3 is the Hodge star in 3-dimensions. These are the Bogomolny equations and a pair (Φ,A) that solves them is called a magnetic monopole. The scalar field Φ is known as the Higgs field. If we suppose invariance in the x1,x2,x3 directions and relabel x4 s and Aμ Tμ we obtain the ordinary differential equations (3.6) known as the Nahm equations. It turns out that solutions of the Nahm equations are linked to solutions of the Bogomolny equations by a transformation due to Nahm. Naturally we could also reduce the anti-self-duality equations and obtain equivalent systems of equations differing by a minus sign on the right-hand side.

3.2 Monopoles 3.2.1 Dirac monopoles

A monopole is a hypothetical particle which carries magnetic charge. Classically, they would be described by a Coulomb-like magnetic field: (3.7) where g is the monopole charge. An early analysis due to Dirac [17] revealed that the existence of a single monopole in the universe would account for the quantisation of electric charge in multiples of the electron charge. Monopoles such as that described in (3.7) are known as Dirac monopoles. Topologically, a Dirac monopole is described by a principal U(1) bundle over R3\{0} (the fibre bundle description of the Dirac monopole was first given by Wu and Yang [18]). This bundle is non-trivial, and we need two coordinate patches to describe it. On each patch the magnetic field can be written as the curl of a vector potential without any difficulty. The vector potential is in fact a U(1) connection. The expressions for on the overlap of the two patches are related by a U(1) gauge transformation (we will use the Nahm transform to derive two such Dirac monopole potentials below). Physically, this stems from the fact that if we could everywhere write the magnetic field as then the total magnetic flux through any closed surface (using Gauss’ theorem) would vanish, as the divergence of any curl is zero. By allowing to have singularities in some coordinate system we can obtain non-zero magnetic flux through closed surfaces.

277 111gradbook final.indd 277

22/10/2010 13:52:22


3.2.2 Non-abelian monopoles

In the 1970s ’t Hooft [19] and Polyakov [20] showed that monopole solutions arose naturally in certain non-abelian field theories. The basic example is the Georgi-Glashow model, originally proposed as a possible description for electroweak interactions. The Lagrangian describing this model is (3.8) Here is an electromagnetic field tensor, and Aμ and Φ are su(2)valued vector and scalar fields respectively (Φ is known as the Higgs field). The covariant derivative coupling the Higgs field to the vector potential is DμΦ = ∂μΦ − i[Aμ, Φ] and V (Φ) is some scalar potential.1 If we denote the generators of su(2) by Ta then we can write where a = 1,2,3. The index a labels the internal space of the theory. Note that these satisfy [Ta, Tb] = iεabcT c, and we choose to normalise them according to tr The Lagrangian can then be written as (3.9) while the covariant derivative becomes

(3.10)

We also take

(3.11)

The energy of our system is

(3.12)

We identify the electric and magnetic fields in the field tensor as

(3.13)

We will choose to study static, time-independent systems, corresponding to Ei = A0 = 0. Eventually, we will also take the BPS limit, λ g 0. For the moment with non-zero λ the energy is (3.14) and we see that the vacuum (zero energy) state of the system is defined by the conditions (3.15) To be consistent with our notation for instantons we are not writing a gauge coupling e, which would here amount to making the replacements (e would be identified with electric charge).

1

278 111gradbook final.indd 278

22/10/2010 13:52:22


The Higgs field does not vanish in the vacuum but rather is restricted to a sphere in internal space of radius v. This is the crucial condition that gives rise to topologically non-trivial monopole configurations. In the vacuum state we can always take the Higgs field to be given by (0,0,v), using a global SU(2) gauge transformation if necessary. The resulting configuration is invariant under rotations about the a = 3 axis in internal space, but not by rotations about the a = 1 or a = 2 axes. Thus the full SU(2) symmetry of the original system has been spontaneously broken to a U(1) symmetry by the Higgs field. This U(1) subgroup is generated by Φ and gives rise to a theory of electromagnetism embedded in the larger SU(2) theory. Let us suppose now we have let λ g 0 but kept the vacuum constraint ΦaΦa = v2 on Φ. We would like our system to have finite energy, thus for our fields should approach the vacuum configuration. Restricting to the sphere at infinity we have a map from to the sphere in internal space described by ΦaΦa = v2. A map from S2 into S2 is characterised by an integer n which tells us how many times the first S2 is “wrapped” around the second. Recall that for instantons we had a similar situation, with a map between two three-spheres characterised by the integer k which we called the instanton number or charge. Here we will find that the integer n is related to the magnetic charge of the monopole. In the case n = 0 the map from the physical S2 into the internal S2 is trivial, and we can always find a global gauge transformation such that Φa = (0, 0, v). The case n = 1 corresponds to the basic ’t Hooft-Polyakov monopole. As a simple example of an n = 1 map we suppose the “hedgehog” configuration for large r (3.16) Let us note that for finite energy we must have the covariant derivative DiΦ go to zero faster than r−3/2 (from simple dimensional analysis of the energy integral). It is easiest to require that ∂iΦa term cancel the term. Subbing in and computing tells us that our field should behave as (3.17) as r g ∞. We can in turn insert this into (3.13) of the magnetic field to find that

and use the definition (3.18)

This magnetic field does not quite correspond to a physically observable field. As was argued by ’t Hooft we need to define a gauge invariant electromagnetic field tensor which reduces to the usual U(1) case of Maxwell electromagnetism when Φa = (0, 0, v). This is accomplished by the choice (3.19)

which yields a magnetic field

(3.20)

279 111gradbook final.indd 279

22/10/2010 13:52:22


corresponding to a Coulomb magnetic field located at the origin, i.e. to a magnetic monopole. Let us now connect this with the dimensional reduction of the Euclidean self-duality equations. The energy for a static system in the BPS limit can be written (3.21)

Hence we obtain the following bound on the energy: with equality when

(3.22) (3.23)

This equation is known as the Bogomolny equation, and the bound (3.22) is known as a Bogomolny bound. In particular we see that the dimensional reduction (3.4) corresponds exactly to (3.23). Finally, notice that the integral appearing in (3.22) can be rewritten as (3.24) Now, DiBi = 0 by the Bianchi identity, while (the commutator term can be shown to vanish by computing [Ai,BiΦ] and then taking the trace). Hence we find (3.25) having used Gauss’ theorem to convert the integral over the whole space into an integral over a boundary sphere of large radius R. In the n = 1 case, using equation (3.20) we have that (3.26) for large = R, with ni an outward pointing unit vector. Inserting this gives an integral over the surface of the sphere; hence we find (3.27) Hence a solution of the Bogomolny equation Bi = DiΦ with n = 1 corresponds to a magnetic monopole of unit charge with energy equal to the total magnetic charge . Note that in the case n = −1 the sign of Φ is changed, so that the Bogomolny equation in the form Bi = -DiΦ is obeyed. We then decompose the energy as in (3.21) but with the minus and plus sign interchanged, to find that again (as the magnetic flux will be negative). In fact, the conclusion for general n is that the energy satisfies E ≥ 4π|n| with equality when the Bogomolny equation in the form Bi = sign(n)DiΦ.

280 111gradbook final.indd 280

22/10/2010 13:52:23


3.3 The Nahm transform

The Nahm transform, which was in fact modelled on the ADHM construction, was introduced by Werner Nahm in the 1980s as a way of producing BPS monopole potentials [11]. It replaces the matrix linear operator D of the ADHM construction with a differential operator, and involves first solving a system of non-linear matrix ordinary differential equations (the Nahm equations). Consider a quadruplet of k × k hermitian matrices depending on a real parameter s I, with I some interval on the real line. Different choices of I give different monopole solutions, with the size of the Nahm data Tμ determining the charge of the monopole. Note that the interval I can be composed of subintervals, with Nahm data of different dimension on each subinterval. At the endpoints of each subinterval we have to define appropriate matching conditions for the Nahm data from the two consecutive subintervals. The matrices T1,T2,T3 can be viewed as endomorphisms of a rank k vector bundle E over I (i.e. for each s the matrices give linear maps from the fibre over s to itself). The remaining matrix T0 acts as a connection with covariant derivative Under gauge transformations g have We now introduce a space S ≈ C2 of two-component spinors which provides us with the representation ej = −σj in terms of the Pauli matrices, and define the Weyl operator D which acts on L2 sections of the tensor product bundle E S. (3.28) Requiring that D†D be real is equivalent to the Nahm equations:

(3.29)

Note that we can set T0 = 0 by choosing a gauge transformation g such that We denote an L2 section of E

(3.30) S by ψ. The inner product on the space of such sections is

The adjoint operator of D with respect to this inner product is having integrated by parts and used hermicity of the Tμ. To construct monopoles we first twist D† with respect to twisted operator is

(3.31) (3.32) and a dummy coordinate t0. The (3.33)

281 111gradbook final.indd 281

22/10/2010 13:52:23


We can view −itμ as a flat connection on a vector bundle e over I. The twisted operator then acts on L2 sections of E e S, which are again denoted by ψ. The twisted operator still satisfies the requirement that be real and invertible. In general, we gauge the combination T0 − t0 to zero, and assume we have done so for the rest of this section. We then solve the Weyl equation (3.34) and normalise the solution ψ. This equation defines a vector bundle E’ over R3, with the solutions of for each providing a basis for the fibre over . To define a connection on this new vector bundle, we consider E’ to be embedded inside the larger trivial bundle given by the direct product Then, just as we did in the ADHM construction, we obtain a connection on E’ by projecting the trivial covariant derivative on the larger trivial bundle onto E’. The projection operator in this case is The trivial covariant derivative we need is provided by ∂j − is, with s I (the replacement is essentially a result of Fourier transforming the dummy coordinate t0. Equivalently, if we had kept t0 in D† then ψ would be proportional to e−ist0. Differentiation with respect to t0 is then equivalent to multiplication by −is.) The new monopole connection is then defined by , and the potentials (Φ, A) which solve the Bogomolny equations are given by the formulae: (3.35) (3.36) The proof that the Nahm transform indeed produces solutions of the Bogomolny equations relies, as did the demonstration that the ADHM construction produces self-dual fields, on the fact that the projection operator P(s, s’) onto the kernel of D† can also be written where G = (D†D)−1 is the Green’s function for D†D. We refer the reader to Nahm’s outline in [13], or the monopole review [36] for a more explicit demonstration along these lines. We can also define an inverse transform, that takes a solution to the Bogomolny equation and uses it to form the matrices which solve the Nahm equations. It was shown by Hitchin [23] that, for the group SU(2), solutions to the Bogomolny and Nahm equations are in fact in one-toone correspondence via the Nahm transform. This result was later extended to monopoles with G = SU(N), SO(N) and Sp(N) by Hurtubise and Murray in [25].

3.4 Examples of the Nahm transform

We will use the Nahm transform to generate two basic monopole solutions: the Dirac monopole, and the ’t Hooft-Polyakov monopole in the BPS limit. As well as illustrating the power of the transform, the computations involved will be needed again when we con- struct an instanton connection on the Taub-NUT space in section 7.2. We will construct the charge one solutions, so that our Nahm data are 1 × 1 matrices defined on an interval I. We gauge T0 to zero, and the trivial Nahm equations j = 1, 2, 3 are solved

282 111gradbook final.indd 282

22/10/2010 13:52:23


by any constant triplet of real numbers (T1, T2, T3). The Weyl operator we need is (3.37) We take the unit quaternions to be given in terms of sigma matrices: ej = −iσj. Introducing the slash notation for a three-vector α = (α1, α2, α3), and defining we can write equation (3.34) as (3.38) which has solutions

(3.39)

with some constant (s-independent) spinor and N a constant to be chosen to ensure the solution is normalised. We note here the useful result that for any matrix B that squares to the identity and a scalar x, (3.40) which can be verified directly from the power series expansion of expBx. In particular we have so that (3.41)

3.4.1 The Dirac monopole

The Dirac monopole [17] is the simplest example of a monopole. The Nahm transform for this monopole was first given by Nahm in [12]. We take the interval I to be [λ, ∞) for some positive real constant λ. To normalise the solution (3.39) we have to compute the integral (3.42) This is only normalisable if is an eigenvector of with eigenvalue 1. This means we only have one normalisable solution, and if is such a vector then (3.43) One choice of eigenvector is (3.44) which satisfies hence N = e−λz. The Higgs field, which here corresponds exactly to a classical magnetic potential, is given by

283 111gradbook final.indd 283

22/10/2010 13:52:24


(3.45) The vector potential is given by (3.46) Now, (3.47) as

is also a left eigenvector of

with eigenvalue −1. Hence A reduces to (3.48)

With

as in (3.44) we obtain

(3.49)

Writing is straightforward to check that this A and ÎŚ satisfy the abelian form of the Bogomolny equation which just expresses the equivalence of the two descriptions of the magnetic field Bas the curl of a vector potential (as it is in normal electromagnetism) and as the gradient of a scalar potential (which makes it into a magnetic monopole). Note that . The Higgs field is singular at the origin, corresponding to a point-like source of magnetic charge. There is a further singularity in our description of the magnetic field this occurs along the negative z3 axis (this singular line is known as a Dirac string). The topological reason underlying this singularity is that the U(1) bundle describing the Dirac monopole is non-trivial, and so we cannot find a single coordinate system to describe it. Our choice of corresponds to a choice of coordinate system that is good everywhere except along the negative z3 axis. Another choice could be vector (3.50) which gives a vector potential with a singularity along the positive z3 axis: (3.51) (the subscript S denotes that this expression is valid in the southern hemisphere, away from the by the gauge transformation positive z3 axis). This is related to

284 111gradbook final.indd 284

22/10/2010 13:52:24


3.4.2 The ’t Hooft-Polyakov monopole

The prototype charge 1 SU(2) monopole is known as the ’t Hooft-Polyakov monopole. We will use the Nahm transform on the interval [−λ,λ] to obtain this monopole in the BPS limit, as described by Nahm in [11]. The monopole itself was first found by Prasad and Sommerfield [21] (using trial functions and a bit of “shimmying”) and independently by Bogomolny [22]. In this case the Weyl equation has two linearly independent normalisable solutions, where ψ1 and ψ2 are linearly independent constant vectors. We choose to take ψ1 = (1, 0)t and ψ2 = (0, 1)t – some other choice amounts to a gauge transformation of the resulting monopole fields. We therefore write the solution as the two-by-two matrix We fix the normalisation factor using From this we find that

We can now determine the Higgs field

(3.52) (3.53)

hence

(3.54)

For the vector potential we have

(3.55)

Now,

(3.56) There is only one term that will not vanish when we subtract off the Hermitian conjugate, which is the term, so we find (3.57)

(3.58) (3.59)

285 111gradbook final.indd 285

22/10/2010 13:52:24


Chapter 4 Moment maps and hyperkähler quotients 4.1 Moment maps

Our discussion of moment maps and hyperkähler manifolds borrows from the lecture notes by Hitchin [24]. We also refer the reader to [37] or [38] for some of the elementary facts about manifolds and Lie derivatives we summarise below. A symplectic form on a manifold M (of even dimension) is a non-degenerate two-form ω which is closed, dω = 0. Consider a manifold with such a form, and let G be some Lie group acting on M, with Lie algebra g. An element ξ of g defines a vector field Xξ on M given by differentiating the action of exp(itξ) G at the identity: (4.1) Note we take ξ to be hermitian, and the action of h G on f a function defined on M is denoted by h(f). For instance, G could act by conjugation: h(f) = h−1fh. Any vector field X defines a flow on M given by paths x(t) through M along which X is always tangent. The flow through a point p M is defined by the equation (4.2) where we have local coordinates x = (x1,..., xn) on M. We define φt(p) to be the point obtained by moving for a parameter distance t along the flow through p. If T(x) is some multilinear object on M then the change in T(x) along the flow generated by X is given by the Lie derivative of T(x). The Lie derivative at a point p along the flow of X is defined by (4.3) where we have used the push-forward For a function, and using that

along the flow of X, we have (4.4)

For more complicated objects, we can use the fact that the Lie derivative satisfies a version of the product rule: As an example, if Y is some vector field and f is a function this tells us that hence, . Proceeding in this way we can compute the Lie derivative of one- and two-forms. We will only need the result for a two-form ω, which is (4.5) Here ιX denotes contraction with respect to X, i.e. ιX ω is the one-form defined by (ιX ω)(Y) = ω(X, Y) for any vector field Y, and ιX dω(Y, Z) = dω(X, Y, Z), for any vector fields Y, Z.

286 111gradbook final.indd 286

22/10/2010 13:52:24


Now consider the vector fields Xξ arising from the action of G, and suppose that this action leaves invariant the symplectic form ω. This means that ω is constant along the flow of Xξ, so that its Lie derivative along Xξ vanishes: (4.6) using dω = 0. Hence we find that one-form ιXξ ω is closed. If on our manifold all closed forms are exact, then (4.7) for some function Hξ on M. We now define the moment map as a function from the manifold M to the dual of the Lie algebra defined by (4.8) Here the bracket (, ) denotes the natural duality bracket between g and g*. Thus the moment map takes the element ξ and associates to it a function Hξ. We can view Hξ as a Hamiltonian function, with Xξ its associated Hamiltonian vector field. In particular it follows that Hξ is constant along the flow of Xξ – intuitively the moment map can be thought of as a generalisation of the notion of conserved quantities from classical mechanics to the action of Lie groups on manifolds. The unavoidable first example when discussing moment maps is to consider a classical system with n coordinates qi and conjugate moment pi. The symplectic two-form is and we let G act by translations, i.e. This action induces a vector field defined by (4.9) (using the chain rule) which tells us that

Then we have (4.10)

This gives the function the momentum:

so we see that the moment map gives (4.11)

then we get rid of n coordinates If we restrict to a constant moment map (i.e. pi = ci = constant in our 2n-dimensional space. The remaining n coordinates are the qi – but we can use the translational action of G, which we know leaves the momenta invariant, to set qi = bi = constant for all i. This eliminates the n qi coordinates and leaves us with a 0-dimensional manifold. This process is essentially the same as solving the problem of a free particle in n-dimensional space in classical mechanics, expressing the fact that once we know the values of the momenta components pi the motion of the particle is entirely determined. This notion of reduction – which can be formalised as the notion of a symplectic quotient – has an important generalisation to a particular class of 4n-dimensional manifolds known as hyperkähler manifolds, which we will now go on to discuss.

287 111gradbook final.indd 287

22/10/2010 13:52:25


4.2 Hyperkähler reduction

A complex structure J on a 2n-dimensional (Riemannian) manifold M with metric g is an automorphism of the tangent bundle which squares to minus the identity, and satisfies a technical integrability condition involving the vanishing of a tensor known as the Nijenhuis tensor (this condition is necessary so that we can always find local holomorphic coordinates at any point of M). If the complex structure leaves the metric invariant, g(JX,JY)=g(X,Y) for all vector fields X, Y, and is also constant with respect to the Levi-Civita connection on the tangent bundle, then we say that the manifold is Kähler. A 4n-dimensional manifold M is said to be hyperkähler if it admits three such complex structures e1, e2, e3 which satisfy the usual quaternionic identities. Many of the manifolds we encounter in the study of instantons admit this property – after all, instantons are intimately connected with four-dimensional space. On such a manifold we can define three closed and non-degenerate two-forms by (4.12) Each of these can be viewed as a symplectic form on the manifold. Thus, if we have an action of a Lie group G on M then we can use these symplectic forms to obtain three moment maps. These moment maps can be used to construct a new hyperkähler manifold, using a process known as hyperkähler reduction or a hyperkähler quotient. We first restrict to constant values of the moment maps: (4.13) This condition gives 3dimG restrictions on the original 4n coordinates and restricts us to the space We then quotient by the action of the group G. The resulting space (4.14) is again hyperkähler and of dimension 4n − 4 dim G. In practice, the process may be summarised as follows: given the manifold M, symplectic forms ωj, and action of the group G we work out the vector fields induced by this group action and contract them with the symplectic form to obtain the three moment maps. Setting these moment maps equal to constant values (most frequently zero) we obtain a series of relationships between our original coordinates. We should also define new coordinates which are compatible with the moment maps and invariant under the action of G. Substituting the moment map relationships and the new coordinates into our metric and simplifying we should obtain a metric on written solely in terms of invariants, and some left-over pieces on which G acts. Finally, quotienting by the group G means neglecting the latter. A number of examples of hyperkähler quotients, relevant to the theory of monopoles, are worked out in the paper [34]. Of most interest to us, the metric on the Taub-NUT space is obtained as a hyperkähler quotient of H × H with the gauge group R acting on a real constant. We will perform a similar quotient in detail in the next chapter.

288 111gradbook final.indd 288

22/10/2010 13:52:25


4.3 ADHM construction in terms of moment maps

We can reformulate the ADHM construction of instantons in terms of maps between two complex vector spaces. Our notation essentially follows that in [15]. Consider the diagram of figure 4.1 showing two complex vector spaces Ck and Cn, with maps I as well as two maps This setup corresponds to an SU(n) instanton with instanton number k. This diagram displays a type of directed graph known as a quiver. In fact, because we are associating vector spaces and linear maps to the arrows and dots shown in the diagram, we more properly have a representation of a quiver.

Fig. 4.1. ADHM quiver diagram. We consider first the space of pairs (I, J) of linear maps between Cn and Ck. The metric on this space is (4.15) and the three complex structures arise from forming (4.16) which is a map from a space of two-component spinors. We have an action of the complex structures ej = 竏段マニ on Q. In terms of I and J these are (4.17) Using this we have

(4.18)

which implies

(4.19)

Similarly,

(4.20) (4.21)

Note that in the one-dimensional case when I = z1 and J = z2 are just complex numbers we recover the symplectic forms on C2 found in the appendix. If we combine these into

289 111gradbook final.indd 289

22/10/2010 13:52:25


(4.22) and compare with (4.23) we see that

(4.24)

where we define Vec Note that the trace here does not denote trace in the space S, but in the space Ck (i.e. the trace applies only to the entries of the matrix dQ^dQ†, which are k × k matrices, and not to dQ^dQ† as a two-by-two matrix over S itself). Now we have two gauge group actions: that of U(n) on Cn and that of U(k) on Ck. We will only need the U(k) action, however, and so concentrate entirely on it. This action is (4.25) where we understand Writing g = exp(itα) with α hermitian we find that the vector field induced by α is (4.26) and the contraction of this with

is

(4.27)

so

(4.28)

In terms of I and J, this works out to give

(4.29) It is convenient to view these as one complex and one real moment map: (4.30) We can similarly work out the moment map for the action of U(k) on the Bi: (4.31) In this case for g = exp(itα) we find that

290 111gradbook final.indd 290

22/10/2010 13:52:25


(4.32) and using the symplectic forms

(4.33) (4.34) (4.35)

the moment maps are

(4.36)

We can write these more compactly by introducing (4.37)

then

(4.38) Now define the linear operator (4.39) where b1 and b2 are complex coordinates on R4. Strictly speaking when we write bi we mean biIk. The adjoint operator is (4.40) If we compute D†D we find (4.41)

Setting the combined moment maps for B1, B2, I and J equal to zero gives (4.42) (4.43) The latter implies that the off-diagonal elements of D†D vanish, and the former implies that the

291 111gradbook final.indd 291

22/10/2010 13:52:26


diagonal elements are equal, so that we find D†D is proportional to the two-by-two identity, and so quaternionic real. It also invertible, as needed for the ADHM transform. Note that if we write (4.44) then

(4.45)

and the condition Vec D†D = 0 is equivalent to (4.46) We will meet all this again in section 6 when we discuss instantons on the Taub-NUT. If we specialise to the case of the k = 1 SU(2) instanton on R4 then we can combine the complex coordinates b−, b+ into a quaternion As k = 1 the maps B01 and B10 will also in this case just be complex numbers, and so we can define another quaternion As n = 2 I and J also combine to give another quaternion (4.47) The operator D† then has the form (4.48)

exactly as in section 2.3.

4.4 Nahm transform in terms of moment maps

We consider k × k Nahm data (T0(s), T1(s), T2(s), T3(s)) defined in a bundle over an interval We will later represent this diagrammatically by a wavy line. Such a quadruplet can be written as a quaternion T0 + T1e1 + T2e2 + T3e3 and so the space of all Nahm data over I has a natural hyperkähler structure. The metric on this space is given by (4.49) The action of the three unit imaginary quaternions (4.50) gives us the three symplectic forms (4.51)

292 111gradbook final.indd 292

22/10/2010 13:52:26


We have a U(k) gauge action given by The infinitesimal form of this action is found as usual by writing g = exp(iαt) for α hermitian and differentiating at t = 0. We find that α induces the following vector field: (4.52) Contracting this with ω1 gives (4.53) We then make use of the cyclic property of the trace as well as integration by parts to find (4.54) while cyclic permutations give us (4.55) (4.56) We are being a little sneaky with the boundary terms here. If our Nahm data are defined on an interval [a, b] we consider the derivative of the Nahm data to actually give us the values of the data at the endpoints, i.e. (4.57) (the choice of minus sign is dictated by the integration by parts). This form makes sense when we consider Nahm data with discontinuities at a point λ – then the “derivative” of at λ gives (where ) denotes the value of limit of as s approaches λ from the right, and denoting the same from the left). We now combine the moment maps into a single entity (4.58) They can also be given in terms of the Weyl operator D

(4.59)

293 111gradbook final.indd 293

22/10/2010 13:52:26


as

(4.60)

Chapter 5 Taub-NUT space 5.1 Properties of Taub-NUT

The Taub-NUT space [32] [33] is a four-dimensional, non-compact, asymptotically locally flat hyperkähler manifold, which is a vacuum solution to Einstein’s equations of general relativity. In terms of a three-vector coordinate and a circular coordinate τ with period 4π its metric can be written as (5.1) where l is a positive real constant, and

(5.2)

Note that there is a coordinate singularity at the origin. If we exclude this point then we can view Taub-NUT as being a fibration of a circle S1 over R3\{0}. As the metric becomes (5.3) describing the trivial product R3 ×S1. We see that the constant l determines the asymptotic size of this S1. The Taub-NUT space is the simplest example of an infinite family of four-dimensional spaces, known as the multi-Taub-NUT spaces. The metric for the k-centered Taub-NUT space is (5.4)

where

(5.5)

and the one-form ω satisfies

(5.6)

which is the invariant way of writing

5.2 Comparison with flat space

We take as coordinates on R4 a quadruplet (w,x,y,z) of real numbers. The standard flat metric is It is instructive to rewrite this metric in Taub-NUT-like coordinates. We first identify R4 with the space of quaternions H. For any point (w, x, y, z)

(5.7) R4 we have a

294 111gradbook final.indd 294

22/10/2010 13:52:27


quaternion q = w + xe1 + ye2 + ze3 and the metric can be written compactly as We decompose the quaternion q into the product of a purely imaginary quaternion a and a unit quaternion ee3τ/2, where τ is an angular coordinate with period 4π: (5.8) From this we have

and so (5.9)

Introduce a three-vector

whose components (x1, x2, x3) are defined by the relation (5.10)

i.e.

corresponds to the imaginary quaternion

Clearly

Using this and eliminating da2 via (5.11) the metric becomes (5.12) The next step is to complete the square in dτ, giving (5.13)

and using equation (5.11) again we find

(5.14) Now by definition which gives expressions for the components of those of a = a1e1 + a2e2 + a3e3:

in terms of (5.15)

Using the additional relation

we can invert these to find (5.16)

295 111gradbook final.indd 295

22/10/2010 13:52:27


We can then explicitly calculate dae3a−ae3da = 2(a2da1 −a1da2) in terms of result is that we find the metric to be

coordinates; the (5.17)

where

(5.18) is such that . Comparing (5.17) with the Taub-NUT metric (5.1) we see that the metric on R4 we have found is given by the Taub-NUT metric with l = 0. Note that the expression for ω involves a choice of direction – here the negative x3 axis – along which ω is singular. If instead we had chosen the positive x3 axis, for instance, ω would be given by (5.19) and the expression for the metric is the same.

5.3 Taub-NUT as a hyperkähler quotient 5.3.1 The data

Our basic setup is shown in figure 5.1, and is taken from [15]. The diagram of figure 5.1 is an example of a bow diagram – we refer to it as the small bow. In the construction of instantons on Taub-NUT this bow is used to generate the Taub-NUT coordinates we will use to describe the instanton. There is another, “large” bow, which determines the actual instanton data. We will deal with this in section 6.2. Let us now describe the bow diagram in detail. The wavy line represents an interval [–l/2,l/2]. Over this interval we have a line bundle e. On this bundle we define abelian Nahm data with t0 acting as a connection, and giving endomorphisms of the fibre es at We also have two maps, represented by the curved lines in figure 5.1. These can be combined into the two spinors (5.20) which can in turn be combined into the quaternion (5.21) The setup we have just outlined gives a representation of the bow diagram in terms of vector bundles and maps between fibres. Bow diagrams can be thought of as generalisations of quivers, which we briefly met when discussing the ADHM construction in terms of moment maps. The data are not uniquely determined, but admit U(1) gauge transformations g(s) acting as

296 111gradbook final.indd 296

22/10/2010 13:52:27


Fig. 5.1. Small bow.

(5.22) (5.23)

We will now show that if we quotient the space of all data corresponding to this bow diagram by the above gauge transformations, we obtain a copy of Taub-NUT space. Clearly gauge transformations which are constant on [–l/2,l/2] do not change our data, and so there is no point in quotienting by them. We can factor out such gauge transformations by choosing a particular distinguished point on our bow diagram, and only considering gauge transformations which are constant at that point. The overall effect is to remove one complete U(1) factor from the total quotient. Here we have chosen the midpoint 0 of the interval to be this distinguished point. We also take some other point s0 and refer to it as the marked point. Here for definiteness we suppose that s0 is positive. It splits the interval [–l/2,l/2] into two subintervals of lengths and We denote the Nahm data on the right subinterval as and on the left subinterval as By introducing this marked point we will be able to identify a natural connection on a line bundle over the Taub-NUT. The first step in the hyperkähler procedure is to make the Nahm data constant. We achieve this on each subinterval by considering gauge transformations which are trivial on the distinguished point 0, trivial at the marked point s0 and trivial at the endpoints of the original interval. Such gauge transformations will also allow us to identify tR) with and tL0 with tL0 with so that the space of the Nahm data on each interval corresponds to a copy of R3 × S1. After this we quotient with respect to the U(1) factors at the remaining points (as always excluding the distinguished point). We really only have two such factors – the actions of U(1) at s = −l/2 and s = s0 are not independent as we need to keep the action at s = 0 trivial. Thus the last step is to quotient by gauge transformations which are non-trivial at s = l/2 but trivial at s = s0, and by gauge transformations acting on s = s0 and s = −l/2 but trivially at s = 0. The result of this quotient will be to give us the Taub-NUT space. In fact before quotienting by the final gauge group action we will have the metric of a circle bundle over Taub-NUT, and we can use this to identify a natural connection on bundles over the Taub-NUT space. Consider first the right subinterval. Using the results of section 4, the first set of gauge transformations – trivial at s0 and l/2 – produce the one-dimensional Nahm equations as their moment maps:

297 111gradbook final.indd 297

22/10/2010 13:52:28


(5.24) Setting we find that the Nahm data with tR0 is a bit more subtle. Let us write

are constant on the right subinterval. The situation . Under this (5.25)

Ideally, we could set

and eliminate tR0 completely. However, we must have which requires f(s) be periodic, If we set (5.26)

then integration shows that this f(s) does satisfy f(l/2) = f(s0) + 2nπ. It also takes tR0(s) to Specialising to a gauge transformation of the form g(s) = we see that we can identify tR0 with tR0 , so that tR0 is in fact periodic. The quotient on the left subinterval is similar except we need to choose our gauge transformations to make sure that they remain trivial at s = 0. We can do this simply by using a combination of transformations which are trivial at the endpoints, and global (constant) gauge transformations which ensure nothing happens at s = 0. We find again that and are both constant. The metric on the data can now be written (5.27) or, as the Nahm data are constant, and using q = (b−, b+), (5.28) The symplectic forms for the Nahm data, arising from the action of the unit quaternions ej on ta0 +ta2e1 +ta2e2 +ta3e3, with a=L or R, are (5.29) while those for the bs, arising from the action of the unit quaternions in the form1 ej = −iσj on b±, are (5.30) There’s a bit of a subtlety here in comparing the ej acting on the Nahm data and the representation of them acting on the spinors. The quaternion −iσ1 actually corresponds to e3, while −iσ3 corresponds to e1. For this reason the symplectic forms ω1 and ω3 shown actually arise from what we’re calling e3 and e1 for the spinors. 1

298 111gradbook final.indd 298

22/10/2010 13:52:28


(5.31) These can be expressed in the compact form

(5.32) (5.33)

5.3.2 Gauge transformation at s = l/2

Consider the gauge transformation

(5.34)

such that g(s0) = 1 and g(l/2) = eiφR . This acts on tR0 as (5.35) and sends

which is equivalent to the transformation (5.36)

The vector field induced by this action is (5.37) which leads to the moment map (5.38) Now the combination qe3q† should look familiar – we previously used it to define a three-vector coordinate when we were writing the R4 metric in Taub-NUT-like coordinates in section 5.2. We can do the same here – define = (x1, x2, x3) by x1e1 + x2e2 + x3e3 = qe3q†. The moment map is then If we also decompose q = a exp(e3ψ/2), with ψ ~ ψ + 4π then the metric (5.28) becomes (5.39) where (5.40) Note that ψ transforms under the gauge transformation we are considering as ψ g ψ + 2φR, so that an invariant angular coordinate is provided by σ = ψ+2lRtR0. Using this coordinate and setting

299 111gradbook final.indd 299

22/10/2010 13:52:28


the metric becomes (5.41) The last two terms equal (5.42)

In total we get (5.43) The first term is the Taub-NUT metric and the gauge transformation only acts on the second term. When we quotient by the group action we therefore eliminate this term. 5.3.3 Gauge transformation at s = s0 Now consider the gauge transformation (5.44) This satisfies

and acts by (5.45)

which leads to the moment map (5.46)

or

(5.47) and as The invariant coordinate for this transformation is Ď„ = Ďƒ + 2lltL0. The metric (5.48)

becomes

(5.49)

300 111gradbook final.indd 300

22/10/2010 13:52:28


Combining the last two terms and completing the square in dtL0: (5.50)

so we find (5.51) The first two terms give the metric on Taub-NUT, while the last term gives the metric on μ−1(0). We see that μ−1(0) describes a circle bundle; if we change coordinates as we can write the second piece as (5.52) and from this extract the natural connection term (5.53) This is an abelian connection on a natural line bundle over Taub-NUT; taking the product of this bundle with R3 gives us a four dimensional vector bundle over Taub-NUT with covariant derivative . The differential d here denotes the full differential with respect to the coordinates (or equivalently on Taub-NUT. We will use this connection to obtain instanton connections on the Taub-NUT, by projecting it onto the kernel of solutions to the appropriate Weyl operator, which we describe in section 6. We can also check fairly easily that the curvature F = dA of this connection is in fact self-dual in the orientation (τ, x1, x2, x3) on the Taub-NUT.

5.3.4 A change of coordinates

From the above moment maps we have −iσ3 we have

where

and q = (b−,b+). Using e3 =

301 111gradbook final.indd 301

22/10/2010 13:52:29


(5.54) Here

In terms of

we have (5.55)

with the equality on the right following from direct calculation, using the definitions of b± in terms of b01, b10. Possible explicit expressions for the spinors b± that satisfy this are: (5.56) (compare b+ with

from the Dirac monopole Nahm transform in equation 3.44). Note that in fact (5.57)

We will find it convenient to use (t0, t) and b± as coordinates on Taub-NUT.

5.4 Multi-centered Taub-NUT as a hyperkähler quotient

We can extend the above construction to generate the metric on k-centered Taub-NUT space. Although we will not need this for the construction of instantons in this report, it is a useful exercise that demonstrates the power of the bow diagram formalism, and would be necessary for constructing self-dual connections on the multi-Taub-NUT space.

5.4.1 The bow diagram

The bow diagram for k centered Taub-NUT is shown in figure 5.2. We have k intervals, represented by wavy lines. Above these intervals we have line bundles, and abelian Nahm data The solid lines with arrows represent maps between the fibres at the endpoints of consecutive intervals. We refer to these lines as (directed) edges with an arrow pointing from tail to head of the edge. The tail of the ath edge is the right end of interval a, and the head of the ath edge is the left end of interval a + 1 (see also figure 5.3). We let the length of interval a be la, and the total length of all the intervals is

302 111gradbook final.indd 302

22/10/2010 13:52:29


Fig. 5.2. Bow diagram for k-centered Taub-NUT. Let us denote by eaR the fibre at the rightmost point of the ath interval, and by eaL the fibre at the leftmost point of the ath interval. The maps represented by each edge are (5.58) The ordering of the superscripts is to make it easier to write down how these transform under gauge transformations (5.59) The notation g(aR) means of course the value of the gauge transformation g at the rightmost 55 point R of the ath interval.

Fig. 5.3. A close-up view. We construct the usual spinors (5.60) and form these into k quaternions As before we have actions of the complex structures giving rise to symplectic forms If under a gauge transformation then we have a moment map We define a vector coordinate for each qa using so that this moment map is

303 111gradbook final.indd 303

22/10/2010 13:52:29


We also decompose qa = where fa is pure imaginary and ψa is periodic with period 4π, then the same gauge transformation is equivalent to ψa = ψa + 2φa. In the terms of ( ψa) the (flat) metric for each edge a is (5.61)

5.4.2 The quotient

As before, we are going to consider quotients with respect to the gauge group U(1) acting on the bow diagram. Again, we do not want to consider everywhere constant U(1) transformations so we introduce a distinguished point 0 which, for simplicity, we take to be the left point of the 0th interval (this point is shown on figure 5.2). We then only ever consider gauge transformations which act trivially at the distinguished point. We also have a marked point s0 which we consider to be on some interval int(s0). We will only quotient by gauge transformations which are non-trivial at s0 at the very end of the hyperkähler quotient procedure. The interval int(s0) is split into two parts by the marked point; we will denote the Nahm data on the left part (of length lL) by and that on the right part (of length lR) by We begin by using gauge transformations which are trivial at the endpoints of each interval, trivial at the distinguished point and trivial at 0. We can use these to make the Nahm data constant on each interval, and also identify ta0 with . Hence we obtain k+1 copies of R3×S1, each with metric (where a = L or R instead of int(s0)). Next, we quotient by gauge transformations acting on the right endpoint of each interval (except at s0). Measuring everything from the distinguished point 0 means that for instance on the ath interval these can be realised by (5.62) The actions are (5.63) The moment map is (5.64) and an invariant angular coordinate is We choose to set for some constant vector ; then still and the quotient is carried out as in section 5.3.2; the result is that for each a the metrics for the maps of edge a and the Nahm data for interval a combine to give a copy of the Taub-NUT metric. In total, including the metric from the split interval containing s0, we have

304 111gradbook final.indd 304

22/10/2010 13:52:29


(5.65) Next, we quotient by U(1) gauge transformations acting at the left endpoints of each interval, apart from the 0th (as the left endpoint of that interval is the distinguished point) and apart from at s0. Thus we have two chains of quotients. The first, which we denote by A, runs from the left endpoint of the 1st interval to left endpoint of interval int(s0). The second, which we denote by B, runs from the left endpoint of interval int(s0)+1 to the left endpoint of interval k − 1. These chains will be joined by the final gauge transformation, which acts at s0. The action at the left end of each interval is (5.66) so that the moment map is (5.67) and an invariant coordinate is We set the moment map equal to it follows that From the moment maps at the right endpoints we also have so that Then for the metric terms we write (5.68) Completing the square in dĎƒa and quotienting out that term we in fact find that we get another copy of the Taub-NUT metric (5.69) with We continue to combine terms in this way (the ) piece of the metric combines with that from the interval int(s0) − 1 to give another Taub-NUT) until we are left with only the gauge transformation at s0 to be performed. At this stage we have the metric (5.70) The indices A and B refer to the two chains of quotients we described above; i.e. is the overall invariant for the chain leading from the left endpoint of interval 1 to the left endpoint of interval int(s0). We also have a Because of the moment maps all the vector coordinates are related by shifts, so we can take for instance The final U(1) acts by

305 111gradbook final.indd 305

22/10/2010 13:52:30


(5.71) Now sends

will respectively contain terms, so that this gauge transformations Hence is invariant. The moment map is We set this equal to zero, notice that it implies for all a and b, and complete the square one last time to find (5.72) where ω = ωA + ωB , maps that

(any other choice is equally valid) and if we notice from the moment we have (5.73)

where give the relative locations of the k Taub-NUT centres. If we had chosen to set all our moment maps equal to zero, rather than to constant values, then we would have obtained a k-fold degenerate multi-Taub-NUT space, with all the centres coinciding at the origin. Letting where φ has period 2π and so can be viewed as a coordinate on the U(1) fibre over s0, we see that we have a natural connection (5.74)

We can write this as

(5.75) Observe that a0 = 0 and if we go all the way around the bow to l (viewing this as a point on interval k 0) we get and we find that al is just given by the exact form so that the connection is essentially periodic modulo exact forms.

5.5 Instanton on R4 in Taub-NUT like coordinates

We would like to have an expression for the basic instanton on R4 in terms of the Taub-NUT coordinates This especially simple if we consider the instanton to be positioned at the origin. From we have Now we have

while a relatively straightforward calculation gives

(5.76) (5.77)

306 111gradbook final.indd 306

22/10/2010 13:52:30


Note that here

We thus find the nice expression (5.78)

which in fact holds whether we write ω as in equation (5.18) or (5.19). Now in the gauge which is singular at the instanton position – which is here the origin – we have (see equation (2.43)) (5.79)

This now becomes

(5.80) or using slashed coordinates (5.81) We now switch from

Suppose that ω is of the form (5.82)

Then substituting in

we find (5.83)

We note that this is in fact minus the correct expression for ω in 5.18)) – so the instanton connection becomes

coordinates (by comparing with (5.84)

where this denoting derivative with respect to coordinates. We define a new scaling factor y = ρ2/2, so that (5.85) This expression will reappear, as a limit of an instanton on Taub-NUT, in section 8. We can also easily write down an expression for the instanton at the point B in terms of b± coordinates. Letting B = (B−, B+) for two spinors B± we obviously have (5.86)

307 111gradbook final.indd 307

22/10/2010 13:52:31


Chapter 6 Instantons on Taub-NUT 6.1 Instantons on the Taub-NUT

To define instantons on the Taub-NUT space (again, following closely the notation in [15]) we begin just as we did on R4. We consider vector bundles over the Taub-NUT with gauge group G. We have (hermitian) connections A with curvature and require that F has finite action (6.1) Then, if F satisfies the self-duality equation, we say that A is an instanton on the TaubNUT space. Here denotes the Hodge operator over Taub-NUT. From the metric, (6.2) we note that our basis one-forms are (6.3) and in the orientation (τ, x1, x2, x3) we define the action of the Hodge star on two-forms by (6.4) and extend this by permutations. An important difference between instantons on the Taub-NUT and instantons on essentially flat spaces arises when we consider the consequences of the finite action condition. At infinity the Taub-NUT space resembles R3 ×S1 – thus, one of the spatial directions is compact. If τ is the TaubNUT angular direction, then there is no need for the dτ component of the connection to vanish as we go to infinity. In particular we can consider the monodromy, or parallel transport, about the angular direction. The monodromy operator is defined by the matrix differential equation (6.5) and the monodromy about the Taub-NUT circle at the point As we go to infinity then our instanton will have non-trivial monodromy in the τ direction. We presume the eigenvalues of have the form exp and presume that In particular for an SU(2) connection the eigenvalues will be exp. We also need to define the instanton number or charge. This situation is a bit more complicated, as it turns out that instantons on the Taub-NUT space possess two kinds of topological charges, which we call monopole and instanton charges. The latter arise from the fact that, if were strict to a sphere of large radius R on the Taub-NUT space, where is the Taub-NUT vector coordinate, we can find a gauge transform such that the instanton connection A can be written in the form where and are independent of τ. Then (as was noted by

308 111gradbook final.indd 308

22/10/2010 13:52:31


Kronheimer [29]) the self-duality equation for A on the Taub-NUT is equivalent to the Bogomolny equation for and on R3. Thus, we can reduce the Taub-NUT instanton to a monopole on R3, and its monopole charges are then determined in terms of the asymptotic behaviour of the eigenvalues of (for the precise details, see section 1.2 of [15]). In this report we will restrict ourselves to the case of vanishing monopole charges. Then the instanton number k is given simply by the integral of over the whole space, as for the R4 instanton.

6.2 Bow diagrams

We come now to the construction of SU(2) instantons on the Taub-NUT space. This construction, and our description of it, is taken from [15]. The data needed for instantons on the Taub-NUT space can be encoded in bow diagrams. The bow diagram for an SU(2) instanton with instanton number k0 is shown in figure 6.1. The wavy lines represent an interval [−l/2, l/2], where l is the same parameter that appears in the Taub-NUT metric. Over this interval we have a rank k0 vector bundle which we denote by E. The fibre at the point is denoted by Es.

Fig. 6.1. Big bow. We have Nahm data defined in the bundle E. We do not presume this data is continuous across the whole interval – it may have discontinuities at the points −λ and λ. These points correspond to the eigenvalues of the monodromy at infinity. At ±λ we have maps Iα and Jα, α = R, L, between the fibres at λL −λ and λR +λ, and the external spaces Wα C. These maps are represented by the curved lines shown. We also have two maps between the fibres at the endpoints of the interval, and These are represented in the bow diagram by a curved line joining the two endpoints (and making it into a bow). Every part of this diagram can be understood in terms of the formulations of the ADHM and Nahm constructions in section 4. Along the interval the Nahm data are acted on by a gauge transformation g(s): (6.6) and we choose to express the resulting moment maps in terms of the operator

309 111gradbook final.indd 309

22/10/2010 13:52:31


The maps (Iα, Jα) and B01, B10 can be thought of as a split version of the ADHM quiver diagram of figure 4.1. Note that our bow diagram becomes this quiver diagram if we send l to zero. This implies that the instanton constructed on Taub-NUT using this method should become an instanton on R4 when l = 0. Gauge transformations act on these data as

(6.7) The moment maps are similar to those in the ADHM case, but split between the points ±λ and ±l/2. The complex moment map μC = μ1 + iμ2 is (6.8) while the remaining moment map μ3 is (6.9) We introduce again (6.10) as well as (6.11) Then we can combine the moment maps for the entire bow, Nahm data and maps, into the single expression (6.12) and we will set

Notice how at the endpoints ±l/2 this implies that and remembering that we are to think of the derivative as giving the values of the Nahm data at the endpoints. Similarly, at ±λ we get and As we can check that it follows that the Nahm data on the leftmost and rightmost intervals of the bow diagram are equal: we denote the data on these intervals by and choose B± such that On the middle interval we denote the Nahm data by

310 111gradbook final.indd 310

22/10/2010 13:52:31


6.2.1 On the origin of bow diagrams

A valid question at this point is to ask why does the bow diagram describe above correspond to an instanton on Taub-NUT space? Obviously we can show explicitly that, using the operator D† that we are going to define in the next subsection, this construction really does produce self-dual connections. But that does not make it clear what the original reasoning was. It turns out the answer lies in string theory. The ADHM transform, Nahm transform and indeed the construction of instantons on Taub-NUT can all be formulated in string theoretic terms using D-branes. These are extended objects with p spatial dimensions and one time dimension upon which open strings can end. Certain configurations of D-branes and strings between them can be shown to give rise to gauge theories. T-duality is a string theory duality in which a compact spatial dimension of radius R is interchanged with one of radius 1/R. This can be used in terms of instantons as follows: first, we realise an instanton configuration on the Taub-NUT space using D-branes. Next we perform T-duality to relate this configuration to a dual picture, which can be interpreted as a bow diagram. Obviously, this is an extremely hand-wavey suggestion of the important ideas underpinning the bow diagram concept. For more on the relation of branes to instantons on the Taub-NUT, see the recent paper by Witten [35].

6.3 The Weyl operator

We now want to write down a linear operator D such that the condition (6.13) is equivalent to the vanishing of the moment maps in equation (6.12). This is the same condition we have met before for instantons on R4 and for monopoles, which guarantees self-dual connections. We define D as acting on continuous L2 sections of the vector bundle E which have L2 derivatives away from the points ±λ, tensored with the a space S ≈ C2 of two-component spinors. We call this space H, and D maps H to the direct sum of H with the spaces WL, WR, El/2 and E−l/2, which we call Explicitly, (6.14)

where f is an appropriate element of Let ψ = ψ(s) be an L2 section of restricted to Denote a collection of such objects as

and let (6.15)

311 111gradbook final.indd 311

22/10/2010 13:52:32


We have a natural inner product: (6.16) The adjoint operator the left is in the space (6.14) we have

is defined by and that on the right is in

(where the inner product on Now, from the definition of Df in equation (6.17)

We integrate the last term by parts, breaking up the integral into integrals over the subintervals [−l/2, −λ], [−λ, λ] and [λ, l/2], so that (6.18)

It is then easy to combine the various pieces together, and find that (6.19) It is straightforward to see that the moment maps of equation (6.12) are indeed equivalent to VecD†D = 0. The Weyl equation D†ψ = 0 will be satisfied by (ψ(s), χL, χR, v−, v+) which satisfy (6.20) on [−l/2, l/2] away from ±λ, and also (6.21) (6.22) (6.23) (6.24)

312 111gradbook final.indd 312

22/10/2010 13:52:32


These four equations represent the matching conditions for our instanton data. The next step is to twist the operator D† with a point of the Taub-NUT. Such a point corresponds to coordinates obeying the moment maps for the “small” bow used in section 5.3. The twisted operator is where again

(6.25) (6.26)

Note that b± act in the opposite direction to B±; note also that they act on different spaces. Properly we must consider them, and the t and T operators, to be tensored with the appropriate identity matrices so that they act on the tensor product of the small bow with the large bow. As a consequence of the moment maps for the small bow, the twisted operator also satisfies At last we can now define our final instanton data, and give the formulae for the instanton connection. Using E to denote as before the vector bundle over the large bow, and e to denote that over the small bow, we consider ψ(s) to be a section of over [−l/2, l/2] excluding ±λ. We somewhat subtly let and and combine these into (6.27) Our data can be denoted by

(6.28)

with the inner product on the space of this data defined as in (6.16). We need an operator s which is essentially multiplication by s: (6.29) To construct an instanton on Taub-NUT we then solve

(6.30)

We form the orthonormalisable solutions into a matrix Ψ which has each basis vector as a column. We assume that (Ψ, Ψ) = N2 so that an orthonormal solution is The instanton connection is obtained by projecting the natural Taub-NUT connection from (5.53), onto the kernel of We do this using which implies that (6.31) We choose to rewrite this in the form (6.32)

313 111gradbook final.indd 313

22/10/2010 13:52:32


with

(6.33) (6.34) (6.35)

with d here denoting the three dimensional derivative,

. Equivalently in t coordinates

The last thing we would need to check is whether this construction genuinely produces self-dual solutions. This can be shown similarly to the proof for the ADHM transform, using the projection operator onto the kernel of and a number of other results. We refer the reader to section 7 of [15] rather than reproducing the demonstration here. This completes the description of the construction of SU(2) instantons on the Taub-NUT space.

Chapter 7 Example – an SU(2) instanton with unit instanton charge

We will now go on to construct an explicit example of an instanton connection on the Taub-NUT space, with unit instanton charge and vanishing monopole charges. Again this calculation is originally found in [15] (although we suspect there are some minor typos in the expression there, as our result differs slightly). As coordinates on the Taub-NUT space we have the small bow data, b± and (t0,t). These satisfy and we assume we have gauged away t0 so that all τ dependence is in the phase of the spinors b±. Recall from the moment map conditions for the big bow we had that the Nahm data equalled on the intervals [−l/2, −λ] and [λ, l/2] are equal and on [−λ, λ]. We also had We let and introduce two spinors Q± such that These must in fact be the same as our previous Qs: Q+ = QR and Q− = QL. We also define relative coordinates Finally, we set (T1)0 and (T2)0 equal to zero (equivalent to setting the phases of B± and Q± to zero). We are now ready to solve the Weyl equation .

7.1 Solving the Weyl equation

The Weyl equation on each interval is This is solved by

, with

, depending on the interval. (7.1)

where Π, ψL and ψR are constant (s-independent) two-by-two matrices. The other data v,χR,χL are determined from the matching conditions at ±λ and ±l/2 – these are the same as equations (6.21) to (6.24) but with the appropriate twisting:

314 111gradbook final.indd 314

22/10/2010 13:52:32


(7.2) (7.3) (7.4) (7.5) We need some relations involving b± and B±. Start with – to evaluate this we write This is just a number, so we do nothing by taking the trace: For the last equality we used the general identity and the fact that We now define (7.6)

and if e±iτ/2 are the phases of b± we find that

(7.7)

We also have the results

(7.8)

At this point we are forced to cross a small Rubicon. We must commit to some particular choice of data, without knowing what sort of connection it is going to produce. From the structure of the bow diagram (Figure 6.1) we can draw some general conclusions. The middle interval [−λ,λ] will clearly produce an ’t-Hooft-Polyakov monopole like term. We can interpret Π as being a gauge transformation of this. From v and χL, χR we expect to find terms that, when l = 0 give an instanton on R4. A priori we do not know exactly how to choose our data to reflect this. But we have to start somewhere, and we might as well just take the data originally used in [15]. For v, we choose (7.9) so that we obtain the simple expressions Combining χL and χR into a single object which we call χ, we take with

(7.10) (7.11) (7.12)

so that

(7.13)

Here

(7.14)

315 111gradbook final.indd 315

22/10/2010 13:52:33


and

(7.15)

which have been constructed so as to satisfy

(7.16)

and also obey μ+μ− = P. We can check whether Υ and Π are unitary, as we would like to interpret them as gauge transformations. In fact we find (see appendix B.1) for the details for the Υ case, the calculation for Π is similar) (7.17) where we have introduced the following functions

(7.18) (7.19) (7.20)

for the sake of brevity. This tells us that although Π and Υ are not themselves unitary, they can be easily made so by rescaling. If we let |Π|2 = Π†Π and |Υ|2 = Y†Y then clearly Π/|Π| and Υ/|Υ| are unitary.

7.2 Constructing the instanton

The calculations in this section are not especially difficult but they are fairly long, so we omit most of the details. A certain amount of patience is useful, as well as the following integrals: (7.21) (7.22) (7.23) all of which we first encountered when performing the Nahm transform in section 3. The first step is to calculate (7.24)

We find

316 111gradbook final.indd 316

22/10/2010 13:52:33


(7.25) The reader may be pleased to hear that at this stage we have no need to define any further functions. The instanton components now follow from equations (6.33), (6.34) and (6.35). The final result, eleven pages of careful calculations later, is (7.26)

(7.27)

and

(7.28) Here denote derivatives acting to the left, for example . Note that we have written the instanton in a form which explicitly shows how each of v, χ and ψ(s) contribute to the connection, and there are some small simplifications which can be made. For instance in Φ we have a piece coming from v which cancels with the term which results from the integrals over [−l/2, −λ] and [λ, l/2]. We can also check the number of independent parameters appearing in our solution. These are given by and (alternatively This gives six parameters – remembering that we set the phases of and to zero, we see that in fact the space of all k = 1 SU(2) instantons on Taub-

317 111gradbook final.indd 317

22/10/2010 13:52:33


NUT is eight-dimensional (the same as in the R4 case). The metric on this moduli space was found in [16].

Chapter 8 Analysis of the solution

This chapter contains the results of our analysis of the instanton solution found in the previous section. We have relegated the details of a number of the calculations involved to the appendix so as to make the presentation clearer.

8.1 The l = 0 limit

The most obvious limit presenting itself to us is to set l = 0 (which also of course implies λ = 0). In this limit the Taub-NUT space reduces to R4 and we expect the instanton to become the usual R4 k = 1 SU(2) instanton, written in the coordinates and τ , and probably in some unusual gauge.

8.1.1 The limit

First of all, it is clear that the part of the solution involving the Nahm data on the interval [−l/2,l/2] will vanish, as this interval disappears when l = 0. Hence we will only get contributions from the v and χ parts of the connection. We also note that Φ will vanish directly from its definition – recall it was given in terms of the operator s of equation (6.29), and now s = 0 identically, We have the following immediate simplifications: (8.1) (8.2) (8.3)

so that

(8.4) and also (8.5)

8.1.2 Centering the instanton

To make life a little bit easier, we choose to locate the instanton at the origin. To do so we interpret as giving the position of the instanton centre, having noticed that appears throughout the solution but only appears as part of y. We interpret the latter as a scaling factor. Letting As there are terms in A(0) and A(3) inversely proportional to P we must be careful about how we set An intelligent approach is to note that we can choose to lie entirely along the t3 axis, i.e. This means that and setting T1 = 0 is now well-defined.

318 111gradbook final.indd 318

22/10/2010 13:52:34


The dτ component becomes (8.6) and the surviving terms in A(3) are (8.7) We abuse notation a little here and understand derivatives acting to the left to only act on Υ†. Now, (8.8) This tidies up beautifully using

We get (8.9)

so that

(8.10) Now in each term we notice that Υ appears almost as if it was a gauge transformation. With we have (8.11) and

We define a unitary matrix

by (8.12)

We then rewrite (8.13) so that

now appears explicitly as a gauge transformation term. It is easy to evaluate (8.14)

while writing

leads to (8.15)

319 111gradbook final.indd 319

22/10/2010 13:52:34


Inserting these back into our expressions for A(0) and A(3) and simplifying we get the following expression for the centred instanton: (8.16)

Finally, we note that

(8.17) Using

and some simple manipulations we find (8.18)

Hence we can write (8.19) Thus we have shown that our instanton on Taub-NUT, once positioned at the origin, can be reduced to a gauge transformation of the following SU(2) connection on R4: (8.20) Referring to equation (5.85) we see that this is in fact exactly the expression for the k = 1 SU instanton, centered at the origin, written in coordinates in the gauge which is singular as Hence we now know, at least for l = 0 and the gauge transformation which brings the instanton into a form closest to the spirit of the ADHM construction – it is given by We would expect that this holds anyway with and in the next section we will go on to show this explicitly.

8.2 An R4-inspired gauge transformation

Inspired by the result of the l = 0 and limit, we return to our original data for the TaubNUT instanton and explore the consequences of applying the gauge transformation In particular, we expect now to be able to find expressions for our data which immediately give the usual ADHM data for R4 in the limit l = 0. We could also reach this conclusion by noting that for the ADHM construction of the k = 1 SU(2) instanton, χ was chosen to be constant. In our case we have (8.21) When l = λ = 0, all the

dependence of this term is in Υ. Thus, we again conclude that we

320 111gradbook final.indd 320

22/10/2010 13:52:34


should multiply on the right by as a first step towards arriving at ADHM-like data. Note that although properly we should only consider unitary transformations of our data, we are also free to rescale the data by a constant function as long as we remember that this will also change the normalisation factor.

8.2.1 Applying Υ† on the right

Bearing this in mind we begin just by applying Υ† on the right. Using we find

and (8.22) (8.23) (8.24)

For v we have (8.25) Now,

and as

and

we find (8.26)

8.2.2 Comparison with ADHM

Setting l = 0 we only get contributions to the instanton potential from

(8.27) (8.28) To make sense out of this, define

(8.29) (8.30)

Now,

(8.31)

321 111gradbook final.indd 321

22/10/2010 13:52:35


and

(8.32)

We see that our data has the form

(8.33) and if we rescale these by

(8.34)

we obtain

(8.35) (8.36) so that they point along the t3 axis – in this case so that in fact χ=1. This is exactly the same as the data we used for the ADHM construction back in section 2.3! Now, we could choose to orient

and

8.2.3 Taub-NUT data in an ADHM-like form

Returning to the data for the Taub-NUT instanton, let us write v and χ as

(8.37) (8.38)

where

(8.39) (8.40)

In fact

This turns out to be the same as

so we get

(8.41) (8.42)

Also,

322 111gradbook final.indd 322

22/10/2010 13:52:35


(8.43)

We evaluate

having used that we obtain

Subtracting this from the similar term with

(8.44) (8.45)

This implies that our data has the form

(8.46) (8.47)

and if we rescale this by

we obtain the ADHM-like quantities (8.48)

Applying the same rescaling to the rest of our data,

(8.49) (8.50) (8.51)

What have we achieved? Well, we have found the form of the data for an SU(2) k = 1 instanton on Taub-NUT which exactly corresponds to the familiar R4 solution in the limit l = 0, written in the gauge which is singular as we approach the instanton position. We had of course expected this result from the limiting l = 0 case of the bow diagram itself, but it is nice to be able to find an explicit form of the data which reflects this. We notice that on the Taub-NUT we have the following (8.52)

323 111gradbook final.indd 323

22/10/2010 13:52:35


which seems to replace the “normal” R4 coordinates b± in the instanton expression. An interesting interpretation of these quantities is as follows: the matrices appearing in these expressions could be thought of as representing parallel transport in the bundle formed by tensoring the line bundle over [−l/2, l/2] with the spinor space S ≈ C2, using and as covariant derivatives. For instance, parallel transport from −l/2 to −λ would be given by exp We note that b+ represents a map from the fibre at l/2 to that at –l/2, and vice versa for b−. Then the shifted coordinates represent parallel transporting the results of these maps to the centre of the interval.

8.3 The monodromy at infinity

In constructing the instanton on Taub-NUT we used the fact that the eigenvalues of the monodromy at infinity were ±λ. We want to check whether our solution is written in a gauge which reflects this. We thus send while keeping everything else constant. The details of this calculation are contained in appendix B.2; the result is that the connection for large becomes (8.53)

Now recall the monodromy equation

(8.54)

In our case

(8.55)

For τ = 4π we get

(8.56)

The eigenvalues of this matrix are

(8.57)

We see that in fact the points ±λ have been shifted:

(8.58)

We note that it would be good to find an interpretation of the quantities (8.59) Diagrammatically, if we view the interval as a circle, with the points and identified, then this shift corresponds to rotating the interval [−λ, λ] from the “bottom” of the circle to the “top” (see figure 8.1).

324 111gradbook final.indd 324

22/10/2010 13:52:35


Fig. 8.1. Shift in eigenvalues of monodromy at infinity.

8.4 Behaviour at the origin

We now seek to examine the behaviour of the instanton at the origin, In particular we want to see if it exhibits singular behaviour, and if the component of the connection vanishes. The non-vanishing of this component would in fact be singular behaviour – as clearly is not welldefined at For A(0) at = 0 we find (see appendix B.3) (8.60) Calculating Φ at the origin is much simpler – in fact we don’t have to, because Φ appears in the connection in the term (8.61) But for t = 0 this vanishes, hence we get no contribution from Φ at the origin (there is one term in Φ which is proportional to V – however it is also proportional to t and so also disappears at the origin). Owing to the non-zero A(0) factor though, we conclude that in its current gauge, the instanton does not behave well at the origin.

8.5 The caloron limit

We end by noting one further limit of the Taub-NUT instanton. Recall that at infinity the Taub-NUT space resembles R3 × S1. We would expect that if we move the centre of our TaubNUT space far from our instanton solution, then our connection would represent a self-dual configuration on R3 × S1. Instantons on R3 × S1 are known as calorons. The precise limit we are considering is to send to infinity while keeping the relative vectors fixed. We choose to apply this limit to the transformed ADHM-like data from section 8.2. We find (8.62) so that (8.63)

325 111gradbook final.indd 325

22/10/2010 13:52:36


(8.64) (8.65)

where

(8.66)

If we introduce (8.67) then BB† = I so B is unitary, and Now also χ is still given by Our resulting instanton data is

and v decays to zero as it is proportional to (8.68)

with Π as above. The only problem with these expressions is that, as noted by Nahm [14], they should be periodic in s with period l. Clearly though In fact this can be noticed back in section 7.1 where in equation (7.10) we originally had and . In the limit of large and but fixed we have so clearly It is possible that a different choice of gauge would go over naturally into a caloron, or else that our approach is too naive. It is interesting to note that similar appearing expressions for caloron data were obtained in the paper [39].

Chapter 9 Conclusions and further work

We presented in this report an overview of the construction of Yang-Mills instantons on a curved space, the Taub-NUT space. We showed how this construction relies on the powerful techniques of the ADHM and Nahm transforms, and uses the machinery of moment maps, hyperkähler quotients and bow diagrams. In particular we gave a detailed presentation of the k = 1 SU(2) instanton on Taub-NUT. Following [15] we explicitly constructed this instanton. We showed how it could be related to the familiar k = 1 SU(2) instanton on R4, by choosing an appropriate gauge. We also exhibited the behaviour of this instanton at the origin of the Taub-NUT space and at infinity. Our results indicate that an additional gauge transformation is needed for the instanton to have the expected monodromy at infinity and to behave properly at the origin. Possible future work would be to find this gauge transform and apply it to our solution. It would also be interesting to observe the behaviour of the instanton in the ADHM-like gauge in these limits.

326 111gradbook final.indd 326

22/10/2010 13:52:36


Additionally, a more in-depth study of the caloron limit seems to be needed. The bow diagram formalism we used has a number of other applications. It can be extended to produce SU(n) instantons for general n and instanton number k0, although for k0 > 1 we are limited by the difficulty of solving the Nahm equations explicitly. The hyperk채hler quotient construction of multi-centered Taub-NUT space which we presented in section 5.4 could be used in constructing self-dual connections on the k-centered space.

Appendix A Some useful results A.1 Symplectic forms on R4 C2 H

Let us first recall the definition of a symplectic form. Let M be some manifold with metric g given by (A.1) This acts on two tangent vectors as follows (A.2) Let I be a complex structure on the manifold, i.e. a mapping of the tangent bundle satisfying In components we have . The symplectic form is defined by (A.3) Now, (A.4) which is the same as (A.5) Writing we have (A.6) so that (A.7) Now, the most basic example of a hyperk채hler space is the space of quaternions H. This is isomorphic to real four-dimensional flat space R4, with the point (w, x, y, z) R4 corresponding to the The metric is (A.8) The actions of the quaternions (A.9) (A.10) (A.11) can be represented by the matrices (A.12)

327 111gradbook final.indd 327

22/10/2010 13:52:36


which automatically give the matrices defining the three symplectic forms (as the metric is We then have (A.13) (A.14) (A.15) We can also introduce complex coordinates z1 = w+ix and z2 = y+iz so that and the action of the unit quaternions is The symplectic forms in these coordinates are

(A.16) (A.17) (A.18) (A.19)

A.2 Products of slashed terms (A.20)

and the following products of slashed terms

(A.21) (A.22) (A.23) (A.24)

Clearly

(A.25)

Appendix B Various calculations

The identities for products of slashed terms, equations (A.21) to (A.24) will be used repeatedly in these calculations. Consider

B.1 Calculation of Υ†Υ (B.1)

using . To evaluate this expression, we note that the third term is obtained from the second by sending all slashed objects to minus themselves. Thus, we only need to calculate the terms coming from which contain an even number of slashes. Doing

328 111gradbook final.indd 328

22/10/2010 13:52:37


the exponentials first, we have

(B.2)

where result into terms containing an odd number of slashes: and an even number:

we grouped the final (B.3) (B.4) and and only keep the terms with an

where we used

We then multiply out even number of slashes:

(B.5)

Thus we get

(B.6) Using these and multiplying above term we have

Now expression by two to account for the

(B.7) Multiplying the c2 term we have while multiplying the

(B.8) term we have

So that

(B.9) (B.10)

B.2 The tg∞ limit B.2.1 Preliminaries

For tg∞ we have Noting that we see that we can neglect P in μ±, taking

and using (B.11)

329 111gradbook final.indd 329

22/10/2010 13:52:37


For terms of the form

we use (B.12)

Hence

(B.13)

becomes

(B.14)

which tidies up using

(B.15)

Similarly

(B.16)

becomes

where Also

(B.17) (B.18) (B.19) (B.20) (B.21)

Finally, the normalisation factor

(B.22)

becomes

(B.23)

We have

(B.24) The first thing we note is that the 1/N2 terms multiplies everything by a decaying exponential so anything not containing exponential factors to cancel this out will disappear as t g ∞. We see then that (B.25)

330 111gradbook final.indd 330

22/10/2010 13:52:37


Now, (B.26) We begin multiplying in (B.27)

so (B.28)

We also have (B.29)

so that

(B.30)

Putting everything together we find (B.31)

and this tidies up to give

We had

(B.32)

B.2.2 ÎŚ (B.33)

For large t this will become

331 111gradbook final.indd 331

22/10/2010 13:52:38


(B.34)

where we dropped terms that will be killed by the exponential in N2. Now, (B.35) Carefully multiplying in the brackets, (B.36)

Using (A.2) and (A.24) we get

(B.37)

and filling in for g, (B.38) For ÎĽ we have (B.39)

We then have (B.40)

332 111gradbook final.indd 332

22/10/2010 13:52:38


We then drop the second term as it is

so (B.41) For large t, V ≈l so we have a term

In the connection ÎŚ appears in the term

(B.42)

B.3 The t g 0 limit B.3.1 Preliminaries

We want to study the behaviour of the dτ components of the connection at the origin, Setting we have

(B.43) (B.44) (B.45) (B.46) (B.47) (B.48)

B.3.2 A(0)

We have

(B.49)

with

(B.50)

using

and (B.51)

using

, so that

333 111gradbook final.indd 333

22/10/2010 13:52:38


(B.52)

We can work this out in greater detail. Start with

(B.53) where c2, s2 have their usual meanings but we are using for convenience. We need to evaluate this and the similar term obtained by sending all the slashed objects to minus themselves, and then subtract the former from the latter. Multiplying in the leftmost and rightmost brackets we have (B.54)

In total, including the minus slashed terms and inserting the appropriate minus signs, we then need to evaluate (B.55) and (B.56) which can be worked out similarly to the many previous calculations of the same type. Thus, the terms in A(0) contributing from the Î factors are

In fact we have

(B.57)

so this is just

(B.58)

For the ÎĽ terms we need to evaluate

(B.59) and subtract from this the similar term obtained by sending the slashed objects to minus themselves. Now, this equals

334 111gradbook final.indd 334

22/10/2010 13:52:39


(B.60)

and then we have

(B.61) Ignoring the scalar terms, which will vanish when we subtract off the minus slashed result, we have found (B.62) The subtraction of the minus slashed terms amounts to multiplying this by two, and so we find for the ÎĽ terms (B.63) Hence A(0) is given by (B.64) Making a single fraction on the right-hand side, (B.65)

hence we find

(B.66)

335 111gradbook final.indd 335

22/10/2010 13:52:39


R. L. Mills, C. N. Yang “Conservation of Isotopic Spin and Isotopic Gauge Invariance,” Phys. Rev. 96, 191195 (1954). 2 A.A. Belavin, A.M. Polyakov, A.S. Schwartz, Yu.S. Tyupkin, “Pseudoparticle Solutions of the Yang-Mills Equations,” Phys. Lett. B. 59, 85-87 (1975). 1

R. Jackiw, C. Rebbi, “Vacuum Periodicity in a Yang-Mills Quantum Theory,” Phys. Rev. Lett. 37. C.G. Callan, R.F. Dashen, D.J. Gross, “The Structure of the Gauge Theory Vacuum,” Phys. Lett. B. 63, 334 (1976). 5 E.Witten, “Some Exact Multipseudoparticle Solutions of Classical Yang-Mills Theory,” Phys. Rev. Lett. 38, 121-34, (1977) 6 G. ’t Hooft, unpublished; F. Wilczek, In: Quark confinement and field theory (eds. D. Stump, D. Weingarten). New York, John Wiley and Sons (1977); E. Corrigan, D.B. Fairlie, “Scalarfield theory and exact solutions to a classical SU(2) gauge theory,” Phys. Lett. B. 67, 69-71 (1977); R. Jackiw, C. Nohl, C. Rebbi, “Conformal properties of pseudo particle configurations,” Phys. Rev. D. 35, (1642-1646) (1977). 7 R.S. Ward, “On Self-Dual Gauge Fields, ”Phys. Lett. A. 61, 81-82 (1977). 8 M.F. Atiyah, R.S Ward, “Instantons and Algebraic Geometry,” Commum. Math. Phys. 55, 117-124 (1977). 9 M.F. Atiyah, N.J. Hitchin, V.G. Drinfeld, Yu.I. Manin, “Construction of Instantons,” Phys.Lett. A. 65, 185187 (1978). 10 M.F. Atiyah, “Geometry of Yang-Mills Fields”, Scuola Normal Superiore Pisa, Pisa (1979). 11 W. Nahm, “A Simple Formalism For The BPS Monopole,” Phys. Lett. B. 90, 413 (1980). 12 W. Nahm, “On Abelian Selfdual Multi-Monopoles,” Phys. Lett. B. 93, 42 (1980). 13 W. Nahm, “All Self-Dual Multimonopoles for Arbitrary Gauge Group,” CERN-TH. 31, 72 (1981). 14 W. Nahm, “Self-Dual Monopoles And Calorons,” BONN-HE-83-16. Presented at 12th Colloq. on Group Theoretical Methods in Physics, Trieste, Italy, Sep 5-10, 1983. 15 S. Cherkis, “Instantons on the Taub-NUT Space,” arXiv:0902.4724v1 [hep-th]. 16 S. Cherkis, “Moduli Spaces of Instantons on the Taub-NUT Space,” Commun. Math. Phys. 290, 719-736 (2009). 3 4

P.A.M. Dirac, “Quantised Singularities in the Electromagnetic Field,” Proc. of the Royal Society of London, 133, No.821, 60-72 (1931). 18 T.T. Wu, C.N. Yang, “Concept of Nonintegrable Phase Factors and Global Formulation of Gauge Fields,” Phys. Rev. D. 12, 3845–3857 (1975). 19 G.’t Hooft, “Magnetic Monopoles in Unified Gauge Theories,” Nuclear Physics B. 79, 276-284 (1974). 20 A.M. Polyakov, “Particle Spectrumin Quantum Field Theory,” JETP Lett. 20, 194 (1974). 21 M.K. Prasad, C.M. Sommerfield, “Exact Classical Solution for the’t Hooft Monopole and the Julia-Zee Dyon,” Phys. Rev. Lett. 35, 760-762 (1975). 22 E.E. Bogomolny, “The Stability of Classical Solutions,” Sov. J. Nucl. Phys. 24, 449 (1976). 23 N.J. Hitchin, “On the Construction of Monopoles,” Commun. Math. Phys. 89, 145-190, (1983). 24 N.J. Hitchin, Monopoles, Minimal Surfaces and Algebraic Curves, Sem. de Math. Sup. 105. Les Presses de l’Univ, de Montreal (1987). 17

336 111gradbook final.indd 336

22/10/2010 13:52:39


J. Hurtubise, M.K. Murray, “On the Construction of Monopoles for the Classical Groups,” Commun. Math. Phys. 122, 35-89 (1989). 26 M. Jardim, “A Survey on Nahm transform,” J Geom Phys. 52, 313-327 (2004). 27 S.W. Hawking, “Gravitational Instantons,” Phys. Lett. A. 60, 81-83, (1977). 25

G.W. Gibbons, Gravitational Instantons: A Survey, review talk given at Int. Congress of Mathematical Physics, Lausanne, Switzerland, Aug 20-25, 1979; Lecture Notes in Math. 116, ed. K. Osterwalder (Springer, 1980). 29 P.B. Kronheimer,“Monopoles and Taub-NUTMetrics,” MSc. Thesis, Oxford (1985). 30 P.B. Kronheimer, “The Construction of ALE Spaces as hyper-KählerQuotients” J.Differential Geom. 29, 665-683 (1989). 31 P.B. Kronheimer, M. Nakajima, “Yang-Mills Instantons on ALE Gravitational Instantons,” Math. Ann. 288, 263-307 (1990). 32 A.H.Taub,“Emptyspace-timesadmittingathreeparametergroupofmotions,”Annals Math. 53, 472(1951). 33 E. Newman, L. Tamburino, T. Unti, “Empty space generalization of the Schwarzschild metric,” J. Math. Phys. 4, 915 (1963). 34 G.W. Gibbons, P. Rychenkova, “HyperKaehler quotient construction of BPS monopole moduli spaces,” Commun. Math. Phys. 186, 585 (1997). 35 E. Witten, “Branes, Instantons, And Taub-NUT Spaces,” JHEP 0906 (2009). 36 E.J. Weinberg, P. Yi, “Magnetic monopole dynamics, supersymmetry, and duality,” Phys. Rept. 438, 65 (2007). 37 C. Nash, S. Sen, Topology and Geometry for Physicists, Academic Press (1983). 38 M. Nakahara, Geometry, Topology and Physics, Institute of Physics Publishing (2003). 39 K. Lee, C. Lu, “SU(2) calorons and magnetic monopoles,” Phys. Rev. D. 58, 025011 (1998). 40 C.N. Pope, A.L. Yuille, “A Yang-Mills Instanton In Taub-Nut Space,” Phys. Lett.B 78, 424 (1978); A.L. Yuille, “Yang-Mills Instantons In Selfdual Space-Times,” Phys. Lett. B. 81, 321 (1979). 41 H. Boutaleb-Joutei, A. Chakrabarti, A. Comtet, “Gauge Field Configurations In Curved Space Times. 4. Selfdual SU(2) Fields In Multicenter Spaces,” Phys. Rev. D. 21, 2280 (1980). 42 H. Kim, Y. Yoon, “Instanton-meron hybrid in the background of gravitational instantons,” Phys. Rev. D. 63, 125002 (2001); H. Kim, Y. Yoon, “Effects of gravitational instantons on Yang-Mills instanton,” Phys. Lett. B. 495, 169 (2000). 43 G. Etesi, “Classification of ’t Hooft instantons over multi-centered gravitational instantons,” Nucl. Phys. B. 662, 511 (2003); G. Etesi, T. Hausel, “On Yang-Mills instantons over multi-centered gravitational instantons,” Commun. Math. Phys. 235, 275 (2003); G. Etesi, T. Hausel, “Geometric construction of new Taub-NUT instantons,” Phys. Lett. B. 514, 189 (2001). 28

337 111gradbook final.indd 337

22/10/2010 13:52:39


Medical Sciences panel

r

Dr. Declan Patton, UCD (co-chair) Dr. Eric Igou, UL (co-chair) Prof. Brian O’Connell, TCD Prof. Helen Whelton, University Dental School and Hospital, Cork Prof. Fidelma Dunne, NUIG Prof. Colin P. Bradley, UCC Dr. Juliette Hussey, TCD Prof. Edward J. Johns, UCC Frederieke van Dongen, UL Wijnand van Tilburg, UL

Judges’ comments

This is a very well-written essay in style, structure and content. The author describes very clearly how themes and findings relating to evolutionary theory historically fed into research in the areas of ethology and psychology, and how the critical discussions of findings and theoretical approaches in evolutionary psychology prepared the emergence of evolutionary developmental psychology as a novel sub-discipline in research, in particular regarding topics such as ‘attachment’, ‘play’ and the ‘evolution of the brain’. The structure of the essay is very impressive as the topic is presented within a broad perspective without losing sight of the particular arguments. The argumentation itself is very convincing and related clearly to critical discussions in psychology on evolutionary principles on human development. Most importantly, the student clearly recognises the relationship between meta-theories, such as evolutionary theory, and disciplines in research, which speaks for exceptional insights of the undergraduate student into the development of research domains.

338 111gradbook final.indd 338

22/10/2010 13:52:39


r Medical Sciences

The impact of evolutionary theory on the history of developmental psychology Louise Bhandal “In the distant future... psychology will be based on a new foundation, that of the necessary acquirement of each mental power and capacity by gradation. Light will be thrown on the origin of man and history.”

F

(Darwin, 1859, pg. 488)

Introduction

rom Charles Darwin’s pioneering work in The Origin of Man to the recently emerging field of evolutionary developmental psychology, evolutionary theory has undoubtedly impacted upon the psychological perspective in a permanent way (Buss, 2009; Geary, 2006; Newman & Newman, 2007). Evolutionary theory compares human beings to the richly varied species that inhabit the animal kingdom, and accounts for lifespan development within the broader context of phylogeny (Miller, 2002; Newman & Newman, 2007). The influence of evolutionary theory on psychology from past to present will be briefly reviewed, examining phenomena such as attachment, mating efforts, play, and intelligence from an evolutionary perspective, and culminating in a broad overview of humans as being a highly adaptive, flexible and constantly evolving species.

Ethology and Evolutionary Psychology: Products of Darwinian Theory

The groundwork of evolutionary theory Darwin’s concept of adaptive evolution is described by Geary (2006) as a two-step process: first phenotypic variation occurs, then natural selection. A species will produce more offspring than are likely to reproduce due to limited resources, predators, etc., in order to produce within-species variation (Newman & Newman, 2007). Essentially, different characteristics thrive in different environments, so the production of a wide range of characteristics helps to ensure a species survival within the context of environmental change (Vander Zanden, 1997). A characteristic which promotes survival in a given environment is known as an adaptive mechanism, and it is

339 111gradbook final.indd 339

22/10/2010 13:52:39


suggested that such mechanisms evolved to face specific, recurrent problems in the history of the organism (Vander Zanden, 1997). Natural selection operates in such a way that those with adaptive mechanisms will survive and reproduce (‘survival of the fittest’); hence their adaptive mechanisms will be preserved across generations thus gradually modifying the species as a whole (Buss, 2009).

Ethology

Following Darwin’s early ecological research into functionally significant behaviour as a feature of common ancestry, e.g. the origin of slave-making in a particular lineage of ants, the field of ethology grew (Brooks & McLennan, 2007). Early ethologists such as Charles Whitman and Oskar Heinroth began to experiment with various classes of birds and insects at the beginning of the 20th century, examining overt behaviour as a function of evolutionary history in the organism (Brooks & McLennan, 2007; Tinbergen, 1963). Comparative studies, relating findings from animal behaviour patterns to that of humans within an evolutionary framework, thrived under the influence of nobel-prize winning ethologists, Konrad Lorenz and Niko Tinbergen during the 1940s and 1950s (Newman & Newman, 2007; Vander Zanden, 1997). Lorenz and Tinbergen were particularly interested in innate behaviours such as reflexes and fixed action patterns elicited by sign stimuli in the environment (Tinbergen, 1963). In his review of experimental and observational data on a range of animals including fish, reptiles, birds, and insects, Tinbergen (1963) gave a thorough account of how such adaptively significant innate behaviours (e.g. mating efforts and territorial struggles) are triggered. The releasing stimulus in the environment can be visual, auditory, chemical, tactile etc. and is usually very specific; for example seeing a pile of red breast feathers elicits aggressive behaviour in the red-breasted robin, whereas seeing another robin with a brown spotted breast does not elicit such behaviour (Tinbergen, 1963). One particular phenomenon which Lorenz examined in great detail was imprinting in precocial birds (Newman & Newman, 2007; Scott, 1967; Vander Zanden, 1997). The releasing factor in this case was the first moving object the birds saw, whether it was its mother, a ball, a buggy etc., and this would elicit a strong and irreversible attachment to that object (Scott, 1967). When examining attachment behaviour in humans and animals, Lorenz (1943, in Newman & Newman, 2007) suggested that babies, puppys, kittens etc. all shared similar ‘cuteness’ features such as a large head relative to the body, large eyes, and pudgy round cheeks. According to Lorenz, these features were evolved in order to elicit caregiving behavioural patterns in parents and promote parentinfant attachment. Indeed, Bowlby (1958, in Newman & Newman, 2007) accounted for a complex set of innate infant attachment behaviours such as cooing, grasping, crying etc. which functioned as sign stimuli for parenting behaviour.

Evolutionary Psychology

Evolutionary Psychology (EP) sought to track the origins of human cognition and behaviour as evolved adaptive mechanisms (Miller, 2002; Newman & Newman, 2007; Tooby & Cosmides, 1992). Buss (1995), one of the main proponents of EP, proposed that the field of psychology was a

340 111gradbook final.indd 340

22/10/2010 13:52:39


scramble of isolated theories, which could all be integrated under the umbrella of an evolutionary meta-theory. General evolutionary theory can be dissected into smaller theories, which in turn can produce specific, testable hypotheses (Buss, 1995). EP relates various human behaviours to their phylogenetic roots, centred around Darwin’s early concept of the great struggles of life; mating, competing for resources, etc. Human mating was initially the prime focus of EP, as it was so closely linked to reproduction; the engine of natural selection and evolution (Buss, 2009). Darwin was perplexed by the adaptive function of the peacock’s tail; it was extremely costly to the animal, encouraging predation and therefore compromising survival (Buss, 2007). Darwin proposed the theory of sexual selection (1859, in Buss, 2007), i.e. the evolution of mechanisms which were advantageous to mating, sometimes at the risk of survival. Sexual selection operates via two routes: intrasexual competition (for resources, status, territory, etc.) and intersexual selection (i.e. mating preferences) (Buss, 2007). Triver’s (1972, in Buss, 1993) theory of parental investment suggested that gender differences in mate selection occurred because the minimum parental investment for males is lower (producing sperm) than females (minimum gestation period of nine months). Trivers proposed that the gender who have higher investment will be more choosy in mate selection, and that the gender with lower investment would be required to engage in more intrasex competition for the choosy high investment sex. Buss (1993) used the general theories of sexual selection and parental investment to empirically test 22 specific hypotheses about human mating, concluding that mate preferences differed consistently with gender. These findings were also supported in other studies (e.g. Landolt, Lalumière, & Quinsey, 1993).

Limitations of Evolutionary Theory in Psychology

Criticisms of evolutionary theory in psychology Some controversy surrounds the validity of evolutionary theory to account for human behaviour. Firstly, it has been argued that large speculative leaps have been made about the lives of our ancestors based upon minimal evidence from very old remains such as fossils, artifacts etc. (Benton, 2000). The problem with such speculative assumptions, derogatively dubbed ‘just-so stories’ by critics, is that they appear to be unfalsifiable constructs of prehistoric life (Gould, 2000). Secondly, evolutionary theorists in psychology have been accused of ‘genetic determinism’, with social and cultural influences being largely ignored (Karmiloff-Smith, 2000). This has lead to criticisms that evolutionary theory does not provide a good model for behaviour change (Newman & Newman, 2007). Furthermore, the assumption of a brain consisting of specialised mechanisms evolved to adapt to an ancient past and primed to respond to certain input from the environment disregards the plasticity of the human brain, which have developed advanced capacities for things like abstract thinking, symbolic play, and learning through trial and error (Geary, 2006; Grotuss, Bjorklund, & Csinady, 2007; Pelligrini, Dupuis, & Smith, 2007; Roth & Dicke, 2005; Newman & Newman, 2007). Gould (2000) accused evolutionary psychologists of hyperadaptionism, i.e. that every aspect of the organism must have an adaptive function; however, many evolutionary psychologists

341 111gradbook final.indd 341

22/10/2010 13:52:39


have acknowledged the existence of noise and by-products (e.g. Buss, 1995) so this criticism is ungrounded. Finally, Gould (2000) argued that evolutionary theory was primarily explanatory, not predictive, however many evolutionary psychologists have produced testable hypotheses about human behaviour (e.g. Buss, 1993; Landolt et al. 1993).

A New Field Emerges

There are a number of key areas in which Evolutionary Developmental Psychology (EDP) and traditional EP differ. Whereas EP primarily focused on adult adaptive mechanisms, EDP considers the period of early development to be a focal point of its research as natural selection exerts greater influence on an organism during immaturity and this influence declines rapidly postreproduction (Grotuss et al. 2007). Prevalence rates of Huntington’s Disease (HD), a neurodegenerative disease which causes death in middle age, are 500 times higher than Progeria, a disease which causes death before the child’s second decade, and this provides evidence that natural selection exerts more influence during the pre-reproductive age (Austad, 1997, in Grotuss et al. 2007). Furthermore, a prolonged period of immaturity is seen as adaptive for higher-order organisms to learn the necessary skills to function within their extremely complex social environment (Pelligrini et al. 2007). Evolutionary developmental psychologists view the limited abilities of infants as adaptive mechanisms rather than being incomplete abilities; over-estimation of cognitive ability increases perseverance at difficult tasks, an infant’s poorly developed visual system prevents sensory information from overwhelming the developing brain, etc. (Bjorkland & Greene, 1992, in Pelligrini et al. 2007; Grotuss et al. 2007). Some of the pitfalls of EP have been tackled by the field of EDP. EDP focuses on the effects of ontogeny (lifespan development) on phylogeny (evolutionary development) as much as the reverse (Grotuss et al. 2007; Pelligrini et al. 2007). This directs study away from the reductionistic view of evolutionary change through a genetic lens (Geary, 2006). Furthermore, the plasticity of human learning has been taken into account by EDP; with the recognition that although some behaviours are quite rigid and domain-specific (e.g. imprinting in birds), having evolved to deal with recurrent and invariant problems during phylogeny, there are also flexible and domaingeneral mechanisms (e.g. age of menarche) which have evolved to deal with variable aspects of the animal’s environment, and that these two modularities interact to produce unique adaptations for future generations (Grotuss et al. 2007). This is a view which allows room for behaviour change. A practical application of this is highlighted by Bjorklund and Bering (2002), who suggest that the education of children can be significantly improved by an understanding of the adaptive cognitive mechanisms which function at different stages of a child’s development.

Evolutionary Developmental Psychology: Theory and Empirical Research

Grotuss et al. (2007) outlined three distinct types of adaptation: deferred, ontogenetic and conditional. Deferred adaptations are evolved mechanisms which help to prepare the organism for later life, particularly the reproductive stage. Ontogenetic adaptations are characteristics with

342 111gradbook final.indd 342

22/10/2010 13:52:39


adaptive functions towards surviving specific aspects of infancy and childhood, and are discarded when no longer needed. Conditional adaptations were defined as mechanisms which evolved to detect features of the childhood environment and direct development in response to this.

Attachment as an Adaptive Mechanism

Attachment is a key area in EDP, functioning as both a deferred and ontogenetic adaptation; Bowlby (1988, in Newman & Newman, 2007) aptly described it as ‘a good insurance policy, whatever our age’ (pg. 27). With regard to the infant’s immediate environment, the benefits of attachment to the primary caregiver have been well-documented. A secure attachment to a parent promotes the infant’s learning and exploration of their world and ensures the protection of the infant during the heavily dependent stage of their lifespan (Ainsworth & Bowlby, 1991; Bowlby, 1988). The principal components of human mother-infant attachment (e.g. maintaining proximity to mother, distress in absence of mother) have been found in Old World monkeys who are close phylogenetic cousins of humans; implicating a deep evolutionary history of attachment (Maestripieri & Roney, 2006, in Geary, 2006). Attachment has also been implicated in the formation of adult social bonds in later life, providing the infant with a template for future relationships. The continuity of three different types of infant attachment; secure, avoidant, and anxious-ambivalent, has been demonstrated by a number of longitudinal studies (e.g. Morris, 1982, in Feeney & Noller, 1990). In their experiment, Feeney & Noller (1990) conducted an MANOVA in order to assess the effects of early attachment styles on different aspects of adult love relationships. Significant effects of attachments style were found on 15 of the 16 love and self-esteem scales, and results indicated that securely attached participants fared better in love relationships than did avoidant or anxious-ambivalent. Fraley, Brumbaugh, & Marks (2005) conducted a comparative analysis of numerous animals and primates to determine the social, developmental and morphological correlates of monogamy. Considering that many animals do not develop emotional attachments to their mates in order to reproduce, the evolutionary function of adult love relationships was unclear (Fraley et al. 2005). Based on their data, Fraley and colleagues concluded that when an increase in paternal investment occurred in a species over evolutionary time, this was a prerequisite for the evolution of pair bonding (i.e. monogamous attachment) during mating for future generations. Thus the evolution of a secure father-infant attachment paradigm leads to the evolution of monogamy during reproductive stages of development for many species. Furthermore, as small body size was correlated with pair bonding, it was proposed that pair bonding may increase survival by providing greater protection against predation, to which small animals are more susceptible. Indeed, a longer lifespan was also correlated with pair bonding, indicating its adaptive function for the fitness of the species.

Play as an adaptive mechanism

Play is another significant behaviour in EDP which displays both deferred and ontogenetic benefits for the organism. Play can be defined as non-serious, typically exaggerated versions of functional behaviours (social, locomotor and object-oriented), with emphasis on the behaviour

343 111gradbook final.indd 343

22/10/2010 13:52:39


itself as opposed to the outcome of the behaviour (Burghardt, 2005, in Pelligrini et al. 2007). Piaget and Vygotsky emphasised the deferred value of play in preparing children for their adult niche in society (Pelligrini et al. 2007). Grotuss et al. (2007) proposed that play helped to prepare males and females for their likely future roles, which are gender specific because of the different evolutionary problems males and females faced in ancestral eras. For instance, boys were found to engage in more rough and tumble play (R & T) than girls to facilitate future intrasex competition (Pelligrini & Smith, 1998, in Grotuss et al. 2007), and this indicates that mating strategies and the processes of sexual selection as outlined by Buss (2007) have their roots early in childhood. Girls were also more likely to engage in care-giving fantasy play than boys (Pelligrini & Bjorklund, 2004, in Grotuss et al. 2007) as preparation for their maternal roles, whereas boys displayed higher participation in object-oriented play than girls (Gredlein & Bjorklund, 2005, in Grotuss et al. 2007), as preparation for use of tools in adulthood, based upon the activities of their huntergatherer forefathers. Play has also been found to have immediate benefits during the juvenile period. Play allows an organism to sample its environment and develop behavioural responses with minimum risk involved, and this produces a wide repertoire of innovative strategies in the infant (Pelligrini et al. 2007). These innovative behaviours can consequently affect gene expression and evolution, according to Pelligrini et al (2007). As formal schooling is a relatively new phenomenon in human evolution (Bjorklund & Bering, 2002), learning through play was essential for species developing in socially complex worlds and variable environments (Pelligrini et al. 2007). In pre-industrial times and in foraging societies which exist today, learning is smoothly integrated with a child’s daily life e.g. Botswana girls learning to pound grain initially through play (Bock, 2005, in Pelligrini et al. 2007). Finally, play acts as a conditional adaptation, directing development according to specific cues in the immediate environment. Play occurs when the organism has an abundance of resources, but levels of play can be reduced or terminated by conditions such as food shortage or extreme changes in environmental temperature (Baldwin & Baldwin, 1976, in Pelligrini et al. 2007). This ties in with Burghardt’s (2005, in Pelligrini et al. 2007) Surplus Resource Theory, where juveniles are do not display fixed behaviour patterns upon coming into contact with an anticipated trigger in the environment, but they use what resources are provided to them, indicating high flexibility. This directly contrasts with the somewhat rigid accounts of innate human behaviour patterns offered by ethology.

The Evolution of the Brain

The human brain is very expensive, making up 2% of the total body mass yet consuming 20% of total energy intake, and being 7-8 times larger than expected (as calculated by the encephalisation quotient), and evolutionary theorists have puzzled over the function of such a costly organ (Roth & Dick, 2005). According to the Social Brain Hypothesis, a rapid increase in brain size along with an extended period of immaturity have been correlated with the development of intelligence, defined as mental and behavioural flexibility (Rothe & Dicke, 2005), which is required to develop

344 111gradbook final.indd 344

22/10/2010 13:52:40


complex social skills and respond to variable environments (Geary, 2006; Grotuss et al. 2007). Aspects of imitation, deception, theory of the mind, grammatical-syntactical language and consciousness have all been observed large-brained species such as cetaceans and non-human primates, however, not to the extent seen in humans (Roth & Dicke, 2005). It is believed that the unique combination of these characteristics has lead to superior intelligence in humans (Roth & Dicke, 2005), perhaps in an autocatalytic fashion (Wilson, 1975, in Newman & Newman, 2007).

Conclusions

Each evolutionary approach towards studying human and animal behaviour has contributed significantly to research within developmental psychology. Darwin fathered the evolutionary meta-theory which continues to generate a plethora of psychological research today, ethology provided an excellent account of those behaviours which appear innate across or within species and spurred comparative research, EP, among other things, gave a thorough account of human mating strategies and reproduction, and EDP has seamlessly integrated evolutionary theory into developmental psychology, including the marriage of ontogeny and phylogeny in affecting evolutionary processes. While the ethological approach and EP emphasise universals among or within species, EDP gives a more idiosyncratic account of how flexible, novel and often unique behaviours are produced and evolve in complex species (Grotuss et al. 2007; Pelligrini et al. 2007). While EP informs us on the nature of natural selection during adulthood (Buss, 1995), EDP emphasises the forces of natural selection during the period of immaturity (Grotuss et al. 2007), and it is suggested that both approaches be combined in future research in order to understand the operations of evolutionary mechanisms across the whole lifespan. It seems that EDP is the most comprehensive approach towards an evolutionary theory of developmental psychology, as it encompasses cognitive and behavioural flexibility as well as predisposed behaviours (Geary, 2006); it has practical applications towards behaviour change (Bjorklund & Bering, 2002); it places emphasis on epigenetic influences on evolution (Geary, 2006); and it gives a richer and fuller account of phenomena such as attachment and play than EP, ethology, or developmental psychology alone (Grotuss et al. 2007; Pelligrini et al. 2007). It should be noted that while a few key aspects of each approach were discussed, a rich body of literature exists regarding the evolutionary approach to psychology, including important behaviours such as the stress response (Geary, 2006), birdsong and language acquisition (Tinbergen, 1963), family conflict, sexual deception, infanticide etc. (Buss, 2009), as well as byproducts and noise (Grotuss et al. 2007), all of which has been beyond the scope of this review to discuss. The caveat still exists that evolutionary theorists speculate to some degree about the nature of life for our ancestors (Gould, 2000), and it is important not to become too wrapped up in the evolutionary approach. There is a general consensus that evolutionary theory has been highly influential on the history of developmental psychology; not by proffering a definitive or absolute account of human development, but by integrating a myriad of disciplines to provide a broad, overarching framework with which to better understand developmental processes (Geary, 2006; Grotuss et al. 2007; Pelligrini et al. 2007).

345 111gradbook final.indd 345

22/10/2010 13:52:40


Modern Cultural Studies panel

r

Dr. Marian Fitzgibbon, AIT (chair) Dr. EibhlĂ­s Farrell, DKIT Dr. Siun Hanrahan, NCAD Charlie McCarthy, Award-winning director and author Helen Doherty, IADT

Judges’ comments

The winning essay recommended itself as overall prize winner in the Modern Cultural Studies category first by virtue of its subject matter, itself innovative; second by its execution which comprised the transmission of new information, even to its expert reader; third, because of the competence and mastery evident in the design and elaboration of its theme; and finally, by the ability of the author to engage the interest of the non-specialist reader. The author took the trouble to research her topic well and the conclusions drawn from the research are intelligent and refined, showing a mature appreciation of the importance of context. Expression is clear and straightforward as are the thought-processes of the writer, resulting in an essay that is convincing and persuasive as well as displaying a stylishness that is rare at undergraduate level.

346 111gradbook final.indd 346

22/10/2010 13:52:40


r Modern Cultural Studies

The female presence in the development of electronic and experimental music Claire Leonard

‘T

Introduction

he terms technology and music are often marked as male domains, and the trenchancy of associated gendered stereotypes seems to gain force when these fields converge in electronic music.’1 The purpose of this report is to discuss the validity of this ‘gendered stereotype’ by informing the reader of the female presence in the development and current state of electronic and experimental music – a presence that has escaped as substantial a degree of documentation as the male influence in musical history. The ‘electronic era’ is the first musical period occurring within recent history whereby women have not been in some way excluded socially, scientifically, intellectually or academically. The issue of gender equality within musical creation is first voiced in this era. This report is not written from a consciously feminist standpoint. The evidence is that significant contributions to electronic music have occurred and continue to occur under female influence. Yet the degree of female involvement in this field is still relatively sparse in comparison to the female presence in other experimental art forms. Encompassing the span from the conception of the electronic era to contemporary developments, this report will briefly examine one notable female figure present at various defining moments in the history of electronic music and in doing so, will highlight the variety of pathways that have led these significant figures to electronic music. The changing role that education has played in relation to female involvement in this field will be explored, with some thought given to the future. Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Duke University Press 43.

1

347 111gradbook final.indd 347

22/10/2010 13:52:40


Female presence in late classical musical history

The ‘gendered stereotypes’ that exist in music have been somewhat entrenched in our musical history long before our focus period. Mainstream social politics until the mid 20th Century, in popular perception, dictated that a woman fulfil her ‘gendered stereotype’ as wife, mother and carer; it seems to be relatively unheard of for a female to have enjoyed an independent career in a field as ‘radical’ as music. Our lack of knowledge, however, concerning the few independent female spirits who chose to break the social cycle may be attributed to the lack of documentation of the female composer in the Western classical music tradition. However, there is evidence that such female figures did exist.2 On the somewhat rare occasions when the female presence was documented, this was usually due to an association with a respected male figure in society. The examples of Clara Schumann (wife of Robert Schumann) and Fanny Mendelssohn (sister of Felix Mendelssohn) serve to illustrate this point. These women were composers and performers in their own right but marriage and siblinghood respectively have, it seems, bonded their achievements to the history of noted male figures. That fact has arguably maintained their survival in musical history. In the case of Fanny Mendelssohn, her emerging role as a female composer could indeed be seen to have been actively discouraged. Her father wrote in 1802, “Music will perhaps become his (Felix’s) profession, while for you it can and must be only an ornament.”3 This patriarchal attitude instilled in the social consciousness the idea that the most acceptable role for the female in music was that of ‘ornament’; in other words, music was not to be pursued as a profession or career.

Female presence before 1945 Rockmore and Bigelow Rosen The earliest experiments in the production of electronic sound were taking place before the end of the nineteenth century. Philip Reis had demonstrated his ‘Reis Telephone’ in 1861, Elisha Gray’s ‘Musical Telegraphs’ dated from 1874 and Thaddeus Cahill’s ambitious plans to build the first electronic music synthesiser came to fruition when he unveiled the ‘Telharmonium’ to the public in New York in 1906. These rudimentary experiments in electronic sound production were overtaken soon after their creation by development of increasingly sophisticated models. There exists, however, an early electronic instrument that has continued to capture the imagination of artists and composers alike to the present day. In 1920, Lev Sergeyevich Termen (Leon Theremin) developed, “one of the most familiar electronic musical instruments to gain widespread acceptance”4 – the Theremin. This instrument utilised a ‘beat frequency’ method to produce the most unusual of sonorities and was played by moving both hands in the vicinity of Sadie, J.A and Samuel, R. (1995) The Norton/Grove Dictionary of Women Composers. W. W. Norton & Company. 3 Balson, D. (2010) http://www.femininemusique.com/main.php?page=fanny. Date accessed 24/05/10. 4 Holmes, T. (2008) Electronic and Experimental Music: Electronic Music Before 1945. Routledge Press. pp. 19-24. 2

348 111gradbook final.indd 348

22/10/2010 13:52:40


two antenna; one upright antenna, controlled by the right hand, manipulated pitch, whilst the amplitude of the sound was controlled by placing the left hand near a second, circular antenna. The ethereal tones of the Theremin were introduced into American in 1927 where they were to be significantly developed by the attentions of a virtuoso violinist and one of the first influential female figures in electronic music, Clara Rockmore. Rockmore was introduced to the Theremin through her acquaintance with its inventor. We can presume that Leon Theremin realised the potential for promoting his unconventional electronic creation as a ‘playable’ instrument by using Rockmore’s skills and talents. Rockmore worked closely with Theremin in refining his instrument designs and in expanding the capabilities of the instrument from the perspective of the performer. The virtuoso’s playing style focused on adaptations of string parts from classical works such as Tchaikovsky’s “Valse Sentimentale”. Rockmore was familiar with such works from her classical musical training. This approach was criticised by composer John Cage who felt the promise of the Theremin was trivialised by performing such conventional music, stating in 1937, “When Theremin provided an instrument with genuinely new possibilities… although the instrument is capable of a wide variety of sound qualities… Thereminists act as censors, giving the public those sounds they think the public will like.”5 Another female figure, Lucie Bigelow Rosen, can be credited with expanding the repertoire of the instrument from the classically rooted adaptations favoured by Rockmore into entirely new musical territory. Bigelow Rosen, the wife of prominent lawyer, banker and art patron, Walter Rosen, made contributions that were paramount to the early prominence and indeed the eventual survival of the Theremin. Under Leon Theremin’s tutelage, Bigelow Rosen came to be one of the most skilled ‘Thereminists’, a position she shared with Rockmore. Demonstrating interest in furthering the new musical possibilities of the Theremin, Bigelow Rosen commissioned several prominent composers to write new works exploring the full capabilities of the instrument. However, Bigelow Rosen was also significantly important from a financial perspective, as she was, “one of the first enthusiastic supporters of the art of electronic music.”6 She and her husband acted as Theremin’s chief benefactors whilst he lived in New York, providing him with low rent accommodation wherein Theremin had, “several productive years… as he took on commissions to construct a variety of electronic musical instruments.”7 Although Rockmore and Bigelow Rosen played a significant part in earning this early electronic instrument a wider acceptance, we still see evidence of predetermined musical gender specific roles within their involvement. Musicologist Susan McClary states that, “women have rarely been permitted agency in their art, but instead have been restricted to ‘enacting – upon and through their bodies – the theatrical, musical, cinematic, and dance scenarios concocted by male artists”.8 Rockmore and Bigelow Rosen essentially continued the ‘performer’ role grounded on earlier patriarchal attitudes. In this new era of music, participation seemed entirely inaccessible to any female other than as a ‘performer’. Ibid. Ibid. 7 Ibid. 8 McClary, S. (2002) Feminine Endings: Music, Gender and Sexuality. University of Minnesota Press. pg. 138. 5 6

349 111gradbook final.indd 349

22/10/2010 13:52:40


Female presence: post 1945 Barron By the time of the emergence of the next influential female figure, there had been significant changes to the face of electronic and experimental music, but seemingly without clear evidence of female involvement. The development of the magnetic tape recorder had taken place primarily in Germany but with the advent of the approaching World War II, interest in this recording medium waned outside the country of its creation. At the end of the war, differing approaches to electronic music were apparent, somewhat dictated from the aesthetics of different studios in various geographical locations. ‘Nordwestdeutscher Rundfunk’ in Cologne and ‘Groupe de Recherches Mucicales’ in Paris had developed the disciplines of ‘elektronische musik’ and ‘musique concrète’ respectively, whereas, “Electronic music activity in the United States during the early 1950’s was neither organised nor institutional.”9 The musical result of this lack of co-ordination was a diverse range of musical endeavour that avoided anything approaching a ‘school of thought’ in respect of the aesthetics of the tape medium. In the USA, the marriage of Louis and Bebe Barron in 1947 was to lead to a pioneering musical partnership in the field of electronic music. It is reported that the couple acquired their first tape recording equipment through a family connection as a wedding gift10 allowing the musically inclined pair the opportunity to delve into experiments with ‘musique concrète’ whilst magnetic recording tape was unavailable elsewhere in America. The Barrons moved to New York in 1948 and set up a recording studio on their arrival. Thus, they became the owners of the first private electronic music studio in America. Their unique position with the music studio situated in New York, the centre of the post-war American Cultural Revolution, allowed the Barrons to collaborate with innovative composersnotably John Cage. Cage’s organisation of the ‘Project for Magnetic Tape’ (1957) in which he explored the medium with his fellow composers would not have been possible without the use of the Barron’s private studio and technical assistance. This led to compositions such as “Imaginary Landscape No.1”. Cage depended on the combined talents of the Barrons when he undertook the ambitious “Williams Mix” project, commissioning the couple to record between 500 and 600 field recordings that had to be spliced together in an unusual fashion, “It was a tremendous editing job.”11 Bebe Barron accounts for her role in the creation of the piece in which this laborious technique was a major compositional element, “There were days’ worth of tape [recorded for Holmes, T. (2008) Electronic and Experimental Music: Early Electronic Music in the United States. Routledge Press. pg. 80. 10 Stone, S. (2005) The Barrons: Forgotten Pioneers of Electronic Music. Text and Audio Broadcast. NPR, Morning Edition, February 7, 2005. http://www.npr.org/templates/story/story.php?storyId=4486840. Date accessed 12/05/10. 11 Holmes, T. (2008) Electronic and Experimental Music: Early Electronic Music in the United States. Routledge Press. pg. 85 9

350 111gradbook final.indd 350

22/10/2010 13:52:40


the piece]… Louis, he wasn’t interested in going through the raw tape. So it had to be me or no one.”12 By 1954, the Barrons had earned a reputation for their own musical endeavours as, “important providers of electronic music and sound effects for film”13 , however they no longer held a ‘monopoly’ in private studios; there now existed competition with other independently owned New York studios that had appeared in the subsequent years following the increasing availability of electronic recording equipment. This encouraged the Barrons to focus on their own compositions. The soundtrack to Forbidden Planet (1956) is amongst the Barron’s most celebrated work – recognised as the first entirely electronic score for film. This score was generated using sounds created from the homemade circuits the couple would construct in their New York studio. These sounds, often consisting of the chaotic noise of dying circuitry, found no equivalent in the commercial music studios of Hollywood. This couple serve as a model to demonstrate the supposed differing approaches to music as influenced by gender. Louis Barron utilised his technical skills in circuitry design whilst Bebe, “did much of the composition and production”14, though they appeared to be proficient in both areas. It seems it was the melding of Louis’s vital technical skills in circuit design coupled by Bebe’s compositional flair that was the makings of such a successful partnership. Bebe Barron’s position permitted unrestricted access to a private electronic music studio in which her learning was self-directed. This female figure also benefited from the associations with respected male composers – a highly important factor to her developing her recognition as a pioneering figure in electronic music. It seems that it would have been extraordinary unlikely that this female figure would have gained acceptance in this field of early electronic music as a solo female composer without such technical and personal associations.

Female presence: tape and early analog synthesiser compositions Oliveros In 1966, the magnetic tape studio still represented the leading edge in electronic music technology but this particular year also marked the beginning of the interest in analog music synthesisers.15 Pauline Oliveros was one of the pioneering figures in the post-war electronic and Stone, S. (2005) The Barrons: Forgotten Pioneers of Electronic Music. Text and Audio Broadcast. NPR, Morning Edition, February 7, 2005. http://www.npr.org/templates/story/story.php?storyId=4486840. Date accessed 12/05/10. 13 Holmes, T. (2008) Electronic and Experimental Music: Early Electronic Music in the United States. Routledge Press. pg. 86. 14 Holmes, T. (2008) Electronic and Experimental Music: Early Electronic Music in the United States. Routledge Press. pg. 87. 15 Holmes, T. (2008) Electronic and Experimental Music: Early Electronic Music in the United States. Routledge Press. pg. 120. 12

351 111gradbook final.indd 351

22/10/2010 13:52:40


experimental music scene and it is notable that she appears to be one of the first female figures who demonstrated complete musical independence in early electronic music history – “ I was very determined that what I wanted to do was to compose music, so I just did it.”16 After obtaining music degrees from the University of Houston and San Francisco State University, Oliveros became a founding member of the ‘San Francisco Tape Music Centre’ (STFMC) in 1961 that served as a focus for electronic music interest in the West Coast of America. The SFTMC existed as a private studio until 1966 when it became affiliated with ‘Mills Centre for Contemporary Music’ in 1967 where Oliveros became the first director.17 Oliveros used tape echo as a structural process behind her pioneering works of electronic music. “I of IV” (1966) is an example of an early recording that makes extensive use of accumulative tape delay systems resulting in the degeneration of the repeating signal in real time performance. It would appear that Oliveros was greatly inspired by her two-month study of circuitry design under Hugh Le Caine in the University of Toronto in the summer of 1966. Speaking of the importance of this academic institution to the creation of this work, Oliveos stated, “ The techniques that I had invented for myself were very well supported by the studio setup at the University of Toronto.”18 “I of IV” was created at the University of Toronto Electronic Music Studio using a tape loop system developed by Hugh LeCaine.19 “Beautiful Soop” (1967) made use of multiple tape echo signals and incorporated the sounds of the relatively new ‘Buchla’ synthesiser that was invented and demonstrated in 1965 at the SFTMC. These two early tape compositions were the result of Oliveros’ single-minded determination to establish a network of colleagues through her associates and form a studio and collective that would lend itself to her pursuit of electronic musical creation. It was not until the advent of the first academic electronic studio that female participation elsewhere in the development of electronic music became possible through the accessibility of electronic musical equipment, not widely available at the time.

Female Presence: analogue synthesiser compositions Carlos The opening of the Columbia-Princeton Electronic Music Centre (CPEMC) in 1958 had been the first collaborative academic effort focusing on the creation of electronic music. This development was key in the encouragement of the female figure in the electronic music field of Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Pauline Oliveros. Duke University Press. pg. 32. 17 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Pauline Oliveros. Duke University Press. pg. 30. 18 Holmes, T. (2008) Electronic and Experimental Music: Early Synthesisers and Experimenters. Routledge Press. pg. 168. 19 Ibid. 16

352 111gradbook final.indd 352

22/10/2010 13:52:40


North America, accounting for much of the pioneering efforts of women as composers.20 Until this point, commercial studios or broadcasting establishments were the centre of electronic musical endeavour. These environments appear to have been viewed by females as quite alienating to any female simply interested in exploring the compositional possibilities of the electronic medium. The CPEMC provided access to an array of electronic equipment in a nurturing environment in which it was possible to be educated in the art of electronic music, regardless of gender. Composers Alice Shields, Pril Smiley and Pauline Oliveros were the earliest female practitioners in this studio. Alice Shields credits founder Vladimir Ussachevsky with, “encouraging women composers to work at the studio”.21 As the founder of this centre along with Otto Luening, Ussachevsky encouraged artists to experiment with electronic music,22 creating a harmonious learning environment for any female composer interested in finding a new means of expression. One such composer was Wendy Carlos. As a child, Carlos had exhibited a clear interest in science, winning a Science Fair scholarship for a home built computer.23 Carlos was exposed to the field of electronic music through her M.A. studies in composition at Columbia University under the guidance of Luening and Ussachevsky. It was Ussachevsky who suggested to Carlos that she attend an Audio Engineering Society Conference in New York in 1964, as she was, “one of his more technically curious graduate students.”24 Carlos became aquatinted with Robert Moog and his early voltage controlled modules. Leaving academia in 1966 with the intentions of becoming a recording engineer, she became one of Moog’s first customers by ordering a voltage controlled module system built to her specific requirements for her home studio. On this system, working over a period of many months, Carlos composed “Switched on Bach” – a work that was to change the mainstream perception of electronic music. Columbia records released “Switched On Bach” in 1968. This album was one of the, “… first to attempt a truly musical treatment of classical music using synthesised sounds” 25 and it was a major achievement in proving that electronic music did not dwell solely in the realm of the experimental. The work became the top selling classical album at the time, selling more then one million copies. The interest generated from this record was, “responsible for the burgeoning use of synthesisers in all music genres.”26 Carlos continued to refine her Moog Synthesiser technique in albums such as the “WellTempered Synthesiser” (1969), “Switched On Bach II” (1974), “By Request” (1975) and “Switched-On Brandenburgs Volumes 1 & 2” (1979) after which she converted to digital Hinkle-Turner, E. (2006) Women Composers and Musical Technology in the United States. Ashgate Publishing Limited. pg. 16. 21 Holmes, T. (2008) Electronic and Experimental Music: Early Synthesisers and Experimenters. Routledge Press. pg. 155. 22 http://www.furious.com/perfect/ohm/columbiaprinceton.html. Date accessed 23/05/10. 23 Carlos, W. (2010) http://www.wendycarlos.com/biography. Date accessed 12/05/10. 24 Carlos, W. (2010) http://www.wendycarlos.com/moog. Date accessed 12/05/10. 25 Holmes, T. (2008) Electronic and Experimental Music: Early Synthesisers and Experimenters. Routledge. 26 Holmes, T. (2008) Electronic and Experimental Music: The Voltage Controlled Synthesiser. Routledge. 20

353 111gradbook final.indd 353

22/10/2010 13:52:40


instruments. Carlos’s works would continue to cover a vast musical territory beyond her groundbreaking work in analog synthesiser composition but it is these seminal works in particular which demonstrated her earliest technical capabilities and innovation.

Female Presence: early computer music and the microprocessor revolution Spiegel Laurie Spiegel is one of few females who appear to have been involved with the actual creation of musical programming languages in addition to their use in her own compositions. In tracing Spiegel’s past we can determine that, displaying behaviour similar to Oliveros and Carlos, is was Spiegel’s own determination and ‘technical mindedness’ that allowed her to succeed in the field of electronic music. As a child, Spiegel’s musical interest was largely self-directed by learning to play instruments such as the mandolin, guitar and banjo by ear.27 By the age of twenty, Spiegel had taught herself Western musical notation that allowed her to notate her own compositions. Spiegel received a degree in social sciences from Oxford University after which she returned to renaissance musical studies in the Julliard School and the private study of composition in London.28 In 1969 she was introduced to the Buchla synthesiser and began to work with the instrument. In present day, Spiegel states, “Ironically, my need for greater control, complexity, replicability, subtlety and precision [rediscovered through the Buchla synth] led me within just a few years to an even less direct means of composition: the writing of computer code to describe my own musical decision-making, and by use of logic to attempt to enhance the musical power of any individual through new instrument creation.”29 Spiegel began to work in Bell Labs as a software engineer from 1973 to 1979, alongside Max Mathews who had successfully demonstrated his first musical programming language MUSIC 1 in 1957. Whilst at Bell Labs, Spiegel wrote computer programmes to operate GROOVE, a microcomputer-based real time synthesis programme developed by Mathews. Spiegel’s compositions “Appalachain Grove” (1974) and “The Expanding Universe” (1975) emerged from the GROOVE studio at Bell Labs, “near the end of the era of musical software applications for general purpose mainframe computers”30 and GROOVE was eclipsed by new technology. Spiegel left Bells Labs to take a position as consultant on microcomputer-based products in the dual role of both engineer and composer. One of her most successful ventures was the music programme Music Mouse released in 1985 that enabled music making with an ‘intelligent instrument’31 in an accessible user environment rather than a programming environment. It was these types of innovative software developments that became accessible to the public through Spiegel, L. (2010) http://retiary.org/ls. Date accessed 17/05/10. Ibid. 29 Ibid. 30 Holmes, T. (2008) Electronic and Experimental Music: Pioneering Works of Electronic Music. Routledge. 31 ‘Atari Midi World’ (2010) http://tamw.atari-users.net/mmouse.htm. Date Accessed 20/05/10. 27 28

354 111gradbook final.indd 354

22/10/2010 13:52:40


the revolution of the microprocessor and with it the birth of the home computer. This enabled electronic music making outside of the studio environment, leading to a new era in the creation of electronic music and greatly affecting the future performance potential for our next female figure, who contrasts completely with Spiegel in electronic music background.

Female Presence: live electronic and ambient music Mori Ikue Mori is perhaps the embodiment of an increasingly common female figure in electronic music; one who has entered the field of music production without academic training or associations with an established academic institution. In contrast to Spiegel, Mori became involved in the New York scene of experimental music in 1977 after an impromptu holiday from her native Japan. Mori joined influential ‘No Wave’ band DNA as drummer with members Arto Lindsay and Tim Wright. With no previous musical training, Mori immersed herself immediately into the spirit of the experimental culture of the New York scene – “Within six months I went from never being a musician to playing [infamous New York City venue] CBGB’s!”32 After the disbandment of the group in 1982, Mori was introduced to drum machines, “Somebody gave me a small Roland TR-707 drum machine… I just fell in love with it. Then I got more sophisticated drum machines you can do a little more with… I moved from one drum machine to two drum machines… I used the laptop starting in 2000.”33 Autodidacticism allowed Mori to discover and develop her compositional voice, at her own pace, through the unorthodox possibilities of the drum machine, using drum presets to create signature broken beats and rhythms. The composer experimented with various live arrangements of drum machines before progressing to utilising a laptop in live performances using music programme Max/MSP which she describes as, “… continuous from the set up with drum machines… a liberation really… But with the laptop you can do more processing and manipulation… expanding the vocabulary to more sounds.”34 This ‘vocabulary of sounds’ creates a unique voice in Mori’s live improvised work. These techniques are similarly demonstrated in her commercial releases; most recently “Class Insecta” (2009). Fellow female performer, Zeena Parkins believes it is this ‘personality of sound’ that creates such interest in Mori “… when you hear Ikue on drum machines. Her sound is pretty much unmistakeably her sound… it’s a really great thing to have such personalised sounds to work with.”35 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Ikue Mori. Duke University Press. pg. 74. 33 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Ikue Mori. Duke University Press. pg. 76. 34 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Ikue Mori. Duke University Press. pg. 77. 35 Holmes, T. (2008) Electronic and Experimental Music: Live Electronic Music and Ambient Music. Routledge. pg. 382. 32

355 111gradbook final.indd 355

22/10/2010 13:52:40


Female presence: in contemporary electronic music Rosenfeld Contemporary electronic music practices are illuminated through the stories of women artists of different generations and cultural backgrounds36 but we might look to Marina Rosenfeld to best represent a new generation of female presence in the contemporary development of electronic and experimental music and to note how academic education as well as contemporary culture has shaped her music. Rosenfeld embodies the spirit of the modern schooled composer who has embraced the collaboration between contemporary ‘academic’ and ‘non-academic’ approaches to musical creation. Rosenfeld studied at the California Institute of Arts under the tutelage of Max Powell.37 Whilst a student, Rosenfeld created the Sheer Frost Orchestra, “a musical performance realised by seventeen women on floor-bound electric guitars, deploying nail-polish bottles as sensitive and magical sound-producing implements”.38 The strictly non-academic approach of ‘turntablism’ is another one of many outlets to Rosenfeld’s work. “Fragment Opera” (2001) is a work of experimental turntablism in which a set of acetate discs that the composer has created herself are used as a palette of sound for live performance following Rosenfeld’s instructions. This work also exhibits Rosenfeld’s fascination with the physicality of this type of performance – “I like the fact that the turntable is mechanical… like the way a piano is mechanical. I was a pianist first and still feel like my hands have to make the music on some level… from the point of view of performance… the idea is to expose the music and not conceal it, or your means of production.”39 Rosenfeld embraces both ‘low’ culture and the ‘high’ art appreciation she has received through academic education. This can be illustrated from the audio installation “Teenage Lontano” (2008); a performance of Ligeti’s “Lontano” (1962) sung by teenagers listening into their iPods within a performance of Rosenfeld music. This melds the influence of the work of a European master of electronic music with fragments of disposable contemporary pop music. We can assume that Rosenfeld is aware of the lineage of appropriation through her academic education. Appropriation has existed in both classical and electronic music before contemporary fascination. Bartok’s “Concerto for Orchestra” intermixes a theme parodying and ridiculing the ‘march tune’ in Shostakovich’s “Leningrad Symphony No. 7” and the 3rd Movement of Luciano Berrio’s “Symphonia” (1968) appropriates a late Romantic period waltz from Mahler’s “2nd Symphony”. Rosenfeld’s appropriation, however, is interesting in its display of how the two worlds of ‘low’ and ‘high’ art may collide in contemporary music. For this particular female, classical training and an academic grounding remain central to her creative output. Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Duke University Press. pg. 9. Holmes, T. (2008) Electronic and Experimental Music: Live Electronic Music and Ambient Music. Routledge. pg. 426. 38 Rosenfeld, M. (2010) http://www.marinarosenfeld.com/home.htm. Date accessed 20/05/10. 39 Holmes, T. (2008) Electronic and Experimental Music: Live Electronic Music and Ambient Music. Routledge. pg. 426. 36 37

356 111gradbook final.indd 356

22/10/2010 13:52:40


Female presence: Is academia as vital to female presence in electronic music as it once was?

There are a multitude of factors as to why women were and indeed still are not as involved with electronic music as their male counterparts. These factors touch upon subjects as diverse as human psychology and the stereotypical gender roles that have been instilled in public consciousness since time immemorial. As these factors constitute topical reports in their own right, the focus of this report will remain specifically on the role of the education system in the encouragement of the female figure in electronic music. We have illustrated some of the backgrounds and early endeavours of various female participants who constitute key contributors to electronic music throughout its history. These persons come from a variety of academic and self-directed approaches and encompass varying degrees of technical proficiency and in some cases a selfconfessed lack of technical interest. The importance of academic institutions in the early education of the female electronic music composer must not be underestimated. To refer back to the institutions mentioned earlier in this report, Mills Collage and Columbia-Princeton Electronic Music Centre, these institutions served a vital function as the earliest academically centred locations40 to enable and foster creativity in electronic music production. These nurturing environments attracted female composers, who were readily granted access to the generally inaccessible world of electronic music for the first time. Until this point the only means of access to electronic equipment was through the commercial or private electronic studio, largely inaccessible to women, unless in the privileged position of Bebe Barron. There was a clear contrast between the welcoming environment of the academic world and the discouragement emanating from the patriarchal attitudes present in the European studios such as the RTF. Thus, it is indeed understandable that women felt inhibited from pursuing electronic music creation before the accessibility of suitable academic institutions. A little known figure, Eliane Radigue studied electroacoustic musical techniques in the late 1950s at the Studio d’Essai at the RTF under the direction of Pierre Schaeffer and Pierre Henry. She describes how her compositional aesthetics, “… were not exactly the interest of Pierre Schaeffer and Pierre Henry… This is why I had to quit at the time because… these two men were completely angry at me for what I was doing.”41 Radigue’s interest was with building slowly evolving sonic forms, seemingly as different from the discipline of musique concrète as could be imagined – “Pierre Henry, the first time he listened to one of my things, he got really mad, he was almost insulting me. Like he was expecting me [to be] like a follower of his own way.”42 Academic institutions were vital in allowing accessibility to rare and expensive electronic equipment and in educating composers as to how to bridge the technical gap that was required Holmes, T. (2008) Electronic and Experimental Music: Early Electronic Music in the United States. Routledge. pg. 98. 41 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Elaine Radigue. Duke University Press. pg. 59. 42 Ibid. 40

357 111gradbook final.indd 357

22/10/2010 13:52:40


to actually create music on these innovative and complex electronic technologies. A ‘technical mindedness’ was indeed a prerequisite in order to create electronic music, as documented from the approaches of Pauline Oliveros and Wendy Carlos. These two female figures emerged as two of the most successful female composers from the era of the first academic institutions, adapting new technologies to create visionary work. With the development of computer programming languages, represented by ‘technically minded’ women such as Laurie Spiegel and Carla Scaletti, computer music brought the revolution of the microprocessor that eventually enabled home computing by 1977 and the development of seminal music making programmes. This could indeed be thought of as the second landmark in the evolution of female involvement in electronic music. For composers such as Ikue Mori, this technology would enable a freedom of creation that was not bound to technical ability nor linked to the world of academia. In contemporary music there exists a ‘cross-pollination’ of sorts between academic and popular forms of electronic music. Categorisation of musical genre is often a fruitless process as female composers move freely between musical roles and production and composition methods. Many contemporary female composers of electronic and experimental music it seems have been inspired by the legacy of the underground feminist punk aesthetic of the early 1990s – the ‘Riot Grrrl’ movement43 – as close to the polar opposite of academia as can be imagined. With the increasing availability of musical technologies and music making programmes to the music enthusiast outside of an academic context, there is no longer an emphasis on discipline within ‘schools of thought’ in musical creation. The question emerges as to why females are embracing nonacademic approaches to electronic music creation over learning in an academic environment.

Female presence: encouragement in education

Pauline Oliveros is well-placed to judge the relative lack of female participation in electronic musical creation throughout her involvement in this field. She has written extensively on the issue of the status of women composers and roles of gender in musical expression. Her opinion is that a major cause of discouragement in women stems from formal education, “Especially in traditional establishment music, people are educated to the work of the European masters – who are all men. As long as they’re educated in that, that’s what they’re going to elevate… the canon is so entrenched in all the educational institutions. It means music has to be taught differently; it has to be inclusive. If that doesn’t happen then change is not going to take place.”44 Oliveros believes this ‘traditional establishment’ in education creates the illusion within the female musical community that there is no place in electronic music for them, “ You see all male programmes, performances, you see all male faculties, music by men… you don’t see any place Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Elaine Radigue. Duke University Press. pg. 3 44 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Pauline Oliveros. Duke University Press. pg. 31 43

358 111gradbook final.indd 358

22/10/2010 13:52:40


for yourself.”45 Concerning female encouragement towards working with the tools of technology, Oliveros states, “They [females] just haven’t been encouraged… they haven’t been supported. That just continues, that boys are more supported to do tech-y stuff than girls. And girls quickly learn to retrain themselves from being interested in things like that.”46 Rodgers is of similar opinion that women are “not encouraged to interact with certain kinds of technology the same way that men are.”47 Susan McClary further supports this view – “It is supposed to be Man who gives birth to and tames the Machine. Women in this culture are discouraged from even learning about technology… To the extent that Women and Machines both occupy positions opposite that of Man in standard dichonomies, women and machines are incompatible terms.”48 If this was and indeed continues to be the case, the counterargument might be that this attitude did not appear to have discouraged the compositional efforts of Oliveros, Carlos or Spiegel who, as previously mentioned, demonstrated a great degree of technological astuteness. In any case, this ‘lack of encouragement’ would have adversely affected budding female composers from the respective eras of these women to a greater degree than any contemporary female composer, with ‘MacBook’ laptop and ‘max/MSP’ software. The current direction of electronic music may indicate that a proficient technical knowledge is no longer a prerequisite to effective operation of modern musical technology. As noted by Kim Cascone, “Most of the tools being used today have a layer of abstraction that enables artists to explore without demanding excessive technical knowledge”.49 The depth of technical proficiency that was previously required to create tape systems in the manner of Oliveros, build additional components for a Moog synthesiser in the case of Carlos, or create a computing language as demonstrated by Laurie Spiegel, is no longer necessary in the creation of electronic music. To illustrate this point, contemporary female electronic musician Ikue Mori expresses how she is far from being technically proficient. “My interest in making songs is not in technical terms at all”50, yet the creation of user-generated content from modern musical software and indeed hardware is so accessible that Mori can not only create ground breaking music but remain one of the most respected live electronic female performers, without any academic training in digital signal processing. Ibid. Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Pauline Oliveros. Duke University Press. pg. 32. 47 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Ikue Mori. Duke University Press. pg. 78. 48 McClary, S. (2002) Feminine Endings: Music, Gender and Sexuality: This Is Not A Story My People Tell. University of Minnesota Press. pg. 138. 49 Cascone, K. (2002) THE AESTHETICS OF FAILURE: ‘Post-Digital’ Tendencies in Contemporary Computer Music, Computer Music Journal 24:4 Winter 2002 (MIT Press). 50 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Ikue Mori. Duke University Press. pg. 78 45 46

359 111gradbook final.indd 359

22/10/2010 13:52:40


Female presence: The importance of ‘community’

A study by Thom, Pickering, and Thompson found that, “… young women desire a career with camaraderie, support… characteristics they perceive can’t be found in technical jobs.”51 It would appear then that a sense of ‘community’ amongst female composers is crucial in supporting further involvement. Ikue Mori notes that despite a lack of women in a global perspective, these communities can be found in particular geographical locations, “New York is great… [it] probably has the most women musicians I can find. And they come here too, because they find it’s really comfortable for women working.”52 These ‘communities’ are the support mechanism for female composers of electronic music who have fostered a ‘non-academic’ approach to electronic music creation. However, such is the nature of collaboration between ‘academic’ and ‘non-academic’ approaches, schooled composers mingle with self-taught drum machine enthusiasts; Marina Rosenfeld collaborates with Ikue Mori. These ‘communities’ it could be said, also raise awareness of female contribution in electronic music, thus potentially influencing a new generation of female composers who previously, in the opinion of Oliveros, ‘saw no future’ in this particular discipline. This awareness is furthered by projects that focus on female contribution to electronic and experimental music. Several such of these efforts have been organised by London-based contemporary art agency, Electra. The ambition of the ‘Her Noise’ exhibition in the South London Gallery in 2005 was to investigate music histories in relation to gender and to bring together a wide network of women artists who use sound as a medium.53 The project featured a diverse representation of female composers from the ‘academic’ schooled artists such as Marina Rosenfeld and electronics improviser, Kaffe Matthews to Kim Gordon of New York rock band, Sonic Youth. Curators Anne Hilde Neset and Lina Dzuverovic began the project after remarking upon a distinct lack of representation of female figures in electronic and experimental music from their positions in ‘The Wire’ magazine and ‘Lux’ arts agency, respectively – “This is a history that is hidden in some ways, so we wanted to make it known.”54 Another ‘community’ that is an influential support mechanism for female electronic musicians is a network that is unrestricted by geographical location – “It is not academically-based, and for the most part the composers involved are self-taught”.55 Kim Cascone argues that, “A non-academic composer can search the Internet for tutorials and papers on any given aspect Kossuth, J and Leger-Hornby, T. (2004) Attracting Women to Technical Professions. Educause Quarterly Number 3, 2004. 52 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Ikue Mori. Duke University Press. pg. 78. 53 Dzuverovic, L. (2010) http://www.electraproductions.com/projects/2005/her_noise/overviews.html. Date accessed 22/05/10. 54 Hedditch, Emma and Electra. (Directors) (2005) Her Noise – The Making of (Documentary). 55 Cascone, K. (2002) THE AESTHETICS OF FAILURE: ‘Post-Digital’ Tendencies in Contemporary Computer Music, Computer Music Journal 24:4 Winter 2002 (MIT Press). 51

360 111gradbook final.indd 360

22/10/2010 13:52:40


of computer music to obtain a good, basic understanding of it.”56 Aside from the availability of online tutorials that encourage self-directed study of electronic music creation, the online community allows a network of female composers to experience a shared interest and discuss their work and approaches. Tara Rodgers founded the website pinknoises.com to promote the work of women in the field of electronic and experimental music as well as provide resources on production that are, “worded as so as to be more accessible to women.”57 The argument may be made that in distinguishing between the creative output of male and female electronic composers, such as Rodgers does, any gulf between gender in electronic and experimental music is further expanded. Arguments concern the cultural politics of the time more so than the music itself.

Future female presence in the development of electronic and experimental music

Academic institutions remain fundamentally important in provision of an education in the sonic arts. Any person, male or female, will not be deterred in pursuing an academic education. It is clear that an academic environment shall provide opportunities to any student that would not be possible outside the academic institution. The Sonic Arts Research Centre at Queen’s University Belfast is an excellent example of that with truly leading-edge equipment and academic facilities. However, the critical function and access that was previously only served in centres of academic research in the bringing together of scarce resources, technicians and creative musicians, is no longer restricted to this academic environment on account of social and technological evolution. However, if some women can see a preferred future for themselves in electronic music creation through the support of non-academic ‘communities’, they might be inclined to follow that course rather than enter a field in which they presume men have a “vastly superior technical knowledge”58, which still appears to be a perception. Whilst some increased emphasis upon the role and function of women in music technology is to be welcomed, Pauline Oliveros was only too well aware, “Trying to change consciousness and trying to change things… it’s not easy… because you run up against the canon of Bach, Beethoven, Mozart, Brahms. Millions of people are educated to that. So there is a very strong force field.”59 Possibly the academic acceptance of a cultural exchange between non-academic thinkers and research centres, as proposed by Kim Cascone would be of value – “In order to help better understand current trends in electronic music, the researchers in academic centres must keep Ibid. Rodgers, T. (2010) http://www.pinknoises.com/pn_home.html. Date accessed 12/05/10. 58 Kossuth, J and Leger-Hornby, T. (2004) Attracting Women to Technical Professions. Educause Quarterly Number 3, 2004. 59 Rodgers, T. (2010) Pink Noises: Women on Electronic Music and Sound. Pauline Oliveros. Duke University Press. pg. 32. 56 57

361 111gradbook final.indd 361

22/10/2010 13:52:40


abreast of these [non-academic] trends.”60 This would encourage and document the growing female participation in non-academically centred electronic music. The endeavours and work of several generations of women in the field of electronic and experimental music are interesting and significant. It appears that the future of female involvement in electronic and experimental music shall be built upon fostering a nurturing creative environment in which female creative figures may find a sense of identity within a musical ‘community’, where there is a freedom of approach and where technical excellence and creativity both flourish in equal measure. That might occur both within and outside academic centres. However, with funding and resources available to promote suitable projects, it is more probable that the most notable technical innovation shall follow academic centres of excellence.

Cascone, K. (2002) THE AESTHETICS OF FAILURE: ‘Post-Digital’ Tendencies in Contemporary Computer Music, Computer Music Journal 24:4 Winter 2002 (MIT Press). 60

362 111gradbook final.indd 362

22/10/2010 13:52:40


363 111gradbook final.indd 363

22/10/2010 13:52:40


Nursing & Midwifery panel

r

Prof. Kathleen Murphy, NUIG (chair) Dr. Eileen Savage, UCC Kathleen Frazer, UCD Loretta Crawley, UCD Mary Kirwan, DCU Orla McAlinden, QUB

Judges’ comments

From the outset, the reader is challenged to consider the power of labelling regarding mental illness and inherent complexities in resolving associated legal, professional and ethical issues. Legal issues are addressed from a historical perspective and assumptions underlying legal statements are challenged using a critical and questioning stance. The essay flows well into professional and ethical issues and the capacity to integrate arguments around these issues is evident. In presenting arguments on ethical issues, complex concepts around principles are handled well and cogently. Independent arguments are presented, it is fluent in style overall, and relevant literature is used given the context and focus of the essay. Overall, this excellent essay is interesting to read, and the level of analysis and critical comment make it an outstanding piece of undergraduate work.

364 111gradbook final.indd 364

22/10/2010 13:52:41


r Nursing & Midwifery

Critically evaluate the legal, ethical and professional issues associated with defining mental illness Carmel Penrose

T

INtroduction

he power to apply a label is something that should not be undertaken lightly, particularly when that label is viewed negatively and pejoratively such as the term ‘mental illness’. The application of such a diagnosis can have huge implications for the person at the receiving end of the process. However, as shall be explored in this paper, the application of a mental illness diagnosis is often fraught with subjectivity and is thus far from being a straightforward medical exercise. In fact it can present a number of legal, ethical and professional issues that are not easily resolved. This paper begins with an exploration of the legal issues before moving on to the professional and then the ethical issues whilst acknowledging that these divisions should not be taken to imply that they are separate and distinct. It is important to acknowledge that there is a huge degree of overlap between all three elements. The Law Society’s Law Reform Committee report on mental health (1997) defines mental illness as: “A state of mind which affects a person’s thinking, perceiving, emotion or judgment to the extent that he or she requires care or medical treatment in his or her own interests or the interests of other persons.” (Law Society 1999 section 2.1.1). The report provides a succinct introduction as to the role and the purpose of legislation as it pertains to mental illness stating that:

365 111gradbook final.indd 365

22/10/2010 13:52:41


“The primary objective of mental health legislation must be to provide a framework within which decisions can lawfully be taken on behalf of those who are unable to take decisions for themselves or are unable to communicate their decisions.” (Law Society 1999 section 1). However, as is briefly outlined in the next section a concern with the wellbeing of people diagnosed with a mental illness has not always been a primary concern. With respect to a historical legal analysis, Walsh and Daly (2004 pg. 14) characterise the eighteenth century response in Ireland to those defined mentally ill as being ‘sporadic and uncoordinated’. Interestingly it was not a piece of medical legislation but rather ‘The Prisons Act’ of 1787 that gave powers for the establishment of lunatic wards in the Houses of Industry (Walsh and Daly 2004). Subsequent acts in 1817 and 1821 provided for the establishment of specific ‘lunatic asylums’ with an approach that was certainly stark and foreboding as this quote from Jones (1960, cited in Walsh and Daly 2004 pg. 16) indicates: “Insanity is quite different from physical illness, and quite unlike normal behaviour. It is generally caused by poor heredity, or by drink, or possibly by starvation. Insane people should be sent to asylums, and most of them will have to stay there for life.” The Special Report from the Inspectors of Lunatics to the Chief Secretary (1894) indicates that the number of ‘lunatics’ and ‘idiots’ (the terms used at that time) at large, in asylums, prisons and workhouses rose from 5,074 and 9,980 respectively in 1851 to 14,945 and 21,188 by 1891 (Walsh and Daly 2004 pg. 20). This begs the question: “Was there more ‘idiots and lunatics’ as the century wore on or was there simply a greater level of social control and greater application of laws and diagnosis?” The early part of the twentieth century brought little in the way of a more humane application of the legal system to mental health issues. Although the United Kingdom’s 1913 Mental Deficiency Act did not extend to Ireland, its classifications and categorisations of behaviour was to have an impact on the way that mental illness was conceived and defined. Of particular note was the category ‘Mentally Defective’: which was defined as people who from an early age, displayed some permanent mental defect coupled with strong vicious or criminal propensities on which punishment had little or no effect (Dale 2003). From the perspective of the 21st century it may seem somewhat strange, but unmarried mothers also became absorbed into this category (Dale 2003). In terms of providing a fairer degree to hospital access that did not rely on one’s economic resources, of particular note was that the Mental Treatment Act of 1945 which allowed for the admission of patients to public hospitals as voluntary patients. Before this legislation there had been provision for admission only to private hospitals of voluntary boarders (Walsh and Daly 2004). The Mental Health Act 2001 (Government of Ireland 2001) replaced the Mental Treatment Acts 1945-61 (Government of Ireland 1945 and 1961) which previously had provided the statutory framework for the detention of people with mental illness and the administration of psychiatric services for over 50 years. The 2001 Act was introduced on a phased basis to allow for the necessary preparatory work to be undertaken. In March 2002 sections 1 to 5, 7, and 31 to 55 were commenced with effect from the 5th April 2002 (referred to as ‘Establishment Day’).

366 111gradbook final.indd 366

22/10/2010 16:09:05


This phase allowed for the establishment of the Mental Health Commission and the Inspector of Mental Health Services, which was intended to replace the Inspector of Mental Hospitals. The remaining provisions of the Mental Health Act 2001 (Parts 2, 4, 5 and 6) commenced on the 1st November 2006. These parts provided for the involuntary admission of persons to approved centres, replacing the provisions of the Mental Treatment Acts 1945 to 1961, and provided for an independent review of detention, consent to treatment, registration of approved centres by the Mental Health Commission and other miscellaneous provisions (Department of Health and Children (DOHC 2007). It is important to note that the name of the 2001 Act itself denotes a conceptual shift choosing the far more inclusive terms: health as opposed to the previous Acts (1945 and 1961) that were premised on treatment, a small point it could be argued but significant nonetheless. The 2001 Act (Section 3.2.1) defines mental illness as: “A state of mind of a person which affects the person’s thinking, perceiving, emotion or judgment and which seriously impairs the mental function of the person to the extent that he or she requires care or medical treatment in his or her own interest or in the interest of other persons.” As can be seen from the above definition the act implicitly acknowledges the ethical debate relating to the rights of the person to treatment and the necessity to impose treatment. However, in terms of providing a greater level of redress and patient/client empowerment, the Mental Health Act (2001) represented a quantum leap in rebalancing the power differentials between those with the power to diagnose (the doctors) – or as Foucault (2006) characterises it as ‘the gaze’ – and those were at the receiving end of the process of diagnosis. The implementation of the Mental Health Commission (MHC) and the concomitant mental health tribunals on foot of the Act was an acknowledgement that initial diagnoses are not absolute and are subject to subsequent challenge and revision. The degree of overlap between ethical and professional practices in nursing is particularly evident in the case of mental illness diagnosis, challenging as it does the nurse to consider a number of key questions relating to such things as: autonomy, harm and risk, ethical duty of care, voluntariness, codes of professional conduct and possible conflicts between law and ethics (Lesser 2002, Farsides 2002). Nurses have an ethical as well as a legal duty of care, however ‘failure to meet this duty will concern the law only if some harm or damage results’ (Lesser 2002 pg. 90). McHale (2002) notes that obtaining the consent of a patient to treatment is a crucial part of health care practice; failure to obtain consent before undertaking treatment may leave the health care professional open to the risk of being sued for damages in a civil court and/or being prosecuted in a criminal court. However, while there may be no place for the law in a certain situation, professional and ethical issues may still apply and thus, the nurse may be morally and professionally responsible (Lesser 2002). An Bord Altranais (ABA), the Irish Nursing Board, has primary responsibility for the legal and professional regulation of nursing in the state. Deriving its legislative power from the 1985 Nurse Act, the board’s main functions include:

367 111gradbook final.indd 367

22/10/2010 13:52:41


• to inquire into the conduct of a registered nurse on the grounds of alleged professional misconduct or alleged unfitness to engage in such practice; • to give guidance to the profession; and • the establishment and maintenance of a register of nurses. (ABA 2009) Arguably, in terms of providing guidance and an underpinning rationale to practice, the two most important publications that ABA produces are the ‘The Code of Professional Conduct for each Nurse and Midwife’ (ABA 2000) and ‘The Scope of Nursing and Midwifery Practice Framework’ (ABA 2000a). The Code of Professional Conduct (ABA 2000) calls for nurses to act in a manner that provides the highest standard of care for the patients and that ‘any circumstance which could place patients/clients in jeopardy or which militate against safe standards of practice should be made known to appropriate persons or authorities’. The difficulty with mental health diagnosis and treatment is that one could argue that the degree to which the application of a diagnosis of mental illness and subsequent possible involuntary admission and treatment could in future times be judged to have been against the best interests of the patient regardless of how well intended the actions were. ABA argues that one of the fundamental principles of nursing practice is the therapeutic relationship between the nurse and the patient/client; a relationship that is ‘based on trust, understanding, compassion, support and serves to empower the patient/client to make life choices’ (ABA 2000a pg. 3). From a professional guidance perspective this may seem quite straightforward and unproblematic, however, as can be appreciated, balancing the patient/client’s rights against the rights of others can raise some ethical and legal difficulties. In terms of the ethical issues regarding mental illness, Mason (2000) outlines a useful framework providing a succinct summary of the major ethical principles in relation to involuntary admission and coercive treatments. One of the most prominent themes is that of individualism; which explores the tension that can exist between the accepted norms of society and the individual’s own set of standards. If the individual’s behaviour becomes too extreme by societal standards they run the risk of being stigmatised and ostracised. The concept of free will presupposes choice whilst determinacy acknowledges that people may be constrained or propelled to act in a certain manner. Society and health care professionals often make judgments about choice and the level of culpability that people assume for their actions such as in the case of mental illness where this often ‘exonerates them from such responsibility’ (Mason 2000 pg. 270). The concept of autonomy incorporates three elements: autonomy of thought, of will, and of action. For people diagnosed with a mental illness these levels of autonomy can be restricted or curtailed. This curtailment can be ‘deemed reasonable and is based on a supposed treatment ethic and a safety principle’ (Mason 2000 pg. 270). Mason (2000) argues that in discussing ethical issues with reference to mental illness there are two interlinked deontological concepts of particular note: beneficence and nonmaleficence of relevance. Beneficence is about doing actions that are done for the benefit of others; nonmaleficence is concerned with not doing harm to others (Lesser 2002). However, in dealing with mental illness, a number of ethical issues can certainly be raised. For example, people diagnosed with a mental illness may have their liberty deprived and in extreme

368 111gradbook final.indd 368

22/10/2010 13:52:41


cases be forced to undergo treatments against their wishes which may seem somewhat at odds with the ethical practices of beneficence and nonmaleficence. Homosexuality, once considered to be a mental illness was only decriminalised in the Republic of Ireland in 1993 and is still illegal in many jurisdictions around the world (Irish Family Planning Association 2007). In fact, in Britain as recently as the 1950s and early 1960s, convicted homosexuals were administered oestrogen in order to ‘cure’ their illness (Davidson 2004). At this point in the debate it is often about balancing one need against the other and weighing up the relative benefits thus leading onto one of the best known ethical theories: Utilitarianism. The 18th century thinker, Jeremy Bentham is credited with defining the concept of utilitarianism, which can be summarised as undertaking the action that is most likely to produce the greatest happiness for the greatest number (Dooley and McCarthy 2005). Thus, by adopting the utilitarian principle, the protection of society by incarcerating those diagnosed as having a mental illness and deemed to present a risk to themselves, or more importantly to others, is an ethically sound act. Herein lies one of the most important debates surrounding mental illness diagnosis; whose rights should take precedence – the absolute rights of the individual (deontological) or the rights of the greater number (utilitarianism)? This debate might be somewhat easier to resolve if there was a universally accepted objective way of testing for and diagnosing mental illness; the trouble is that an analysis of mental illness diagnosis demonstrates strong levels of subjectivity and social influence. Mental asylums, as would be understood in a modern sense, could arguably be said to have begun in the late 17th and early 18th centuries with the opening of large asylums (these had often previously been used as leprosy hospitals until its eradication in Europe in the late middle ages). As society increasingly became more urbanised and complex there was a greater need for social control. The development of large mental asylums in tandem with debtors’ prisons and workhouses can be seen in the context of the ruling classes seeking to effect greater and greater social control (Foucault 2006). This was a time when to be poor was viewed as a symptom of moral failure and the difference between a person’s chances of being thrown into a workhouse, a debtors’ prison or a mental asylum were often very marginal (Roberts 2005). By the late 1960s and early 1970s a number of psychiatrists such as Ronnie Laing and Tomas Szasz unhappy with the seemingly ad-hoc manner in which diagnoses were arrived at and in the manner that psychiatric patients were treated on the basis of these diagnoses, spearheaded the anti-psychiatry movement. They both claimed that psychiatry was nothing more than social control providing the justification for the delineation between what was considered appropriate and deviant behaviour and that the ability to make a diagnosis was a spurious concept simply dependent on social factors and the prevailing context. For critical theorists like Szasz, Laing and Michel Foucault the association of psychiatry with bio-medical sciences and medicine gives the process of mental illness diagnosis a veneer that it does not deserve (Rissmiller and Rissmiller 2006). The assumption that deviant or problematic behaviour can be attributed to a disease process (similar to bio-medical diagnosis) creates a narrow and limiting perception of people which, in turn, has an over-deterministic effect on how people with psychiatric or medical diagnoses are regarded (Hall 1996, Videbeck 2006). As a

369 111gradbook final.indd 369

22/10/2010 13:52:41


consequence, people once diagnosed become viewed within the context of what Goffman (1986) calls a ‘Master Status’; the whole person can easily be viewed and then treated as a disease, reflected in the phrase, ‘she is a schizophrenic’ or ‘he is a manic depressive’ and so on. Whilst the word limit precludes an extensive discussion it is important to acknowledge that this paper has focused on a western ethnocentric understanding of the debates regarding the legal, professional and ethical issues of mental illness diagnosis. Other cultures do not necessarily comprehend acts of difference or deviance with the same moral barometer (Hyde et al. 2004). Consequently, some immigrants to Ireland may run the risk of being doubly demonised – first for being different and secondly for acting in a manner that is perceived and potentially diagnosed as different. To conclude, the relationship between professional issues and ethical codes is quite close in terms of mutual influence, thus making it harder to disentangle where one begins and the other ends and visa versa. An examination of the changing definitions and diagnosis of mental illness clearly demonstrates that diagnosis has often been culturally and temporally specific representing contemporary assumptions about what constituted difference and deviance. What may once have been considered normal medical practice may now be treated with scepticism, disbelief or even outright revulsion (Britten 2008). In the same vein, professional, legal and ethical codes and practices also reflect the times and places in which they are framed and administered and, in turn, reflect cultural and economic power imbalances.

370 111gradbook final.indd 370

22/10/2010 13:52:41


371 111gradbook final.indd 371

22/10/2010 13:52:41


Pharmacy panel

r

Prof. David Jones, QUB (chair) Dr. Chris Scott, QUB Dr. Carole Parsons, QUB Dr. Andrew Harkin, TCD Dr. Helen Sheridan, TCD Dr. Anne Moore, UCC Dr. Brian Kirby, RCSI Dr. Paul Gallagher, RCSI Dr. Marc DeVocelle, RCSI

Judges’ comments

This study examined the effect of hypoxia on Notch signalling pathway components in Rheumatoid Arthritis synovium and on human microvascular endothelial cells. Furthermore, this project examined the effect of Notch inhibition on hypoxia-induced angiogenesis in vitro. The interaction between Notch receptors and liagns is known to regulate intercellular communication and therefore, an understanding of these mechanisms will be of potential advantage in the design of novel strategies for the treatment of inflammatory disorders. The technical difficulty associated with this project was noted. Furthermore, both the judges and the chair of the judging panel noted the high quality of this project. This is an excellent study that is well written as a research paper.

372 111gradbook final.indd 372

22/10/2010 13:52:41


r Pharmacy

Notch plays a critical role in synovial angiogenesis in response to hypoxia Catherine Sweeney

I

Introduction

nflammatory arthritis (IA) is a chronic, progressive disorder that is characterised by synovial tissue (ST) proliferation and joint inflammation, leading to degradation of articular cartilage and subchondral bone. Angiogenesis, the formation of new capillaries from the pre-existing vasculature, is an early event in inflammation, and is closely linked with the initiation and progression of IA. Angiogenesis is dependent on the complex and highly conserved processes of endothelial cell (EC) activation, migration and survival.1 Formation of the invasive synovial “pannus” in IA is critically facilitated by angiogenesis. Several studies demonstrated increased vascularity and elevated levels of pro-angiogenic molecules in the inflammed joint, including vascular endothelial growth factor (VEGF), angiopoietins, plateletderived growth factor (PDGF), transforming growth factor-β-1 (TGFβ-1) and hypoxia-inducible factor-1 (HIF-1).2, 3 Notch receptor-ligand interaction is a highly conserved mechanism that regulates intercellular communication and directs individual cell fate decisions.4, 5 Notch receptors and ligands are transmembrane proteins; four Notch receptors (Notch 1-4) and five ligands (Jagged-1, -2, Delta-1, -3 and -4) have been identified in mammalian cells. Studies using constitutively active Notch receptors missing their extra-cellular domains (Notch IC) have shown that Notch signalling determines proliferation, differentiation and apoptosis in several mammalian cell types.5 Following cleavage by the γ-secretase, presenilin, Notch IC is translocated to the nucleus where it interacts with the 1CSL family of transcription factors (CBF-1/RBP-Jk, Su (h) and LAG-1) to become a transcriptional activator that can then modulate the expression of Notch target genes that regulate cell fate decisions.6, 7 These include the “Hairy Enhancer of Split“ (HES) and the HESrelated transcription factors (hrt) genes.5-7 Hypoxia is established as being a key driving force for angiogenesis, and is recognised as an important event in the perpetuation of joint destruction in IA.8, 9 Previous studies have clearly

373 111gradbook final.indd 373

22/10/2010 13:52:41


demonstrated that oxygen levels in the synovium of IA patients are reduced compared to healthy controls.10-13 Furthermore, we have recently shown that low in vivo oxygen levels in the inflamed joint are inversely related to increased macroscopic vascularity and synovial inflammation.3 In addition, studies have demonstrated HIF-1Îą expression in ST, and in synovial cell cultures that hypoxia induces key angiogenic growth factors (VEGF and Angiopoietins), chemokines (MCP-1, IL-8, CCL-20) and MMPs -1, -2, -3 and -9 and downregulates IL-10.13, 14 More recently, evidence that members of the Notch signalling pathway are upregulated in response to hypoxia has been demonstrated in murine endothelial progenitor and HeLa cell lines.15, 16 This study aims to determine the effect of hypoxia on Notch signalling pathway components in RA synovium and human microvascular endothelial cells. Furthermore, we investigate the effect of Notch inhibition on hypoxia-induced angiogenesis in vitro.

Materials and Methods Materials

All materials were of the highest purity commercially available and purchased from Sigma Aldrich Ltd. (Poole, Dorset, UK) unless otherwise stated. Polyclonal rabbit anti-Notch 1, Notch 3 and Notch 4 antibodies were obtained from Upstate Cell Signalling Solutions (Milton Keynes, UK). The anti-Notch 1 cytoplasmic domain antibody was raised using a GST fusion protein to residues 2272-2488 of rat Notch 1. The anti-Notch 3 and anti-Notch-4 were His-tag fusion proteins containing the C terminus of murine Notch 3 and Notch 4, respectively. Polyclonal goat anti-Dll-4, HRT-1 and HRT-3 antibodies were obtained from Santa Cruz Biotechnology (Santa Cruz, CA). Secondary anti-rabbit IgG was obtained from Amersham Biosciences (Buckinghamshire, UK). The AlexaFluor secondary antibodies were purchased from Molecular Probes (Rijnsburgerweg, The Netherlands) and cyanine two or three conjugated secondary antibodies were purchased from Jackson ImmunoResearch (Cambridgeshire, UK).

Patient recruitment, arthroscopy, oxygen partial pressure measurements and sample collection

Patients with active inflammatory arthritis, rheumatoid and psoriatic arthritis (RA/PsA), were recruited from the Rheumatology clinics prior to commencing biologic therapy. RA was classified according to the American College of Rheumatology criteria, and the diagnosis of PsA was classified according to previously defined criteria.17, 18 All patients had clinically active disease that included at least one inflamed knee joint. Following approval by the Institutional Ethics committee, patients gave written informed consent before undergoing arthroscopy. Arthroscopy of the inflamed knee was performed under local anesthetic using a Wolf 2.7 mm needle arthroscope (Storz, Tuttlingen, Germany) as previously described.19 Synovial tissue biopsies were obtained from representative areas of the inflamed synovial membrane with the use of grasping forceps under direct visualisation. The novel Licox p02 probe (Integra Life Sciences Corporation, New Jersey, USA) was introduced via a 22G needle and positioned into the biopsy pocket following the first biopsy, under direct visualisation, allowing direct p02 (mmHg) measurements in the synovial membrane sublining layer as

374 111gradbook final.indd 374

22/10/2010 13:52:41


previously described.20 These measurements were recorded for future classification of patients into low p02 (<20mmHg) and high p02 (>20mmHg) categories. For future immunohistochemical analysis synovial biopsies were embedded in OCT (Tissue Tek, The Netherlands) and stored at -80°C or paraffin embedded. Biopsies were also snap frozen in liquid nitrogen for future protein analysis.

Cell culture

Dermal-derived human micro-vascular endothelial cells (HMVEC) were purchased from Clonetics, San Diego, CA, USA. Cells were maintained in endothelial basal medium (EBM) supplemented with 5% FCS, 0.5 ml human epidermal growth factor, 0.5 ml hydrocortisone, 0.5 ml gentamicin, and 0.5 ml bovine brain extract (Clonetics) in a 37°C humidified atmosphere of 5%CO2/95% air. Cells were used between passages 3 and 8.

Notch inhibition and Exposure of HMVEC to hypoxia

HMVEC were plated at 1x106cells. Notch inhibition was achieved by incubating the cells in a γ-secretase inhibitor (N-[N-(3,5-Difluorophenacetyl-L-alanly)]-S-phenylglycine t- Butyl Ester) (DAPT) (50μM, or 10μM as appropriate).21, 22 A DMSO vehicle control at the same concentration was also included as a negative control. In the control experiments, settings of 5% CO2, 20% O2 and 37°C in a humidified incubator were used. The experiments under hypoxic conditions were performed using a Coy O2 control glove box chamber that provides a customised and stable environment through electronic control of CO2 (5%), O2 and temperature (37oC). HMVEC were cultured for 24 h under normoxia and 3% O2.

Preparation of protein lysates

Synovial tissue biopsies were powdered using a mikro-dismembrator U (B. Braun Biotech International, Melsungen, Germany); protein extracts were prepared by powdering 2-3 biopsies in liquid nitrogen and incubating the resulting powder in ice-cold Radioimmunoprecipitation (RIPA) buffer (50 mM Tris, 150 mM NaCl, 1 mM EDTA, 0.1% Triton X-100 (v/v); 0.25% sodium-deoxycholate, 2.5 mM sodium pyrophosphate, 1 mM NaF; 1 mM α-glycerophosphate; 1 mM sodium orthovanadate, 1 mM phenylmethylsulphonylfluoride (PMSF); 10 αg/ml protease inhibitors (Pierce Chemical Co, Rockford, IL, USA), pH 7.5. Samples were divided into aliquots and stored at –80°C before use. Protein concentration was measured using a BSA assay (Pierce Chemical Co, Rockford, IL, USA).

Western blot analysis

Proteins from synovial tissue lysates (20 μg) were resolved on SDS-PAGE (10% resolving, 5% stacking) prior to transfer onto nitrocellulose membrane (Amersham Biosciences). Membranes were stained in Ponceau S to ensure equal protein loading and were rinsed in wash buffer (PBS containing 0.05% Tween-20). Membranes were blocked for 2 h in wash buffer containing 5% nonfat dried milk at room temperature with gentle agitation. Following three 15-min washes in wash buffer, membranes were incubated in primary antibody (1:800 dilution in PBS containing

375 111gradbook final.indd 375

22/10/2010 13:52:41


0.05% Tween 20 and 2.5% nonfat dried milk) at 4°C overnight with gentle agitation. Membranes were incubated, following three 15-min washes, in 1:1000 dilution of horseradish peroxidase conjugated anti-rabbit IgG (in PBS containing 0.05% Tween 20 and 2.5% nonfat dried milk) for 2 h. Following three final 15-min washes, the ECL TM detection reagent (Amersham Biosciences) was placed on the membranes for 5 min before they were exposed to Hyperfilm ECL. The signal intensity of the appropriate bands on the autoradiogram was calculated by using the EDAS 120 system from Kodak (Kodak, Rochester, NY).

Immunohistology

Cryostat sections (7μm thick) were cut, dried overnight at 37°C and stored at –20°C. Immunohistochemistry was performed using synovial tissue sections and the DAKO ChemMate Envision Kit (Dako, Glostrup, Denmark). Cryostat sections were defrosted at room temperature for 20 min, fixed in acetone for 10 min, and washed in PBS for 5 min. Non-specific binding was blocked by incubating the sections in 10% casein for 20 min. Synovial membrane (SM) sections were stained with specific antibodies to Notch receptors, ligands and target genes and incubated on the sections for 2 h at room temperature in a humidified chamber. An IgG1 control antibody and elimination of the primary antibody were used as negative controls. Endogenous peroxidase activity was blocked incubating sections in 0.3% hydrogen peroxide for 7 min. Slides were washed in PBS and incubated in secondary antibody/HRP (Dako) for 30 min at room temperature. DAB was used to visualise staining, 1:3 Mayer’s haematoxylin (BDH Laboratories, Poole, Dorset) was incubated as a DNA counterstain prior to mounting in Pertex mounting media. Images were captured using Olympus DP50 light microscope and AnalySIS software (Soft Imaging System Corporation, Lakewood, CO). To stain HMVEC, cells were seeded onto 6-well plates 2 days prior to staining at 2 × 105 cells per well. Cells were stained for Notch receptor, ligand and target gene expression at 80–90% confluency by using the following protocol. Cells were washed three times in PBS. The cells were then permeabilised and fixed in methanol (–20°C, 10 min) and subsequently rehydrated in PBS containing 3% BSA for 10 min. Cells were then incubated in the appropriate primary antibody at 4°C overnight with gentle agitation, washed and then incubated with secondary antibody (1:200 dilution in PBS containing 3% BSA using FITC or anti-goat AlexaFluor) for 2–3 h at 37°C. Cells were counterstained with a DAPI nuclear stain (1:1000 for 10 min) prior to visualisation by using an Olympus DP-50 fluorescent microscope and analysed using AnalySIS software. Mag 10× and 40×.

In-gel Zymography

The activity of pro-MMP-2 and -9 secreted by synovial biopsies and HMVEC into culture medium was determined by gelatin zymography. Zymogram gels consisted of 7.5% polyacrylamide gels polymerised together with gelatin (1 mg/ml). After electrophoresis, the gels were washed with 2.5% Triton X-100 and incubated with substrate buffer (50 mM Tris, 5 mM CaCl2, pH 7.5) at 370C for 24 h. The gels were stained with Coomasssie brilliant blue R 250 and destained with water. Gelatinolytic activities appearing as a clear zone were quantitated using densitometric analysis. Bands were identified using gelatinase standards (Chemicon).

376 111gradbook final.indd 376

22/10/2010 13:52:41


In vitro HMVEC tube formation assay

HMVEC tube formation was assessed using Matrigel basement membrane matrix. Matrigel (50 μl) was added to 96-well culture plates and allowed to polymerise at 37°C for 1 hour before plating the cells. HMVECs (1 × 104 cells) were then plated at 250 μl EBM/well onto the surface of the matrigel and incubated in normoxic or 3% O2 as described previously for 24 h. The EBM was supplemented with DAPT (50 μM), DAPT (10 μM), or DMSO vehicle control as indicated. HMVEC tubes were photographed using phase-contrast microscopy at 10× magnification. A connecting branch between two discrete EC was counted as one tube and required a consistent intensity and thickness. Tube formation was assessed by two blinded observers and was determined from five random fields per duplicate well, focusing on the surface of the matrigel.

Cell Migration Assays

To perform scratch assays, HMVEC were plated into 24-well cell culture plate, and maintained in EBM for 4 h, then were washed with serum-free medium and starved of both serum and growth factors for overnight. A 1-mm wide scratch was made across the cell layer using a pipette tip. After washing with serum-free medium twice, EBM supplemented with DAPT (50 μM), DAPT (10 μM), or DMSO vehicle control as indicated was incubated on the cells for 24 h. Plates were photographed after 24 h and extent of migration assessed by two blinded observers. All experiments were performed at least three times in triplicate.

Statistical analysis

Results are expressed as mean ± SEM. Experimental points were performed in triplicate with a minimum of three independent experiments. An unpaired Student’s t test and a Wilcoxon signed rank test were used as appropriate for comparison of two groups. A value or P<0.05 was considered significant.

Results Expression of Notch signalling pathway components within the inflammatory joint. Immunohistochemical analysis for components of the Notch signalling pathway was carried out in synovial tissue sections from patients with inflammatory arthritis (n=5). The Notch receptors (Notch 1 and 4) and target gene (Hrt-1) were detected in the perivascular and synovial sub-lining layers (Figure 1a). Immunofluorescent staining for Notch signalling pathway components was performed in HMVEC (n=3). The pattern of staining was both nuclear and cytoplasmic, however Notch 1, Notch 4 and Hrt-1 staining appears to be predominantly nuclear, in contrast to Notch 3, Dll-4 and Hrt-2 and -3 staining, which was mainly cytoplasmic (Figure 1b). Increased Notch 1 IC levels are associated with low in vivo p02 levels in ST Notch 1 IC protein expression was determined in synovial tissue lysates. Patients were categorised according to in vivo joint oxygen levels, into p02< or >20mmHg (n=9). Notch 1 IC protein expression was higher in patients with p02 <20mmHg compared to those with p02 >20mmHg. Figure 2a shows a representative western blot of increased Notch 1 IC levels associated with low in vivo p02 (<20 mm Hg), this is represented graphically in figure 2b.

377 111gradbook final.indd 377

22/10/2010 13:52:41


Hypoxia upregulates HIF-1α and Notch 1 IC protein expression in HMVEC HMVEC were cultured under normoxia, or 1% or 3% hypoxia for 24 h. HIF-1α and Notch 1 IC protein levels were subsequently determined by western blot analysis. HIF- 1α is virtually undetectable under normoxic conditions, but is induced following exposure to 1% and 3% hypoxia (Figure 3a). Notch 1 IC protein expression is detectable in HMVEC under normoxic conditions, and is induced following exposure to 1% and 3% hypoxia (Figure 3b), with maximal levels observed at 3% oxygen which reflects the in vivo median levels of hypoxia in the IA joint.3

Hypoxia-induced angiogenic response in vitro is Notch-dependent

Matrigel tube formation, EC migration and pro-MMP-2 and -9 activities were assessed as in vitro measurements of angiogenesis. Hypoxia significantly increased EC tube formation (hypoxia control = 36.94±3.33 tubes/high powered field vs. normoxia control = 28.8±2.45 tubes/high powered field n=3, P<0.05) an effect that was significantly inhibited in the presence 10 μM DAPT under both normoxic and hypoxic conditions (P<0.05) (Figure 4). No effect was observed for DMSO vehicle control (Figure 4a and b). EC migration was assessed using a scratch assay, in which HMVEC were wounded with a 1-mm wide scratch across the cell layer and exposed to normoxic and hypoxic conditions for 24 h in the presence or absence of 50 μM and 10 μM DAPT. Hypoxia increased EC migration across the wound, an effect that was inhibited by both 50 μM and 10 μM DAPT. No effect on cell migration was evident under normoxic conditions (Figure 4c). Similarly hypoxia significantly increased both pro-MMP-2 and-9 activities (pro-MMP-2 and pro-MMP-9; 2.86±0.58- and 1.73±0.22-fold increase over normoxia basal respectively, n=3, P<0.05). The activity of pro-MMP-2 and 9 under hypoxic conditions was significantly inhibited by treatment with 50 μM and 10 μM DAPT. Conversely, treatment with DAPT under normoxic conditions resulted in a significant increase in both pro-MMP-2 and-9 activities. The inclusion of DMSO as a vehicle control exhibited no significant effects (Figure 4d).

Discussion

This study demonstrates synovial expression of Notch signalling pathway components. Notch 1 IC expression was higher in patients with p02 <20mmHg compared to those with p02 >20mmHg. We demonstrate that hypoxia induces HIF-1α and Notch signalling pathway components with maximal levels observed at 3% oxygen. Furthermore, we demonstrate that 3% hypoxia induces angiogenic tube formation, EC migration and activity of pro-MMP-2 and-9. Finally, we demonstrate that hypoxia induced EC function is inhibited by DAPT, a specific inhibitor of the CBF-1-dependent Notch signalling pathway. DAPT is a cell-permeable molecule that inhibits the cleavage of Notch IC by presenilin, thus preventing the release of the active form of the Notch receptor from the membrane.21, 22 Angiogenesis is an early and critical event associated with several pathologies, including IA, blockade of which is potentially a clinical important goal in the future treatment of IA. Several studies have established that hypoxia is a key regulator of angiogenesis in vitro in many cell types.13, 15, 16 We have recently demonstrated that at low synovial tissue p02 levels there is an

378 111gradbook final.indd 378

22/10/2010 13:52:41


increase in macroscopic vascularity, and a decrease in blood vessel stability. In vivo p02 levels within the IA joint were measured using a novel Licox probe, and low levels of p02 in the synovial membrane were found to be consistent with ambient oxygen tensions of 3.2%. This indicates that hypoxia within the joint increases angiogenesis and vessel instability.3 Consistent with other studies, we demonstrate that the in vivo measures of hypoxia in the IA joint result in significantly increased in vitro measurements of angiogenesis, namely EC tube formation, migration and proMMP-2 and -9 production.13, 26, 27 Hypoxia has previously been shown to upregulate the expression of components of the Notch signalling pathway in several cell lines.15, 16 We report, for the first time, that ST p02 levels affect the expression of Notch signalling pathway components in vivo. Furthermore, we demonstrate that increased Notch 1 IC protein levels are associated with low in vivo p02 levels (<20 mm Hg) in the inflamed joint. It is well established that HIF-1α is a key regulator of the cellular response to hypoxia. We, and others, have demonstrated an upregulation in HIF-1α protein expression in vitro in response to increased levels of hypoxia.15 Evidence for a functional link between HIF-1α and Notch is also being increasingly established. HIF-1α binding domains and a hypoxia response element (HRE) have been demonstrated in hrt-1, -2 and dll-4 promoters.15 Furthermore, Gustaffen et al. have recently reported a protein-protein interaction between HIF-1α and Notch IC, resulting in increased stability under hypoxic conditions in P19 and Cos7 cells.28 Finally, a negative feedback loop between Notch and HIF-1α has been proposed, as one report suggests that Hrt-1 and -2 is capable of repressing hif-1a gene induction.15 Taken together, this data indicates that Notch has a potentially important role in hypoxia-mediated induction of angiogenesis. The role of the Notch signalling pathway in embryonic vasculogenesis and angiogenesis is well established.29 Targeted disruption of components of this pathway can result in embryonic lethality.28-31 In adults deletion of Notch 1 in mice, for example, resulted in defects in angiogenic vascular remodelling.32 Furthermore, several studies have shown that Notch signalling plays a significant role in EC migration and angiogenesis.33-38 Interestingly, alteration in Notch signalling produces abnormalities in vessel structure, and branching and patterning of the vasculature, suggesting that Notch regulates patterning of the vascular network, similar to that observed in the inflamed joint. 39, 40 In this study, we have demonstrated that inhibition of the Notch signalling pathway using the γ-secretase inhibitor, DAPT, inhibited EC tube formation under both normoxic and hypoxic conditions. Similarly, we demonstrated that DAPT inhibited both hypoxia- induced EC migration and pro-MMP-2 and -9 production. This indicates, for the first time in adult microvascular EC, that angiogenesis is mediated, at least in part, by the Notch signalling pathway. DAPT does not completely reverse the effect of hypoxia on the in vitro measurements of angiogenesis. This is consistent with a recent study carried out in endothelial progenitor and HeLa cells. The authors demonstrated that hrt-2 and dll-4 induction by hypoxia was partially attenuated by DAPT.15 This may be due to a direct effect of HIF-1α on Notch target genes, Notch signalling in a CBF-1independent manner which does not require Notch receptor cleavage by presenilin, input of other signalling pathways, or a mixture of all three.15, 41 MMPs play a key role in angiogenesis and other

379 111gradbook final.indd 379

22/10/2010 13:52:41


factors that contribute to the pathogenesis of IA, such as degradation of the extra-cellular matrix. The observed hypoxia-induced increases in pro-MMP-2 and -9 activity are consistent with other reports.13 Interestingly, we observed that inhibition of the Notch signalling pathway resulted in a significant increase in both pro-MMP-2 and -9 levels under normoxic conditions, this may be due to the differential regulation of MMPs by Notch under normal versus pathological conditions, as is evidenced in other disease states.

Conclusion

This study demonstrates synovial expression of Notch signalling pathway components, and for the first time, an inverse relationship between low synovial tissue p02 levels and Notch IC protein expression. Furthermore, we demonstrate that hypoxia increases Notch signalling pathway protein expression and measurements of angiogenesis in HMVEC in vitro. Inhibition of the Notch signalling pathway using a Îł-secretase inhibitor significantly attenuates in vitro measurements of angiogenesis. Future studies are warranted to attempt to fully delineate this pathway, as it may have future important therapeutic implications.

Koch AE. Review: angiogenesis: implications for rheumatoid arthritis. Arthritis Rheum. 1998:41(6): 95162. 2Schoettler N, Brahn E. Angiogenesis inhibitors for the treatment of chronic autoimmune inflammatory arthritis. Curr Opin Investig Drugs. 2009:10(5): 425-33. 3Kennedy A, Ng CT, Biniecka M, Saber T, Taylor CT, O’Sullivan J, Veale DJ, Fearon U. Angiogenesis and blood vessel stability in Inflammatory Arthritis. ArthritisRheum.2010 inpress. 4WeinsteinBM, LawsonND. Arteries, veins, Notch, and VEGF. Cold Spring Harb Symp Quant Biol. 2002:67:155-62. 5Iso T, Hamamori Y, Kedes L. Notch signalling in vascular development. Arterioscler Thromb Vasc Biol. 2003:23(4): 543-53. 6Lai EC. Keeping a good pathway down: transcriptional repression of Notch pathway target genes by CSL proteins. EMBO Rep. 2002:3(9): 840-5. 7 Iso T, Kedes L, Hamamori Y. HES and HERP families: multiple effectors of the Notch signalling pathway. J Cell Physiol. 2003:194(3): 237-55. 8Muz B, Khan MN, Kiriakidis S, Paleolog EM. The role of hypoxia and HIF-dependent signalling events in rheumatoid arthritis. Arthritis Res Ther. 2009:11(1): 201. 9Bodamyali T, Stevens CR, Billingham ME, Ohta S, Blake DR. Influence of hypoxia in inflammatory synovitis. Ann Rheum Dis. 1998:57(12): 703-10. 10Hitchon CA, El-Gabalawy HS. Oxidation in rheumatoid arthritis. Arthritis Res Ther. 2004:6(6): 265-78. 11Lund-Olesen K. Oxygen tension in synovial fluids. Arthritis Rheum. 1970:13(6): 769-76. 12Sivakumar B, Akhavani MA, Winlove CP, Taylor PC, Paleolog EM, Kang N. Synovial hypoxia as a cause of tendon rupture in rheumatoid arthritis. J Hand Surg Am. 2008:33(1): 49-58. 13Akhavani MA, Madden L, Buysschaert I, Sivakumar B, Kang N, Paleolog EM. Hypoxia upregulates angiogenesis and synovial cell migration in rheumatoid arthritis. Arthritis Res Ther. 2009:11(3): R64. 14SivakumarB, Akhavani MA, Winlove CP, Taylor PC, Paleolog EM, Kang N. Synovial hypoxia as a cause of tendon rupture in rheumatoid arthritis. JHand Surg Am. 2008:33(1): 49-58. 15Diez H, Fischer A, Winkler A, Hu CJ, Hatzopoulos AK, Breier G, Gessler M. Hypoxia-mediated activation of Dll4-Notch-Hey2 signalling in endothelial progenitor cells and adoption of arterial cell fate. Exp Cell Res. 2007:313(1): 1-9. 16Lee JH, Suk J, Park J, Kim SB, Kwak SS, Kim JW, Lee CH, Byun B, Ahn JK, Joe CO. Notch signal activates hypoxia pathway through HES1-dependent 1

380 111gradbook final.indd 380

22/10/2010 13:52:41


SRC/signal transducers and activators of transcription 3 pathway. Mol Cancer Res. 2009:7(10): 1663-71. 17 Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, Cooper NS, Healey LA, Kaplan SR, Liang MH, Luthra HS, et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum. 1988:31(3): 315-24. 18Veale D, Rogers S, Fitzgerald O. Classification of clinical subsets in psoriatic arthritis. Br J Rheumatic. 1994:33(2): 133-8. 19Youssef PP, Kraan M, Breedveld F,

Bresnihan B, Cassidy N, Cunnane G, Emery P, Fitzgerald O, Kane D, Lindblad S, Reece R, Veale D, Tak PP. Quantitative microscopic analysis of inflammation in rheumatoid arthritis synovial membrane samples selected at arthroscopy compared with samples obtained blindly by needle biopsy. Arthritis Rheum. 1998:41(4): 663-9. 20 Biniecka M, Kennedy A, Fearon U, Ng CT, Veale DJ, O’Sullivan JN. Oxidative damage in synovial tissue is associated with in vivo hypoxic status in the arthritic joint. Ann Rheum Dis. 2009:ard.2009111211. 21 Sassi N, Laadhar L, Mahjoub M, Driss M, Zitouni M, Benromdhane K, Makni S, Sellami S. Expression of Notch family members in cultured murine articular chondrocytes. Biotech Histochem. 2009:26:1-8. 22Takeshita K, Satoh M, Ii M, Silver M, Limbourg FP, Mukai Y, Rikitake Y, Radtke F, Gridley T, Losordo DW, Liao JK. Critical role of endothelial Notch1 signalling in postnatal angiogenesis. Circ Res. 2007:100(1): 70-8. 23 Koch AE. Angiogenesis as a target in rheumatoid arthritis. Ann Rheum Dis. 2003:62 Suppl 2:ii60-7. 24Reece RJ, Canete JD, Parsons WJ, Emery P, Veale DJ. Distinct vascular patterns of early synovitis in psoriatic, reactive, and rheumatoid arthritis. Arthritis Rheum. 1999:42(7): 1481-4. 25 Fearon U, Griosios K, Fraser A, Reece R, Emery P, Jones PF, Veale DJ. Angiopoietins, growth factors, and vascular morphology in early arthritis. J Rheumatol. 2003:30(2): 260-8. 26 Ottino P, Finley J, Rojo E, Ottlecz A, Lambrou GN, Bazan HE, Bazan NG. Hypoxia activates matrix metalloproteinase expression and the VEGF system in monkey choroid-retinal endothelial cells: Involvement of cytosolic phospholipase A2 activity. Mol Vis. 2004:10:341-50. 27del Rey MJ, Izquierdo E, Caja S, Usategui A, Santiago B, Galindo M, Pablos JL. Human inflammatory synovial fibroblasts induce enhanced myeloid cell recruitment and angiogenesis through a hypoxia-inducible transcription factor 1 alpha/vascular endothelial growth factor – mediated pathway in immunodeficient mice. Arthritis Rheum. 2009:60(10): 2926-34. 28Gustafsson MV, Zheng X, Pereira T, Gradin K, Jin S, Lundkvist J, Ruas JL, Poellinger L, Lendahl U, Bondesson M. Hypoxia requires notch signalling to maintain the undifferentiated cell state. Dev Cell. 2005:9(5): 617-28. 29Park M, Yaich LE, Bodmer R. Mesodermal cell fate decisions in Drosophila are under the control of the lineage genes numb, Notch, and sanpodo. Mech Dev. 1998:75(1-2): 117-26. 30 Gridley T. Notch signalling during vascular development. Proc Natl Acad Sci U S A. 2001:98(10): 5377-8. 31 Tallquist MD, Soriano P, Klinghoffer RA. Growth factor signalling pathways in vascular development. Oncogene. 1999:18(55): 7917-32. 32: Kojika S, Griffin JD. Notch receptors and hematopoiesis. Exp Hematol. 2001:29(9): 1041-52. 33 Taylor KL, Henderson AM, Hughes CC. Notch activation during endothelial cell network formation in vitro targets the basic HLH transcription factor HESR-1 and downregulates VEGFR-2/ KDR expression. Microvasc Res. 2002; 64(3):372- 83. 34 Leong KG, Hu X, Li L, Noseda M, Larrivée B, Hull C, Hood L, Wong F, Karsan A. Activated Notch4 inhibits angiogenesis: role of beta 1-integrin activation. Mol Cell Biol. 2002 Apr: 22(8):2830-41. 35Sullivan DC, Bicknell R. New molecular pathways in angiogenesis. Br J Cancer. 2003:89(2): 228-31. 36 Nakatsu MN, Sainson RC, Aoto JN, Taylor KL, Aitkenhead M, Pérez-delPulgar S, Carpenter PM, Hughes CC. Angiogenic sprouting and capillary lumen formation modeled by human umbilical vein endothelial cells (HUVEC) in fibrin gels: the role of fibroblasts and Angiopoietin-1. Microvasc Res. 2003:66(2): 102-12. 37Mailhos C, Modlich U, Lewis J, Harris A, Bicknell R, Ish-Horowicz D. Delta4, an endothelial specific notch ligand expressed at sites of physiological and tumor angiogenesis. Differentiation.

381 111gradbook final.indd 381

22/10/2010 13:52:42


2001:69(2-3): 135-44. 38 DiCorleto PE. Cellular mechanisms of atherogenesis. Am J Hypertens. 1993 Nov: 6(11 Pt 2):314S-318S. 39 Villa N, Walker L, Lindsell CE, Gasson J, Iruela-Arispe ML, Weinmaster G. Vascular expression of Notch pathway receptors and ligands is restricted to arterial vessels. Mech Dev. 2001:108(12): 161-4. 40 Veale DJ, Fearon U. Inhibition of angiogenic pathways in rheumatoid arthritis: potential for therapeutic targeting. Best Pract Res Clin Rheumatol. 2006:20(5): 941-7. 41 Martinez Arias A, Zecchini V, Brennan K. CSL-independent Notch signalling: a checkpoint in cell fate decisions during development? Curr Opin Genet Dev.

Fig. 1. Notch localisation in synovial tissue and HMVEC. A) Immunohistochemical staining of Notch receptors (Notch -1 and -4 IC) and Notch target gene, Hrt-1 in synovial tissue sections in patients with IA (n=5). B) Immunofluorescent staining of Notch receptors (Notch -1, -3 and -4 IC), Notch ligand (Dll-4) and Notch target genes (Hrt-1, -2 and -3) in HMVEC. n=3. Representative images and isotyped matched IgG controls are shown.

382 111gradbook final.indd 382

22/10/2010 13:52:42


Fig. 2. Increased Notch 1 IC levels are associated with low in vivo p02 levels in ST of patients with inflammatory arthritis. Patients were categorised into pO2<20 mm Hg (n=5) or >20mmHg (n=4). A) Representative western blot of Notch 1 IC protein expression in IA ST with pO2< or >20mmHg. Data was normalised to β-actin protein levels. B) Cumulative data representing mean values from individual patients¹SEM.

383 111gradbook final.indd 383

22/10/2010 13:52:42


Fig. 3. Hypoxia induced angiogenic response in vitro is Notch-dependent. A) Representative western blot showing HIF-1ι expression in HMVEC exposed to normoxic (N) and 1% and 3% hypoxia. B) Representative western blot showing Notch 1 IC expression in HMVEC exposed to normoxic (N) and 1% and 3% hypoxia. Data was normalised to β-actin protein levels.

384 111gradbook final.indd 384

22/10/2010 13:52:42


Fig. 4. Hypoxia induces Notch signalling pathway-mediated in vitro measurements of angiogenesis. A) HMVEC were treated with 50 ÎźM or 10 ÎźM DAPT, or DMSO vehicle control as appropriate, and cultured under normoxic and 3% hypoxia conditions for 24 h. Angiogenic activity was then assessed by measuring network formation on Matrigel, as described in Materials and Methods section. B) Graphical representation of EC tube formation, expressed as no. of tubes/high powered field, n=3, *P<0.05 vs. normoxia basal, #P<0.05 vs. hypoxia basal.

385 111gradbook final.indd 385

22/10/2010 13:52:42


C) Confluent HMVEC were cultured in serum-free media overnight and wounded to assess cell migration, as described in Material and Methods section. HMVEC were subsequently treated with 50 μM or 10 μM DAPT, or DMSO vehicle control as appropriate, and cultured under normoxic and 3% hypoxia conditions for 24 h. Cell migration was assessed as a marker of in vitro angiogenesis. Representative photos of n=3 experiments shown. D) HMVEC were treated with 50 μM or 10 μM DAPT, or DMSO vehicle control as appropriate, and cultured under normoxic and 3% hypoxia conditions for 24 h. Conditioned media was harvested and pro-MMP-2 and -9 activity was identified by in-gel zymography. Representative gels shown (inverted for clarity) (Figure 4d insert). The cumulative data represents normalised band densitometry readings averaged from three independent experiments±S.E.M, and represented as fold change over normoxia basal control. *P<0.05 vs. normoxia basal, #P<0.05 vs. hypoxia basal.

386 111gradbook final.indd 386

22/10/2010 13:52:43


387 111gradbook final.indd 387

22/10/2010 13:52:43


Philosophical Studies panel

r

Prof. Peter Simons, TCD (chair) Dr. Tim Crowley, UCD Dr. Michael Dunne, NUIM Dr. Lilian O’Brien, UCC Dr. Benjamin Jarvis, QUB Dr. Zuleika Rodgers, TCD

Judges’ comments

Nearly all the descriptive words we use every day are vague, which means that there are instances where it is unclear whether they apply or do not apply; so-called borderline cases. While there is a good reason for this, namely that we have to be able to learn and apply such words in everyday situations, it has been known since antiquity that standard logical assumptions applied to vague expressions can lead to catastrophe. In the ancient Paradox of the Heap, we seem to be able to show that a single grain of sand constitutes a heap of sand, or that a heap a mile high does not. The importance of discovering how to reason coherently using vague terms has prompted much recent philosophical activity. One large publicly-financed research project in the UK recently resulted in a 586-page volume by 31 authors. Among the several contending theories for explaining and coping with vagueness, the most widely supported among experts is known as the supervaluational theory. First proposed in 1975 by the English philosopher, Kit Fine, it is technically challenging, and has both its supporters and its critics. It is very much at the cutting edge of contemporary professional discussion. The winning essay fluently elucidates the technical details of this theory, insightfully and succinctly lays out its principal advantages and drawbacks, and concludes that it cannot be the whole story. The essay is easily of postgraduate standard, and could serve as a chapter in a doctoral dissertation. It shows mastery of difficult literature, is critically ambitious, well structured, and clearly and accurately expressed. The panel congratulates the winner on this excellent piece of work.

388 111gradbook final.indd 388

22/10/2010 13:52:43


r Philosophical Studies

Describe and evaluate the supervaluational theory of vagueness Siobhan Moriarty

I

Introduction

n this essay I shall assert and defend the claim that the supervaluational theory of vagueness is untenable. The essay shall be structured in four sections: 1) A general outline of the supervaluational theory of vagueness. 2) A detailed description of a specific instance of the general theory – that presented by Kit Fine in his 1975 paper ‘Vagueness, Truth and Logic.’1 3) An evaluation of the Fine’s supervaluational theory of vagueness on the basis of an assessment of factors which support it and those which undermine it. 4) A conclusion applying the results detailed in the third section to the general outline of the supervaluational theory.

Section 1: A general outline of the supervaluational theory of vagueness

The supervaluational theory of vagueness is a proposed resolution of the vagueness of borderline cases where a given term can be said to either apply or not apply. The theory claims that with regard to any given vague term ‘R’ there are a range of ‘admissible precisifications’. The truth of any proposition Ra should be assessed by substituting for the vague term R each admissible precisification of R and determining the truth of Ra under all such precisifications. If the Since, as Varzi notes “it is pretty clear that there is not just one supervaluational semantics out there – there are lots of such semantics”((Varzi) Williams, 2009, 1), I have, in this essay, considered the general “barebones idea”(Williams, 2009, 1) in conjunction with a detailed analysis of one instance of it. Given the space restrictions I think that this is preferable to either give the consideration of multiple systems very briefly or give consideration of the general idea of the system without reference to any specific instance.

1

389 111gradbook final.indd 389

22/10/2010 13:52:43


proposition is true for all admissible precisifications then it is said to be supertrue, if it is false for all admissible precisifications then it is said to be superfalse, if it is true for some and false for others it is said to be ‘neither true nor false’. Supertrue is then claimed as the proper meaning of truth and superfalsity as the proper meaning of falsity. Therefore, supervaluation proposes to resolve the borderline cases by claiming that Ra is only true where it is supertrue. In spite of the indeterminate range sanctioned supervaluation retains the law of excluded middle by defining this in terms of the operator introduced to indicate supertruth – ‘D’. The Law of the excluded middle is retained as (DRa v ~DRa). Supervaluation requires the sacrifice of truth-functionality in compound propositions but it can explain how a proposition such as ~(Ra & ~Ra) is true even if the conjuncts are neither true nor false. Since any admissible precisification will make one true and the other false it follows that all admissible precisifications make their conjunction false and thus make the negation of their conjunction supertrue. Such a move is also used by supervaluation to block the sorites paradox. For any admissible precisification of the vague predicate ‘heap’ there will be some ‘x’ which is a heap even though (x + 1) is not a heap; any admissible precisification will draw a line and for any line there will be two things, n and n+1, which that line divides. Hence, “the supervaluationist rejects the induction step of the sorites argument” (Sorensen, 2006, SEP) as all admissible precisifications make it false; it is thus superfalse and therefore false.

Section 2: Fine’s position

Fine defines vagueness as underdeterminacy. He avers that although a vague predicate is underdetermined it has content which is evident in its penumbral connections – those connections which clearly hold between vague sentences – for instance, if P and Q are contraries, that is a penumbral connection such that even if Pa is indefinite and Qa is indefinite it is clear that ~(Pa & Qa). Fine proposes a general semantical system – that of the specification space – within which he resolves the vagueness of “an intuitively understood though possibly vague first order language L” (Fine, 1975, 267) through supervaluational methods. He defines a specification space as “a nonempty set of elements, the specification-points and a partial ordering ≥ (also read: extends) on the set” (Fine, 1975, 271). Specifications are assignments of the truth values; true, false and indefinite to atomic sentences. Fine identifies the vague sentence with a partial specification point which forms the base point of the specification space. The partial specification point assigns definite truth values to some sentences but not to all. Further specification points may extend the partial specification point in different ways so long as they continue to give the sentences which received definite truth values under the partial specification those same definite truth values. Fine refers to the condition whereby they must retain definite truth value assignments as the Stability condition. The process of extension continues along each chain of extension to a complete specification point which assigns only the definite truth values ‘true’ and ‘false’. Thus, each partial specification point has a set of complete specification points and at this complete level, vagueness and the indefinite truth value have been eliminated. Fine asserts that truth at the complete level is classical and refers to this as the Fidelity condition.

390 111gradbook final.indd 390

22/10/2010 13:52:43


The claim that these complete points can be reached depends upon two further conditions – the Resolution condition which states that “an indefinite atomic sentence can be resolved in either way upon improvement in precision” (Fine, 1975, 279) and the Completeability condition which states that “any point can be extended to a complete point within the same space” (Fine, 1975, 272). The resolution condition amounts to the claim that no vagueness is fundamental; all vague sentences can be precisified and that this can be done in such a way that their possibilities for precisification are exhausted. Completeability is defended on the basis of the resolution condition in conjunction with a set theoretic result ‘Zorn’s Lemma’ which states that “If s is any nonempty partially ordered set in which every chain has an upper bound, then s has a maximal element.” (Weisstein)2 Taken in conjunction with Resolution it follows that the maximal element must be the complete specification point at which no sentence is assigned the indefinite truth value because if the maximal point were otherwise the possibility of further precisification would remain which would mean that it was not actually the maximal element. Fine claims that the truth value of a vague sentence should be determined by reference to an appropriate specification space where ‘appropriate specification space’ is defined in terms of the points it contains “a space is appropriate if each point corresponds to a precisification, one point for each precisification” (Fine, 1975, 271). Admissibility is defined in terms of the specification point and its relation to the base point “the only constraint on admission into the appropriate space is that a point can verify the original penumbral truths” (Fine, 1975, 277) and specification points are defined in terms of precisifications “a specification is admissible if it is appropriate for some precisification. Officially, the notion of admissibility is primitive” (Fine, 1975, 273). Despite the claim in the latter part of the last quotation one can see that that which is really primitive in the theory is the collection of admissible precisifications. That such a collection exists and can be known is assumed and everything else is defined in terms of these points. At a later point Fine addresses the question of how this complete group of admissible precisifications is known and suggests three different ways in which the whole could come to be known, but there is no question that it is not known nor that it is not known as a whole. Fine uses this structure as the semantical framework within which he proposes his supervaluational theory of vagueness. He claims that a vague sentence is true at a partial specification point if it is true at all the complete points of the appropriate specificational space. This constitutes supertruth and truth, he claims, is supertruth.

Weisstein, Eric W. “Zorn’s Lemma.” From MathWorld – A Wolfram Web Resource. http://mathworld. wolfram.com/ZornsLemma.html.

2

391 111gradbook final.indd 391

22/10/2010 13:52:43


Section 3: An evaluation of the Fine’s supervaluational theory of vagueness on the basis of an assessment of factors which support it and those which undermine it

Factors in favour of Fine’s theory include the following: 1. The idea of penumbral connections providing core content to the vague predicate is persuasive. In general, even though we can’t precisely define a vague predicate such as tall, we can partially define all vague predicates by reference to the relations which we know must hold between instances of the predicate (e.g. between tall people) and between instances of that predicate and other predicates (e.g. tall vs. short). Further, the way in which the supervaluational method validates these penumbral connections also seems correct. It can be observed that with respect to both the vague and the precise some claims can be verified on the basis of knowledge about the relevant concepts and certain structural truths about the world. The supervaluational method verifies intuitively plausible penumbral connections in a manner which seems to capture a distinct feature of such claims. 2. The characterisation of the vague predicate as underdetermined is credible3. Given this characterisation the notion of precisifications is complimentary and convincing, as is the idea of taking these precisifications in connection with each other.4 3. The supervaluational method itself has a parallel with the generally accepted definition of logical validity whereby a formula is valid if it is true for all substitutions of its variables. There is also a parallel with necessity in modal logic whereby something is necessary if it is true in all possible worlds. The latter parallel is pointed out by Fine and both enhance the credibility of the method and its results. 4. The supervaluational theory takes precisifications into account but assiduously avoids false truth allocation which it is imperative to avoid. A theory of vagueness which endorsed boundaries which gave rise to false truth allocation would, given the ubiquity of vague predicates in natural language and the bivalent truth conditions of logic, lead to flawed arguments. 5. Fine’s supervaluational theory redefines the vagueness of first order predicates in positive terms through the connection which he draws between the base point and the complete points whereby the complete points illuminate aspects of the base point and are connected to it: “how an expression can be made more precise is already part of its meaning” (Fine, 1975, 277). Indeed, Fine makes vagueness a virtue of language whereby change is possible.5 Given the ubiquity of vague predicates in natural languages and the fact that much has been achieved It is similar to Frege’s characterisation of the vague predicate as “incompletely defined”(Williamson, 1994, 38) though what Frege concludes about vague predicates is a complete contrast to the supervaluational position. 4 Part of the problem of vagueness is usually apparent when one takes a reasonable precisification of a vague predicate as the definition of a that predicate but finds that as a complete definition it is inadequate. 5 “In understanding a language one has thereby understood how it can be made more precise; one has understood… the possibilities for its growth” (Fine, 1975, 277). 3

392 111gradbook final.indd 392

22/10/2010 13:52:43


(including philosophically) through thought and writing in natural languages the fact that a theory understands the phenomenon of vagueness in other than purely negative terms and still accommodates the requirements of precision is a point which contributes to its defence. 6. Vague sentences are represented as vague but precisifiable in Fine’s supervaluational theory.6 This characterisation of vague terms renders them tractable and facilitates resolutions of many of the semantical paradoxes arising from vagueness. 7. Supervaluation creates, as Fine points out, a principle-based framework within which vagueness can be accommodated. Factors against Fine’s theory include the following: 1. Supervaluation entails the loss of truth functionality with respect to compound propositions in which one or more of the parts are vague. Although there are strong intuitions to the effect that truth functionality should hold one can see that, with respect to the propositions in question, the grounds on which we assert their truth or falsity are distinct from those on which we would assert the truth of each part. The supervaluational method and its results reflect this distinction. Thus, I think that the associated loss of truth functionality can be accepted. 2. Although Fine holds that since truth is supertruth “Truth is supertruth, truth from above”(Fine, 1975, 273) truth at the top is defined on the basis of truth from the bottom and truth at the bottom is then redefined in view of what has been found to be the truth at the top. One could argue that since complete precisifications are perfectly precise, their application to that of which they are true at the complete specification level is supertrue (as it is true of all of their admissible precisifications which, since they are perfectly precise, amounts only to being classically true of them). However, the result is that admissible precisifications don’t seem admissible after all as the predicate which they are admissible precisifications of is often indeterminate with regard to that of which they are true. When a predicate R is supertrue of a particular and this supertruth does not imply that the predicate P (of which R was said to be an admissible precisification) is also true of that particular it is hard to see how you could maintain that R is an admissible precisification of P. Given this lack of connection, it is hard to see what being an admissible precisification means. Thus, I think that the connections drawn between truth, supertruth, vague predicates and their precisifications is problematic and unconvincing. 3. Supervaluation’s admission of the indeterminate into the metalanguage is problematic. As Williams points out “Suppose one judges that p is indeterminate. How should that impact your

Sorenson describes this aspect of the theory characterising “vagueness as hyperambiguity” (Sorenson, 2006, 7). I think this neglects the distinction he draws between the fundamental connection that holds between a vague predicate and its precisifications whereby the latter complete and illuminate the former and the relatively superficial connection he describes between an ambiguous term and its disambiguations whereby the ambiguous term is like “the super-imposition of several pictures” (Fine, 1975, 283) each of which need to be disentangled.

6

393 111gradbook final.indd 393

22/10/2010 13:52:43


first order opinion as to whether p or ~p?” (Williams, 2009, 8).7 Space does not permit me to address the question, suffice it to say that admission of the indeterminate into the metalanguage is a difficulty with the results of supervaluation and it is hard to see how any answer could resolve the problematic metaphysical issue concerning the nature of particulars with regard to which it is indeterminate whether a predicate applies. 4. Supervaluation creates problems with regard to intuitively plausible argument forms. Having identified the true as the supertrue the definition of logical validity is thereby changed for the supervaluationist from the preservation of truth to the preservation of supertruth (Williamson, 1994, 147). This creates problems with regard to logical consequence. Williamson demonstrates that it necessitates the rejection of the intuitively plausible argument forms “Contraposition… Conditional Proof… Argument by Cases… Reductio ad absurdam” (Williamson, 1994, 151-2). Williams argues that if one takes a degree of determinacy view instead of what he calls “a rejectionist view” (Williams, 2009, 1) then ‘Reductio ad absurdam’ remains valid. Nevertheless, supervaluation creates problems with regard to argument forms and I think that the plausibility of their validity is greater than that of supervaluation and the fact that supervaluation requires the rejection of them thereby adds the strength of their plausibility to the argument against its plausibility. 5. Fine identifies the vague sentence with a partial specification point, a point at which the values true, false and indefinite are assigned to atomic sentences of a first order language. The specification space is built upon this point; the stability condition ensuring that all of the truth values that it assigns are maintained. Ultimately, because the precisifications expand on this point in different ways8, the sentences which are going to turn out to be true at all complete points are those true at the base point and thus, the supertrue and hence the true is effectively determined by the base point assignments. Fine claims that base point truth assignments would have a basis in penumbral connections and there is substance to the idea that our understanding of vague predicates sanctions a partial specification of truth values. However, there are two problems both of which regard the fact that this specification is a precise one. i. Firstly, I don’t think that penumbral connections provide substance of the sort which could ground this; my understanding of the fact that a tall person is not also short doesn’t get me much closer to the knowledge required to assign ‘tall’ ‘not tall’ and ‘indefinite whether tall’ to a group of people. ii. Secondly, although I do think that we have sufficient knowledge of the intension of a vague predicate to determinately assign the predicate sometimes, Fine’s system requires more Williams advocates a view whereby the indeterminate incorporates sentences with different degrees of determinacy. This position does provide an explanation of why some indeterminate sentences seem closer to truth than others. Intuitively, we feel that we should be able to express this and degrees of determinacy provide both an explanation and a framework. 8 At each level precisifiication is achieved because indefinite sentences are determined to be either true or false, they can, by the Resolution condition, be either but chains which assigned the same value would not be separate chains, they would become separate chains only when they did assign different truth values. 7

394 111gradbook final.indd 394

22/10/2010 13:52:43


than an example (e.g. a 7ft tall person is tall) it requires a exhaustive knowledge of the area of determinate application. If the vague predicate can be conceptualised (as Williamson suggests) as a blurred boundary between the application and non-application of a predicate, Fine is suggesting that we know precisely where that blurred boundary begins. This is simply implausible; even if someone thought that they could do this a comparison would find that others placed the precise beginning of the blurred section at a different point and the predicate itself would provide no substance with which to adjudicate between two such people. This is part of what it is for predicate to be vague. Fine’s characterisation of the base point doesn’t allow for this aspect of vagueness; the whole system is built upon the base point and the supertruth which the system produces is, as I showed earlier, fundamentally connected to the base point’s assignment of truth values. Thus, Fine’s characterisation of the base point is one of the reasons why I think the supertruth of his supervaluational system should be rejected. 6. As I pointed out in section two, precisifications are primitive to Fine’s system. The specification space is built around precisifications which maintain the truth assignments of the base point, we are taken to know the complete set of precisifications for any vague predicate and specification points are then assigned to each precisification where “a specification is admissible if it is appropriate for some precisification” (Fine, 1975, 272). There are two problems with this: i. Firstly, it implies that there are specific discrete precisifications which can be assigned specification points and incorporated into a space but a simple case can illustrate the falsity of this and serve as a counter example. Say we knew (as I hold we cannot) that the blurred range of the vague predicate tall was between 5ft and 6ft. What would the precisifications of this vague area be? Would we differentiate in inches? Centimetres? Both of these are artificial points and there would be an infinite possibility of precisifications corresponding to the possibilities for precisification on the number line. Thus, the set would be infinite and it would not, therefore, be possible to accurately graph them in a specification space without taking certain sections as representative and running the risk of incurring further vagueness.9 Fine’s supertruth depends on his incorrect assumption of the discrete countable nature of precisifications and this is another reason to reject supertruth. Fine admits the possibility of infinite sets of precisifications in the discussion of completeability but discussion of them supertruth is untenable and he suggests that “in place of supertruth we use an anticipatory account” (Fine, 1975, 280) though he observes that “for countable domains, anticipatory truth turns out to be a form of supertruth” (Fine, 1975, 280). There are two points to be made about this; firstly, anticipatory truth depends on the base point even more clearly than supertruth and is hence untenable since, as I demonstrated earlier, the base point is inadequate with respect to vague predicates; secondly, since the possibility of infinite sets of precisifications forces him to this account, he in fact, advocates supervaluation and returns to asserting a supervaluational it is implicit that he thinks most domains are countable. I think that this position is incorrect; most vague predicates seem to be defined over gradable domains which support the analogy of the number line and would, therefore, preclude the possibility of their being graphed accurately to a specification space.

7

395 111gradbook final.indd 395

22/10/2010 13:52:43


ii. The second problem with the precisifications, which are primitive to Fine’s system, is that the process of deriving supertruth from the system depends on the set of admissible precisifications being known in their entirety. Fine holds that a sentence is true at a partial specification point if, and only if, it is true at all complete points. If these points can’t be known one would be driven to an epistemic theory of vagueness. However, Fine thinks they can be known. He describes three ways this could occur, all of which involve our understanding, directly or indirectly, as a set or as a whole, the entire collection of admissible precisifications for a vague predicate. But this is implausible, a little introspection or experimentation will demonstrate this as before with regard to the base point. This claim requires not just that we know an example of a case where the predicate clearly does not apply, it requires and asserts that we know exactly where it becomes true to say that the predicate determinately does not apply. This is the claim that, where through knowing the base point we know the precise location of the beginning of the blurred boundary, through knowing the set of admissible precisifications we know the precise location of the end of the blurred boundary. This is both demonstrably false and a mischaracterisation of vagueness and the nature of vague predicates. Since supertruth depends on this knowledge – because supertruth is truth at all complete specification points and we must, therefore, know these points to derive supertruth – supertruth is thereby undermined. I think that the points against Fine’s supervaluational theory of vagueness supersede the points which support it and ultimately undermine the plausibility of the theory.

Section 4: Conclusion

Although the points against Fine’s theory are argued earlier in a way that is specific to it, I think that the essence of the first, second, third, fourth and sixth point would apply to the general supervaluational framework. Crucially, supervaluation depends upon knowledge of a complete set of admissible precisifications and I think that the nature of vagueness precludes this. Thus, I think that, although the theory is undoubtedly persuasive and it produces convincing results which seem to ensure the avoidance of false truth allocation, block the sorites paradox and give a principled explanation of the clear truth of penumbral connections in the face of the unclear nature of the truth values of their components, the supervaluational theory of vagueness is ultimately implausible.

396 111gradbook final.indd 396

22/10/2010 13:52:43


397 111gradbook final.indd 397

22/10/2010 13:52:43


Physics panel

r

Dr. Créidhe O’Sullivan, NUIM (chair) Dr. Síle Nic Chormaic, UCC Dr. Cormac O’Raifeartaigh, WIT Dr. Mauro Paternostro, QUB Dr. Eilish McLoughlin, DCU

Judges’ comments

This report discusses an experiment that was carried out by the student to observe and analyse vortex dipoles in 2D fluid flow. The vortex dipoles were first produced using both a dye and water injected into a layered fluid. The dye was imaged and the water studied using particle image velocimetry. The student’s own contribution to the project was clear. The judges felt that this essay was of a very high standard with a good, well thought-out explanation of the experimental work that was carried out. Plots were used very effectively throughout. There was a nice combination of both experiment and theory and the student demonstrated insight into both. The report was also well written and well presented.

398 111gradbook final.indd 398

22/10/2010 13:52:43


r Physics

Vortex dipoles: ordered structures from chaotic flows Eamonn Kennedy

T

Abstract

his experiment involved the observation and analysis of unique structures known as vortex dipoles in a 2D fluid flow. The Reynolds number of the vortex dipoles was found from established theory as a function of the volume flux into a stratified fluid. This was compared with the Reynolds number found experimentally as a function of the vortex dipole aspect ratio, Îą. The results agreed with those presented by Y. Afanasyev et al.1 A well-known imaging technique, Particle Image Velocimetry was then employed to find the velocity field of the dipole from which the vorticity at each point was found. The vorticity fields produced were consistent with the polar solution of vortex dipoles first produced by Sir H. Lamb in 1895.2 Other tertiary experiments were performed including observation of vortex dipole collisions and the curious attraction of vortices to walls and boundaries. Introduction Vortex dipoles are fascinating jet flows that have two propagating, counter rotating vortices at their front. Although considerably rarer than the common single vortex, vortex dipoles still have application in an incredibly diverse range of scientific areas and have been observed in everything from cloud formations to Bose-Einstein condensates3, an exotic state of matter where quantum effects are apparent on the macroscopic scale. In this report, the history and application of vortex dipoles (V- Ds) are discussed, with emphasis on evaluation of the various mathematical models used to explain their structure.

399 111gradbook final.indd 399

22/10/2010 13:52:44


Fig. 1. A vortex dipole of dyed water at t =50s after formation. Typically, turbulent fluid flows occupy the full three dimensions of space. However, in certain cases motion in one direction is suppressed and a two-dimensional fluid system remains. This assumption is common in explaining fluid flows and in particular for vortices, usually means that φ dependence of polar co-ordinates is neglected.4 Two dimensional turbulent experiments and simulations are common as simple models of atmospheric and oceanic turbulence. However, despite the simplicity of the models used, they accurately predict the properties of large scale dipoles, which regularly form in inlets and can even form on oceanic scales.5 The earliest successful attempts to model straight vortex pairs was produced by Sir H. Lamb in 1895. A modern revision of his original solution is provided in the theory section below. Although the model reproduced the basic shape and properties of the V-Ds, it had no time dependence and indicated no V-D evolution. In 1905, Chaplykin reworked the original equations but produced results with only limited experimental significance. With the invention of jet propulsion aircraft in the 1940s, vortex dipoles became a topic of significant interest. This is because an accelerated flow past a wing produces a vortex pair across the wingspan. The mathematics is exacting, but my interpretation of the equations produced by G. Saffman7 is given here. Consider a rectangular elliptically loaded wing of span 2a, with a stream velocity W inclined at an angle δα (Figure 2). The flow that develops is a vortex sheet after time t = Z/W with U = W δα when the wing moves.

Fig. 2. Vortex field behind a wing.

400 111gradbook final.indd 400

22/10/2010 13:52:44


This results because the vorticity, v, is a maximum at the wing tips since v = s Ă— u. As the wing moves, the flow going past it continues to be curled on each side of the aircraft, and so a spiral of flow continually grows at the tips with the motion of the wing. As the flow curls away from the wing tip it loses velocity, which causes the flow to ‘roll up’ into an approximately circularly symmetric vortex. The distance between each consecutive turn limits to zero, as in figure 1 as the particle speed reduces along its path from the wingtips. At each side of the aircraft a counterrotating vortex is produced. As the craft accelerates the vortices interact in the plane perpendicular to the craft motion forming a V-D. Before the invention of computer processors, only having graphical solutions meant that the equations governing V-Ds had limited applicability. With the additional potential of numerical modelling, the equations could be heavily refined. The most comprehensive re-evaluation of the vortex dipole model was produced by Deem and Zabusky in 1977.8 The model included reworking to include translation of the dipoles and explained why they are not dispersive and the conditions under which they form. A good example of the benefit of the numerical model equations was presented in 2001 by L. Cortelezzi and A. Karagozian9 who were able to model the V-Ds in 3D and gained significant insight into the internal vortex pair topology. An example of the modelled V-Ds is shown in figure 3. Some of the places where vortex dipoles have been observed are genuinely surprising. Consider that in this experiment, the V-Ds are produced by a delta function of impulse into a planar fluid. One simple example of this is rainwater hitting the surface of a body of water at a high speed. If the conditions are right, the water droplet impulse should sink the droplet fast down into the water, so that horizontal spread is suppressed. This essentially provides a stratified layering of the water. This has been done using dyed water droplets and dipoles can be readily observed. Attempts were made in this lab to replicate this phenomena but insufficient velocity was achieved.

Fig. 3. 3D vorticity contour produced by Cortelezzi et al. It is easy to think that vortex dipoles are somehow confined to classical physics and are unlikely to be found outside of predefined experiments. However, they have recently been observed in Bose-Einstein condensates and in exotic superconducting materials near the critical temperature phase transition.10 Fleming et al. showed that the dissipation of energy near the superconducting

401 111gradbook final.indd 401

22/10/2010 13:52:44


region of some materials is carried away by vortex dipoles which form in the material at a coupling rate πK which was highly temperature dependent and, in particular, large around Tc. In fact, the equations they used to model the dipoles were based on the original work by Deem and Zabusky8, with a few variations to include magnetic effects. Although not as common as a singly rotating vortex, which is just the manifestation of shear stresses in a fluid, vortex dipoles are still found throughout all fluid systems. The original potential solution of V-Ds has been updated and numerical models have been employed and no doubt further work will be done, as a complete revision has recently been offered by Y. Afasayev.11 The key thing is that, given how common they are in nature and the diversity of scientific fields in which they are found, the work on V-Ds is in no way complete, and is a fascinating example of how something simple can actually have a whole range of complex applications.

Theory Modelling vortex dipoles Unlike vortex arrays or systems containing large amounts of vortices, V-Ds do not require numerical methods to model and can be analytically solved. Straight vortex pairs with equal magnitude but opposite sign were first mathematically modelled by Sir Horace Lamb in 1895.2 He showed that a vortex pair with no net angular momentum can be modelled assuming its overall shape is circular, its overall motion is uniform and that it has evenly distributed velocity. Although more accurate models have been proposed, they usually limit themselves to specific cases, the only one of which we will discuss is a delta function of impulse into a stratified fluid. A modernisation of Lamb’s original work was produced in 2006 by Jie-Zhi Wu et al4, which provides a simpler analysis, solving the potential over a cylinder in polar co-ordinates, (r, θ) with

Here, a is the distance from the centre of the vortex dipole pair to its boundary at r = a and θ is the angle around the dipole origin. For simplicity, we impose an external flow of −U so that it remains stationary with its centre on the origin. Also, since this is a 2d structure we neglect any φ dependence, which would have to consider bottom and gravitational effects. We note that in general in order to linearise the potential: and so the general equation for vorticity yields The solution of which is involved but we can see some aspects of it straight away. From the previous equation for ψ it is clear that ψ α sinθ. Also, the equation takes the form of Bessel’s differential equation. Therefore, the solution is a Bessel function

402 111gradbook final.indd 402

22/10/2010 13:52:44


The first zero of J1(ka) is at ka = 3.83, which gives a closed circular streamline ψ = 0 at r = a, the boundary of the dipole. More specifically, we can find C from velocity continuity noting that the spatial derivative of ψ at the boundary is From which the constant C is found to be Subbing back in for the potential solution, ψ inside the vortex dipole yields

This is the point at which the streamlines can be seen graphically and is probably a good finishing line on the general solution, showing that the vorticity magnitude is directly proportional to the point velocity everywhere inside the dipole. Also, the vorticity is largest at θ = ±π/2 along the axis between the dipole centres, a suggestion that my results disagree with by several degrees, most likely due to bottom forces. Figure 4 shows the streamlines in (r,θ). Sergey Chaplykin elaborated on this model4 and determined that the vorticity has a maximum and minimum (due to symmetrical counter rotation) at r0 = 0.48a. This result will be confirmed by producing a vorticity contour and observing it across the plane, finding the maximum and minimum vorticity along a.

Vortex dipole global properties

We now leave the general case solution and look at some of the bulk properties of vortex dipoles. If we inject a short impulse of the water into a large body of fluid both the mass and linear momentum of the system will be conserved. Y. D. Afanasyev showed5 that the time dependence of a dipole’s length and breadth, L, D are given by

Fig. 4. The graphical solution of the vorticity in (r, θ) showing the closed streamlines and the main features of Lamb’s vortex dipole model.4

403 111gradbook final.indd 403

22/10/2010 13:52:45


Implying: L = αD where α is a constant for a given dipole that depends on the incident velocity and therefore also the ratio of the inertial to viscous forces in the fluid, Re. By plotting α against time we can find its mean value after formation. We can then find the Reynolds number it coincides with using {1}. This gives an indication of the viscosity of the fluid and the level of turbulence of the system since

Where U is the fluid exit velocity, v = μ/ρ is the kinematic viscosity of water and d is the diameter of the injection nozzle. More specifically for the setup used we know that the velocity at the nozzle is the volume flux q, divided by the nozzle area (πd2/4) so we have that where J is defined as thus we see that Re = k’q where k’ is just a constant of the setup that does not change for different dipoles although it does have a small temperature dependence since v = v (T) as shown in figure 5.

Volume flux calibration

We now must consider how the volume flux can be calculated from experimentally known quantities. In particular, given an initial fluid height, h0 and time interval of injection ∆t we would like to have q = q(h0,∆t). In fact, as will be shown, once a calibration has been performed the time interval has no bearing on the volume flux since q is a constant for a given initial height and a small net volume released. We start with the assumption that given a height of liquid with an opening at its bottom, the rate of change of its height is proportional to the height. which immediately implies

404 111gradbook final.indd 404

22/10/2010 13:52:45


Fig. 5. Temperature dependence of v, the kinematic viscosity. In general v ≈ 0.012cm2/s in the dark lab used. The volume flux, q is the change in volume divided by the change in time. We can now write it as

Let us now assume q is an experimentally found quantity, where we can set h0 − hf = 1cm3 so that we can solve for the constant k, which we will define as a magnitude only, with the sign outside of the exponent.

k was found to be 0.00183; details are given in the results. However, using the k value found by experiment we return to the equation for q and expand the exponent to first order

The higher the initial height, the larger the volume released by the nozzle in one second, q. Further to this, we suggest that as we saw that Re = k’q, the Reynolds number of the flow is directly proportional to the height of the fluid (Figure 6) provided we assume at h0 = 0 there is no volume flux. This imposes an effective range of Re which can be altered by changing the height of the burette but is typically 10 < Re < 80. The first part of the experiment concluded with the estimation of Re via the aspect ratio α, and by the volume flux, q(h0).

405 111gradbook final.indd 405

22/10/2010 13:52:45


Fig. 6. The Reynolds number as a function of water height.

Particle Image Velocimetry (PIV)

In PIV, patterns of particles spanning a small sub-area are matched with a slightly shifted set of particles separated by a well-defined time interval. The resulting displacement gives an average velocity for that region.6 The experimental considerations of PIV are detailed in the method where as here, we provide an overview of the image processing entailed. The main principle of PIV algorithms is to take two images and perform cross-correlation on them to determine the motion at each point in the image over the time interval between their capture. The most straightforward approach to cross-correlation of the images is to define the images as divided into sub-windows of pixel side length N = 2n where the n used was typically 4, 5 or 6. Two dimensional cross correlation of each sub-window pair is performed. Let us define sub-windows I and J within the first image F’ with indices i, j representing pixel locations within the sub-windows. Similarly, we define the same sub-windows in the second image F�, we then have

where R is the cyclic cross-correlation between the windows I and J in each image pair. The equation also has a concise fourier form which was not used in this experiment. This basic method cannot interpret changes above N/2 and so more advanced algorithms are used. Also, it is necessary to normalise the background which is not a complex procedure or equation, but is very large and basically involves summing over the mean non determinate sub-window cross correlations and dividing each cross correlation by this value of R(s, t). The actual program used was MatPIV16.1 written by J. Kristian Sveen, which is available as freeware online. The initialisation commands required an approximation of the mean vector per sub-window and an input of the time duration between the images.

406 111gradbook final.indd 406

22/10/2010 13:52:45


When the mean particle displacement is obtained at each window over the whole image, we effectively have obtained the velocity field. Taking the curl of the velocity field, s × U (u, v) gives the spatially resolved vorticity of the fluid. Using basic Matlab commands it can be shown that the total vorticity of the dipole is near zero and that the y-directional momentum has a maximum along the centre line of the dipole. Given R × C subwindows of a vorticity matrix ‘M’ we can say: M+ = sum(M.*(M> 0) M− = sum(M.*(M< 0) Net vorticity = M+ + M− and also to find net vorticity at each X and Y level: for n = 1:R, for m = 1:C X(n) = sum(M(R,:)) Y(m) = sum(M(:,C))

Method

The equipment was set up as shown. A web camera with a maximum resolution of 640 × 480 pixels was connected to a laptop using the Logitech drivers available online. The .avi videos recorded were viewed frame-by-frame using virtualdub.exe (www.virtualdub.org), a freeware .avi viewer which gives the frame number and time in ms for each image. The frame rate of the camera was set to 30 frames per second. This was confirmed by counting the number of frames over a set time interval. The diameter of the nozzle was found by inserting thin wires into the injection point and then measuring the wires with a μm gauge.

Fig. 7. The experimental setup for non PIV imaging. The water tank and burette were prepared separately.

407 111gradbook final.indd 407

22/10/2010 13:52:45


The tank had a top surface area of 1250cm2. We required a salt water bottom layer of 2-3cm. Thus, in order to produce ≈ 2.5cm of bottom layer, 3L of salt water was included, requiring 350g of salt per refresh of the layer to hit a reasonable (≈ 50%) level of salt saturation. The salt is added and stirred, left for ten minutes and then stirred again to maximise distribution. When this was not done, the top layer was seen to be uneven on the surface of the salt water layer. A sheet of A3 paper was placed on top of the salt layer and pure water was sprayed onto the surface. The paper given suggests a ‘thin as possible’ surface layer. However, this was found to produce significant bottom effects due to interference of the bottom and top layers which affected the dipoles produced. A larger layer of 0.5cm layer was used on top, requiring 600ml of deionised water. The A3 sheet was then carefully removed and the layers were given time to settle. During this time, the burette was prepared which contained up to 25ml of red dyed pure water. The burette was carefully inserted into the top surface layer. A retort stand was used to keep the height of the needle to the pure water layer constant. An additional ‘control’ retort stand was used to rebalance the needle angle in the fluid, since the needle tended to point up slightly when placed into the water due to upthrust. The volume flux was calibrated using q = kh0. k was found experimentally using known volume fluxes at increments of 1ml along the burette liquid height. Typical dipoles were 0.1ml or less in volume and so h0 − hf was highly inaccurate and so q was found indirectly using k. Generally, there was a pay off between pixel resolution and the time during which the dipole was in the camera view. Also, the scaling of cm to pixels was recorded with each run since any movement of the camera caused it to vary. Surprisingly, the x and y planes had the same calibration factor. ∆t and q were varied. During a run, the following procedure was used. More than 40 video runs are available.

Run procedure

• The initial height h0 was recorded and its corresponding Reynolds number was found (Figure 6). • The light was turned on and camera was set to view the nozzle and ≈ 12cm in front of it. • The video was started and a short injection of fluid was allowed through the burette. • Several frames of the dipoles development were isolated using Virtualdub.exe • The pixel-cm scale was calibrated in paint and L(t),D(t) and t were recorded for each frame. • α was found for the dipole and the corresponding Reynolds number was found graphically.

408 111gradbook final.indd 408

22/10/2010 13:52:45


Fig. 8. Time evolution of a vortex dipole at a) 0.5s, b) 6s, c) 8s, and d) 14s. Generally, the dipoles were initially long and narrow, only spreading out once their velocity had decreased below a threshold that was usually about 50% of the injection velocity. In terms of the analysis, this meant that α varied with time, specifically it was seen to have an asymptotic behaviour, quickly reducing to a near constant value where it stayed for t > 1min. This meant that a bestfit was required which would omit the first few high α values. An example is included in the results. Runs were also performed where a continuous dye stream was injected behind the dipole. This gave larger, faster moving dipoles. Also, when this was performed in a non-density stratified fluid, a dipole was seen to form into a more spherical dipole shape. This concluded the main analysis of the dipole momentum. The other most significant vortex dipole property, vorticity, was explored using particle image velocimetry (PIV).

PIV

The setup was altered in three ways. Firstly, pure non-dyed water was used as the injection fluid. Secondly, a projector was position parallel to the water surface and a thin sheet of light was spread across the top layer. Thirdly, 50μm balls were seeded into the water using a spray, where they floated. These had a tendency to clump together even at relatively low particle densities. To compensate for this, the balls were then ‘sprinkled’ by a spatula over the surface; this gave a much better distribution. The seeded particles typically took 10 minutes to settle. A video was started from a pre-chosen height with known cm-pixel scale. The water was injected as before and frames were isolated with a known time between them using virtuadub.exe. The time interval was usually between 0.1s < t < 1s. The two images (im1, im2) were then saved in a directory containing MatPIV.m, the core file of MatPIV16.1. The command structure for cross correlation

409 111gradbook final.indd 409

22/10/2010 16:24:10


of the images was: function [x,y,u,v] = matpiv(im1, im2, winsize, Dt, overlap, method, wocofile, mask, ustart, vstart) where ‘winsize’ was the sub-window size in pixels, which is a multiple of 2n, ‘overlap’ was the required overlap size of the windows, which could be set to zero. ‘Method’ was cross correlation and no ‘masking’ of areas was usually performed. Vorticity was then found using the velocity map [x,y,u,v] by taking its curl, the code is easy to follow and is based on the idea of least squares: { DeltaX=x(1, 2)-x(1, 1); DeltaY=y(1, 1)-y(2, 1); for i=3:1:size(x, 2)-2; for j=3:1:size(x, 1)-2; vor(j-2, i-2)= -(2*v(j, i+2) +v(j, i+1) -v(j, i-1) -2*v(j, i-2))/(10*DeltaX)... + (2*u(j+2, i) +u(j+1, i) -u(j-1, i) -2*u(j-2, i))/(10*DeltaY); end end outp=vor; xa=x(3:end-2, 3:end-2); ya=y(3:end-2, 3:end-2); } We end in a matrix output ‘outp’ which can be plotted as a surf plot or as required. The main practical issue with this method was the web camera resolution. Inevitably, there was a pay off between the resolution of individual particles and viewing of the entire dipole. Small sections of a dipole could be viewed giving excellent velocity maps, but larger viewed areas lost resolution, requiring masking of ‘null’ velocity areas where the particles were not picked up. A compromise was reached by generally producing small dipoles (Dmax = 2cm) although the larger ones were more impressive. Additionally, it was possible to perform PIV by turning the bottom projector on, taking the images as usual. The images were then gray-scaled and the colours were inverted and the brightness was reduced. The effect of this was to simplify the setup while still maintaining essentially similar pictures of white seeded particles against a dark background.

Results

The flux calibration against height is shown below in figure 9 (a). Although the time intervals ∆t(h) were non-linear with height, the flux, q(h) was seen to be proportional to height of the liquid. From this, the constant ‘k’ was determined for each height; its variation is shown in figure 9 (b). In c.g.s units, if we consider that the height of the burette correlates to cm3 k was 0.00182cm3/s per cm of height.

410 111gradbook final.indd 410

22/10/2010 13:52:46


Fig. 9a. The averaged flux as a function of height; b) the constant ‘k’ and its bestfit. The graph provided for fitting α against Reynolds number was logarithmic, squared and generally did not give a clear indicator of the experimental error. A genfit was performed in MathCAD using data from the bestfit α(Re2) line so that a direct correlation between α and Re could be found graphically.

Fig. 10. The alpha and Reynolds number values found for 16 dipoles. This avoided the tedious conversion from logarithmic to linear scaled data. The line took the form of the exponential of a polynomial, exp{f(x1, x2, x3, x4...)}. Poly-Gaussian fit was not considered necessary as the line was well formed. The list of dipoles analysed is shown graphically with the theoretical fit in figure 10. Another way of interpreting this data is to find the Reynolds number by both methods and to find their ratio against height (Figure 11). This proves

411 111gradbook final.indd 411

22/10/2010 16:25:04


to be useful as it shows the values were typically within 90% accuracy and that since most of the points are below unity, the Re(h) values were larger than the Re(α) values. It is unlikely that the flux was overestimated, as the values found were surprisingly small, which suggests that the actual graph provided with the paper may have been off the true value by some constant. In fact, it is possible that the fluxes were not overestimated but underestimated since q(0) = 0 was assumed.

Fig. 11. The alpha and Reynolds number values found for 16 dipoles. The behaviour of α and the dipole velocity with time was explored, an example is given below (Figure 12) for run.7 The velocity had a strong, asymptotic behaviour, contradicting the suggestion by Y. Afanasyev [11] that the steady state velocity is constant at Uconst = Ujet/2. The behaviour was found repeatedly. Typically, the initial velocity was above 5cm/s for half of a second, within 5 seconds the jet typically travelled at (0 − 1.5) cm/s slowing to (1 − 2)mm/s in the steady state. Typically there was a 10-100 fold decrease in velocity.

Fig. 12. Velocity and α time dependence for one dipole. Other runs gave very similar results provided as an appendix. Also, dipoles were collided to observe the effects (Figure 13). Normal process was that the slower dipole would remain unchanged while the faster dipole would smear out over the front of

412 111gradbook final.indd 412

22/10/2010 13:52:46


the other, forming small, isolated vortices. This behaviour has been well documented in numerous papers by Y. Afanasyev.

Fig. 13. Vortex dipole collision. PIV provided interesting results. The velocity fields were well formed for small image areas but had aberrations over large areas. These were then converted into vorticity fields using the code provided in the theory section. Figure 14 shows the vorticity field for a large dipole. Limited masking was employed at the corners of the image. The circulation is counter-rotational in each vortex, and so we see one side of the dipole as negative and the other as positive. Summing over the whole area of the dipole gave a near zero value for vorticity, 41/(478 + 437) ≈ 4.5%. The y axis momentum had a peak at the central sub-window row. Also, recalling the result of the theoretical model, rmax = 0.48a, in (b) we see maxima and minima at 0.487a and 0.52a, both within 10% of rmax.

Fig. 14. Vorticity field for a steady state V-D.

413 111gradbook final.indd 413

22/10/2010 13:52:46


Fig. 15. Velocity field for a large V-D flowline.

Fig. 16. Velocity field detail of the off vortex velocity. The circulation is non-zero outside the dipole.

414 111gradbook final.indd 414

22/10/2010 13:52:46


Conclusion

In this experiment vortex dipoles were produced using a dye injection into a layered fluid. The Reynolds number of each dipole was found by two methods; as a function of incident volume flux, q, and as a function of the vortex dipole aspect ratio, Îą. The results were as expected in [1]. Particle Image Velocimetry was used to find the velocity field of the dipole from which the . vorticity at each point was found. The vorticity fields produced were consistent with the analytical models presented. The velocities found were consistent with the asymptotic time dependence observed using frame-by-frame analysis. Other tertiary experiments were performed, an example being dipole production off of a main jet using a wind force above the water surface, shown in figure 17.

Fig. 17. A body of dyed water was blown with a 10cm/s wind gust, producing a dipole.

415 111gradbook final.indd 415

22/10/2010 13:52:46


Y. Afanasyev ‘Investigating vortical dipolar flows using particle image velicometry’ Am.J.Phys. 70 (1) January 2002. 2 Sir Horace Lamb ‘Mathematical Theory of the Motion of Fluids’ 2nd Edition (1932). 1

T. W. Neely et al. Dec 2009 ‘’Observation of vortex dipoles in an oblate Bose-Einstein condensate’ http:// www.citebase.org/abstract?id=oai:arXiv.org:0912.3773. 4 Jie-Zhi Wu, Hui-Yang Ma, Ming-De Zhou -2006 Vorticity and vortex dynamics pp. 288-291. 5 Y. Afanasyev ’Investigating vortical dipolar flows using particle image velicometry’ Am.J.Phys. 70 (1) January 2002 section III, data analysis. 6 J. Kristian Sveen et al. Jan. 22nd 2004. ‘Quantitative Imaging Techniques and their application to wavy flows’ Dep. of Mathematics, University of Oslo, Norway jks@math.uio.no. 7 P.G. Saffman, Vortex dynamics, Cambridge University press, Ch5, Ch6. 8 G. S. Deem, N.J. Zabusky ‘Vortex Waves: Stationary ‘Vstates’, interaction recurrence and breaking’. Physical review letters, Vol. 40 N.13 Dec. 1977. 9 Cortelezzi et al. J. Fluid Mech. (2001), vol. 446, pg. 347, 2001 Cambridge University Press, ‘On the formation of the counter-rotating vortex pair in transverse jets’. 10 F. Fleming et al. Physical Review Letters, Vol.62, No. 6, 18Dec 2009, ‘Vortex Pair Excitation near the Superconducting transition of Bi2Sr2CaCu2O8 Crystals’. 11 Physics of fluids, 18 037103 (2006) Y. Afanasyev ‘Formation of vortex dipoles’. 3

416 111gradbook final.indd 416

22/10/2010 13:52:46


417 111gradbook final.indd 417

22/10/2010 13:52:46


Social Studies panel

r

Dr. Cormac Forkan, NUIG (chair) Dr. Anne Byrne, NUIG Dr. Trevor Spratt, QUB Dr. Katy Hayward, QUB Prof. Mary Corcoran, NUIM Dr. Ciaran McCullagh, UCC Dr. Maire Nic Ghiolla Phadraig, UCD Dr. Alice Feldman, UCD Dr. Iarfhlaith Watson, UCD Dr. Martin J. Power, UL Dr. Brian Conway, NUIM Aifric O’Grady, UCD

Judges’ comments

The central premise of this extremely thought-provoking essay is that famines in the main are man-made and not merely caused by the occurrences of food shortages, due to natural disasters. In the first half of the essay, the author introduces the theories of famine and examines the way in which the entitlement and distribution of food, rather than food shortage, is often the underlying cause for famines. To copper-fasten this point, a case study from Bangladesh in 1943 is detailed, where two-to-three million people died despite there being an adequate availability of food in the area. The essay also suggests that famines are enmeshed in either direct or indirect political decisions. Consequently, political systems have often been intentional in creating famine conditions and using starvation as a mechanism of repression. This fact makes these government officials some of history’s worst criminals. In the second half of the essay, the author focuses on a case study of Holodomor in Ukraine in 1932-33. This illustrates that not only political economy and forced collectivisation, but the intentional faminogenic behaviour of Stalin and a small group of his government officials, caused devastating starvation and the deaths of millions of people. This case moves the study of famine into the field of international law. Thus, the UN’s recognition of Holodomor not only as crime against humanity, but indeed also as genocide, can be regarded as appropriate and justified.

418 111gradbook final.indd 418

22/10/2010 13:52:46


r Social Studies

Famine: a crime against humanity? Or genocide? Death by starvation – Holodomor in Ukraine 1932-1933 Renate Stark Introduction

“Ethiopia plans ‘Live Aid’ appeal – Millions of people face starvation following drought, floods and policy failures – Drought hitting

the Western Sahel region.” (BBC News 15/08/2007) “Millions in

Africa face starvation because of failed harvests.” (The Guardian,

11/08/2008) “Starvation in Kenya due to drought.” (IHF 06/2009)

T

hese quite recent headlines certainly indicate that the myth “Africa has drought, drought causes failed harvest, failed harvest equals famine” is still deeply engrained in the popular perception of today in relation to the development of famines. However, it has been established by many scholars that famines are, in the majority of cases, man-made and not merely caused by the occurrence of food shortage due to natural disasters. In this paper, I will commence with a brief overview on certain theories of famine, particularly focusing on the entitlement theory established by Sen and Devreux’s elaborations on famine and government policy. In particular, I will elaborate on the factor of collectivisation and famine as is widely documented and investigated that the collectivisation process, carried out by communist countries, provoked a counterproductive effect causing widespread opposition, rebellion and resistance in

419 111gradbook final.indd 419

22/10/2010 13:52:47


the peasantry and thus, in many cases, lead to famines (Bernstein 1984, Devereux 1993, Lin 1990, Livi-Bacci 1993, Tauger 2005). A further focus point will be the concept of famine crime against the background of international law, a field in which the professor of law, David Marcus (2003), has provided essential and valuable research and material with his classification of ‘faminogenic behaviour’ into different stages that under certain conditions can be regarded as crime against humanity and indeed, as genocide. Based on this theoretical framework, I will present a specific example with the case of Holodomor, the famine that occurred in the Ukraine Socialist Soviet Republic (SSR) and Kuban in the early 1930s. Holodomor has long been described as a result of adverse government policy during the first period of Stalin’s collectivisation program, due to which millions of men, women and children lost their lives over a space of no more than 18 months. However, in recent years it has entered a new level of discussion as the question has been raised to treat Holodomor as more than just a result of forced collectivisation: Ukrainian nationalists, lead by Viktor Yushchenko, the President of Ukraine from 2005 to 2010, and supported by a number of scholars from a variety of disciplines, have initiated a broad debate over the issue for the Ukrainian famine to be placed in the realm of crime against humanity and indeed to be recognised as genocide. This would be a significant move towards a case of famine crime in international law, and a call to trial for those who were responsible for the implementation and execution of faminogenic government policies.

THEORIES OF FAMINE

In the scholarly literature and research of famine and its causes, it has long been determined that, contrary to the popular perception of famine as a consequence of natural disaster, drought, flooding or sheer misfortune, affecting only developing countries, as a matter of fact other more humanly contrived factors are implicated in the cause of most famines. Many different definitions and theories of famine have indeed been established, that can be applied to the particular situations. However, all more recent theories of famine concur in their view that famines, rather than being caused by an unforeseeable natural calamity, are actually man-made disasters. Especially when analysing the big famines of the 20th century in Sub-Saharan Africa, Asia and the Soviet Union, this factor can repeatedly be identified. One element of this popular myth that has been particularly and increasingly challenged is the comparison and indeed equation of famine with ‘shortage of food’. With his essay ‘Poverty and Famines: An Essay on Entitlement and Deprivation (1981)’ the scholar, Amartya Sen pioneered the claim that rather than food shortage, famine is a problem of entitlement and distribution of food (See Devereux 1993, Osmani 1995). He developed his theory with the example of the famine that occurred in 1943 in Bangladesh, where two to three million people died, despite an adequate availability of food in the area. The problem was the imbalance in the food distribution, resulting in a failure of food entitlement. Sen’s entitlement approach has been analysed by Siddiq Osmani (1995) and further developed and classified. The development economist Stephen Devereux, in his comprehensive book Theories of Famine (1993), is providing further different definitions of famine and the various approaches to famine analysis.

420 111gradbook final.indd 420

22/10/2010 13:52:47


For the purpose of this essay, I am going to concentrate on Devereux’s representation of the political economy of famine in part 3 of the aforementioned book, with a specific focus on famine and government policy. Based on this background I will furthermore discuss famine in the context of international law, where it is considered crime against humanity and indeed, genocide.

FROM GOVERNMENT POLICY TO FAMINE CRIME AND BACK

Most of the recent famines did not occur due to overpopulation and a subsequent lack of food or as a result of natural disaster. Rather, Devereux argues, they ‘always include elements which are either directly political – a deliberate act of political will – or indirectly political – a failure to intervene to prevent famine, or famine as an unintended by-product of government policy’ (Devereux 1993:129). Depending on the type of government policy and contribution, he elicits four sub-categories in this area, namely: The implementation of inappropriate or deliberately harmful policies, the failure to intervene and prevent famine, famine as a by-product of war or civil unrest, and finally, the intentional creation of famine conditions, ‘using starvation as a mechanism of repression and subjugation’ (Devereux 1993:130) David Marcus has classified these intentional governmental behaviours to create famine conditions as ‘faminogenic’ – a term coined by him – and argues that those government officials ‘should be considered some of history’s worst criminals’ (Marcus 2003:245). He stages faminogenic behaviour in four degrees, fourth-degree faminogenic behaviour being ‘the least deliberate’, marked by ‘incompetent or hopelessly corrupt governments’ (Ibid:246), while he defines first-degree faminogenic behaviour as an intentional act, where ‘Governments deliberately use hunger as a tool of extermination to annihilate troublesome populations’ (Ibid:247). With his research, Marcus aims to achieve formal criminalisation of first- and second-degree faminogenic behaviour, the latter being described as the reckless pursuit and continuation of government policies that engender famine ‘despite learning that they are causing mass starvation’ (Ibid:247). He concurs with and builds on to the argument of Abdullahi El-Tom who, already more than a decade ago, talked about a necessary move towards the concept of ‘famine criminals’ to bring those responsible for famines to justice, in a similar way as already is common practice with war criminals (El-Tom 1995). The government policies introduced in the course of collectivisation in a number of countries of the eastern block have been made responsible for recent famines that emerged as a consequence of either internal war or state policy: In the Soviet Union, accounting for four famines ‘within thirty years of the 1917 Revolution’ (Devereux 1993:130); in China, where the most dreadful example is the enormous famine during the years of the Great Leap Forward from 1958-61; in Cambodia due to civil war and the subsequent repression under the Khmer Rouge in the mid-1970s; in Ethiopia in 1984 after collectivisation; or, most recently, in North Korea, where a famine occurred in the 1990s under the regime of Kim II-Sung and stretched over nearly a decade. This implies that the collectivisation process itself, and thus the governments that implement it, are largely to blame for the hardship that they, be it knowingly or unbeknownst to them, place on

421 111gradbook final.indd 421

22/10/2010 13:52:47


to their people. The certainly most affected are the farmers or peasants themselves, despite being at the very frontline of food production. In the process to clarify the theory of government policy, in particular regarding collectivisation, Devereux concentrates on two cases of famine: in Russia, in the early 1930s, and China, where the devastating famine during China’s Great Leap Forward years from 1958-61 can be regarded as the ‘worst anywhere in human history’ (Devereux 1993:143). Russia and China were both traditionally described as famine-prone countries. This was mainly explained by their natural disasters, their harsh climatic conditions, poverty in rural areas and the isolation and difficult accessibility of the affected regions. However, with closer investigation, all of the famines that occurred in these countries over the course of the twentieth century can quite safely be regarded as man-made and indeed, as a direct consequence of government policy and the failure or unwillingness of the leaders to prevent them or at least intervene (Devereux 1993:140-147). To underline his argument Devereux quotes Clay and Holcomb: Famine has resulted primarily from government policies that have been implemented in order to accomplish massive collectivisation of agricultural production and to secure central government control over productive regions of the country where indigenous peoples have developed strong anti-government resistance. (Devereux 1993:136) While their main focus was the Ethiopian famine, this statement certainly can perfectly be applied to any other case of famine as a consequence of collectivisation. To demonstrate the above described processes, I am going to concentrate on one specific example of famine and government policy that has come to the fore especially over the last decade, Holodomor.

HOLODOMOR, THE UKRAINIAN FAMINE IN 1932-33

The devastating famine that rampaged in the Ukraine SSR during the collectivisation under Stalin has been named by the Ukrainians as Holodomor, ‘from moryty holodom ‘to kill by means of starvation’’ (Makuch and Markus 2009). It took place from the beginning of the year 1932, and lasted until the autumn and winter of 1933; in fact some scholars even describe that it really only tapered off in the first half of 1934 (Dalrymple 1964, Devereux 1993, Serbyn 2006). The numbers of people believed to have died during this period of time vary greatly in the various academic sources, a fact which can be ascribed to insufficient data or improper use of statistics. However, estimates range between five and ten million deaths due to starvation or other famine-related illnesses, more than one third being children (Serbyn 2006:181, also Conquest 1986, Dalrymple 1966:259, Livi-Bacci 1993, Marcus 2003, Zakharov 2008). As an historical background and context it may be important to mention that Ukraine has often been hailed as the ‘bread basket’ of Russia due to its very rich soil that is known as the Ukrainian ‘black earth’. This explains that the Ukrainian SSR was certainly one of the most valuable economic components of the former USSR. It generated more than one fourth of Soviet agricultural output and provided, beside grain, substantial amounts of vegetable, meat and milk

422 111gradbook final.indd 422

22/10/2010 13:52:47


to other Soviet regions. After the destruction of the Tsarist autocracy in 1917 and the creation of the Soviet Union, the Bolshevik government was certainly aware of the importance of Ukraine, where nationalists strongly aspired to the formation of an independent state. In order to pacify those nationalist movements, Lenin introduced certain reforms for ‘Ukrainisation’ of this region and the Kuban area in the North Caucasus – whose population in the majority was Ukrainian – that enabled a nationalist Ukrainian community spirit to grow. Thus, against all communist persuasion, the traditional agricultural concept of private farming and landownership persisted, as well as the existence of the Ukrainian Orthodox church and other traditional religious orientations. Furthermore, the whole education system was introduced in the native Ukrainian language. However, in 1929, Stalin, besides trying to speed up the industrialisation process in the USSR, abolished these reforms in order to push towards pure communism. He implemented de-kulakisation1 and collectivisation, particularly targeting the Ukrainian kulaks during this process as they, in addition to being economically suspect to him, also caused a threat due to their expressed nationalism. The collectivisation process entailed the dissolution of privately run farms – in consistency with Marxist ideology and the call for abolition of private property – transforming them into large state-run kolkhozes (collective farms), and the imposition of exaggeratedly high grain procurement quotas (Marcus 2003: 253). Pre-empting resistance – and to prevent any irritation arising from this – the Central Committee assigned twenty-five thousand young urban communists to the particular villages, each of them accompanied by armed troops, to compel the change-over. Everything that existed in the kolkhozes was passed over into government property and no private ownership was left to individual peasant workers, certainly disincentivising them and reducing their work motivation and performance. Many peasants manifested their resistance by slaughtering off their livestock and selling stock and machinery as well as showing a very low work motivation. At the same time the regime aimed to undermine the cultural bearers of the Ukrainian nation, targeting the Ukrainian intelligentsia, religious leaders and at its basis, the peasantry. According to Stalin’s ‘first commandment’, introduced during the first year of collectivisation, any kolkhoz ‘had first to settle with the State according to the quota issued from above’ (Zakharov 2008) and only after that would the workers be remunerated. However, already two years prior to the famine, the kolkhozes were never able to meet the Central Committee’s target and suffered acute grain shortage, thus were unable to recompense their workers. Additionally, the quota for the following year was artificially increased to speed up production and motivate the workers to reach at least their prior quota. By June 1931, Ukraine reached its quota for 1930, but only by emptying all its grain reserves and leaving the kolkhoz workers weakened and malnourished. Instead of taking this into account, the regime even increased the quota for the 1931 harvest to a quite impossibly high level. At the end of 1931, not a single village in the Ukraine had a chance to achieve the high grain quote set by the government. The situation for the villagers deteriorated De-kulakisation: elimination or ‘liquidation’ of the kulaks, middle-class and private farmers who were labelled as class enemies in the Soviet Union, and introduction of collectivisation. It is believed that millions of peasants were arrested, deported and executed in 1930-32. 1

423 111gradbook final.indd 423

22/10/2010 13:52:47


with a resolution in December, that all supplies of goods to rural areas should be stopped. In addition the Central Committee considered ‘any grain found in a peasant’s home [as] a priori […] squandered or stolen’ (Zakharov 2008:32) and thus, it was mercilessly confiscated. By spring 1932 the scarceness of food was developing into real famine, not only in Ukraine but also in several other agricultural areas of the USSR. Only after the report of an agricultural commissioner, who predicted that, without assistance, the working forces would not be able to cope with the next coming harvest, did the state begin to send provisions to the rural areas, also returning grain that had been designated for export to those in need (Zakharov 2008:33). But this support lasted only for a very short time and was discontinued again at the end of June. What followed was another even more serious famine, induced similarly by the inability to fulfil the disproportionate high grain procurement quotas, imposed on a people who had hardly even recovered from their previous hardship. In Ukraine and Kuban the situation was additionally exacerbated with the reintroduced confiscations and a complete cessation of any food deliveries from outside after a resolution from Stalin in January 1933.2 The ‘Law in the Inviolability of Socialist Property’, also widely known as the ‘5 ears of wheat law’ (Zakharov 2008:41) condemned anyone, who was even only suspected to having purloined an ear of grain, to death or, in milder circumstances, to a prison sentence for at least 10 years (see also Marcus 2003:253). The Party and leadership of the USSR increased state repressions even further by prohibiting the Ukrainian populace to leave their region in search for food, reintroducing an internal passport system and denying such a passport to any worker who was in arrears with quota fulfilments. Not surprisingly, this led to devastation and widespread mass starvation. During the first half of 1933 it is believed that millions died in Ukraine, and hundreds of thousands in Kuban. Excerpt from Zakharov (2008) pg. 41: On 1 January 1933 the UkrSSR leadership received the following telegram signed by Stalin: “Be informed of the Central Committee Resolution from 1 January 1933: “Suggest that the CPU and the Council of People’s Commissars of the UkrSSR widely inform, via their village councils, kolkhozes, kolkhoz workers and working individual farms that: a) t hose of them voluntarily hand over to the state grain previously stolen and hidden from inventory, shall not be repressed; b) with regard to kolkhoz workers, kolkhozes and individual farmers who stubbornly persist in hiding grain previously stolen and hidden from inventory, the most severe measures of punishment set out in the Resolution of the Central Executive `Committee and Sovnarkom of the USSR from 7 August 1932 “On the protection of property of state enterprises, kolkhozes and cooperatives, and the consolidation of socialist property” will be applied. The telegram notified the peasants that they must hand over all grain and if they don’t do this, they faced blanket searches aimed at rooting out “grain stolen and hidden from inventory”. If grain was found, punishment would be according to the “5 ears of wheat law” (death penalty or no less than 10 years deprivation of liberty), and if none was found, there would be a fine in kind, that is confiscation of meat, including “in live” weight, and potatoes.

2

424 111gradbook final.indd 424

22/10/2010 13:52:47


A further devastating factor was the political repression carried out by the Central Committee of the Communist Party with the aim to suppress any Ukrainian nationalist revival. Ukrainian leading academics and teachers, as well as known writers and even leaders of the Ukrainian branch of the Communist Party, were falsely accused of conspiracy and executed or sentenced to 10 years of prison. In March 1933, thirty-five civil servants of the Commissariat for Agriculture were executed after being less than a day on trial for the most ludicrous accusations, like having ‘wilfully permitted noxious weeds to grow in the fields’, or ‘encouraging the spread of meningitis among horses’ (Zakharov 2008: 45). In fact, they were officially used as scapegoats, blamed for having deliberately caused the bad harvest, and thus starvation, bringing the USSR into international disrepute. Ukrainisation was completely abolished, the use of the Ukrainian language forbidden, and the Central Committee focused instead on forced Russification. To increase the number of working hands that were urgently needed in the kolkhozes for the harvest in 1933, peasants from other regions of Russia were strongly encouraged to settled into the – due to famine deaths – empty or half-empty Ukrainian villages and the North Caucasus, an area which also was inhabited by Ukrainians and equally devastated. The different Ukrainian associations and committees dealing with Holodomor agree that at the height of the famine villagers were dying at a rate of 25.000 per day. They also suggest that the Ukrainian population may have been reduced by up to 25%. They especially point out that up to 80% of Ukraine’s intellectuals, over 200 Ukrainian authors and 62 linguists perished, were liquidated or disappeared. As mentioned earlier, between five and ten million Ukrainians are believed to have been starved to death.3 An exact overall death toll is nearly impossible to establish, although the numbers certainly will have become more accurate since the opening of Soviet archives over the last decade. However, in relation to this matter I ultimately agree with Massimo Livi-Bacci who states in his article On the Human Cost of Collectivisation in the Soviet Union (1993: 743) that: ‘Numbers have often been improperly used to underline ideological points of view, as if imputing 5, 10, or 15 million additional deaths to the policies of forced industrialisation, Robert Conquest, in the chapter ‘The Death Roll’ of his book The Harvest of Sorrow: Soviet Collectivisation and the Terror-Famine (1986), a thorough work on Holodomor, scrutinising the for him available data at the time, came to the following result: Peasant dead: 1930-37 11 million Arrested in this period dying in camps later 3.5 million TOTAL 14.5m Of these: Dead as a result of dekulakisation 6.5 million Dead in the Kazakh catastrophe 1 million Dead in the 1932-3 famine: In the Ukraine 5 million In the N. Caucasus 1 million 7 million Elsewhere 1 million

3

425 111gradbook final.indd 425

22/10/2010 13:52:47


collectivisation, and the liquidation of rich peasants would alter the nature of the political responsibility of Stalin’s regime.’

ANALYSIS APPROACH TO HOLODOMOR – GENOCIDE DISCUSSION

Devereux’s government policy theory is certainly one appropriate approach to analyse Holodomor. The amount of policies, decrees and resolutions by the Central Committee of Stalin’s regime that have created and advanced the famine, and indeed caused millions of deaths, has been explicitly described in the preceding section of this essay. Additionally, Holodomor could also be analysed in the frame of Sen’s entitlement theory. He argues that in areas of famine there always remain people that have plenty of food. It is never the whole population that is left to starve. In the case of Holodomor it has been described that there were emergency supplies of grain and potatoes in Ukraine, guarded by heavily armed Russian soldiers, while the villagers in the surrounding area starved to death (Marcus 2003:254). Therefore, there is evidence that there certainly was no absolute lack of food supply, but it appeared that the farmers, who were the ones producing the food, were not entitled to avail of it. Holodomor has indeed raised scholarly interest in a variety of disciplines, ranging from a manmade famine due to adverse government policy by the economist Stephen Devereux (1993), a result of the pressures of collectivisation under the strict Stalin rule by political historians such as Dana Dalrymple (1966) or the Professor of Demographics and Political Science Massimo LiviBacci (1993), to a famine crime in consequence of Stalin’s ‘faminogenic behaviour’ by David Marcus, Professor of Law (2003). However, in more recent years, especially in the last decade, Holodomor has entered a new level of discussion with the debate around the endeavour of the Ukrainian Nation to achieve international recognition of Holodomor as genocide in front of the UN. This call is lead by the present President of Ukraine, Viktor Yushchenko, supported by historians like Roman Serbyn (2006) and the late Robert Conquest (1995), and the many associations and committees of Ukraine and the Ukrainian Diaspora. It brought Holodomor into the domain of international criminal law and, if accepted, would make this famine a pioneer for El-Tom’s call towards the concept of famine criminals. Those responsible for first- and second-degree faminogenic behaviour, as described by David Marcus might finally be summoned to trial. The concept of genocide in relation to the Ukrainian famine in 1932-33 has indeed first been described by the Polish-Jewish scholar and lawyer Raphael Lemkin, who coined the term ‘genocide’ in 1944. He already defined the Ukrainian famine as genocide in the 1950s, in the last chapter of his unpublished book ‘History of Genocide’. He also delivered an address to the audience of the commemoration of the 20th anniversary of Holodomor in New York in 1953, which has only been launched very recently, on 25th November 2009, as a book: ‘Raphael Lemkin: Soviet Genocide in Ukraine’ (Williams 2009). Dr. Roman Serbyn acknowledges in his introductory article to this new publication, the importance of Lemkin’s work for the Ukrainians as the ‘father of the Genocide Convention’ and admires his analysis of the Ukrainian famine and his convincing ‘demonstration that it was a genocide in accordance with the principles and criteria of the UN Convention’ (Ibid).

426 111gradbook final.indd 426

22/10/2010 13:52:47


The denomination genocide combines in the study of Holodomor, due to its implication of intentionally aiming against a group that is culturally coherent, the discipline of international criminal law with the realms of history and anthropology. Not only through intentional starvation did Stalin try to obliterate Ukrainian culture, he also persecuted and killed bearers of Ukrainian cultural memory, thus attempting to destroy the very foundations of Ukraine cultural heritage. He dismantled education in adjusting the school system forcefully to the Russian system, attacked the intelligentsia, academics, linguists, the writers, and the religious culture of the independent Ukrainian Orthodox Church and its clergy. Furthermore, the traditional agricultural concept in the Ukraine was quite distinct from customs used in other parts of the USSR. It consisted of private farming and landownership and the farmers were major local providers of agricultural production with proficient economic and labour traditions. Therefore, the forced change to collectivisation hit the Ukrainian peasants and farmers exceptionally hard and was understandably met with a high level of resistance. In the legal frame of this discussion the argument has been raised that there is no document or decree as such, issued by the Soviet government and the Central Committee of the Communist Party, that explicitly stated ‘an order to kill with famine a certain number of Ukrainians or other peasants’ (Antonovych 2008: 2). Thus, in legal terms it could not be seen as proven that the famine of 1932-33 was indeed thoroughly planned. New research, however, has established that the intentionality of this was sufficiently expressed by many legal acts aimed specifically at the Ukrainian citizenship. Antonovych, for example, mentions the system of black boards that was established in Ukrainian villages of Kuban and extended over the rest of Ukraine by 18th November 1932 (2008: 4). The earlier mentioned ‘Law in the Inviolability of Socialist Property’, also widely known as the ‘5 ears of wheat law’ was implemented despite the knowledge of the Soviet government that this would lead to severe starvation and famine. Another act solely aiming at Ukrainian villagers was the order, issued by Stalin in January 1933, to ban Ukrainian peasants to leave the territory of the Ukrainian SSR and Kuban in order to search for food (Antonovych 2008: 4). Thus, over the last decade Holodomor has become a matter dealt with by scholars of international criminal law who officially stated that between ‘5 and 10 million Ukrainians were starved to death as a result of brutal enforcement of excessive grain-procurement quotas’ set by the Soviet government (Ibid:8). Since then, it has been convincingly established that the famine was deliberately caused by the instrumental use of government policies, and that those implementing them, Stalin and a small group of other officials around him, were therefore responsible for committing a crime against humanity and indeed intended genocide to cripple the Ukrainian nation. Especially the new publication of Lemkin’s perception of the Ukrainian famine, as the father of – and thus an authority in – the concept of genocide, gives the proponents of genocide a new, strong basis. He, in particular, described the attack against the Ukrainian intelligentsia and the near liquidation of the Ukrainian Orthodox Autocephalous Church as an offensive against the very soul of Ukrainian culture, thus an act aimed at the annihilation of the Ukrainian ethnicity, and therefore, genocide.

427 111gradbook final.indd 427

22/10/2010 13:52:47


CONCLUSION

This paper has demonstrated that famines, contrary to the outdated popular belief as being haphazardly caused by adverse climatic conditions, generally afflicting developing countries, are more often phenomena consequent to man-made factors. It has indeed been established that particular adverse government policy and unequal food entitlement issues play a major role in the occurrence of famine, factors that after all could be prevented. Furthermore, it has been determined that in particular cases a certain intentionality has sadly been implicated. This moves the study of famine into the field of international law, crime against humanity and genocide. The Holodomor in Ukraine in 1932-33 is a sinister example of an artificial and undoubtedly intentional induction of famine through unfavourable government principles. It illustrates that not only political economy and Stalinist government policy in form of collectivisation, but indeed the intentional faminogenic behaviour of Stalin and a small group of his government officials, caused devastating starvation and the deaths of millions of people. This nowadays verifiably artificial creation of famine conditions can indeed be regarded as a purposeful and calculated act and attempt to achieve the eradication of a whole people, as in the case of Holodomor, through death by starvation. In this context the UN recognition of Holodomor not only as crime against humanity, but also indeed as genocide, can be regarded as appropriate and justified.

428 111gradbook final.indd 428

22/10/2010 13:52:47


429 111gradbook final.indd 429

22/10/2010 13:52:47


Teacher Education panel

r

Michael Cotter, DCU (chair) Dr. Paul Conway, UCC Prof. Gary Granville, NCAD Keith Johnston, TCD Prof. Carol McGuinness, QUB Sinéad Ní Ghuidhir, NUIG Pádraig Ó Murchú, Intel Ireland

Judges’ comments

This is a very well constructed essay with a natural and easy flow. It is a personal story written in language which is very accessible and reveals deep learning. It is hopeful and real; it reaches out to the reader and the insights and experiences of both writer and subjects are very interesting, lingering with the reader long after. The author has managed to chart a very difficult course between personal narrative and reflection on professional practice. The essay avoids the dangers of a merely anecdotal recounting of an education project. Instead, it makes imaginative use of personal experiences within a well-constructed narrative. The paper is also well constructed in selective use of a small number of well-chosen references drawn from the fields of learning theory (Wenger), art education (McGonagle et al.) and philosophy (O’Donoghue). The perspective of the author displays a maturity of experience and a capacity to reflect upon experience in a coherent and critical manner. The concept of a journey underpins the paper. This concept is somewhat hackneyed in contemporary discourse, but the author uses the concept in a very simple and effective manner, displaying a confident ownership and interpretation of the essential idea without labouring it. The paper is a reflection on an experience that took place some years ago, earlier in the professional formation of the author. Yet the narrative is very contemporary and the presentation of the ‘journey’ is very effective, both in terms of visual documentation, which adds considerably to the effect, and in the incorporation of references from the literary and visual arts to illuminate the reflections. It is a highly impressive presentation of a reflective practitioner and a model of learning in a professional formation context. It describes a journey based on personal action, reflection and selfdiscovery. It honestly addresses issues of inclusion, risk-taking, creativity and self-expression and is very candid in its appraisal of results.

430 111gradbook final.indd 430

22/10/2010 13:52:47


r

Te a c h e r E d u c a t i o n

From one heart to another: using visual arts as a medium of self-expression will activate an individual’s self-discovery Nuala Finnegan

F

Introduction

or the purposes of this assignment I intend to bring you on a retrospective journey; its beginnings spring from the creative expression of honesty and identity culminating in the discovery of the self as part of a collective and the connections therein. I hope that I, the writer, will succeed in bringing you, the reader, along with me. ‘Each day we journey from one place to another… Being conscious of your journey changes that journey forever more.’ (Lewis, R.1996). In 1991, I was employed by St Michael’s House as a Vocational Trainer. St Michael’s House was then and still is a large organisation providing services for people with intellectual disabilities such as; Downs Syndrome. The main centre for vocational training was Ballymun Training Centre; it catered for young adults aged 18 or 19 who had completed their primary/secondary education in the special school located on the same site. The Foundation Skills course was a threeyear course with the first two completed in Ballymun and the final year in Prussia St. The Prussia St. unit was based in a Community Centre in Dublin 7 near Stoneybatter. It catered for students with mild to moderate intellectual disabilities who could travel independently. This is where I was employed. One of the programmes I was responsible for delivering was Personal Development. Not being held to a particular curriculum meant there was lots of flexibility when it came to programme content. My training background was in community arts and drama and I was enthusiastic about being able to put my arts skills to good use in my new job. As part of my tutoring brief I also conducted Arts and Crafts classes with the trainees. ‘Arts and Crafts’ was an enjoyable activity but sometimes a disappointing one. If a piece of work did

431 111gradbook final.indd 431

22/10/2010 13:52:47


not turn out how it was expected to, it had an adverse effect on the group as a whole in terms of participation. This prompted me to search for an alternative, more open approach to arts work that did not require a right or a wrong outcome. A criteria-free approach would offer more chance of participation, satisfaction and inclusion. “Inclusive education and art make a good match because art doesn’t have to be intellectual or conceptual. Everyone can approach art on their own terms.” (Van Deursen, N. 2009). In 1992, I attended an exhibition in the Irish Museum of Modern Art, (IMMA) Kilmainham called Unspoken Truths – it was a very powerful exhibition about the lives and experiences of women from two areas of Dublin; Inchicore and Sean McDermott St. Encountering that exhibition was like being invited into those women’s lives. Their pieces were so emotive and revealing I was left with a heightened awareness of their struggle and survival. The process involved in producing the art works had enabled this group to have a voice and the finished pieces were imbued with its expression. So, a theory presented itself to me – my groups could use visual arts as a form of selfexpression within the personal development programme. If these women could work with and in the museum, why couldn’t my students?

Context

At the time I did not know that I had come up with a theory; to me it was an idea that had the potential to provide the trainees with an opportunity for self-discovery. If this happened it would surely have a positive effect on the other modules that trainees attended as part of the vocational training programme in Prussia St. such as: Catering/Canteen Skills; Employability Skills; Literacy and Sports. The community-based setting presented a challenge to some trainees; the integrated nature of the centre meant that their social interaction had to become more appropriate. In other words, behaviour such as shouting and screaming may have been tolerated in the safety of the Ballymun Training Centre but was not acceptable in the Prussia St. setting. On many occasions I observed that these students rose to the challenge; in fact they enjoyed it. As a tutor/trainer tasked with teaching them, in my experience a reciprocal relationship existed, I learned just as much (if not more) from them. In reflecting on the centre and its culture now I can identify that it was a community of practice with emphasis on ‘learning as social participation’, as stated by Wenger (1998). Wenger’s writing brings to mind the idea that as humans we need a sense of belonging as put by John O’Donoghue, ‘To be human is to belong. Belonging is a circle that embraces everything’. (1998 pg. 3). The stated ethos of Prussia St. was to encourage young adults with a mild to moderate level of intellectual disability to become integrated at a community level and implicitly to persuade that community not to bestow favours or allowances on them because of their disability. The prospect of forming a link with IMMA espoused that ethos in terms of the St Michael’s House group being treated on the same basis as any other community group.

432 111gradbook final.indd 432

22/10/2010 13:52:47


Another positive aspect of a proposed link with IMMA was its setting in the Royal Hospital Kilmainham. Spending time in the pastoral grounds of IMMA would make a refreshing change from the busy, urban Prussia St.

Implementation of Theory

Coming up with a theory is one thing but if it stays in your head it will never be tested, action must follow as Joel A. Barker says clearly; ‘Vision without action is but a dream, action without vision is only passing time, vision with action can change the world’. My colleague and key worker were interested in the idea and were keen to be involved. As there were only three staff working closely together in Prussia St. it was vitally important that the theory was supported by all. My next action was to write a letter to Helen O’Donoghue, senior curator of the community and education department at the Irish Museum of Modern Art (IMMA). Her reply was positive and she was interested in finding out more about the training centre and its participants. These were the first steps on our journey of discovery. IMMA welcomed the idea of working with young people with intellectual disabilities; they had never catered for this profile of group before and had lots of questions that needed answering. One of their main concerns (and ours) was health and safety; the museum environment, the art room where we would work, access to toileting facilities, the most suitable materials to use and any medical conditions to be considered were all discussed at length. These discussions were always animated and filled with a sense of positivity and purpose. Over several meetings a way forward was agreed: IMMA would provide: • A Space. • Materials. • A trained artist and assistant. • Access to the museums exhibitions. St Michael’s House Prussia St. would: • Get the trainees there. • Liaise with the artist. • Document sessions. • Produce end of project report. • Attend meetings with IMMA when necessary. It was also agreed that this pilot project would be voluntarily attended. We introduced the idea to the trainees and they all seemed eager to take part. There was a certain sense of excitement and enthusiasm generated about starting something different and new. Enthusiasm can be like a wave carrying everyone along and is a wonderful thing.

433 111gradbook final.indd 433

22/10/2010 13:52:47


“A wonderful thing is this quality which we call enthusiasm… If you would like to be a power among men, cultivate enthusiasm.” (Ogden, Armour, J. 1917) It was decided to split the group of twenty trainees into two and to work with each for three two-hour long workshops over a period of six weeks. The first group started in December 1993. Using ‘The Self’ as a theme, the group drew life size body maps of each other onto which they stuck faces expressing different emotions that were also represented by colour, for example: Blue=Sad, Red=Angry. At the end of the sessions each group member had created a portrait of themselves complete with expressions and feelings held within. For the duration of all three workshops the trainees worked very hard and their concentration levels were high throughout. The second group started their three sessions in January 1994. The main exhibition in IMMA at the time was from British artist, Anthony Gormley. He uses his own body as a starting point for his work. Thinking of the body as a starting point for creating ideas sparked an interesting discussion about what human beings are made of. This group surprised us during this debate by demonstrating a clear understanding of the physical, mental and spiritual aspects that make us human. The group drew figures of a man and a woman on two large sheets of soft board. These were filled in with physical detail such as; lungs, heart, brain, veins etc. On these, each participant pinned a piece of folded paper with a wish or a dream written on it. They also pinned on personal items like photos and other items that were important to them. Each figure was then outlined with pins and encased by weaving thread around the pins and across the figures. (See Fig.1)

Fig. 1. Body Maps – Mixed Media on Soft board (Intersections 1996. pg. 27). These pieces were named collectively – ‘Song of Myself’ after the poem by Walt Whitman. “I celebrate myself, and sing myself, And what I assume you shall assume,

434 111gradbook final.indd 434

22/10/2010 13:52:47


For every atom belonging to me as good belongs to you.” (Whitman, W. 1819-1892) The project didn’t go without difficulty. Firstly, travelling to the museum could be very tiring for the group so times had to be adjusted to minimise this. Tiredness also came into play if the work became intense (and it often did); a short snack break worked well in this case. The main project these two groups worked on was based on their physical journey from Prussia St. centre to IMMA. This journey, a ten-minute drive, involved a two bus trip that sometimes proved frustrating to everyone. They mapped their trek on paper and highlighted some landmarks such as, the bus stop, the bridge, the river and the train station. Using squares of red clay rolled out and smoothed, the paper map was transferred onto these squares. The clay squares were then cast with plaster. When the finished squares were joined together they formed two figures reaching out to one another. This piece was named ‘From One Heart to Another’, symbolising the heart of Prussia St. reaching out to the heart of IMMA and vice versa. (See figure one opposite and appendix one). All of the trainees work was displayed as part of the ‘Intersections’ exhibition at the museum along side other youth and community groups work. The 1996 group were invited to Denmark to work and exhibit with other similar groups (See appendix two for details).

Fig. 2. From One Heart to Another – St Michael’s House/IMMA. (Photo Derek Speirs 1994)

Findings

Throughout the five-year collaboration with IMMA lots of learning occurred and accrued. I was privileged to have witnessed the trainee’s response towards this way of working. At the time I noted in my records:

435 111gradbook final.indd 435

22/10/2010 13:52:48


“There was a great atmosphere created in the moment of finishing a piece. The effort, thought and care that went into the whole process cannot be emphasised enough. It is hard to explain or quantify the benefits of this process to each person’s development.” (Finnegan, N. 1994) I am not claiming that doing visual arts is like magic because it is not. Nor am I claiming to have practiced art therapy because I am not a therapist. The pieces were not diagnosed but they were reflected on. Perhaps this purposeful reflection acting like a mirror was how the group developed a greater sense of themselves and their place in the world. They did not hesitate; they immersed themselves in the work. The finished product was not the main focal point; the process involved in getting there was. Natalie Rogers developed expressive arts therapy based on her father, Carl Rogers’ Theory of Creativity (1961 Ch. 19). What she says here about the value of the creative process could be aptly applied to the St Michael’s House trainees. “The creative process is a life force energy. If offered in a safe, empathic, non-judgmental environment, it is a transformative process for constructive change.” (Rogers, N. 2002) One might ask, ‘Why go to the museum to do art? Why not stay in Prussia St?’ My answer is firstly, ‘Why not?’ Secondly, the opportunity to use the museum as a resource and as a place where personal discovery and development through visual arts could take place was one that could not be missed. ‘While we speak, envious time will have {already} fled; seize the day, trusting as little as possible in the next’. (Horace 65-8BC) The group took full ownership of their artwork; they made the space their own. I could not have foreseen the success of this project. The word ‘success’ is used guardedly because after all, how does one measure it? For the museum it was successful because they gained experience and insights into working with people who have intellectual disabilities. Currently, they have strong links with St. John of Gods, Carmona Services and Sunbeam House whose groups access the museum through the Education and Community Department. For the participants, success had different guises. Whether it was the chance to try a different activity, show their work to family and friends, travel to Denmark, or simply to hear positive words about their work; whatever form success took for them that was enough. Also the knock on effects in terms of improved concentration levels and self-esteem were commented on by tutors of other modules such as catering and employability skills. Benefits for me came in various forms too the most poignant could not really be described as success but rather as a reflective awareness. This group of individuals who are classed as intellectually disabled made me aware on a deep level of how complicated we ‘able’ people make things. Their ability out did their disability in terms of applying themselves to the tasks and the ‘straight to the point’ way in which they viewed and commented on art works in the gallery left us gob smacked. The phrase that comes to mind is an often used one – ‘For now we see through a glass, darkly’. (Corinthians 1:13). My collaboration with artist Lisa Weir meant I learnt how to use different types of media, for example clay, plaster, print, and collage. I transformed from being a trainer into an arts facilitator.

436 111gradbook final.indd 436

22/10/2010 13:52:48


My interest in arts and disability led me to be involved in a focus group connected with the Camphill Community in Kilkenny and Kilkenny Collective for Arts Talent (KCAT). This group have succeeded in opening an inclusive arts education centre in Callan, Co. Kilkenny, where artists of all abilities work and learn together.

Conclusion

Hindsight is always 20:20 and in reflecting on this initiative with the benefit of a considerable amount of time in between, I regard this project as a huge growth period both personally and professionally. Whenever a new theory is initiated – no matter what it is – a journey of sorts occurs. Something that can seem so simple and benign like turning left instead of right can present life-changing opportunities. If the initiation of an idea or theory involves people, especially vulnerable people, it is incumbent on the ‘theorist’ to take on the responsibility associated with such a task. It was my pleasure to accept this responsibility and lead the trainees on their journey of discovery.

Recommendations

A number of recommendations were made as a result of an evaluation meeting with IMMA at the end of the main project. It was agreed that the link with the museum was an extremely positive one for all concerned and it was hoped that St Michael’s House would continue to use the museum as a resource. Other recommendations included: Working with smaller groups It was agreed that smaller groups would benefit everyone, especially those who needed one-to-one assistance. Awkwardness of journey Although journeying to and from IMMA provided the impetus for ‘One Heart to Another’, there are two things we could have given more thought to. They are: 1. Fundraising to cover cost of coach hire. 2. Using in-house transport if available. Project ideas Keep project ideas in sync with participants’ ability (this was practised but a more purposeful awareness would improve the experience for all). Fast Friends It was proposed that St Michael’s House would engage with a programme called ‘Fast Friends’ (a research project involving the integration of mainstream and special school pupils) with Michael Shevlin from Trinity College. Most, if not all, of these recommendations were taken on board. In 1998 the Prussia St. Centre was closed due to diminishing student numbers and the IMMA-Prussia St. link ended.

437 111gradbook final.indd 437

22/10/2010 13:52:48


438 111gradbook final.indd 438

22/10/2010 13:52:48


r Irish – Winner 2009

Cillíní Páistí: briseadh croí faoi rún Philomena Ní Flatharhta ‘Faoin gcré ag an gclaí teorainn tá mo naoináinín na luí,

a hadhlacadh go rúnmhaireach i ndorchadas na hoíche’

‘gan aon duine le thú a chaoineadh

S

ach an ghealach ‘s an ghaoth’...

eo amhrán a chum Luisne Ní Neachtain don chlár raidió ‘An Claí Teorainn’ a chraoladh ar Raidió na Gaeltachta i mí na Samhna seo caite. Cuireann an píosa seo in iúl dár liom, na mothúcháin uile a bhaineann le traidisiúin adhlacadh páistí agus le marbhghin; na deora, an uaigneas, an briseadh croí agus go háirid an rúndiamhaireacht a bhain leis anuas go dtí an dara leath den 20ú haois.

Aidhm

Is í aidhm na haiste seo anailís a dhéanamh ar an eolas atá cruinnithe ar na cillíní páistí go dtí seo, ar a mbunús go stairiúil agus ar nósanna agus tuiscintí an phobal orthu. Déanfar é seo ag baint úsáid as foinsí atá cruinnithe ag saineolaithe éagsúla ar an mbéaloideas, taighde sheandálaithe, tuairim socheolaí ar an scéal agus fianaise ón bpobal comhaimseartha ar an ábhar.

Téarmaí

Úsáidtear téarmaí éagsúla timpeall na tíre chun cur síos a dhéanamh ar na suíomhanna a úsáidíodh i gcomhair adhlacadh páistí; teampaillín, cillathán, ceallúnach, cahir, ráthanna, liosanna. Ba iad na téarmaí cill, cillín nó seanchill ba choitianta ó tharla gur minic go rabhadar suite faoi scáth sean-séipéal a tréigeadh sna meánaoiseanna (O’Connor 2005:73). Tá 72 baile fearainn i nÉirinn go bhfuil ‘Cillín’ mar ainm orthu, agus 98 baile fearainn eile go bhfuil ‘Cillín-’ mar chuid tosaigh ar a n-ainm de réir taighde Uí Shúilleabháin (Ó Súilleabháin 1939:146). Tagann an

439 111gradbook final.indd 439

22/10/2010 13:52:48


focal ‘cill’ uaidh an bhfocal Laidin ‘cella’, a chiallaíonn séipéal bheag an luath Eaglais (Crombie 1988:150).

Céard iad na Cillíní?

Cheap Seán P. Ó Riordáin (1979) gurbh iad na ‘cillíní’ na háiteanna a chuirtí gasúr gan baisteadh agus thug sé fianaise go raibh dáta réamh-Chríostaíochta ag baint leo, de réir an Dr. T Fanning (1981:6) chomh maith is éard is brí le ‘ceallúnach’ ná an áit a chuirtí gasúr nár baistíodh. Do chaillfí go leor díobh dár leis, sula mbeadh deis iad a bhaisteadh, agus mar sin níor thuill siad adhlacadh Críostúil de réir an Eaglais agus níorbh ann dóibh a bheith curtha i dtalamh coisrice. De réir an béaloideas a bhaineann leis na cillíní seo níorbh iad na páistí gan baisteadh seo an t-aon ghrúpa amháin a bhí ‘taobh amuigh’ de gnáth shruth saol an phobail, bhain an catagóir seo le grúpaí eile a chuaigh ar strae ón gcoinbhinsiún sa chomhthionóil, agus dá bhrí sin cuireadh iadsan sna cillíní seo chomh maith (Finlay 2000). An ‘dorchadas gan phian’, mar a thugtar air in iarthar na tíre, a bhí i ndán dóibh siúd a chuirtí sna cillíní seo, nó ‘Limbó’; an áit idir eatarthu, gan iad a bheith sa saol seo ná sa saol eile ach oiread. Cuireadh tuilleadh béim ar an staid spioradálta seo go fisiciúil “an Liombó”, in sna suíomhanna a roghnaíodh do na cillíní, mar shampla na claí teorainn idir dhá ghabháltas (Finlay 2000:413). Ba choitianta in áiteanna an píosa talúna idir claí an gharraí íochtarach agus an fharraige a úsáid go háirid san iarthar. Is téama uilíoch sa mbéaloideas an ‘bheith as alt’ seo. (Crombie 1988:151). De réir an Eaglais ní fheicfidh siad solas na bhFlaitheas go dtiocfaidh ‘Lá na Cinniúna’ (An Claí Teorainn, 2009, O’Connor 2005:67). Tá go leor nósanna agus deasghnátha ag baint le adhlacadh páistí in Éirinn, ach níl siad teoranta d’Éirinn amháin. Tá sé suimiúil go bhfuil cuid de fréamha an traidisiúin seo le fáil i gcultúir eile na hEorpa, de réir Seán Ó Súilleabháin (1939:145), tá a mhacasamhail le fáil i sean-chultúr na Róimhe agus na Gréige mar shampla, agus cuireann Pentikainen (1969) síos ar chosúlachtaí an traidisiúin seo i gcultúr an Ioruais agus an Fhionlainn chomh maith.

Bunús na gCillíní Páistí?

Ceann de na foinsí is luaithe ar an ábhar seo ná Aeneid IV a scríobh Virgil (c70-19BC) ina ndéantar tagairt do dhá ghrúpa a bhíonn ar seachrán idir bhruach an Abhainn Acheron agus lár Hades, is iad an dá ghrúpa seo ná gasúr a chaillfí an-óg agus iad siúd nár éirigh leo an ghnáth fadsaoil a bhaint amach ar chúis éigin. Tagraíonn ‘La Divina Commedia:Inferno’ go Dante ag cloisteáil caoineadh na ndaoine agus na ngasúir sa dorchadas nár b’fhéídir leo éalú as de bharr nach rabhadar baistí (Crombie 1990:13). San aimsir réamh-Chríostaíochta bhí áit adhlactha faoi leith ag na Rómhánaigh le haghaidh leanaí a chailltí faoi bhun 40 lá d’aois, ‘Suggrundarium’ ab ainm dó, ach sa chás seo ba áit ómós do na páistí a bhí i gceist. Tagann an foinse eolais scríofa is luath uaidh Juvenal (c.60-130AD) a rinne cur síos ar na nósanna a bhain leis (Crombie 1990:12). Chuirtí corp an linbh faoi sciathán an tí agus ní bhíodh sé le feiceáil ach é ar chomhleibhéal leis an talamh ina thimpeall. Thaitin le muintir an tí de réir cosúlachtaí an leanbh a chur in aice an áit a mbeidís chuile lá (Ó Súilleabháin 1939:145). Dár le Ó Súilleabháin ‘d’fhás croiceann Críostaidhe ar nós Phágánach úd na Rómha sa tír seo agus mar cor breise air, go dtugtí adhlacadh fé leith do leanbhaí gheobhadh bás gan

440 111gradbook final.indd 440

22/10/2010 13:52:48


baiste’ (Ibid). Déanann Ó Súilleabháin tagairtí go leor d’aimsir na Págánaigh agus go fiú san am sin go ndéanfaí cás ar leith do ghasúr a bhásaigh. Cuir i gcás daoine fásta a chaillfí dhófaí na coirp, ach i gcás gasúir, adhlacadh iad sa ghnáth bhealach. Dearbhaíonn Ó Súilleabháin go bhfuair an nós Rómhánach seo athchruthú Chríostaíochta sa tír seo agus mar sin thugtaí adhlacadh speisialta do pháistí gan bhaisteadh in áiteanna a mbíodh móradh áirid ag an bpobal áitiúl dóibh. Ní raibh an chúis a mbíodh móradh ar áiteanna áirid soiléir i gcónaí ach chuir Ó Súilleabháin chun cinn smaoineamh Wakeman mar míniú ar an scéal; is é sin gur reiligí Págánacha ar dtús iad, láthair adhlactha na ndaoine bochta dár leis le linn do na taoisigh Ghaelacha a bheith á chur sna cáirne móra cloch, a mhaireann fós in áiteanna (Ó Súilleabháin 1939:146). Tréigeadh na reiligí seo le theacht soiscéal na Críostaíochta go hÉirinn, agus níor úsáidíodh iad ina dhiaidh sin ach le haghaidh leanaí gan baisteadh, daoine a d’imreodh ana-bhás orthu féin nó strainséirí a chaillfí (Ibid). Bhronn na taoisigh ráthanna agus dúin mar suíomhanna don Eaglais i gcomhar na chéad séipéil a tógadh sa tír, ba as adhmad a tógadh na gcéad chillíní seo, de réir fianaise obair seandálaíochta (Crombie 1990:19). Tréigeadh iad níos déanaí sna meánaoiseanna nuair a tugadh cúram pharóistí don Eaglais, agus dá bhrí sin níl a lorg le fáil os cionn talamh inniu. Thabharfadh an méid seo le fios an chúis gur baistíodh ‘cillíní’ ar go leor do na suíomhanna seo agus go rabhadar luaite le áit a mbíodh uaigheanna le feiceáil (Ó Súilleabháin 1939:147). Coinníodh an t-eolas seo beo i gcuimhne na ndaoine agus bhí meas speisialta ar na háiteanna seo, aontaíonn Ó Ríordáin chomh maith (1979) gurbh é seo an cúis a roghnaíodh na suíomhanna áirithe seo do chillíní páistí. Tagann cuid de na foinsí scríofa is luaithe a bhaineann leis an gcás in Éirinn uaidh na scríbhinní Eaglaise agus Giraldus Cambrensis sa 12ú haois. Sna scríbhinní sin leagtar amach go raibh sé coiriúil ligeann do pháiste bás a fháil gan baisteadh. Ceaptar gurbh é seo a chur tús leis an ‘baiste úrláir’, ach cé gur thug an Eaglais aitheantas dó i súile na coismuintire. is ar éigin a bhí sé seo níos fearr ná stádas an páiste gan baisteadh. I dtuarascáil Cambrensis chuig an Comhairle i mBaile Átha Cliath sa mbliain 1186 dearbhaíonn sé seasamh an Eaglais ar thábhacht an bhaisteadh agus leanacht go dlúth le deasghnátha an Eaglais go cruinn (Crombie 1990:13). Tugann fianaise seandálaíochta le fios gurbh coitianta iad na thailte adhlacadh scoite seo do ghasúr sa tírdreach ó dheireadh na meánaoiseanna ar a laghad (Crombie 1990:16-7, Ó Héalaí 2006:92). I gcomhthéacs an smacht a bhí ag an Eaglais in Éirinn sna meánaoiseanna ní aon ionadh go bhfuil lorg láidir na gCríostaíochta le fáil orthu, ó thaobh a n-ainmniúcháin mar ‘cillíní’ agus go leor díobh a bheith lonnaithe ag suíomhanna sean-séipéal (Ó’Héalaí 2006:93). Tá sé réasúnta mar sin a rá go raibh éilimh ar leith do na cillíní don athuair dóibh siúd nach mbeadh feiliúnach d’adhlacadh i reilig coisrice de réir dlíthe an Eaglais agus freisin nárbh iad na páistí gan baisteadh amháin a bheadh sa chatagóir seo. Leanadh le húsáid formhór na gcillíní go dtí an tréimhse idir lár an 19ú haois agus tús an 20ú haois, cé go raibh cuid acu fós á n-úsáid anuas go dtí ár linne féin (Aldridge 1969; Crombie 1990:55-6, Ó Súilleabháin 1939, Ó’Héalaí 2006:93). Déanann Aldridge (1969) tagairt do fianaise i gCo. Mhaigh Eo i lár na seascaidí (1964). Tá sé thar a bheith suimiúil chomh maith gur líonmhaire go mór na cillíní in iarthar na tíre de réir taighde (Crombie 1988, 1990), tá 458 cillíní páistí i gCo. na Gaillimhe seachas 2 i gCo. Chill Mhantáin. Tá cúpla teoiric curtha chun cinn i leith an

441 111gradbook final.indd 441

22/10/2010 13:52:48


míchothromaíocht seo, de réir Cuppage (1986) is de bharr láidreacht an chultúr, idir teanga agus bhéaloideas, san iarthar seachas san oirthear an chúis is mó leis.

Suíomhanna na Cillíní Páistí

Is fairsing agus is éagsúil iad na cineálacha suíomhanna a úsáidíodh i gcomhair adhlacadh leanaí. Ní i gcillíní amháin a chuirtí iad de réir lámhscribhinní i seilbh Choimisiúin Bhéaloideasa Éireann. Tugann Ó Súilleabháin (1939:148) liosta do na cineálacha áiteanna éagsúla inar lonnaíodh cillíní páistí iontu: San iothlainn i n-aice leis an teach, i liosanna, i ngairdín in aice an teach, i bpáirceanna, i gclaí teorainn, ag crosbhóthar agus faoi scáth sceach. Ag braith ar nósanna logánta bíonn an éagsúlacht suíomhanna seo i gceist i dtraidisiún na hÉireann. Ba mhinic na daoine seo a bheith curtha in áit a bhain leis an tSlua Sígh, nó ‘na daoine maithe’ ar nós liosanna agus ráthanna (O’Connor 2005:37). Maireann na céadta cuntais ag cur síos ar na tuiscintí agus nósanna a bhain le adhlacadh páistí gan baisteadh in áiteanna mar iad ar fad na tíre. (Ibid). Is minic a thagann na scéalta idir spioraid na leanaí gan baisteadh agus ‘na daoine maithe’ trasna ar a chéile agus go measctar iad, cuir i gcás sna síscéalta ó Thír Chonaill agus ‘Bean ghlún a chaill amharc na leathshúile.’ Is í an bean ghlún an bhean chabhair a bhíodh ag mná a bhíodh i dtinneas clainne, tháinig marcach chuici oíche go ndeánfadh sí freastal ar a bhean. Rugadh an páiste marbh don mná, teipeann ar an bhfear san iarracht a dhéanann sé páiste na mná glúine a chuir in áit an leanbh a bhí marbh, faightear amach níos déanaí sa scéal gur de ‘bhunadh na gcnoc’ é, nó an tSlua Sígh. Feictear i dtraidisiúin bhéil na hÉireann agus na hEorpa go bhfuil dlúthnasc idir páistí agus an tSlua Sígh, fiú is go bhfuil sé intuigthe in áiteanna gurbh iad ‘na daoine maithe’ anamacha na bpáistí gan baisteadh (O’Connor 2005:37). Theastaigh cabhair daonna uaidh na sígh chun aon tionchar a bheith acu sa saol seo agus dá bhrí sin bheadh iarrachtaí á dhéanamh acu daoine a fhuadach, go háirid i gcás mná ar leaba clainne a bhásaigh nó marbhghin deirtí gur fuadach sígh a bhí ann agus amanna go bhfágtaí ‘iarlais’ ina ndiaidh. “Dá mbásódh páiste ar bhean ag teacht ar an saol bheidís ag rá gurbh iad ‘na daoine maithe’ a thug leo é.” (Ó’Héalaí agus Ó Tuairisg 2007:99). Tréith láidir a bhain leis na cillíní páistí in Árainn agus i gConamara ach go háirid ná páistí gan baisteadh a adhlacadh faoi claí teorainn, mar atá sonraithe ag Robinson (1985). Ba ghnáth chleachta eile i gConamara freisin go ndéanfaí cillíní páistí a shocrú ar an gcósta idir claí an gharraí íochtarach agus an chladaigh (Crombie 1990:46). Ba é an chúis leis seo ná gurbh teorainn eile é an áit seo idir an talamh agus muir, tuigeadh leis gurbh teorainn é idir an saol seo agus an saol eile.

Nósanna agus tuiscintí a bhain leis na Cillíní Páistí

Tá go leor den eolas faoi na nósanna agus tuiscintí ceangailte le adhlacadh leanaí in nÉirinn, go háirid adhlacadh na leanaí gan baisteadh, bailithe ag an gCoimisiúin Béaloideas agus ag scríbhneoirí éagsúla cosúil le Seán Ó Súilleabháin. Níl traidisiúin an bhéaloideas a bhaineann le leanaí gan baisteadh teoranta d’Éirinn amháin, tá sé tuigthe go forleathan go raibh dorchadas seachránach i Liombó ar a dtugtar an ‘friedlosen Irrlichterdasein’ sa nGearmáin air i ndán dóibh.

442 111gradbook final.indd 442

22/10/2010 13:52:48


Déantar tagairt dó seo i litríocht na deartháireacha Grimm (Crombie 1990:57). Tá siad ar an ‘taobh amuigh’ den tsochaí agus gan stádas, mar gheall nach raibh aon deis acu aon stádas a thuileadh i gcás na leanaí gan baisteadh. I gcás na ngrúpaí eile a adhlactha sna cillíní agus a bhí ‘taobh amuigh’ is mar gheall ar cúinsí moráltachta nó creidimh é b’fhéidir, cur i gcás duine a chur lámh ina bhás féin nó duine a rinne dúnmharú. Is mar gheall ar an stádas seo a thugtar ‘na marbh gan stádas’ orthu go minic sa traidisiúin bhéil san Ioruais agus san Fhionlainn, de réir Pentikainen (1969:92-102). Tá léiriú fisiciúil ar an staid spioradálta seo sna háiteanna a roghnaíodh le haghaigh na coirp a chur, ba neacha imeallacha iad agus dá bhrí sin roghnaíodh áiteanna a bhí imeallach dóibh (Finlay 2000:409). Tá 65% de na cillíní in gCo. na Gaillimhe suite i bhfoisceacht 200m do theorainn bailte, agus is siombalachas láidir é seo ó thaobh tábhacht na teorainn seo i sochaí na hÉirinn leis na cianta (Crombie 1990:58). Chruthaigh na teorainn seo limistéir fisiciúil chomh maith le limistéir mhiotaseolaíocht agus tá siad luaite go minic, agus go láidir, sa dtraidisiún béil ag dul i bhfad siar, ba iad na teorainn seo na filltíní idir an saol seo agus an saol eile de réir tuiscintí. Sa mbéaloideas in Éirinn maireann na leanaí seo in áit áirid sa saol eile agus tá sé le tuiscint freisin go dtagann siad isteach sa saol seo ó am go chéile mar spioraid nó taibhsí. Go minic feictear iad i bhfoirm gasúir bheaga a bhfuil solas anma le feiceáil iontu, nó feictear iad mar leanaí ag iompar coinnle lasta, nó fiú feictear iad mar soilse astu féin. Is cuid de traidisiúin forleathan in Éirinn na híomhanna seo, le cruthanna logánta air uaireanta. Is motif lárnach í an solas, feictear é arís agus arís eile i dtraidisiúin bhéil agus finscéalta na hÉireann (O’Connor 2005:37). Tá corpas substaintiúil ábhar a bhaineann le cillíní páistí freisin ann, tugann na tuiscintí agus cleachtais a bhaineann le adhlacadh na leanaí seo tacaíocht don meon traidisiúnta i leith an áit imeallach, idir-eatharthu nó Liombó seo ina bhfuil na hanamacha ag maireachtáil. De na tarlúintí osnádúrtha ag Cillíní Páistí atá le fáil san mbéaloideas is iad ‘an féar gortach’, ‘an fóidín mearúil’ agus an ‘cineál de scroth’ a thiocfadh ar do chraiceann na cinn is suntasaí (O’Connor 2005:72-77). Tá scéal ‘an féar ghortach’ coitianta go maith ar fad na tíre, creidtear go dtagann ocras dochreidte ar dhuine a shiúlainn ar an áit atá páiste gan baisteadh curtha. Bheadh duine i mbaol a bháis bheadh an tocras seo chomh mór agus le faitíos ní thiocfadh le haon duine bheith amuigh gan greim le nithe ina bpócaí acu de réir Séamas Ó Dúshláíne as Co. Longphort i bhfianaise a thug sé don Choimisiún Bhéaloideasa i 1956 (Ibid). Ba é ‘an fóidín mearúil’ (‘an fóidín meara’ i gConamara) a thugtaí ar an mbealach go dtiocfadh le daoine dul ar strae dá siúilfidís ar phíosa áirid talúna, ba mhinic ‘an fóidín mearúil’ luaite le háit a mbíodh draíocht curtha ag na Sígh air, ach tá go leor fianaise timpeall na tíre gur bhain na háiteanna áirithe seo le adhlacadh páistí gan baisteadh. Tá sé ráite go dtéann an duine a shiúlainn ar an spota seo isteach san dorchadas chéanna ina bhfuil anam an pháiste agus gurb é sin an fáth go dtéann siad ar strae (O’Connor 2005:75-6). An ‘fóidín marbh’ a thugtaí ar seo i Maigh Eó. agus ní bhíodh fonn ar daoine gabháil thar na huaigheanna tar éis titim na hoíche le ceann eagla na marbh (Ó Súilleabháin 1939:149). I dTír Chonaill amháin a chloistear faoin ‘cineál de scroth’ seo a thiocfadh ar chraiceann duine dá shiúlfaí trasna ar uaigh páiste gan baisteadh. De réir Seán Ó hEochaidh i 1940 ba é an leigheas a bhí air ná “dá chuta déag de snáith lín a fháil agus iad a chur le chéile. Bain díot do chuid éadaigh uile ansin agus léim trí an lín trí uaire in ainm na Trionóide. Déan sin trí lá i ndiaidh a chéile agus tá an leigheas agat.” (O’Connor 2005:77).

443 111gradbook final.indd 443

22/10/2010 13:52:48


Séard atá le feiceáil sa traidisiún bhéil anseo ná an droch toradh a bhíonn ar briseadh na tuiscintí traidisiúnta ar an rud is cóir; trí faillí nó d’aon turas. Ba í ceann de na feidhmeanna is mó a bhain leis an mbéaloideas i riamh na teagasc agus comhairle a thabhairt do dhaoine i leith an rud cóir agus ceart.

Cleachtais adhlacadh leanaí gan baisteadh

De réir teoiric Van Gennep ar deasghnátha an bháis ní raibh ceangail sách láidir idir na páistí gan baisteadh seo (anabaí agus marbhghin ach go háirid), agus a gclann nó a bpobal ionas go dtuillfeadh siad tórramh nó aon deasghnáth casta scarúinte ab iondúil do dhaoine fásta chun cosaint a thabhairt dóibh ar an aistir ó staid amháin go dtí an chéad staid eile (Ó’Héalaí 2006:96). Páiste lag nó i mbaol bás ar baistíodh, bhí stádas gan pheaca sroichte acu agus dá bhrí sin bhíodar i dteideal ballraíocht sásúil i gcomhluadar na marbh; na leanaí gan baisteadh, ar an lámh eile, do bhain siadsan le catagóir eile nach bhfhéadfadh riamh ballraíocht sásúil a dhéanamh i measc na marbh. Bhíodar sáinnithe idir an saol seo agus an saol eile gan faoiseamh ar bith le fáil acu as paidreacha a muintire (O’Connor 2005:35-9, 65-98). Is é seo an míniú a thugtar ar an ngiorrúcháin agus an simplíocht a bhain le chleachtais adhlacadh i gcás na leanaí gan baisteadh. Ba íomhá iad na cleachtais ar an ísliú stádas seo a bhain leo go spioradálta (Ó’Héalaí 2006:87). Ní éireodh leo go deo an aistir a chríochnú ó staid an saol seo go dtí staid an chéad saol eile agus dá bhrí sin ní raibh aon mhaith dóibh sna deasghnátha a bhí deartha chun cosaint a thabhairt ar an aistir sin, agus aistriú stádas rathúil a chinntiú (Pentikainen 1969:98). Mura mbeadh baisteadh Eaglasta tugtha don pháiste ní chuirfí aon éadach speisialta orthu tar éis a mbás, chuirfí i gcliabhán na gcorpán nó i leaba na máthair iad nó go dtógfaí iad chun a adhlacadh, in ndorchadas na hoíche go hiondúil (Ó Súilleabháin 1939:149). Ba mhinic a chuirtí gan tórramh ná sochraide iad, agus ní chrochfaí aon bhraillín os a chionn. I gcás leanbh mór ní chuirtí aon chláir in íochtar an chomhroinn chun cead fás a thabhairt dóibh (Ibid). I gCo. Chiarraí do chuirtí muigín ornáide sa chomhroinn nó ar bharr na huaigheanna, mar gur tuigeadh go raibh seanbhean ar an saol thall agus bó aici a thabharfadh bainne do na páistí, ach an páiste nach mbeadh aon chupán aige ní bheadh bainne le fáil aige (Ó Súilleabháin 1939:150). Ar na mBlascaod ní chaointí na leanaí gan baisteadh, agus de réir dealraimh in áiteanna ní chaointí gasúr riamh (Ó’Héalaí 2006:89).

Fianaise an pobal comhaimseartha

Sna cuntais a chloistear ó dhaoine éagsúla cuirtear in iúl an ghairgeacht agus an lom-nádúrthacht a bhain leis na nósanna seo. Ar an gclár raidió ‘An Claí Teorainn’ a thug Máire Ní Mhaoileoin as Ros a’ Mhíl an méid seo le fios do Seán Leainde faoin ábhar; ‘Nuair a bhí mé 13 bliain d’aois bhí mé i dteach comharsan a rugadh páiste marbh ann agus thug an tathair leis é de shúil na hoíche i mbosca bróga agus chur sé faoin gclaí teorainn é agus ní cheapaim go raibh a fhios ag aon duine ar an mbaile tada faoi. Rinne m’athair an rud chéanna le iníon a rugadh marbh do mo mháthair, thug sé leis é de shúil na hoíche agus chur sé faoin gclaí teorainn é. Séard a dhéanfaí leagfaí an claí síos go talamh agus chuirfí an páiste faoi agus thógfaí an claí ar ais san áit chéanna, ní raibh sé curtha taobh istigh ná taobh amuigh den teorainn ach

444 111gradbook final.indd 444

22/10/2010 13:52:48


díreach sa teorainn. Ní raibh daoine ag dul ag déanamh scéal dhó, ní thiocfadh daoine ar thórramh ná cruinniú ná rud ar bith, bheadh brón mór ann ach choinnídís acu féin é choinneofaí an scéal anrúnda. Níor chuala mé riamh go dtabharfaí ainm ar na páistí marbh seo’ (Leainde, 2008). De réir Máirtín Ó Dúiche as Corr na Móna ba é an fáth ar cuireadh na páistí gan baisteadh sna claí teorainn ná chun an mhí-ádh a scoilt idir an dhá bhaile, ‘deirtí gur mallacht a bhí curtha ag Dia ort dá gcaillfí páiste’ (Leainde, 2008). ‘Dumhach na Leanaí’ a thugtar ar an reilig bheag a déanadh thíos ag an gcladach sa Teach Mór ar an Indreabhán, agus bhí sé déanta go mídhleathach de réir Maitiú Ó Diolúin. ‘Páistí nach raibh baiste a cuireadh ann cé go raibh triúir duine fásta freisin inti’ (Ó’Héalaí agus Ó Tuairisg 2007:99). Bhásaigh go leor aimsir an droch fhliú, sa mbliain 1945 a cuireadh an corp deireadh i nDumhach na Leanaí. De réir Michael Mheairt Ó Coisdealbha, a bhí ina ghasúr scoile ag an am, ba ghasúr comharsana, buachaill, a bhí thart ar 7-8 bhliain a adhlacadh go deireadh sa chillín seo. ‘Bhí máchail fisiciúil éigin ag dul don buachaill óg seo agus níor fhás sé riamh thar méid páistín trí bliana’ (Ó Coisdealbha, 2009). Déanadh scrúdaithe eolaíochta ar chnámha ag cillín páistí i gCo. Aontroma mar chuid de thaighde seandálaíochta i 1999 (Donnelly et al. 1999). Thaispeáin na torthaí go raibh máchail fisiciúil ar ceann ar a laghad de na coirp, rud a cheap na seandálaithe a mharcáil amach na gasúir seo chomh maith mar ghrúpa eile a bhí “taobh amuigh” den ghnáth shruth saolta agus a thuill adhlacadh sa chillín dóibh. Tá ‘Trá na bPáistí’ ar an Dóilín ar an gCeathrú Rua, tá go leor páistí gan baisteadh curtha in aice an áit seo agus sa Tismeáin freisin a deir Meaig Uí Dhomhnaill agus Bríd Uí Bhriain, ach dá mbeadh baisteadh úrláir faighte acu chuirfí sa reilig iad (Ó’Héalaí agus Ó Tuairisg 2007:100). Bhí Peadar Tommy Mac Donnachadha thar cheann an choiste a thóg an léacht cuimhneacháin ar an áit seo i 1997, thug sé eolas ar an gclár raidió do Sheáin Leainde gur fhreastal an cillín áirid seo ar dhá bhaile, an Roinn agus an Bóthar Buí. Deir sé freisin go raibh páistí á chur ann leis na céadta bliain agus in éineacht leo chuirfí shiúlóirí agus strainséirí nach mbeadh fios a gcreideamh ag muintir na háite. ‘Le linn aimsir an Gorta cuireadh daoine ann san áit a mbíodh a muintir ró lag chun iad a chrochadh go dtí an reilig’ (Leainde, 2008). Tagraítear san litríocht freisin do pháistí a adhlacadh sna cillíní agus don céasadh croí a d’fhulaing a muintir a chaoin go híseal agus rúnmhar iad mar gheall ar an ‘neamhghlaineacht’ a bhain leo de réir na hEaglaise. Ach an oiread leis an traidisiúin bhéil thug an litríocht árdán chun ábhair a phléigh a bheadh deacair nó íogaireacht do dhaoine. Seo sliocht as ‘An Strainséara’ le Máirtín ó Cadhain, scríbhneoir mór i litríocht na Gaeilge, as an gCnocán Glas taobh thoir den Spidéal i gConamara. Léirítear céasadh croí an mháthair nach raibh deis aici bheith ina máthair tar éis chúig cinn de marbhghin a shaolú agus an céasadh intinne de bharr go raibh uirthi é a choinneáil faoi rún gan cead a brón a scaoileadh uatha.

‘Garraí an Locháin. Leachtáinín cloch. Scáilí.’...

‘Cúigear acu a bhí ann...’ Marbh a rugadh iad. Marbh a rugadh an cúigear. An duine féin níor rugadh beo. Dhá bhfeiceadh sí beo iad! Mura mbeadh ann ach ar feadh sméideadh a súl! ‘a dheornadh ina théagar te beo-cholla – roimh a bhás’. ‘Bhí teir orthu abhus. Bhí teir orthu thall.’

445 111gradbook final.indd 445

22/10/2010 13:52:49


‘A bheith róshalach le slánú; róghlan le damnú... Cuilínigh clamhacha na síoraíochta. Ar an stoc ronna nach raibh Dia ná an diabhal ag éiliú seilbhe air a bhíodar... ‘in áit dorcha gan aon phian’ (Ó Cadhain 2006:94-97).

Conclúid

De réir taighde Aldridge i gCo. Mhaigh Eo ba é an úsáid is déanaí atá luaite le húsáid cillín páistí i gcomhair adhlacadh ná 1964. Tar éis an méid eolais atá bailithe ag Ó Súlleabháin agus taighdeoirí eile ar an ábhar is soiléir go raibh tradisiúin an-fhada ag baint leo agus gur neartaigh sé arís le theacht na Críostaíochta de bharr an géarghá a chruthaigh sé, cuireadh cruth Críostúil ar nós págánach. Tá eolas mhaith ag pobal comhaimseartha na Gaillimhe go ginearálta ar na cillíní páistí, cé gur éiríodh as a n-úsáid idir na 40idí agus na 60idí den chéad seo sa taobh seo tíre. Ba chuid den saol acu é agus iad ag teacht suas, ba mhinic a thárlódh a leithéidí agus marbhghin agus go deimhin tarlaíonn sé inniu fós, fiú is go bhfuil an saol eolaíochta agus leighis i réim ó tar éis an dara chogadh domhanda i leith agus dul chun cinn mór déanta i dtaobh ghasúir nuabheirthe. In Éirinn inniu tá difríocht mhór idir chleachtais i dtaobh páistí a chailltear agus an traidisiúin seo de cillíní scoite a lean leis na glúnta. Tugtar aitheantas do na páistí seo ina gclann agus ina bpobal anois, tá tábhacht sóisialta ag baint le sochraide ghasúr agus tionchar aitheanta i gciall níos leithne na mar a bhí roimhe seo. Léirítear an tábhacht lárnach atá ag an bpáiste i saol an chlann chomhaimseartha sna nósanna inniu agus nochtann siad an nasc láidir clainne atá ann fiú le páistí nuabheirthe. Bheadh sé dochreidte inniu páiste a adhlacadh in aon áit eile ach lena gclann sa reilig (Ó’Héalaí 2006:99). Ó lar na 80idí sa tír seo tá ghluaiseacht sóisialta in aghaidh an chleachtas de chillíní scoite, tá feachtais éagsúla ar bun chun aitheantas a thabhairt do chillíní timpeall na tíre agus iad a choisriceann, is chor nua suntasach an aitheantas poiblí seo i gcuimhne na ngasúir bheaga. Tá obair nach beag déanta ó na 90idí thar tairseach na mílaoise i bpobail bheaga na tíre, trí scéimeanna forbartha áitiúla agus a leithéid, cillíní pháistí a chóiriú, ballaí a thógáil timpeall orthu agus leachta cuimhneacháin a shocrú orthu. Déantar é seo i gcuimhne na leanaí agus freisin chun aitheantas a thabhairt d’fhulaingt agus briseadh croí rúnda a gclann (Ibid:101). ‘Is é an tuiscint atá ann de réir an Eaglais ná go dtagann chuile dhuine isteach sa saol seo agus peaca ár sinsear orthu agus iad i mbaol a anam dhá réir’ (Leainde 2008). Is trí sacraimint an bhaisteadh agus an ainmniúchán a shlánaítear muid agus nglactar linn i bpobal Dé. Dár leis an tAthar Ó Fátharta níl aon Liombó ann níos mó de réir múineadh na hEaglaise, agus nach raibh an bealach ar caitheadh leis na páistí seo Críostúil. Árdaíonn an tAthar Ó Fátharta pointe suimiúil freisin sa mhéid is go ndeireann sé go bhfuil muid ag breathnú siar ar a tharla leis na cillíní agus ag rá gur déanadh éagóir i dtaobh na gasúr seo ach amach anseo b’fhéidir go ndéarfaí go bhfuil rudaí atá ag tarlú inniu neamh Chríostúil chomh maith céanna. Sa mbliain 2007 chuir an Pápa Beinidict XVI deireadh leis an smaoineamh go raibh na leanaí seo scoite ó Dhia agus sáinnithe sa dorchadas, an chleachtas a bhí ann thar na céadta bliain. Thug sé aitheantas don tuarascáil a chuir an Choimisiún Reiligiúnda Idirnáisiúnta (painéal comhairleoirí na Vatacáine) ar fáil a dúirt go raibh cúinsí beachta ann chun misneach a thabhairt go bhfeicfeadh páistí gan baisteadh solas na bhFlaithis. Beannacht Dé le hanam na mairbh.

446 111gradbook final.indd 446

22/10/2010 13:52:49


111gradbook final.indd 447

22/10/2010 13:52:49


111gradbook final.indd 448

22/10/2010 13:52:49


r

Undergraduate Journal

FOUNDING PARTNERS

Undergraduate Journal

r Volume 2

Volume 2

www.undergraduateawards.com

A collection of winning essays from 2010


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.