Asia Research News 2015

Page 40

40

PEOPLE

Detecting indigenous linguistic knowledge Researchers in Malaysia have identified an important source of linguistic knowledge for minority languages by converting an original printed dictionary into an electronic, machine-readable form. Dictionaries for minority languages such as Penan – a language spoken by an indigenous group in Sarawak, Malaysia – often lack linguistic knowledge such as functional words (words that express grammatical relationships), collocations (sets of words that often appear together) and morphology (word structure). Led by Dr Bali Ranaivo-Malançon, researchers at Universiti Malaysia Sarawak have discovered a way of extracting linguistic knowledge about the Penan language by converting a printed dictionary into an electronic “machine-readable” dictionary (MRD) and analysing it with language processing tools. After creating a PDF version of a printed EnglishPenan dictionary, the researchers used an optical character recognition tool to convert the PDF file into an

MRD. The team then applied “corpus processing” software to analyse the dictionary contents – creating the first list of functional words in the Penan language and identifying their collocations, some of which were previously unknown. The team also generated morphological information, including prefixes and suffixes, from the MRD using another software program called Linguistica. In the future, the team hopes to convert all existing Sarawak printed dictionaries with the agreement of the authors, says Dr Ranaivo-Malançon.

For further information contact: Associate Professor Bali Ranaivo-Malançon Faculty of Computer Science and Information Technology Universiti Malaysia Sarawak E-mail: mbranaivo@fit.unimas.my

Universiti Teknologi MARA

The Iban alphabet was invented in 1947.

Reviving the Iban alphabet Programme. (TULIS means “writing” in Iban.) “The ultimate A Malaysian indigenous group has revived its alphabet purpose of the course is to help revive the otherwise from the brink of extinction, thanks to specially designed disappearing Iban alphabet,” he explains. computer fonts. Dr Philip is now re-alphabetising three Iban folktales, The Iban is the largest indigenous group in Malaysia which are currently written in Latin, using the with a population of more than one million, Iban alphabet as part of an effort to transcribe most of whom live in the state of Sarawak, The researchers aim as many Iban language materials as possible. Malaysia. The Iban language is fairly He is also building an Iban alphabet dictionary common. It is the only indigenous language to help preserve the Iban for use as a reference for the Iban spelling that is officially taught in Sarawak schools system. and is spoken not only among the Iban but alphabet in digital form in “Most Iban, [whether] old or young, are also between the Iban and other ethnic by now aware that the Iban language has its groups. However, it was not a written the modern world. own alphabet that can be used to accurately language until Dunging anak Gunggu translate the Iban’s spoken language into a invented the first Iban alphabet in 1947. written language,” Dr Philip says. In 2010, extending Dunging’s work, Dr Bromeley Philip of Universiti Teknologi MARA (UiTM) Sarawak developed computer fonts for the Iban alphabet, called LaserIban. His aim is to help preserve the Iban For further information contact: Dr Bromeley Philip alphabet in digital form in the modern world. The Associate Professor, Academy of Language Studies LaserIban is available for Windows and Macintosh Universiti Teknologi MARA (UiTM) Sarawak computers and is completely cross-platform compatible. E-mail: bromeley@sarawak.uitm.edu.my Using the LaserIban, Dr Philip has launched a course called the “Training unto LaserIban System”, or TULIS,


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.