IRJET-A Hybrid Translator: From Malayalam to English

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395 -0056

Volume: 03 Issue: 07 | July-2016

p-ISSN: 2395-0072

www.irjet.net

A Hybrid Translator: From Malayalam to English Anisree P G1, Radhika K T2 1PG 2Asst.

Student, Dept. of Computer Science and Engineering, MEA Engineering College, Kerala, India Professor, Dept. of Computer Science and Engineering, MEA Engineering College, Kerala, India

---------------------------------------------------------------------***--------------------------------------------------------------------rule based approach, a large number of rules are from one language to another. The language from which we necessary to capture the phenomena of natural are translating is called the source language and the language language. These rules transfer the grammatical to which we are translating is called the target language. So structure of the source language into target language. machine translation can be defined as a process of translation As the number of rules increases, the system becomes from a source language to a target language. Here the source language is chosen as Malayalam and the target language is very complicated. Formulation of a large number of English. Statistical based approach towards machine rules is a tedious process and require years of effort translation is commonly used for the translation purpose and linguistic analysis. In the second approach, large because it is superior to any other approaches. But due to the peculiarity nature of Malayalam it is not a good practice to parallel and monolingual corpora are used as source of develop a translator for this language by using the statistical knowledge. This approach can be further divided into method. Hence we are integrating some of the rules for the statistical approaches and example based approach. translation purpose which helps the system to understand how Statistical machine translation (SMT) is superior to rule to translate from Malayalam in an easier way. Here we are proposing a translator which can translate from Malayalam to based and example based systems in that they do not English by combining a rule based and a statistical based require human interpenetration and can build a approache, hence it is a hybrid translator. translation system in an unsupervised manner directly from the training data. Rule based systems are Key Words: Machine Translation, Statistical Based Machine Translator, Rule based Machine Translator, Hybrid language dependent and require careful analysis of Translator, Sandhi Splitter. source and target languages. With the rapid proliferation of internet and increasing availability of data, SMT is currently the most popular and prevalent 1.INTRODUCTION paradigm. Machine Translation (MT), perhaps the earliest For an SMT system, a parallel corpus consisting of NLP application, is the translation of text units from source and target language sentences and a one language to another, using computers. It is one of monolingual corpus consisting of target language the most interesting and the hardest problem in the sentences are required. The SMT system is trained on field of NLP. The two challenges in machine translation these large quantities of parallel data and monolingual are adequacy and fluency [1]. The former is to develop data. The statistical model learns the translation a system that adequately represents the ideas parameters from the corpus and performs the expressed in the source language into the target translation. SMT takes place in three phases, namely language. The latter is to represent those ideas language modeling, translation modeling and decoding grammatically. India is a multilingual country, i.e., as shown in Fig 1. The language model determines the many of the states have their own native language and probability of the target language T which helps in only 5% of the population knows English [2]. So, it achieving the fluency in the target language and must require a translator which is capable of choosing the right word in the translated language. It is translating from their native language to English for generally denoted as P (T). The translation model, on efficient communication and knowledge sharing. The the other hand helps to compute the conditional common approaches to machine translation are the probability of the target language T given the source rule based approach and corpus based approach. In the language S generally denoted as P (T|S). Finally, in the

Abstract - Machine translation is a process of translation

Š 2016, IRJET

|

Impact Factor value: 4.45

|

ISO 9001:2008 Certified Journal

|

Page 129


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
IRJET-A Hybrid Translator: From Malayalam to English by IRJET Journal - Issuu