International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
1Student, Department of Computer Applications, Madanapalle institute of technology and science, India 2 Asst. Professor , Department of Computer Applications, Madanapalle institute of technology and science, India ***
Abstract - In a Traditional way of Translation it becomes difficult because users need to maintain different manual books for understanding different languages and users as to learn every single word from the manual to understand a particular meaning of a sentence or words from a certain language, or else a person needs to get help to understand the language. As Translation technologyconsistently improves,Language Translation usingOCR(OpticalCharacterRecognition)is an application for recognizing and extracting the text through a camera, and extracted text language can be identified and translated into any language. These features are included in Firebase ML Kit (Machine Learning) and Google Play Services packages for adding ML concepts to an application which were provided by Google for Android Development. The main purpose of Language Translation by using OCR (Optical Character Recognition)withAndroidistobringdowntheLanguage barrier by enabling people to translate any text content fromonelanguagetoanotherlanguage,bysimplytaking a picture of the text they want to translate.
Words: OCR, Android Application, FireBase ML Kit, Translation API
Our Idea of language translation will be more useful, especially in the translation of the scanned text such as imageswherethetextispresentintheimageandmightbe different from manual translational which might be practically impossible. OCR refers to software technology andisalsoreferredtoastextrecognitiontocreateadigital version of a scanned image or handwritten image to transform characters such as letters, numbers, and punctuations used by an application to read without the need for manually typed or text. Generally, OCR with TensorFlowLiteisaFireBaseMLSoftwareDevelopmentkit thatprovidesadvancedcapabilitiesofmachinelearningto Android.OCRisa processofrecognizingtextfromacamera orimagetotranslateinformationfromthesourcelanguage tothetargetlanguage.
LanguageTranslationusingOCRwillprovidetheflexibility toscanthetextImagesofdifferentlanguageslikeEnglish, French, Hindi, and Spanish and provides the Customised translationofthatscannedtextorimagetotheuserbased upon the requirement of limiting the translation to a
particular part of the text. In this Application, OCR uses a combinationofthetextdetectionmodelandtextrecognition modelasanOCRpipelinetorecognizetextcharacters.Inthe OCR system, the recognition will interpret the scanned imagesandturnimagesofprintedcharactersintoMachinereadablecharacters.theprocessofOCRpipelinetorecognize characters.
AnimageispassedfromanumberofstageslikeImagepreprocessing, text detection, detection post- processing, and textrecognitiontoperformtheOCRpipeline.Imagesofthe OCR system might be acquired by scanning images or by capturingaphotographoftheImage.TheaimofimagePreprocessingistoimprovethequalityoftheimagecapturedby acameranecessarytomodifytherawdata.Textdetectionis theprocesstodetecttextandextracttextfromanimageby ofblocks,lines,words,andcharactersfromtheimage.The text detection stage uses the features extracted in the scanned image. detection post-processing involves recognizing and localizing Text detection stage uses the featuresextractedinthescannedimage.recognitionprocess which will recognize images in a Latin script. Text recognition is the part of the OCR that finally recognizes individual characters from images and outputs them into Latinscript.
1)ADetailedstudyandrecentresearchonOCRbyDevaraj Verma C and Proddutur Shruthi. This paper says an overview of OCR and discussed various phases like image acquisition, noise removal, normalization, pre-processing, andfeatureextractionandalsodiscussedtheGenerationof OCR,thehistoryofOCR,andapplicationsofOCR.
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
2)OverviewofvariousOCRtechniquesbyAkshaSrivastava, Nitin Ramesh, and K. Deeba have discussed various OCR techniques which comprises the various phases of recognition such as pre-processing and post-processing acquiredforcameraimages.thepapersaysaboutthefieldof OpticalCharacterRecognition,therearevariouschallenges thatstill existsuchasrecognitionofcharactersinvarious languages,real-timerecognition,etc.Finally,theuseofOCR inreal-worldapplicationsremainsanactiveareaofresearch
3)AmobileapplicationOCRAssistedTranslatorbyNikhil chigali,SaiRohith,MSuvarnaani.KandRajeswariS.sayto selecttheimage,extractthetext,andthentranslatethetext hasbeendesignedanddevelopedforfurtherenhancement to address the problem of translating pdf and other documents to be translated from one language to another language by using a flutter pdf viewer. OCR Assistant Translator proposed the user flexibility to upload the documentsorimagesandprovidescustomizedtranslation basedontherequirement.Thedocumentscanbeinvarying formatslikedoc,pdf,orjpg.
4)TextRecognitionusingimageprocessingandTranslation by A.K Gaikwad and Mayur Pabalkar. They proposed to recognize characters with the use of optical character techniquewithanaccuracyofexceeding90%marktorobust different kinds of text including color, font style, size, and background.Thesystemiseasilyportableandscalability.
5)Scanner & Decoder: Conversion of Text from any ApplicationFormanditsLanguageTranslationUsingOCRby Sharmila Sengupta, Anshal prasad, Harish Kumar, Ninad Rane, and Nilay tamane, with CNN (Convolutional neural networks)fortextdetectionandrecognitionofhandwritten Hindicharactersfromprintedformsandthentranslatedinto English.TheyalsoproposedsentimentanalysisusingNLP (Naturallanguageprocessing)foranalysis,extracting,and analyzing information towards public reaction, sentiment analysis is performed by using Random Forest Algorithm, andNLTK(NaturalLanguageToolKit)librariesareusedfor givinganaccuracyof88%.Featureextractionisalsopartof the process. The system must be trained on object information.
6)ReviewonHandwrittenCharacterrecognitionbyNikita MehtaandJyotikaDoshi.Thispaperdiscussedcategoriesof OCRandvariousapproachestoachieveagoodrecognition rate for Indian scripts, but only for individual characters withdifferentmodifierswhicharequitecomplextoidentify theDevanagariscript.
Whiletravelingbecomesamajorproblemfacedbytravelers for understanding unfamiliar languages and failing to understandunknownlanguageswhichleadtochallengesof exact text or major problems of manual translation. As Human translators cannot complete the speed of Google
Translate API, to overcome this problem Language TranslationusingOCR(OpticalCharacterRecognition)isan Androidapplicationforrecognizingandextractingthetext through a camera and extracted text language can be translated into user-specified language. The Application definesthetextrecognition,identification,andtranslation part in one single activity screen. we are using a module DependenciescalledCameraXforOCRtocapture,recognize andextractthetextfromthecamera.Inatypicalscenario, theuserhastoscanatextareawiththecellphonecamera usedbyOCR,ithas2stages
1)text-detectionmodel
2)text-recognitionmodel
first,thetext-detectionmodelisusedtodetectscannedtext in the image around bounding boxes. second in a textrecognition model, we processed bounding boxes to recognizethespecificcharactersfromthetext.once,thetext isdetectedtherecognizerwilldeterminetheactualtextin eachblockandportionitintolinesandwords.
Text Recognition API -Text recognition API includes detectors used to represent the structure or find text in images by detecting words, lines, and paragraphs. RecognitionAPIwilldetecttextinLatin-basedlanguageslike French,German,English,etc.inreal-time,onthedeviceonce thescannedimageisrecognizedbyanOCR.
Translation API- we have integrated an open source TranslationAPIwithourOCRApplicationforthetranslation ofanextractedtextintomorethanonehundredlanguages. TranslationAPIoffersfastanddynamicresultsalmostthe same as instant. As Translation is faster and free and you need to have a good internet connection to access this application.byusingthisapplicationwecantranslateinto multiple languages. the whole proposed system is implementedintheandroidapplicationandthePurposeof thisprojectistoimplementtextextractionfromtheimage andtranslateitintotheuser-specifiedlanguage.
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
ThemainactivityandHomescreenoftheapplication.
IntheMainactivity,thetextwillCapturewithcameraatthe Centertextintheboxportion
Fig5:listoflanguages
the output of captured text will be translated into userspecifiedlanguage.
Once the text is captured, the user will have to choose a languagefromalistoflanguagestotranslatecapturedtext.
Fig 6:Theoutputoftranslationin(EnglishtoHindi) language.
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
Thisisthediscussionabouttheopticalcharacterrecognition technologyusedtorecognizethetextfromthecameraand then translate the text into the user-specified language withouttheneedformanuallytypedtextwhichmayprove resultant for people as they overcome the issues of a languagebarrier.Aslanguageinterpretationwillbeeasier andenabletheusertogetfasteraccessfortranslationofa scanned text into multiple languages with an accuracy of 90%mark.Themainadvantageofthesystemisscalability andreductionofcost.Byusingthisapplicationuserswillbe able to communicate with local people for a better understandingofunfamiliarlanguage.
1. NikhilChigali,saiRohith,Suvarnavani.\kandRajeswari S,"OCRAssistedTranslator"onIEEE7thInternational Conference on Smart Structures and Systems ICSSS, september2020.
Fig 7:Theoutputoftranslationin(EnglishtoItalian) language.
2. JamshedMemon,MairaSami,RizwanAhmedKhanand Mueen Uddin , "Handwritten Optical Character Recognition(OCR)"onIEEEAccess (volume:8),july2020.
3. Harish Kumar, Anshal Prasad, Ninad Rane, Nilay Tamane, Dr. Sharmila Sengupta, "Form Scanner & Decoder:ConversionofTextfromanyApplicationForm anditsLanguageTranslationUsingOCR"publishedin InternationalJournalofAdvancedResearchinComputer and Communication Engineering Impact Factor 11,Issue2,February2022
4. AdityaPrabhu,MihirPitroda,SusmitWadikar,andProf. Suvarna Pansambal, "A Study of Translator Camera UsingOpticalCharacterRecognition&NaturalLanguage Processing"onIOSRJournalofEngineering(IOSRJEN) Volume13,PP43-45.
5. uturShruthiDr.DevarajVermaC"ADetailedstudyand recent research on OCR" on International journal of Computer science and information Security (IJCSIS), Vol.19,No.2,feb2021.
6. HappyJain,AkshataChoudhari,MohanSharma,Jagdeep Yadav Krunal J. Pimple, "OCR WITH LANGUAGE TRANSLATOR" Published in International Journal of TechnicalResearchandApplications,Volume4,Issue2 (March-April,2016),PP.125-129.
Fig 8:Theoutputoftranslationin(Englishto Telugu)language.
7. M Swamy Das, CRK Reddy, K Rahul & A Govardhan, “MultilingualOpticalCharacterRecognitionSystemfor Printed English and Telugu Base Characters” on International Journal of Science and Advanced Technology(ISSN2221-8386),Jan2016.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
8. R. R. Ingle, Y. Fujii, T. Deselaers, J. Baccash and A. C. Popat, "A Scalable Handwritten Text Recognition System,"2019 International Conference on Document AnalysisandRecognition(ICDAR),pp.17-24,2019.
9. Prof.N.R.Ingale,AshishSuman,AniruddhaPatil,Suhasini Raina, “Text Fetching App by Image Processing” Published in International Research Journal of InnovationsinEngineeringandTechnology,Volume4, Issue5,pp51-54,May2020.
10. Nitin Ramesh, Aksha Srivastava and K. Deeba, "ImprovingOpticalCharacterRecognitionTechniques" onInternationalJournalofEngineering&Technology,7 (2.24)(2018)361-364.
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |