IRJET- Interactive Speech Recognition Agent System using AI by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 07 Issue: 12 | Dec 2020

p-ISSN: 2395-0072

www.irjet.net

Interactive Speech Recognition Agent System using AI Akshata Tayade1, Mahima Thakur2, Laxmi Vathari3, Prof. Satish Kuchiwale4 1,2,3Student,

Dept. of Computer Engineering, Smt. Indira Gandhi College of Engineering, Navi Mumbai, Maharashtra, India 4Professor, Dept. of Computer Engineering, Smt. Indira Gandhi College of Engineering, Navi Mumbai, Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------Abstract - To assist the disabled we may make use of speech to text translation technique, speech recognition technology that turns spoken words into written words. It can also identify and understand human speech to carry out a person’s command on an android phone. This paper will help everyone to communicate with one another, it will enable user to interact with everyone. Basically it will convert audio signal which we can hear on our phone into text and then the text can be used for documentation or any other work and for call the conversation can be saved which is in the form of text.

internet. So now he/she can learn new things without any kind of obstacle in the studies. There will be less communication and education barrier and ease of day to day life for deaf and old aged people. 1.1.

To design an automatic speech recognition system that gives the best recognition results for both male and female speakers. Such that, it can identify and understand human speech and convert it into the form of text. The subtitle for any video on the internet will be provided by supposed system if you provide file in app to lessen the linguistic barrier in education. If the other end caller is speaking fast and the user is not able to understand then the user can convert the audio call into text. The user get text captions for particular audio playing on screen as it may be a lecture for online classes or the any video which is in any other language. It will also give solution to the person who is traveling alone in unknown country to communicate, all these features will be provided by supposed system.

Key Words: ASR - Automatic speech Recognition, Dictation- In which the user enters the data by reading directly to the computer, STT-Speech to text 1. INTRODUCTION Voice is the basic, common, and efficient form of communication method for people to interact with each other. Today speech technologies are commonly available for a limited but interesting, various range of task. This technology enables machines to respond accurately and reliably to human voices and provide useful and valuable services. Communicating with a computer is faster using voice rather than using keyboard, so people will prefer such a system. Communication between the human being is dominated by spoken language, therefore it is natural for people to expect voice interfaces with computer. This can be accomplished by developing a voice recognition system: speech-to-text which allows a computer to translate voice request and dictation into text.

1.2.

Impact Factor value: 7.529

OBJECTIVES

1) Generation of subtitle on the screen for ongoing audio on your device. 2) It destroys language barrier present in communication. 3) Subtitle for video on internet. 4) Call conversion to text will be available.

The supposed system is an application that converts speech into text with the help of emerging technology Neural network, Artificial intelligence, Deep learning, Machine learning. As in the world of digitalization, it needs to build such a system to solution to some circumstances in education, communication, and in daily life. It can identify and understand human speech and convert it into the form of text. Our app will have different models to work for different features and will be having an Automatic Speech Recognition System (ASR). The Speech to Text App will be used in many aspects for the betterment of the people in bunch of features such as Generation of subtitle on the screen for ongoing audio on your device, Language Translation[2], Call to text conversion[4] and video captions from an online source as URL link i.e. Subtitle for video on

PROBLEM STATEMENT

5) Language Translation in English. 1.3.

SCOPE

1) Speech to text conversion becomes a link between deaf and normal person. It will helps the person with a hearing disability by providing subtitle to any video available on the internet. 2) It will let the user having hearing disability to answer calls. The app may help them to know what the other end caller is saying by reading the subtitles which will get converted by the end caller’s speech.

ISO 9001:2008 Certified Journal

Page 1855