CLEAR Journal December 2018 Edition

Page 1



CLEAR Journal (Computational Linguistics in Engineering and Research) M. Tech Computational Linguistics, Dept. of Computer Science and Engineering, Govt. Engineering College, Sreekrishnapuram, Palakkad678633 www.simplegroups.in simplequest.in@gmail.com Chief Editor Shine S Assistant Professor Dept. of Computer Science and Engineering Govt. Engineering College, Sreekrishnapuram, Palakkad678633 Editors Aiswarya K Surendran Divya Visakh Sreeja V Cover page and Layout Zeenath M T

Editorial……………………… 3 CLEAR DECEMBER 2018 Invitation………………………21 Last word……………………………22

Word Sense Disambiguation using WSD Specific WordNet of Polysemy Words .....................................................7 Aiswarya K Srendran Sense Aware Neural Model: A Method to Identify Pun Location in Texts ....................................................11 Minimol M Text Categorization Using Disconnected Recurrent Neural Networks .....................................................15 Sibila M Intelligent Question Answering System Using Artificial Neural Network .....................................................18 Sreeja V

2

CLEAR DECEMBER 2018


Dear Readers, Here is the latest edition of CLEAR Journal, which comes with some new articles based on the trending topics like Word Sense Disambiguation, Sense Aware Neural Model, Text Categorization using Disconnected Recurrent Neural Networks, Intelligent Question Answering System etc. The previous edition could cover different articles related to Why Artificial Intelligence fail in Predicting 2018 FIFA World Cup Result? , Artificial Intelligence to Predict Alien Life on Other Planets, Text Based Multi-Emotion Extraction Using Linguistic Analysis, Alias Links Identification from Narratives and Face Detect – Track System: For Criminal Detection etc. We are very happy that we could make new readers which give us very much motive to make improvements and keep going well. As always, we are working on it based on your valuable feedbacks, and expect more. On this hopeful prospect, I proudly present this edition of CLEAR Journal to our faithful readers and look forward to your opinions and criticisms.

BEST REGARDS, Shine S (Chief Editor)

3

CLEAR DECEMBER 2018


INTERNSHIP  Anisha T S, Bhavya K, Pradeep T and Sandeep Nithyanandan of M.Tech Computational Linquistics, 2017-19 batch got selected for internship at Lymbyc Solutions pvt. Ltd., Bangalore.  Gayathri G Nair of M.tech Computational Linguistics, 2017-19 batch got selected for internship at Suprath Technologies, Banglore.  Aswathy K S and Mohammed Shameem K of M.tech Computational Linguistics , 2017-19 batch got selected for internship, at Zwayam , Banglore.

RESULTS

 100% result for Second Semester M.Tech Computational Linguistics, 2017-19 batch.

Simple Groups Congratulates All for their Achievements...!!!

4

CLEAR DECEMBER 2018


............................WORKSHOP ON MACHINE LEARNING.............................. Dr. Arun Rajkumar A workshop was conducted by Dr. Arun Rajkumar on Machine Learning. It was a full day workshop.The session was so informative. He gave a nice idea about the basic concepts of Machine Learning and also introduced new advancements in the field of Machine Learning and its scopes.

………...................... MACHINE LEARNING CLUB ..............................

A Machine Learning Club was started by including all the students from M.Tech and interested students from B.Tech. A class was organized by Second year M.Tech Batch for all the club members. They introduced the basic methods that we need to consider while taking a Machine Learning Project.

5

CLEAR DECEMBER 2018


Word Sense Disambiguation using WSD Specific WordNet of Polysemy Words Aiswarya K. Surendran M.Tech Computational Linguistics Government Engineering College, Sreekrishnapuram aiswaryasurendrank@gmail.com

In Natural Language Processing (NLP), Word Sense Disambiguation(WSD) is defined as the task to assign a suitable sense of words in a certain context. Word Sense Disambiguation takes an important role and considered as the core research problem in computational linguistics.This concept presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense words are referred to as the clue words. The conventional WordNet organises nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the structure of WordNet, a new model of WordNet is presented that organizes the different senses of In this methodology, authors presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense word words as well as the single sense words based on the clue words. These clue words for each sense of a polysemy word as well as for single sense word are used to disambiguate the correct meaning of the polysemy word. 6

CLEAR DECEMBER 2018

In the given context using knowledge baseWord Sense Disambiguation(WSD) algorithms. Here the clue words can either be a noun, verb, adjective or adverb. Words that express two or more different meanings when used in different contexts are referred to as polysemy or multi-sense words. Every natural language contains such polysemy words in its vocabulary. Such polysemy words create a big problem during the translation of one natural language to another. To translate the correct meaning of the polysemy word, the machine must first know the context in which the polysemy word has been used. Only after this, the machine can find out the correct meaning of the word in the particular context and can translate the meaning of that word into the correct word in another language. The process of finding the correct meaning of the polysemy word using machine by analyzing the context in which the polysemy word has been used is referred as Word Sense Disambiguation(WSD). In this concept, a new model of WordNet is considered that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy.


Word as well as single sense word are referred to as the clue words. In this proposed model, polysemy words must be organised in such a way that it should contain only those words that are sufficient to disambiguate the different senses of a polysemy word and it should not again introduce ambiguity as it is in the case of using WordNet. Although WordNet is very useful lexical resource that can be used to disambiguate the different meanings of a polysemy word, it has still some limitations. Therefore, to improve the accuracy of the knowledge based overlap selection WSD algorithms dramatically, authors developed a new logical model that organizes the words in such a way that it disambiguates the meanings of a polysemy word correctly and efficiently resulting in higher accuracy than that can be obtained from the use of WordNet. Unlike in WordNet, they organized the different senses of polysemy word as well as single sense words and grouped each sense of a polysemy word based on the verb, noun, adverb and adjective with which the sense of the polysemy word can be used in a sentence. The organisation of the single sense is also done in the same way. This organisation of words results in a new model of wordNet which is focused on WSD. This new model of WordNet contains all the necessary and sufficient words/information that can be used to disambiguate the senses of a polysemy word. It restricts the unnecessary words/information and therefore does not create ambiguity in ambiguity like wordNet. 7

CLEAR DECEMBER 2018

Here it is believed that each sense of polysemy word has some distinct words related only to that sense and describe the sense. They call these words as related words for the sense. For example, the must be organised in such a way that it should contain only those words that are sufficient to disambiguate the different senses of a polysemy word and it should not again introduce ambiguity as it is in the case of using WordNet. Although WordNet is very useful lexical resource that can be used to disambiguate the different meanings of a polysemy word, it has still some limitations. Therefore, to improve the accuracy of the knowledge based overlap selection WSD algorithms dramatically, authors developed a new logical model that organizes the words in such a way that it disambiguates the meanings of a polysemy word correctly and efficiently resulting in higher accuracy than that can be obtained from the use of WordNet. Unlike in WordNet, they organized the different senses of polysemy word as well as single sense words and grouped each sense of a polysemy word based on the verb, noun, adverb and adjective with which the sense of the polysemy word can be used in a sentence. The organisation of the single sense is also done in the same way. This organisation of words results in a new model of wordNet which is focused on WSD. This new model of WordNet contains all the necessary and sufficient words/information that can be used to disambiguate the senses of a polysemy word. It restricts the unnecessary words/information and


therefore does not create ambiguity in ambiguity like wordNet. Here it is believed that each sense of polysemy word has some distinct words related only to that sense and describes the sense. They call these words as related words for the sense. For example, the words copy, write etc. are the related words for the sense writing implement with a point from which ink ows of polysemy word pen. Therefore, if the word pen is used with copy, then it is a sufcient evidence to understand the meaning of the word pen as a writing implement. This does not require the overlap counting that may cause the induction of noise information. Only the task is that to and such related words for each sense of the polysemy word and organize in a new lexical database. PolyWordNet is the new lexical database we developed. It organizes multiple senses of a polysemy word in such a way that each sense of the polysemy word is linked with its related words by dividing these related words into verbs, nouns, adverbs, adjectives and prepositions. In PolyWordNet, each related word is linked only with a sense of a polysemy word except for the bridging related word as explained in previous section. If a word is equally semantically related with more than one sense of the same polysemy words, it is just ignored and is not included in the related words of either sense. This is because such word will lead to all senses of a polysemy word during the disambiguation process introducing the ambiguity in ambiguity. However, if the two senses of a polysemy word have very common meaning, in that case there is a need to include a bridging 8

CLEAR DECEMBER 2018

related word common to more than one sense of the polysemy word in the PolyWordNet since it facilitates for sense disambiguation.

Fig: Sample PolyWordNet

of organizing

words

in

The method used in this concept does not count the word overlaps between the context and the sense bags for sense disambiguation. Instead, it searches the paths or links of context words with the senses of a target word. They keep the track of each path or link that connects a context word and a sense of the target word. If the paths thus obtained connect only one sense of the target word, the algorithm output the linked sense as the correct sense of the target word for the given context. If there are paths that link more than one senses, then the algorithm counts the number of paths or links for each linked sense.


Then the sense for which the number of connection paths is maximum is selected as a correct sense. If the two or more senses have the equal number of paths, the rst sense in the array maintained by algorithm is selected as correct sense. If no connection path is found, the algorithm displays an information indication failure of disambiguation.

References [1] Udaya Raj Dhungana, Subarna Shakya, Kabita Bara and Bharat Sharma, ”Word Sense Disambiguation using WSD Specific WordNet of Polysemy Words”,Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing . [2] Udaya Raj, Dhungana, Subarna Shakya, ”Word Sense Disambiguation Using PolyWordNet”, Proceedings of the 4th International Conference on Computational 

Open source Deep Learning curriculum: A comprehensive deep learning curriculum starting with the mathematical, statistical and CS fundamentals and leading up to advanced materials such as Reinforcement Learning and Convolution Neural Networks for image recognition.

9

CLEAR DECEMBER 2018

Linguistics and Intelligent Text Processing. [3] Priti Saktel’, Urmila Shrawankar2 ,”Context Based Meaning Extraction for HCI Using WSD Algorithm: A Review” ,IEEE-International Conference On Advances In Engineering, Science And Management (ICAESM -2012) March 30, 31, 2012

Computers successfully trained to identify animals in photos: A computer model developed at the University of Wyoming and others has demonstrated high accuracy and efficiency in identifying images of wild animals from camera-trap photographs. Researchers trained a deep neural network to classify wildlife species using 3.37 million camera-trap images of 27 species of animals obtained from five states across the United States. The model then was tested on nearly 375,000 animal images at a rate of about 2,000 images per minute on a laptop computer, achieving 97.6 percent accuracy-likely the highest accuracy to date in using machine learning for wildlife image classification.


Sense Aware Neural Model: A Method to Identify Pun Location in Texts Minimol M M.Tech Computational Linguistics Government Engineering College, Sreekrishnapuram minimolm13@gmail.com There exists a class of language constructs known as puns in natural language utterances and texts, and the speaker or writer intends for a certain word or other lexical item to be interpreted as simultaneously carrying two or more separate meanings. Though puns are an important feature in many discourse types, they have attracted relatively little attention in the area of natural language processing. The pun, also called paronomasia, is a form of word play that exploits multiple meanings of a term, or of similar-sounding words, for an intended humorous or rhetorical effect. These ambiguities can arise from the intentional use of homophonic, homographic, metonymic, or gurative language. A pun differs from a malapropism in that a malapropism is an incorrect variation on a correct expression, while a pun involves expressions with multiple correct interpretations. Puns may be regarded as in-jokes or idiomatic constructions, as their usage and meaning are specific to a particular language and its culture. Puns have a long history in human writing. The Roman playwright Plautus was famous for his puns and word games. Puns were found in ancient Egypt, where they were heavily used in the development of myths and interpretation of dreams. In literature, puns have been used by famous writers throughout history. In constructing puns, William Shakespeare was a master craftsman. Since Pun words play the key role in forming a pun, it is important to identify the pun word in a given text. 10

CLEAR DECEMBER 2018

This task of identifying pun words is known as pun location. Traditionally there were so many methods used for this purpose, but all those results shows that pun location is a challenging task. So a better alternative approach for this is Sense Aware Neural Model (SAM). It is an extension of Baseline Neural Model. To improves the prediction performance, the sequence of word senses corresponding to each WSD result is modelled by a bidirectional LSTM network and the outputs of different LSTM networks are then concatenated for prediction. In this way the different senses of words can be captured. The baseline model works on the word level but the sense aware model works on the sense level. Extensive research has been done in the field of modelling and detecting puns. The context or the sense depends largely on the perspective and knowledge of the reader about a particular language. For example, the following punning joke exploits contrasting meanings of the word interest and it is a homographic pun. I used to be a banker but I lost interest. The word in italics conveys two different meanings or senses in the sentence. So, the word interest could be called a pun. A pun is the exploitation of the various meanings of a word or words with phonetic similarity but different meanings. Since the pun word plays the key role in forming a pun, it is very important and meaningful to identify the pun word in a given text.


The task of identifying the pun word is known as pun location. In order to address this special task, various approaches have been attempted, including rule based approach, knowledge based approach and supervised approach. However, these approaches do not achieve good results, and the best F1 score for homo graphic pun location is just 0.6631, which is achieved by the Idiom Savant system with a knowledge based approach. The results demonstrate that pun location is a very challenging task. Computational processing of puns involves two separate tasks: In pun detection, the object is to determine whether or not a given context contains a pun, or more precisely whether any given word in a context is a pun. In pun identification (or pun disambiguation), the object is to identify the two meanings of a term previously detected, or simply known a priori, to be a pun. A variety of methods have been proposed to locate the pun words. For example, UWaterloo system constructs a rule-based pun locator that scores candidate words according to eleven simple heuristics. BuzzSaw system attempts to locate the pun in a sentence by selecting the polysemous word with the two most dissimilar senses. Duluth system identifies the last word which changed senses between different word sense disambiguation results. Fermi system uses Bi-directional RNN to learn a classification model. Idiom Savant system uses n-grams features, and only content words including nouns, verbs, adverbs and adjectives are considered as candidate words. Pun interpretation is considered a subsequent step for pun location, and it aims to annotate the two meanings of the given pun by reference to WordNet sense keys. In the work of traditional language. Diagnostic WSD approaches are adapted to disambiguate puns, and rather to identify their double meanings. 11

CLEAR DECEMBER 2018

Word Sense Disambiguation (WSD) is also related to this. Some prior works compute overlaps of glosses between the target word and its context. These approaches derive information from some lexicon thesauruses for WSD, including WordNet and BabelNet. Supervised models, including neural models, have been successfully applied to WSD. Pun location is a more challenging task than pun detection, because it aims to find the actual pun word in the given text. Previous works find some clues about puns in the texts. For example, pun is more likely appeared towards the end of sentences. Many puns have particularly strong associations with other words in the contexts. The task of pun location needs to locate the exact pun word in each short text or sentence. We regard pun location as a word-level classification task, and attempt to train a model that can predict whether a word in a sentence is a pun or not. A word will be regarded as a pun word with high probability when it is a noun, verb, adjective or adverb, therefore, we only try to make prediction of one word when it has one of the four kinds of parts of speech tags. A sense-aware neural model is built on WSD results. Two or more WSD results are obtained by using different WSD algorithms or different configurations, and the WSD results may be different. The sequence of word senses corresponding to each WSD result is modelled by a bidirectional LSTM network and the outputs of different LSTM networks are then concatenated for prediction. The architecture of the sense aware model contains several bidirectional LSTM networks. For each WSD result, the sequence of sense embeddings is taken as input for a bidirectional LSTM network. Figure shows the Sense-aware neural model with bidirectional LSTMs. Assuming


we have K Word Sense Disambiguation results. The outputs hij (j = 1,...,K) by K different bi-directional LSTM networks for the same ith word (i.e., the ith time step) are then concatenated into one vector.

In our work, the Wikipedia corpus is used to train word embeddings (together with contextual embeddings of words) by using word2vec and then the word embeddings are used by SenseGram for inducing sense inventory and sense embeddings. The word similarity graph used by SenseGram is built based on the similarity between word embeddings. We do not use WordNet as sense inventory because the sense inventory is too fine-grained and many words are not included in WordNet. Given each target word w and its context words C = c1,...,ck in the sentence. Want to assign a sense vector to w from the set of its sense vectors S = s1,...,sm. System uses two simple WSD methods for achieving this: 1. The first WSD strategy is based on the sigmoid function. cc is the mean of the contextual embeddings of words in C .The sense embedding of w is chosen as:

The vector is sent to a two-layer feed-forward neural network and a sigmoid function for prediction. Sense- aware model can be considered as applying the baseline model on different WSD results and then combining the outputs for prediction. In order to obtain sense inventory and the sense embeddings for each word, we choose SenseGram. The SenseGram tool kit is available online, and it can take as an input the word embeddings and split different senses of the input words. For instance, the vector for the word table will be split into table (data) and table (furniture). SenseGram induces sense inventory from existing word embeddings via clustering of ego-networks of related words. 12

CLEAR DECEMBER 2018

2. Let cw be the mean of the word embeddings of words in C, which is different from cc. The second disambiguation strategy is based on the cosine similarity function.

For each WSD strategy, we can set different window sizes of 3 and 50 (the maximum sentence length in the corpus) as different configurations. This methodology first obtains several WSD results for the text, and then leverages a bidirectional LSTM network to model each sequence of word senses.


The outputs at each time step for different LSTM networks are then concatenated for prediction and this work, apply the neural network models to the pun location task. It proposed a novel senseaware neural model to leveraging multiple WSD results. It works in sentence level and it disambiguates two senses of the sentence. Here the baseline BM model does not perform well. The baseline neural model is built on the word level and the word senses can only be implicitly captured by the model. Moreover, a pun word usually has two senses in the sentence, while the baseline neural model cannot disambiguate them. Whereas the Sense aware model built on sentence level and it can disambiguate the senses of a sentence. As a future work it can be test with more advanced WSD algorithms and try to address the pun interpretation task. References [1] Y. Cai, Y. Li, and X.Wan, “Sense-aware neural models for pun location in texts" , in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 546,551, 2018.

11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 457,460, 2017. [5] S. Doogan, A. Ghosh, H. Chen, and T. Veale, “Detection and interpretation of english puns," in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 103,108, 2017. [6] T. Miller and I. Gurevych, “Automatic disambiguation of english puns," in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 719,729, 2015. [7] Y. Xiu, Man lan, and yuanbin wu. 2017. “Using

[4] V. Indurthi and S. R. Oota, “Detection and interpretation of homographic puns in english language," in Proceedings of the 13

CLEAR DECEMBER 2018

and

unsupervised

methods to detect and locate english puns," in Proceedings of the 11th International Workshop

on

Semantic

Evaluation

(SemEval- 2017), pp. 453,456. 

Behind Google’s Kaggle Acquisition: Google is eager to point out that its recent acquisition of Kaggle is part of its mission to ‘democratize AI’.

Artificial Intelligence Mimics Navigation Cells in the Brain: An algorithm trained to move through a virtual environment spontaneously generated patterns of activity found in so called grid neurons.

[2] T. Miller, C. Hempelmann, and I. Gurevych,”Detection and interpretation of english puns," in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 58,68, 2017. [3] T. Pedersen,”Puns upon a midnight dreary, lexical semantics for the weak and weary", arXiv preprint arXiv:1704.08388, 2017.

supervised


Text Categorization Using Disconnected Recurrent Neural Networks Sibila M M.Tech Computational Linguistics Government Engineering College, Sreekrishnapuram sibilamayyathodi@gmail.com

An artificial neural network (ANN) is a computational model based on the structure and functions of biological neural networks. ANNs are considered nonlinear statistical data modelling tools where the complex relationships between inputs and outputs are modeled or patterns are found. Text categorization is a fundamental and traditional task in natural language processing (NLP).The most commonly used methods to handle the task is to represent a text with a low dimensional vector, then feed the vector into a soft max function to calculate the probability of each category. Recurrent neural network (RNN) and Convolutional neural network (CNN) are two kinds of neural networks usually used to represent the text. RNN can model the whole sequence and capture long-term dependencies. However, modeling the entire sequence sometimes can be a burden, and it may neglect key parts for text categorization. CNN is able to extract local and position-invariant features well. The below Figure is an example of topic classification, where both sentences should be classified as Science and Technology. The key phrase that determines the category is unsolved mysteries of mathematics, which can be well extracted by CNN due to position- invariance.

14

CLEAR DECEMBER 2018

Figure: Example of Topic Classification RNN, however, does not address such issues well because the representation of the key phrase relies on all the previous terms and the representation changes as the key phrase moves. A novel model named Disconnected Recurrent Neural Network (DRNN), which incorporates position-invariance to RNN. In this disconnect the information transmission of RNN and limit the maximal transmission step length as a fixed value k, so that the representation at each step only depends on the previous k-1 words and the current word. In this way, DRNN can also alleviate the burden of modeling the entire document. To maintain the position- invariance, utilize max pooling to extract the important information. The model can also be regarded as a special 1D CNN where convolution kernels are replaced with recurrent units. Therefore, the maximal transmission step length can also be considered as the window size in CNN. Another difference to CNN is that DRNN can increase the window size k arbitrarily without increasing the number of parameters.


Recurrent Neural Networks RNN is a class of neural network which models a sequence by incorporating the notion of time step. Figure shows the structure of RNN. Hidden states at each step depend on all the previous inputs, which sometimes can be a burden and neglect the key information.  

Like other RNN variants, feed the input sequence in to an RNN model and generate an output vector at each step. Important difference from RNN is that the state of model at each step is only related to the previous k-1 words but not all the previous words. Here k is a hyper parameter called window size.

The variant of RNN is Gated Recurrent Unit (GRU). GRU is a special type of RNN, capable of learning potential long-term dependencies by using gates. The gating units can control the flow of information.

Figure: Structure of DRNN

. Figure: Structure of RNN [1] The representation of state t depends upon all the previous input vectors thus the tth step can be expressed as:

Since the output at each step only depends on the previous k-1 words and current word, the output can also be regarded as a representation of a phrase with k words. Phrases with the same k words will always have the same representation no matter where they are. That is, incorporate the position-invariance into RNN by disconnecting the information flow of RNN

ht= GRU(Xt,Xt-1,Xt-2,.........,X1) ht=RNN(Xt,Xt-1,Xt-2,.........,Xt-k+1) Disconnected Recurrent Neural Networks 

To reduce the burden of modeling the entire sentence, limit the distance of information flow in RNN. 15

CLEAR DECEMBER 2018

DRNN for Text Categorization DRNN is a general model framework, which can be used for a variety of tasks. In this


only discuss how to apply DRNN in text categorization. 

 

Here utilize GRU as recurrent units of DRNN and get the context representation of each step. Every context vector can be considered as a representation of a text fragment. Then feed the context vectors into a multi-layer perceptron (MLP) to extract high-level features.

To get the text representation vector, apply max pooling after MLP layer to extract the most important information and position-invariant features. Finally, feed the text representation vector into an MLP with rectified linear unit (ReLU) activation and send the output of MLP to a soft max function to predict the probability of each category.

References [1] Baoxin Wang, ”Disconnected Recurrent Neural Networks for Text Categorization”, In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018. [2] Y Wang and Tian. ”Recurrent residual learning for sequence classification” ,In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing pages 938943,2016

Figure: Architecture Diagram 

Before feeding the vectors into MLP, utilize Batch Normalization after DRNN, so that the model can alleviate the internal covariate shift problem.

16

CLEAR DECEMBER 2018

[3]Y Kim, ”Convolutional neural networks for sentence classification”, In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) pages 17461751,2014.

AI Lends Computing Power to Ac academic Search Engines: A tool that uses machine learning alg orithms to comb and categorize sc ientific literature is making waves in neuroscience !


Intelligent Question Answering System Using Artificial Neural Network Sreeja V M.Tech Computational Linguistics Government Engineering College, Sreekrishnapuram sreejaverkot96@gmail.com

Modern information retrieval systems allow us to locate documents that might have the associated information, but the majority of them leave it to the user to extract the useful information from an ordered list. Question Answering Systems (QAS) is used to retrieve the correct responses to the questions asked by human in natural language. QAS plays an important role in man-machine interaction. Thus QAS can be used for fast information extraction by exempting user to read unnecessary information which might not lead to the Answer. Recently, there are many search engines available and their impact in daily life is very high. All these search engines successful in retrieving relevant documents according to the user needs, but the main problems with these search engines is that instead of giving a direct, accurate and precise answer to the user’s query they usually provide list of document related to websites which might contain the answer of that question. Although the list of documents which are retrieved by the search engine has lot of information about the search topic but sometimes it may not contain relevant information which the user is looking for. Search engines present a ranked list of relevant documents in response to users query based on various aspects such as popularity measures, keyword matching, 17

CLEAR DECEMBER 2018

frequencies of accessing documents, etc. So the users have to examine each document one by one for getting the desired information. For QAS,there are various NLP models which are used. These models are symbol matching in which make use of linguistic annotation, structured world knowledge and semantic parsing. These methods are not effective as they are only able to answer simple questions. The other method is using the neural network. Proposed system create a deep neural network from the documents provide by the user and storing them for future use. It tries to imitate human information recalling feature by processing the document first i.e.understanding it and then try to find the answer to the questions asked. The system process the question asked by the user and comprehending the question and understand what answer is required and then try to find the answer from the deep neural network created previously from the documents provided. The system consists of two phases. In the initial phase, process a document given by the user to create the Artificial Neural Network. In the second phase, process the question answered by the question and giving answer to the user.


The initial ANN Construction phase: Initially the user provides a document to the system as input which is processed and based on this document knowledge the QAS can answer. Sentences in the document provided are divided into words so that each word can be processed individually. POS tagging and entity recognition steps are performed, which are used to assign deep case.Part of Speech (POS) tagging is used to tag words from sentences with part of speech like noun, verb, adverb etc. Entity Recognition is used to label atomic elements in the sentence into categories like person, location, time, etc. Then Deep cases are assigned to each word (entity) based on the result of POS tagging and entity recognition. Following table contains a list of deep cases with its description.

First divide the input sentences into words and assign unique ID to each word.

Now extract knowledge units from the input data. K1 : Gandhiji was born in Porbandar K2 : Gandhiji was born on 2-Oct-1869 Now assign word type and assign deep cases to each word. Assignment of word type to words:

Assignment of deep cases to words:

Next connection between knowledge units and words is established. These are the steps included in the initial phase of the system.

Lets discuss these steps with an example. Let say the document uploaded by the user contains the following details. “Gandhi was born in Porbandar on 2nd October 1869�. 18

CLEAR DECEMBER 2018

Now we can extract relationship between knowledge units and words and we can create a neural network corresponding to the existing relationship.


The network diagram is shown above.

The QAS proposed here is able to answer the complex question by assigning deep cases to the word in the complicated sentences. The proposed system can be helpful for faster information extraction by the user. This QAS can be advanced further by making it extract from images, tables and other statistical documents. This will enable User to process more information and maximum extraction from QAS. References [1] C. Manning, H. Schütze, “Foundations of Statistical Natural Language Processing”, MIT Press, Cambridge, 1999.

Next phase deals with the processing of the user question. Now the systems have the neural network. The system has some knowledge and can answer questions based on this knowledge only. So ask a question to the system. “Where was Gandhiji born?” The system analyses each word from the question and search the words from the network and extract the knowledge unit which relates these words. Here both K1 and K2 relates both the words ‘Gandhiji’ and ‘born’. K1 answers “where” whereas K2 answers “when”. Evaluate this from the word type table that was created in learning phase. The user has asked “where” so the output knowledge unit will be K1. The corresponding knowledge displayed to the user. 19

CLEAR DECEMBER 2018

unit

is

[2] Dipanjan Das, Desai Chen, Andre F. T. Martins, Nathan Schneider, and Noah A Smith, ”Framesemantic parsing”. Computational Linguistics, vol. 40, no. 1, pp. 9–56, December [3] H. Garis, C. Shuo, B. Goertzel, L. Ruiting, “A world survey of artificial brain projects, Part I, Largescale brain simulations”, Neurocomputing, vol. 74, no. 1–3, pp. 3–29, August 2010.

[4] B. Goertzel, R. Lian, I. Arel, H. Garis, S. Chen, “A world survey of artificial brain projects, Part II, biologically inspired cognitive architectures”, Neurocomputing, vol. 74, no. 1–3, pp. 30–49, August 2010.


SIMPLE Groups

M.Tech Computational Linguistics Dept. of Computer Science and Engg, Govt. Engg. College, Sreekrishnapuram Palakkad www.simplegroups.in simplequest.in@gmail.com

Students Innovations in Morphology Phonology and Language Engineering

Article Invitation for CLEAR- March -2019

We are inviting thought-provoking articles, interesting dialogues and healthy debates on multifaceted aspects of Computational Linguistics, for the forthcoming issue of CLEAR (Computational Linguistics in Engineering and Research) Journal, publishing on March 2019. The suggested areas of discussion are:

The articles may be sent to the Editor on or before 10 th March , 2019 through the email simplequest.in@gmail.com. For more details visit: www.simplegroups.in

Editor,

Representative,

CLEAR Journal

SIMPLE Groups

20

CLEAR DECEMBER 2018


Hello world, This latest edition of CLEAR journal comes with some trending topics based on Word Sense Disambiguation, Sense Aware Neural Model, Text Categorization and Intelligent Question Answering System etc. In word sense disambiguation, it deals with the task to assign a suitable sense of words in a certain context with the help of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. Next, sense aware neural model is generated to find the pun locations present in a text. In NLP, Text Categorization is considered to be the fundamental task where it represent a text with a low dimensional vector, then feed the vector into a soft max function to calculate the probability of each category using the recurrent neural network (RNN). Question Answering Systems (QAS) is used to retrieve the correct responses to the questions asked by human in natural language. However, it is clear that the field of computational linguistics is advancing day by day to capture the real attention in almost every other fields. These articles are made based on the recent works or researches done in the field of computational linguistics. CLEAR is thankful to all who have given their valuable time and effort for contributing their thoughts and ideas . Simple group invites more aspirers in this field. Wish you all the success in your future endeavours‌!!!

DIVYA VISAKH

21

CLEAR DECEMBER 2018


22

CLEAR DECEMBER 2018


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.