Journal of Engineering and Technology (IRJET)

Volume: 09 Issue: 04 | Apr 2022 www.irjet.net
Sentiment Analysis using Naïve Bayes, CNN, SVM
e ISSN: 2395 0056
ISSN: 2395 0072
Ankita Yekurke1 , Gayatri Sonawane2 , Harshada Chavan3, Pradhyumn Thote4, Anuja Phapale
1 5Department of Information Technology, All India Shri Shivaji Memorial Society’s Institute of Information Technology, Pune, India
***
Abstract Thousands and millions of customers express their opinions on a brand, product, or service in major businesses. In the big picture, manually evaluating every review is difficult, and as a result, Sentiment analysis has received a lot of attention lately. Sentiment analysis is the mining of text which identifies and extracts subjective information from the source material and helping businesses to understand the social sentiment of their brand, product, or service while monitoring online conversations. The proposed paper focuses on different approaches and algorithms that could potentially help in sentiment analysis.
Key Words: Sentiment Analysis, Naïve Bayes, CNN, SVM
1.INTRODUCTION
Human uses natural language to communicate with each otherandthesamethingcomputershouldreplicate(Copy) whilecommunicatingwiththeuserandforthispurposeonly we use NLP. we know that NLP has a wide domain like Speech Recognition, sentimental analysis, machine translations,ChatbotsaresomeapplicationsofNIP.Butin this paper, we are going to work on one of the specific applicationsof NLPnamedSentimentalanalysis.Sentiment analysismodelsfocusonpolarity positive,negative,neutral reviewbutalsofeelingsandemotionsandevenonintentions ofusers.
Sentimentanalysisisaprocedurethatanalysesareviewand evaluates if the writer's attitude towards a given topic or product is positive, negative, or neutral using natural languageprocessingandatextcategorizationtool.Analyzing many evaluations without a properly trained system is a difficult process. Sentiment analysis is a solution to this problemthatinvolvestrainingasystemonmillionsoftexts to determine if a message is good, negative, or neutral. It works by breaking down a review into separate sets of words,cleaningallnon essentialmaterial,andthendefining thetypeofreviewbasedontheevaluationscore(positive, negative,orneutral).
2. ALGORITHM
Afterreviewingtheliterature,wefoundthatforsentiment analysisthreemainalgorithmswereused.Inthissection,we arediscussingthosealgorithms.
1. Naïve Bayes Algorithm:
value:
Classificationmeansthegroupingofdatabasedoncommon characteristics. A Naive Bayes classifier is a probabilistic classifier that works by figuring out the probability of different attributes of data being associated with a certain class. The Naive Bayes classification algorithm utilizes the probability theory proposed by British scientist Thomas Bayes, which predicts future probabilities based on experience.
BayesTheorem:
ThisTheoremstatesthat“theprobabilityofAoncondition thatB is true equals to the probability of BprovidingA is true times the probability of A being true, divided by the probabilityofBbeingtrue.”
In Naïve Bayes, the classification processis oftendivided into two phases: learning and testing.a number ofthe informationthathasbeenknownfortheinformationclassis fedintotheeducationalphasetoconstructanapproximate model.Thenwithinthetestphase,themodelthathasbeen formedistestedwithanotherdatatoseetheaccuracyofthe model.
a) Working:
Letusconsider, x >thetextdataorsayabunchofreviews y >theclasses(let’ssay0fornegativeand1forpositive)


forpositivereviews fornegativereviews
Nowweareattemptingtoseewhetherareviewispositive ornegativedue to whichthedenominatorwillbeignoredfor theonce

One of the foremost important assumptions we'veconsidered in Naive Bayesis termedthe bag of words.Itimpliesthateveryonethealgorithmcaresaboutare thatthewordanditsfrequencyi.e.,whatnumbertimesthe wordhasappearedinourdata.Thepositionofthewordin our sentence(document)doesn'tmatterin the least. We
9001:2008
Research Journal of Engineering and Technology (IRJET)

Volume: 09 Issue: 04 | Apr 2022 www.irjet.net
onlyshouldkeep track ofthe quantityof timesa specificwordappearedinourdocument.That’sit.
b) Implementation:
Let’stakeanexample-“Fabricisverygood”.
Initially,awordcountisdone.
words count fabric 1 is 1 very 1 good 1
Thisisabag of wordsconcept.Itdoesnotmatterwherethe wordshavebeenusedinthesentences,allthatmattersisits frequency.
If you may recall, our main goal was to find the class (whetherpositiveornegativesentiment)givenaparticular sentence(document).So,wecanapproachthisproblemthis way:
1.SupposewehaveasetofpossibleclassesC.
2.Wefindtheprobabilityofadocumentbeinginaparticular class.Soessentiallyconditionalprobabilityofaclassgivena document.
3.Weiterateoveralltheclassesandfindwhichclasshasthe maximumconditionalprobability,givingusouranswer.
2. SVM (Support Vector Machine)

e ISSN: 2395 0056
p ISSN: 2395 0072
Fig1:BasicSystemArchitectureofNaïveBayesAlgorithm

SVM (support vector machine) is a Supervised learning method algorithm, which is used for classification worldwide. It separates new inputs with the help of its hyperplaneandgeneratesoutput.
a) Pre processing:
Letusplotthen dimensionalspace&takethevalueofeach newfeatureasthevalueofaspecificcoordinate.Then,by differentiating those two classes we can find the ideal hyperplane.[Fig2]
Fig2:SVMmodel.
b) Working:
SVM set aside the hyperplane that maximizes the margin betweentwoclasses.[Fig2]
( )+b= ( )+b=0
=n dimensionalinputvector, =itsoutputvalues

weightofvector(thenormalvector)
=Lagrangianmultiple.
If . +b≥0→positiveclass

Else→negativeclass
a) Implementation: StepsforimplementationofSVM
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page695
Research Journal of Engineering and Technology (IRJET)

Volume: 09 Issue: 04 | Apr 2022 www.irjet.net
Step1:Importtheimportantlibrarieswhicharerequiredfor theimplementationoftheproject.
Step2: DownloadthedatasetandapplySVMtothatdataset forfurtheranalysis.
Step 3: Dataset will get splitter into two parts, using one user definedfunctionSplit().
Step4:Nowwecanvisualizeandobservedatathatoneof theclassesislinearlyseparable.
Fig3:BasicSystemArchitectureofSVM
SVM has been proved one of the most efficient learning algorithmsfortextTokenization.ThisSVMalgorithmcanbe used to perform an evaluation measure on comments or feedbackreceivedfromcustomers.

3.CNN(ConvolutionalNeuralNetwork):


CNN stands for "Convolutional Neural Network" and is largelyutilizedinimageprocessing.Thenumberofhidden layers employed between the input and output layers determinestheCNN'sstrength.Asetoffeaturesisextracted byeachlayer.
Aseriesoffiltersareappliedtotheinputtocreatefeature maps.Eachfiltermultipliesitsweightsbytheinput values aftergoingoverthefullinput.TheresultissenttoaRectified LinearUnit(ReLU),sigmoid,ortheactivationfunction.The setofweightsisevaluatedusingalossfunction.Thefeature mapsthatthefiltersproducehighlightvariousaspectsofthe input.
e ISSN: 2395 0056
p ISSN: 2395 0072
Fig4:CNNModel
EventhoughCNNiscommonlyutilizedinimageandvideo processing, recent approaches to NLP utilize CNN. A pre processingstepinNLPconvertsthetextinputtoa matrix representation.Sentencecharactersareusedasrows,and alphabetlettersareusedascolumns,inthematrixform.In NLP,afilterisslidacrossthematrix'swords.Asaresult,the wordsaredetectedusingtheslidingwindowtechnique.
Fig5:BasicArchitectureofCNN
2022, IRJET
Factor value: 7.529 | ISO 9001:2008
09 Issue: 04
3. LITERATURE SURVEY
Sr. No. Title
1. Machine Learning Based Customer Sentiment Analysis for RecommendingShoppers, ShopsbasedonCustomers Review
2. Experimental Investigation of Automated System for Twitter Sentiment Analysis to Predict the PublicEmotionusingMLA
3. A Literature Review on Application of Sentiment Analysis using Machine LearningTechniques
4. The Impact of Preprocessingstepsonthe Accuracy of Machine learning Algorithms in SentimentAnalysis
5. Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baseline
6. A Survey of Sentiment Analysis based on TransferLearning
Apr 2022
Methodology Conclusion
Thispaperismotivatedtowards applying Machine Learning Algorithm Hybrid Recommendation.
This paper introduces an automated Twitter sentiment analysissystemusingMLAsuch as SVM, Naïve Bayes Classifier, MLP,Clustering
The performance of this Hybrid Recommendation System was evaluatedusingthreemetricsnamely MAE,MSE,andMAPE.
Limitation
ISSN: 2395 0056
ISSN: 2395 0072

Inthispaper,thesentimentanalysis of Twitter data by using different MLAisdiscussed.
The regression model used to implement the recommending system is complex and requires expertise.
Thetestcasealgorithmusedinthis paperisgivingacontrastresultwith theexistingalgorithm.
7. TheEssentialofSentiment Analysis and Opinion Mininginsocialmedia
8. Comparison of Naïve BayesandSVMAlgorithm based on Sentiment Analysis Using Review Dataset
9. Survey of Sentiment Analysis Using Deep LearningTechniques
10. A Survey of Sentiment Analysistechniques
Commontechniquesofanalyzing sentiments such as Supervised Learning, Unsupervised Learning, and hybrid based approachisused.
Inthispaper,differentMLAsand featureextractiontechniquesare used. These algorithms are NB, MaxE,andSVM.
For unimodal features extraction,theprocedurebybc LSTMisfollowedandfortextual features extraction, CNN is followed.
This paper focuses on the algorithms and application of transfer learning in sentiment analysis
Thispapersummariesthetechniques for machine learning used in the analysis of emotions in the latest period.
We studied the impact of different preprocessingstepsontheaccuracy ofthreeMLAforsentimentanalysis.
Doesnotdetermineaccurateandless timefeatureextraction.
accuracy of Maximum Entropy (MaxE)remainsthesameevenafter applyingpreprocessingsteps.
Theuseful baselinefor multi model sentiment analysis and multimodal emotionrecognition.
Thisreviewmakesacomprehensive investigation and discussion on sentiment analysis in a different direction.
Does not include contextual dependency learning within the model.
Out of three transfer models Parametertransfermodelisnoteasy to converge, within the Instance transfer model application scope is laid low with the similarity that existsbetweentwodomains.Within the Feature representation transfer model application scope is more extensive.
It makes use of the supervised learning method and unsupervisedlearningmethod
This research paper contains supervised learning which is under the machine learning approach.
Deep learning methods such as CNN, RNN, LSTM are used in sentimentanalysis.
Featuresselectionandsentiment classification is done using sentimentclassification
Different methods using different approaches and algorithms and has beentestedinavariousdataset.
In this paper, we use machine learning algorithms of Naïve Bayes and support vector machines for sentiment classification of airline reviews.
A detailed Survey of various deep learning architecture used in sentimentanalysisatbothsentence levelandaspectlevelispresented.
Thissurveyconcludedthatthereisa lotofscopeinsentimentanalysisand itisfoundthatSVMandNaïveBayes algorithms are used for sentiment analysis.
Itischallengingtoidentifynegation inthesentence,sarcasm,andothers.
In this paper, the SVM algorithm is found to be more accurate than NaïveBayes.Whereasinsomecases exactlyoppositeisobserved.
manydeeplearningalgorithmswere studiesbutnoalgorithmisfoundto capture the deep and implicit relationinformation.
deeper analysis is required whenit comestosocialnetworkingsitesand context consideration is very important.
International Research Journal of Engineering and Technology (IRJET)

Volume: 09 Issue: 04 | Apr 2022 www.irjet.net
3. CONCLUSIONS
Theimpactofperformingdatatransformationsondifferent classificationalgorithmsisdiscussedinthiswork,although thetypeoftransformationdependsonthedatasetandthe languageitcontains.
Determining which categorization methods will yield the greatest results is indeed tough. Different strategies have been evaluated in various datasets using various methodologiesandalgorithms.TheaccuracyofNaïveBayes and Support Vector Machines (SVM) was found to be comparable,whereasConvolutionalNeuralNetwork(CNN) wasshowntobelessaccuratefortextualsentimentanalysis.
REFERENCES
[1] ShanshanYi XiaofangLiu (2020) “Machine learning based customer sentiment analysis forrecommending shoppers,shopsbasedoncustomers’review”
[2] Priti Sharma, A.K. Sharm (2020) “Experimental Investigation of Automated System for Twitter Sentiment AnalysistoPredictthePublicEmotionusingMLA”
[3]AnvarShathikJ.,KrishnaPrasadK.(2020)“ALiterature ReviewonApplicationofSentimentAnalysisusingMachine LearningTechniques”
[4] Saqib Alam, Nianmin Yao (2018) “The Impact of Preprocessing steps on the Accuracy of Machine learning AlgorithmsinSentimentAnalysis”
[5]SoujanyaPoria,NavonilMujumdar,DevamanyuHazarika, Erik Cambria, Alexander Gelbukh, Amir Hussain (2018) “MultimodalSentimentAnalysis:AddressingKeyIssuesand SettinguptheBaseline”
[6]RuijunLiu,YuqianShi,ChangjiangJi,MingJia(2019)“A SurveyofSentimentAnalysisbasedonTransferLearning”
[7]Hong ChengSoong,NoraziraBintiAJalil,RameshKumar Ayyasamy(2019)“TheEssentialofSentimentAnalysisand OpinionMininginSocialMedia”
[8] Abdul Mohaimin Rahat, Abdul Kahir, Abu Kaisar MohammadMasum(2019)“ComparisonofNaïveBayesand SVMAlgorithmbasedonSentimentAnalysisUsingReview Dataset”
[9]IndhraomPrabhaM,G.UmaraniSrikanth(2020)“Survey ofSentimentAnalysisUsingDeepLearningTechniques”
[10]HarpreetKaur,VeenuMangat,Nidhi(2017)“ASurvey ofSentimentAnalysistechniques”
2022, IRJET
e ISSN: 2395 0056
p ISSN: 2395 0072
Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal
698