IRJET- Depression Detection with Sentiment Analysis of Tweets

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 06 Issue: 05 | May 2019

p-ISSN: 2395-0072

www.irjet.net

DEPRESSION DETECTION WITH SENTIMENT ANALYSIS OF TWEETS Hemanthkumar M1, Latha A2 1Student,

Department of Computer Science and Engineering, Sapthagiri College of Engineering, Bengaluru Department of Computer Science and Engineering, Sapthagiri College of Engineering, Bengaluru ---------------------------------------------------------------------***--------------------------------------------------------------------2Professor,

Abstract - Nowadays risk of early death is increasing due to

positive or negative, based on a curated word-list to detect depression tendencies. Numerous machine learning algorithms are available for the classification of the tweets, some of them are Naïve Bayes, Support vector machine algorithms, Random forest algorithms etc. But the documented experiments show better result for Naïve Bayes and support vector machine algorithms, the study is limited to these algorithms [1]. The performance of the algorithms can be measured over different metrics like accuracy, precision, recall and confusion matrix.

mental illness which is mostly caused due to depression. Depression creates suicidal thoughts causing serious impairments in daily life. Sentiment analysis is a hot topic that’s been on research for decades, which intends to find the nature of text and classifies into positive, negative and neutral. In today’s digital world lot of data is available for sentiment analysis of both image and text. The aim of this paper is to apply natural language processing on Twitter feeds for conducting emotion analysis focusing on depression. Based on a curated word list, the tweets are classified as positive or negative. Naive-Bayes and support vector machine algorithms are used for the classification of the text data obtained from twitter feed. The results are presented using the classiﬁcation metrics like accuracy, precision, recall and confusion matrix.

The Determination of the sentiment of an entire document is called as coarse level and ﬁne level deals with attribute level sentiment analysis [2]. Since Twitter provides text that is at the sentence level, this study will be between these two levels of determination. In this paper we mostly deal with Tweets which are short message which are bound by a 140character limit [3]. In this concise format, users express their emotions and feelings about ongoing happenings in their life and the world around them.

Key Words: Sentiment analysis, Naïve-Bayes, SVM, Lemmatization, confusion matrix.

1. INTRODUCTION

2. BACKGROUND

Modern society has made human life so busy, making it vulnerable to mental disorders like depression, anxiety etc. Depression can impair many facets of human life. The detection and prevention of depression is so difficult and has been the hottest topic of research since the past decade. The stresses, work schedules and many other things affects the normal life and can cause depression. Depression is regarded as the mental cancer by the researchers. Some of the feelings that can arise due to depression are loss of interest in activities, Suicidal thoughts, feeling of worthlessness or hopelessness and Worsened ability to think and concentrate.

A. Dataset

The dataset is comprised of 43,000 tweets, collected from kaggle website. A ratio of 70:30 has been adopted for splitting the data collected into training and test dataset. The dataset consists of 2 columns namely sentiment and text. The text part contains the tweets and the sentiment part contains 0 or 1 where 0 represents negative meaning the tweet is linked to depression and 1 represents positive.

B. Naive Bayes Classiﬁer

Naive Bayes Classifier implements Bayes theorem with a solid independence assumption [4], particularly, independent feature model. Bayes Theorem works on conditional probability which finds out the probability of an event given that some other event has already occurred. It predicts the conditional probability of a class given the set of evidences and finds the most likely class based on the highest one. A naive Bayes classifier is a famous and popular technique because it is very fast approach and gives a high accuracy [5]. The equation of the conditional probability is defined as follows

Sentiment analysis is a hot topic that’s been on research for decades, which intends to find the nature of text and classifies into positive, negative and neutral. Today’s advanced technologies provide lot of tools for efficient retrieval, analysis and classification of text data into different classes. In today’s digital world lot of data is available for sentiment analysis of both image and text through social media networks like Facebook, twitter etc., but using data from some social media networks raises concern of privacy, thus, twitter is the most suitable social media network for getting enough data and also to protect us from privacy laws. This proposition aims to apply natural language processing on Twitter feeds for conducting emotion analysis focusing on depression. Individual tweets are classiﬁed as

|

Impact Factor value: 7.211

Where,

|

ISO 9001:2008 Certified Journal

|

Page 1197

Turn static files into dynamic content formats.

Create a flipbook