International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 07 Issue: 09 | Sep 2020
p-ISSN: 2395-0072
www.irjet.net
Event Classification and Retrieving User's Geographical Location based on Live Tweets on Twitter and Prioritizing them to Alert the Concern Authority Sarthak Vage1, Sarvesh Wanode2, Kunal Sorte3, Prof. Dipak Gaikar4 1-3BE,
Department of Computer Engineering, Rajiv Gandhi Institute of Technology, Mumbai, Maharashtra, India of Computer Engineering, Rajiv Gandhi Institute of Technology, Mumbai, Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------4Department
Abstract - Social media has become a place where people
send advice during disasters. Twitter API and Tweepy library were used to get the tweets. But most of the data generated on Twitter consists of conversational tweets and tweets used for advertisements often termed as spam-bots, which usually hold no value for helping in emergency situation. Users also post hilarious posts for entertaining purposes. For an official to check for informational content among all these tweets would be a great waste of time. To save their time we categorized tweets into relevant and nonrelevant tweets where relevant ones hold key information and non-relevant containing useless conversational tweets. To further ease the process the tweets are classified according to their priority levels for ease of user to get key details.
meet, greet and share vast information and their personal views. It is a very useful source for procuring vital information in emergency situations. One such platform is Twitter which is a micro-blogging and social networking service where a large number of people from celebrities to common people express their views and opinions and share information. We could use this information to mitigate and extract useful information from messages in Twitter called as tweets. We developed a system takes input as tweets from hashtag provided by the user and categories them into relevant and non-relevant using Naïve Bayes algorithm. These relevant tweets contain data that is crucial for any emergency situation management. These classified tweets are further prioritized by XGBoost algorithm, which determines the tweets that are most important during such situations. In the final stage the user can click on the tweet by which they can see all the key details of the tweet such as user name, user location and other such information to take suitable action on the situation.
This system uses Naïve Bayes algorithm which is a stochastic model and belong to a family of simple probabilistic classifiers based on applying Bayes theorem. To classify according to the priority various models are tested among which XGBoost gave the most accuracy which is a decision tree-based Machine Learning algorithm that uses gradient boosting framework. An important aspect is geo -location of the tweet as it is important to know the location of the user in some situations. After categorizing the developed system all such vital information related of the user is provided to the concerned authority.
Key Words: Twitter, Hashtag, Naïve Bayes, XGBoost, NLTK, API, Geolocation, etc.
1. INTRODUCTION Social media provides a platform to discuss real-world events and express one’s opinions. Micro blogging systems such as Twitter provide such platform where people come together. By better understanding them we can more effectively filter information as the event unfolds. By using Machine Learning various techniques we can unearth key details which can help in better management of an emergency situation. This information can be used alert the concerned authorities to take suitable actions during distress situations.
2. RELATED WORK The Geographic Situational Awareness: Mining Tweets for Disaster Preparedness, Emergency Response, Impact, and Recovery [1] paper has classified tweets and location from the tweets where found. In the second paper tweets are classified into informational tweets and conversational tweets and then informational tweets are used for the disaster response. In this paper they worked on a coding schema for separating social media messages into different themes within different disaster stages such as mitigation, preparedness, emergency response and recovery. They have manually examined 10,000 tweets generated during natural disaster phases. They used the Hurricane sandy which struck in northwest US as their training dataset and only used geotagged tweets, from these tweets they filtered out the tweets related to storm, hurricane and where categorized into different themes as above. They manually annotated subset of
The rise of mobile internet in past two years has increased the social activity of people to a large context due to which Twitter has seen a large number of new users from urban as well as rural area. People with political importance also have accounts and post to increase popularity and dominance by expressing their views. Almost all the important officials have their account on Twitter to help people and reach to them socially. Twitter act not only as an awareness platform, but also a place where people can ask for help and
© 2020, IRJET
|
Impact Factor value: 7.529
|
ISO 9001:2008 Certified Journal
|
Page 1051