Issuu on Google+

International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol. 3, Issue 2, Jun 2013, 27-36 © TJPRC Pvt. Ltd.

ANALYSIS OF DIFFERENT CLUSTERING TECHNIQUES FOR DETECTING HUMAN EMOTIONS VARIATION THROUGH DATA MINING JASKARAN KAUR1 & SHEVETA VASHISH2 1

Lovely Professional University, Punjab, India

2

Assistant Professor, Lovely Professional University, Punjab, India

ABSTRACT This paper explored to detect human emotion variation through data mining and analyzing different clustering techniques to be chosen for classification of human emotion . In human beings there are different mood swings exist. People become more moody and irritable during any time according to situation .This moodiness is commonly attributes to the sudden and fluctuating hormonal level, environment factor, family disturbance, Cognitive immaturity . Here in this proposed work different types of clustering techniques is applied on same dataset of human emotion so different types of clusters are made from that same dataset. Clustering is often one of the first steps in data mining analysis. It identifies groups of related records that can be used as a starting point for exploring further relationship. By this we can identify and make comparison among different clustering techniques applied on EEG dataset and see which one is better for getting good results for classification.

KEYWORDS: Clustering Algorithm, Human Emotion, Classification, Electroencephalography(EEG) INTRODUCTION Emotions Emotion is often the driving force behind motivation, positive or negative. Term such as affect, sentiment ,feeling, mood, expressiveness and emotion are sometimes used interchangeably and at other times to denote a specific affective state.[5] Emotion is a feeling that is private and subjective .Humans can report an extraordinary ranges of states, which they can feel or experience. We can also say that when Emotion is energy in motion. Human emotions can be complicated, confusing, and sometimes even downright ugly. Most of us experience emotions at times that we would rather not haveemotions we mightn’t even want to admit to having. That’s okay. As human beings, we all experience a wide range of emotions [13]. We often don't choose the emotions we encounter, but we can choose what to do with them. As well as learning to handle less pleasant emotions like sadness, it is vitally important to understand and nurture our positive emotions such as joy and authentic happiness. There are various ways that we can nurture the emotions that we want to have more of in our lives. Data Mining Data Mining is a collection of techniques for efficient automated discovery of previously unknown, valid, novel, useful and understandable patterns in large databases. The patterns must be actionable so that they may be used in an enterprise’s decision making. Data mining techniques may be used for smaller amounts of data, but the larger the data the better the chance of finding something novel and interesting. In other words, we want to ensure that what has been discovered indeed something interesting about the process underlying the data that is being analyzed and not simply a result of some random fluctuations[4]. Data mining is often a complex process and may require a variety of steps before some useful results are obtained.


28

Jaskaran Kaur & Sheveta Vashish

A typical data mining process is likely to include the following steps:

Requirements analysis- One cannot use data mining without a good idea of what kind of outcomes the enterprise is looking for, since the technique to be used and the data that is required are likely to be different for different goals. Furthermore, if objectives have been clearly defined, it is easier to evaluate the result.

Data selection and collection-This step may include finding the best source data source database for the data that is required.

Cleaning and preparing data-By doing survey find out the emotions which have large probability.

Data mining exploration and validation- once appropriate data has been collected and cleaned; it is possible to start data mining exploration. Data mining tools or model are constructed based on our needs.

Implementing, evaluating, and monitoring. Once a model has been selected and validated, the model can be implemented for use by the decision makers.

Clustering Clustering is the process of examining a collection of “points,” and grouping the points into “clusters” according to some distance measure. The goal is that points in the same cluster have a small distance from one another, while points in different clusters are at a large distance from one another. Cluster- A collection of data objects .Similar to one another within the same cluster .Dissimilar to the objects in other clusters .Cluster analysis means Grouping a set of data objects into clusters The aim of cluster analysis is to classify the objects into clusters, especially in such a way that two objects of the same cluster are more similar than the objects of other clusters[4]. The objects can be of various characteristics. It is possible to cluster animals, plants, text documents, economic data etc. The two major approaches to clustering – hierarchical and agglomerative – Clustering is unsupervised classification: no predefined classes.

LITERATURE REVIEW Shenghua Bao, Shengliang Xu September(2012) Mining social emotion for document categorization so that it is useful for online user to select the document based on their emotional preferences ,for this they have propose a joint emotion topic model with the help of latent dirichlet allocation with an intermediate layer for emotion modeling, this model provide us connection between online document and user generated social emotion . By text mining they mine the affective words and make connections with relative emotion .By this model we can uncover hidden topic that exhibits strong emotion. But problem may arise that if same word has different meaning & they may convey different emotion .These methodology can be applied in songs, emotion aware recommendation of advertisements. Further I am studying on some new techniques to detect emotion and their application area [7]. Sivaraman sriram, xiaobu yuan (2012) An enhanced approach for classifying Emotions using Customized decision tree algorithm. As there are different way to recognize emotion like from textual conversation ,facial recognition , dynamic gesture recognition capturing the human body movements but as I have read emotion detection can also be done with the help of decision tree or nearest neighbor algorithm in this emotion generated rules are used ,here artificial neural network is also used for emotion detection ,here we find out mean and root mean square for all values in dataset , as in dataset have all seven emotion . It is also be used in real time situation such as data mining or gene prediction system but proposed paper implements this above approach in application like to classify vedio’s according to their emotion[8]. Minho kim et al (2011) In this research paper lyrics based emotion classification is done. As based Songs fell emotionally diff to listeners depending on their lyrical contents even melodies are similar .In this a method for lyrics based


Analysis of Different Clustering Techniques for Detecting Human Emotions Variation through Data Mining

29

emotion classification is text-based using feature selection by partial syntactic analysis [11]. Classification of emotion require the choice of emotion model , as such existing research on music emotion use Thayer model, tellegen-watson – Clark model. Thayer model is efficient including twp pillar representing stress and energy to classify emotion polarity .in this study examined emotion extracted through application of the syntactic analysis rule and classified them on basis of lyrics Oscal T.-C. Chen et al (2012) In this paper inspired age and gender recognition systems are developed. As in this work, the proposed age and gender recognition systems based on arousal intensities of speaker’s emotions. First, speech frames of a speaker’s utterance are classified into two groups that are higher and lower than the mean of arousal intensities of speech frames. The speech frames with higher and lower arousal intensities are used for age and gender recognitions. Here, only three ages of young, adult and senior are identified. Hence, according to the emotional arousal intensity, speech frames of a speaker’s utterance are classified into two groups which are above and below the mean of arousal intensities of speech frames. Mel-scale Frequency Cepstral Coefficients (MFCC) are often used to differentiate age and gender of a human. Additionally, the classifier can be realized by k nearest neighbors, it

is emotion-inspired recognition systems to

identify the age and gender of a speaker [12]. Four emotions of angry, happy, calm and sad are analyzed first. The emotions of angry and happy usually have higher arousal than those of calm and sad. From our simulations, the gender recognition prefers calm whereas the age recognition prefers angry and happy. Therefore, the recognition systems proposed here in can be widely applied to various voice interface applications.

PROPOSED WORK Proposed paper is based on classification of human emotions through data mining. As our brain stores different types of emotions and through analyzing emotional behavior it can be determined that how emotions vary and how this variation occurs using different classification techniques to do this we do classification it is the process of finding a model that describes and distinguishes data classes. The model is derived based on the analysis of a set of training data. Classification rule we use is if –then rules means decision trees, as such decision tree can be easily be converted into classification rules. But before classification we have to do clustering so, here in this proposed work on same dataset of EEG (electroencephalography) related to human emotion different clustering techniques are applied and hence different clusters are obtained through which it can be identify which clustering technique show better result to identify different human emotion and their variation on different parameters.

Figure 1: Over view of Classification [10]


30

Jaskaran Kaur & Sheveta Vashish

As shown in this diagram, from the dataset classification is done .as Classification is the process of finding a model that describes and distinguishes data classes or concepts. We can also say that classification is the separation or ordering of objects into classes. After classification making decision by using if –then rule.

PROPOSED METHODOLOGY Dataset Dataset of human emotion is taken from EEG Different machine learning techniques are used. A range of approaches have been applied for automatically detecting emotion including recording and analyzing the person physiological responses ,electroencephalography (EEG) signal. An electroencephalography is a test which measures and records the electrical activity of brain. Special sensors are attached to head and connected to computer [15]. Computer records the brain’s electrical activity on the screen. And having different parameter for this emotion due to family disturbance, environment factor, exciting news and many more. By using different clustering techniques on that same dataset different type of cluster will be formed then we can identify which technique, is one of the better for identifying human emotion .

Figure 2: EEG Dataset of Human Emotion Major Clustering Approaches are *Partitioning Algorithms: Construct various partitions and then evaluate them by some criterion This algorithm have two types k mean in which Each cluster is represented by the center of the cluster and kmedoids or PAM (Partition around medoids) there are some limitation of k mean like unable to handle noisy data and outlier, we have to specify no of cluster in advance, and not able to discover cluster of non convex shapes. In PAM it works effectively for small data sets, but does not scale well for large data sets.


Analysis of Different Clustering Techniques for Detecting Human Emotions Variation through Data Mining

31

Figure 3: In this Figure it Is Shown How Clusters are made Using K-Mean • Hierarchy Algorithms: Create a hierarchical decomposition of the set of data (or objects) using some criterion. In Hierarchical Clustering under it there is two types Agglomerative versus divisive and generic Agglomerative Algorithm. In agglomerative versus divisive algorithms agglomerative havebottom-up appraoch . Build up clusters from single objects where as divisivehave top-down break up cluster containing all objects into smaller clusters .

Figure 4: Hierarchy Algorithm Samples • Density-Based: Based on connectivity and density functions .Density-Based Method in which most of the distance based algorithms try to find only spherical shaped clusters. Inorder to find the arbitrary shaped clusters, density based methods are proposed. The idea is to grow the cluster as long as the density of around a point in the cluster is above certain threshold. These methods not only find arbitrary shaped clusters but also outliers/noise points. DBSCAN and OPTICS are examples of density basedclustering algorithms.


32

Jaskaran Kaur & Sheveta Vashish

Figure 5: DBSCAN Algorithm • Grid-Based: Based on a multiple-level granularity structure. Grid based Method: Grid based algorithms divide the given data space into finite number of cells that form a grid structure. This grid structure is used to perform all the clustering operations. The advantage here is clustering processing can be done in parallel. Quality of the clusters depends on the granularity of the grid. such method deal with non- numeric data more easily. As this method is not affected by data ordering.

Figure 6: Cluster Showing Different Emotion Variation of Human Being of EEG Dataset • Model-Based A model is hypothesized for each of the clusters and the idea is to find the best fit of that model to each other. Kohonen Neural Network It is possible to use the neural network with so called unsupervised learning for cluster analysis that is based on evaluation of the difference (distance) of the weighted vector w of the neural network from the vector of input pattern x and search of neuron, whose weighted coefficient have the minimum distance of w from x. This neuron, which won among


Analysis of Different Clustering Techniques for Detecting Human Emotions Variation through Data Mining

33

the neurons of the network, has the right to adjust its weights and the weights of neurons in its surroundings and thus the response on submitted learning pattern to better value. Model based Method: In this method, it is based on a probability distribution [6]. This algorithm tries to build cluster with a high level of similarity within then and a low level of similarity between them. Model based methods follow two approaches: statistical approach and neural network based approach [4]. The weight vector w j=(w j1, w j2 ,wj3……..wjn)T

(where j-th-neuron)

The input vector x p( x 1p, x2p,x3p…….,xnp)T

(where p-th input pattern)

Figure 7: Sample Schema for Kohonen Neural Network

Figure 8: Cluster Analysis with the Use of Kohonen Neural Network The Fuzzy Clustering and Data Analysis Toolbox is a collection of MATLAB functions. The toolbox x provides five categories of functions: Clustering algorithms[1]. These functions group the given data set into clusters by different approaches: functions Kmeans and Kmedoid are hard partitioning methods, FCMclust, GKclust, GGclust are fuzzy partitioning methods with different distance norms.


34

Jaskaran Kaur & Sheveta Vashish

Classification Data mining has generated renewed interest in classification. Since the datasets in data mining are often large, new classification techniques have been developed to deal with millions of objects having perhaps dozens or even hundreds of attributes. Classification is the process of finding a model that describes and distinguishes data classes or concepts. The model is derived based on the analysis of a set of training data. The model is used to predict the class label of objects for which the class label is unknown [14] .We can also say that classification is the separation or ordering of objects into classes. If the classes are created without looking at the data, the classification is called apriori classification. If however the classes are created empirically, the classification is called posteriori classification. In most literature on classification it is assumed that the classes have been deemed apriori and classification then consists of training the system so that when a new object is presented to the trained system it is able to assign the object to one of the existing classes. This approach is also called supervised learning. Classification has many applications for example prediction of customer behavior for example like predicting direct mail responses or identify telecom customers that might switch companies and identify fraud. The number of cases classified correctly provides us with an estimate of the accuracy of the model. Our aim is to find highly accurate models that are easy to understand and which are efficient when dealing with large datasets. There are a number of classification methods. These are like decision tree and naïve bayes techniques.

CONCLUSIONS Clustering lies at the heart of data analysis and data mining applications. The ability to discover highly correlated regions of objects when their number becomes very large is highly desirable, as data sets grow and their properties and data interrelationships change. At the same time, it is notable that any clustering “is a division of the objects into groups based on a set of rules – it is neither true nor false. Emotional regulation and mood may play a pivotal role in their academic success. This emotional change causes relational conflicts. Emotional variation detection can help to control the negative emotions. Classification of Human Emotions using different machine learning techniques is one of the phenomenal researches in today's world. This proposed paper explored emotional variation of human beings using different clustering techniques on same dataset of EEG hence different clusters are obtained and analyzing there result of clustering and choosing the best one. Outlier analysis is used to identify emotion variation in human having any kind of disability.

REFERENCES 1.

http://www.mathworks.in/matlabcentral/fileexchange/7486

2.

http://www.stat.columbia.edu/~madigan/W2025/notes/clustering.pdf

3.

http://www.ima.umn.edu/~iwen/REU/REU_cluster.html

4.

Introduction to data mining with case studies (G.K. Gupta)

5.

http://library.thinkquest.org/26618/en-1.4.1=What%20are%20emotions.htm

6.

http://dsp.vscht.cz/konference_matlab/MATLAB08/prispevky/025_dostal.pdf

7.

Shenghua Bao, Shengliang Xu September (2012). Mining Social Emotions from Affective Text, 2012 IEEE

8.

Sivaraman sriram, xiaobu yuan (2012). An enhanced approach for classifying emotions using customized decision tree algorithm, 2012 IEEE

9.

Minho Kim, Hyuk-Chul Kwon(2011). Lyrics-based Emotion Classification using Feature Selection by Partial Syntactic Analysis,2011 IEEE

10. Jaskaran kaur , Sheveta Vashisht (2012) Analysis and Indentifying Variation in Human Emotion Through Data Mining, 2012 IJCTA.


Analysis of Different Clustering Techniques for Detecting Human Emotions Variation through Data Mining

35

11. Minho Kim, Hyuk-Chul Kwon(2011). Lyrics-based Emotion Classification using Feature Selection by Partial Syntactic Analysis, 2011 IEEE 12. Oscal T.-C. Chen, Jhen Jhan Gu, Ping-Tsung Lu and Jia-You Ke(2012 ). Emotion-Inspired Age and Gender Recognition Systems, 2012 IEEE 13. http://www.choosing-life-my-way.com/human-emotions.html 14. Data mining concepts and techniques third edition (jiawei han, micheline kamber, jian pei) 15. http://www.webmd.com/epilepsy/electroencephalogram-eeg-21508



4.Analysis of different.full