Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC

Introducing New Hybrid Rough Fuzzy Association Rule Mining Algorithm Aritra Roy1 and Rajdeep Chatterjee2 1

2

KIIT University, Bhubaneswar, India Email: royaritra1990@gmail.com KIIT University, Bhubaneswar, India Email: cse.rajdeep@gmail.com

Abstract— Association rules shows us interesting associations among data items. It means that an association rule clearly defines that how a data item is related or associated with another data item. That is why these types of rules are called Association rules. And the procedure by which these rules are extracted and managed is known as Association rule mining. Classical association rule mining had many limitations. As a result Fuzzy association rule mining (Fuzzy ARM) came. But Fuzzy ARM also has its limitations like redundant rule generation and inefficiency in large mining tasks. After that Rough association rule mining (Rough ARM) came which seemed to be a good alternative of Fuzzy association rule mining in terms of performance. But day by day our mining task is becoming huge. So, performing mining task efficiently and accurately over a large dataset is still a big challenge to us. In this paper we have presented a new hybrid mining method which has incorporated the concepts of both rough set theory and fuzzy set theory for association rule generation. Index Terms— Association rule mining, Fuzzy ARM, Fuzzy C means clustering, Rough set theory, Rough ARM, Attribute reduction, Apriori algorithm

I. INTRODUCTION Knowledge discovery in databases (KDD) is big procedure which consists of several sub processes. Data mining is a sub process of KDD process [1]. Data mining [2] discovers useful information and interesting patterns by the logical analysis of a database. This derived information is very useful in intelligent systems like Expert systems [3]. An Expert system uses heuristic mechanisms and knowledge to produce expert suggestion for decision making. Because of this need the idea of association rule mining has come. Association rule mining shows interesting association relationships among data items [2]. A. Data Mining and Association Rule Mining We know that there is a large amount of data present in physical world. And useful knowledge is hidden within these data. Now extraction of knowledge from these dataset is a crucial thing because knowledge helps us in decision making. Data mining can be described as an important tool to discover the knowledge. Knowledge also can be termed as interesting patterns [2]. So, the whole process of producing knowledge from raw data is called knowledge discovery in databases (KDD). And data mining is just a part of KDD process. Association rules provide interesting correlations among data items. From these association rules, useful knowledge can be derived. Association rule mining is an aspect of data mining. DOI: 02.ITC.2014.5.101 © Association of Computer Electronics and Electrical Engineers, 2014

B. Fuzzy ARM Concepts Classical or crisp association rule mining (Crisp ARM) has many problems due to the use of sharp partitioning of dataset for the conversion of numerical attributes into boolean attributes. And it results in loss of information. In Crisp ARM a user has to define the minimum support value which is again a problem because any wrong setting of minimum support value will result in wrong association rules. So, to eradicate these issues the concept of fuzzy association rule mining came. Fuzzy ARM process incorporates the concepts of fuzzy set theory [4] for the association rule generation. There is a huge number of Fuzzy ARM algorithms is already present in research work. Some of them are interesting in terms of the mining strategy. In [20], we can see a variety of different fuzzy ARM techniques. In [5], we can see the use of fuzzy c-means clustering [6], [7] technique for the preprocessing of the dataset. In [8], we found the automatic generation of minimum support value of each data item by the proposed algorithm. In [9], we can see a pruning mechanism to delete redundant rules by the concept of “Certainty Factor” [10], [11]. In [12], we can see again a pruning mechanism to eradicate redundant rules by the “Equivalence concept”. But the performance of these algorithms was not up to the mark as they were anticipated. And research efforts showed that Fuzzy ARM algorithms are not efficient in case of huge datasets. And also there is the chance of redundant rule generation. So, as an alternative of the Fuzzy ARM, the concept of rough association rule mining came. C. Rough ARM Concepts Rough association rule mining incorporates the concepts of Rough set theory [13]. There is a significant number of research works already done in the area of Rough ARM. Some of them are interesting in terms of the mining strategy. In [14], we can find that rough set approach to association rule mining is a much easier technique than the maximal association method [15]. Here, rules generated in both methods are similar. In [16], we can see the use of rough set attribute reduction technique to reduce the size of the large dataset. In [17], we can find the use of the equivalence class concept for the mining task. Rough ARM seemed to be better than the Fuzzy ARM. But the challenge of performing mining task efficiently on a large dataset, still remains. This paper represents a hybrid mining method which uses the concepts of both rough set theory [13] and fuzzy set theory [4]. II. THEORITICAL ASPECTS A. Rough Set Concepts Most of the time information which is available in the physical world is uncertain, imprecise, and incomplete. Performing mining task on such incomplete data can produce incomplete knowledge. So, it is very much needed to imperfect knowledge. And rough set [18] can remove this imperfectness. Let, U is finite set of objects and a binary relation ⊆ × be given. The set U is called the universe and R is an indiscernibility relation. The discernibility relation describes our lack of knowledge about U. R is also can be represented as an equivalence relation. The pair (U, R) is termed as approximation space. Let, X be a subset of U ( ⊆ ). Main objective is to represent set X with respect to R. R(x) denotes the equivalence class of R determined by element x. Equivalence classes of the indiscernibility relation R is called granules. These granules are the fundamental parts of knowledge. And these granules of knowledge are understandable to us because of R. But by the indiscernibility relation individual objects of U cannot be observed. From [18], we can get a simplified view of the rough set approximations. The set of all objects which can be with certainty classified as members of X with respect to R is called the R-lower approximation of a set X with respect to R, and denoted by ∗ ( ). (1) ∗ ( ) = { ∶ ( ) ⊆ } The set of all objects which can be only classified as possible members of X with respect to R is called the Rupper approximation of a set X with respect to R, and denoted by ∗ ( ), ∗(

) = { ∶ ( ) ∩ ≠ ∅} (2) The set of all objects which can be definitively classified neither as members of X nor as members of - X with ( ), respect to R is called the boundary region of a set X with respect to R, and denoted by ( ) = ∗ ( ) − ∗ ( ) (3) So, from those relations it can be easily anticipated that, Set X is a crisp set with respect to R if and only if the boundary region of X is a null set.

168

Set X is a rough set with respect to R if and only if the boundary region of X is not a null set. The accuracy of approximation of a rough set X can be numerically described as, ( ) =

| ∗( )| | ∗( )|

(4)

Here | |denotes the cardinality of ≠ ∅.The value of ( ) ranges between (0,1) such that, 0 ≤ ( ) ≤ ( ) = 1 , then X is a crisp set with respect to R, means X is definable in U, and if ( ) < 1 ,then X 1. If is a rough set with respect to R, means X is indefinable in U. Fig. 1 shows us the approximations of rough set theory.

Fig 1: Graphical illustration of the Rough set approximations

Instead of set approximations, rough sets can be described also by using, a rough membership function. Rough membership function represents the degree of relative overlap between the set X and the equivalence class R(x) to which x belongs. It is defined as, ∶ → < 0,1 > . Where, ( ) =

| ∩ ( )| | ( )|

(5)

And | | is the cardinality of X. B. Information System From [18], we can also get an idea about information systems. A dataset is represented as a table, of which each row is an object and each column is an attribute that can be measured for each object or simply provided by the user. Such table can be called as an information system. Let, U is the universe and it is also a set of finite number of objects. And A is the finite set of the attributes which is non-empty. So, a dataset S will be represented as an information system S= (U, A). C. Decision System A decision system is similar to an information system, with a little difference. In case of an information system A is a non-empty finite set of attributes. But in case of decision system we can see the presence of decision attributes. A decision attribute is a distinguished attribute by which knowledge can be expressed. And the values of decision attribute helps in evaluating an object. So, information systems having decision attributes present within them are called decision systems. In [16] a good example of a decision system is given. Which is given a decision system S = < U , A ,V , f >, here U= { x1,x2,….xn } is the limited collection objects or samples. = ∪ is set of finite number of attributes. = { , , … , } is the condition attribute set and = { , , … , } is the decision attribute set. ∩ = . ( , ) is value of in attribute c. And V is the value range of attribute set A. D. Reduct Dimensionality reduction is a very important thing in case of large dataset. Here, dimensionality reduction means attribute reduction. Attribute reduction is done by the generation of reducts [18] which is a rough set concept. The key concept behind reduct is keeping only those attributes that preserves the indiscernibility

169

relation as well as set approximations. So, reducts can be described as subsets of attributes which are ( ) = minimal. The subset of is a reduct of if is independent and ( ) [18]. E. Fuzzy c-means clustering Clustering method is very useful in finding patterns in the dataset. Fuzzy c-means (FCM) clustering method [6] is an extension of k-means algorithm where a data item can be a member of only one cluster. But in case of fuzzy c-means clustering a data item can be member of multiple clusters. Here, each data item has a degree of membership to be a member of each cluster. This clustering algorithm basically iteratively minimizes the following objective function, ‖ − =∑ ∑ (6) Where m is any real number such that 1 ≤ m < ∞, is the degree of membership of in the cluster of j , is the d-dimensional measured data, is the d-dimensional center of the cluster, and ‖∗‖ is any norm expressing the similarity between any measured data and the center. The fuzziness parameter m is an arbitrary real number (m > 1).We can see the relative overlapping of clusters in fuzzy c-means clustering. Here, has to define the number of clusters and the minimum support value. And fuzzy c-means clustering algorithm is not a deterministic algorithm. III. PROPOSED ALGORITHM In real world we find most of the datasets are of large volume and also in many cases the datasets are overlapping in nature. We observe that in many cases dataset contains superfluous data. In another word some attributes may not play a significant role in association rule mining. Attributes can be analysed by the rough set theory for their significance in mining process. Less significant attributes can be dropped. So that dimensionality will be reduced. As to speed up rule generation process again we divided the dataset into multiple clusters based on their commonality. And also fuzzy c-means allows for partial membership of patterns to clusters. In simple words FCM algorithm is used to represent overlapping clusters. The proposed algorithm is a hybrid algorithm which incorporates the concepts of both rough set theory and fuzzy set theory at different stages to produce association rules. A. Algorithm Description The proposed algorithm works this way. Firstly algorithm will reduce the attributes by the using rough set [13]. Here algorithm calculates the reducts [18]. The attributes which does not belong to a reduct, are unnecessary attributes and therefore can be dropped. So, as the number of attributes is reduced, obviously the dataset will be also reduced. After that algorithm converts the crisp dataset into fuzzy dataset and applies fuzzy c-means clustering [6] technique on fuzzy dataset to create fuzzy clusters. Next, algorithm applies classical apriori algorithm [19] on each cluster to generate set of association rules from each cluster. And lastly algorithm has a mechanism for the aggregation of the generated association rules. Algorithm works in four basic steps. Those are as follows, Step 1: Given a dataset = ( , ) , a decision system where U is the finite set of objects and A is the attribute set. Calculate B indiscernible set, [ ] ( ) = {( , ) ∈ | ∀ ∈ ( ) = ( )} Where, ⊆ .Calculate B lower approximation, ( ) = { | [ ] ⊆ } Where, ⊆ ⊆ . Calculate positive region of the partition U / D with respect to B, ( ) =

( ) ⊆

D and B be the subset of A. Calculate

for all possible subset B of A | ( )| ( , ) = | |

D depends on B with degree of Obtain RED(B), it is the set of all reducts of B where, reduced attribute set. Dataset after attribute reduction, Step 2: 170

( )⊆ =( , )

=

( )⋃

. Here

is the

Convert crisp dataset into fuzzy dataset F using fuzzy MF (Membership function). Apply FCM clustering algorithm on F. Obtain set of clusters = { , , … } . Here, j = number of clusters. Step 3: Apply classical apriori algorithm on each cluster ∈ , where i = 1, 2, 3,…,j. Obtain set of association rules from , where rule ∈ . Here = 1,2, … , = 1,2, … , . = number of rules in each cluster and 0 < ≤ . Step 4: Initially = ∅ , Aggregated set of rules. iff, ∩ = ∅ then, = ∪ where, ∈ Finally obtain E, set of association rules from . B. Pseudo Code 1. // Generate reducts from the given dataset 2. // Calculate for all possible subset B of A ( ) = ⋃ ⊆ ( ) 3. ( ) 4. Obtain = ⋃ 5. Dataset after attribute reduction, = ( , ) 6. // Convert crisp dataset into fuzzy dataset F 7. // Apply FCM clustering algorithm on F 8. Obtain set of clusters C 9. // Apply classical apriori algorithm on each cluster 10. Obtain association rule set from each cluster 11. // Rules aggregation 12. Initially aggregated set of rules, = ∅ 13. iff, 14. ∩ = ∅ 15. then, 16. = ∪ where , ∈ Finally obtain E C. Flowchart of the proposed algorithm Fig. 2 is the visual representation of the working process of the proposed algorithm. Dataset Attribute Reduction Using Rough Set Reduced Dataset Coversion of Crisp dataset into Fuzzy dataset Fuzzy cluster creation using FCM algorithm Association rule set generation from each cluster using classical apriori algorithm Rules Aggregation Fig 2: Conceptual diagram of the proposed algorithm

171

D. Analysis Here, a simple example is given about the functionalities of the proposed algorithm. Table I represents a simple dataset of animals and their features which is as follows, TABLE I: E XAMPLE DATASET Animal

Hair

Teeth

Eye

Feather

Feet

Eat

Milk

Fly

Swim

Lion

Yes

Pointed

Forward

No

Claw

Meat

Yes

No

Yes

Dolphin

No

No

Sideway

No

No

Fish

No

No

Yes

Cow

Yes

Blunt

Sideway

No

Hoof

Grass

Yes

No

No

Tiger

Yes

Pointed

Forward

No

Claw

Meat

Yes

No

Yes

Cheetah

Yes

Pointed

Forward

No

Claw

Meat

Yes

No

Yes

Giraffe

Yes

Blunt

Sideway

No

Hoof

Grass

Yes

No

No

Zebra

Yes

Blunt

Sideway

No

Hoof

Grass

Yes

No

No

Ostrich

No

No

Sideway

Yes

Claw

Grain

No

No

No

Penguin

No

No

Sideway

Yes

Web

Fish

No

No

Yes

Albatross

No

No

Sideway

Yes

Claw

Grain

No

Yes

Yes

Eagle

No

No

Forward

Yes

Claw

Meat

No

Yes

No

Viper

No

Pointed

Forward

No

No

Meat

No

No

No

Now, attribute reduction will be done by generation of reducts. Here, we can observe the dependency between the attributes Teeth and Hair. The dependency of attribute Hair on attribute Teeth is computed by the following way, |{3,6,7}| + |{2,8,9,10,11}| 3 + 5 2 = = = = 0.67 | | 12 3 So, we can see that hair is partially dependent on Teeth. This way we can dependency within other attribute pairs. We consider a threshold value = 0.5 for the dependency between attribute pairs. By this condition we get other two attribute pair which satisfies this condition. ⇒ , = = 0.5 and ⇒ , = = 1. So, we have attributes Hair, Teeth, Feet, Milk. And we can drop the rest of the attributes. As attribute reduction is done, the whole dataset will also reduce. The following Table II represents the reduced dataset, TABLE II: REDUCED DATASET Animal

Hair

Teeth

Feet

Milk

Lion

Yes

Pointed

Claw

Yes

Dolphin

No

No

No

No

Cow

Yes

Blunt

Hoof

Yes

Tiger

Yes

Pointed

Claw

Yes

Cheetah

Yes

Pointed

Claw

Yes

Giraffe

Yes

Blunt

Hoof

Yes

Zebra

Yes

Blunt

Hoof

Yes

Ostrich

No

No

Claw

No

Penguin

No

No

Web

No

Albatross

No

No

Claw

No

Eagle

No

No

Claw

No

Viper

No

Pointed

No

No

After this conversion of crisp dataset into fuzzy dataset, is done. And FCM algorithm is applied to the fuzzy dataset and fuzzy clusters are formed using Gaussian membership function. Fig. 3 shows us the clusters formed due to applying FCM algorithm on reduced dataset. 172

Fig 3: Clusters

First cluster contains animals {Dolphin, Ostrich, Penguin, Albatross, Eagle, Viper} and the second cluster contains animals {Lion, Cow, Tiger, Cheetah, Giraffe, Zebra}. Next classical apriori algorithm is applied on each cluster to generate rule set from each cluster. Minimum support value was set to 0.5 and minimum confidence was set to 80%. In case of first cluster only one frequent item was found (Claw) which has the support count 3. So, as there were no other frequent items so no association rule can be found. But in the second cluster we found more than one frequent itemsets. These two itemsets are {Hair, Pointed Teeth, Claw, Milk} and {Hair, Blunt Teeth, Hoof, Milk}.Then we calculated the confidence values of generated association rules. Generated association rules are as follows, (

, ,

(

, (

ℎ) → )→ ℎ) → )→ ,

,(

(

,

ℎ, (

,(

, ℎ, (

, ,

ℎ) → )→ ℎ) → )→

And all these rules satisfy the minimum confidence threshold and have 100% confidence. Lastly the rule aggregation is done and the final output E is produced which is the aggregated set of association rules. Initially E is empty. At the end of the algorithm it is a non-empty set of aggregated association rules where no similarity can be found between association rules. IV. CONCLUSION AND FUTURE WORK Fuzzy ARM has problems like, redundant rule generation, incomplete knowledge, Inefficient to solve huge mining tasks. So, Rough ARM concept came, as an alternative of the fuzzy ARM. And rough ARM found to be better than fuzzy ARM. But as the mining task is getting larger, need of developing better algorithm has emerged. This paper represents a hybrid mining algorithm which incorporates the key concepts of both rough set theory and fuzzy set theory to generate association rules more efficiently. Here, a simple analysis of the proposed algorithm has been given. We are moving towards the simulation phase. Where we will test our algorithm on real time dataset and compare with the existing state of the art algorithms of both rough ARM and fuzzy ARM technique. REFERENCES [1] Frawley, William J.; Piatetsky-Shapiro, Gregory; Matheus, Christopher J.: Knowledge Discovery in Databases: an Overview.AAAI/MIT Press, 1992. [2] J. Han and M. Kamber, Data Mining: Concepts and Techniques: The Morgan Kaufmann Series, 2001. [3] Donald, W.A.: A Guide to Expert Systems, Addison Wesley, Boston, MA, 1986. [4] Zadeh, L. A.: Fuzzy sets. Inf. Control, 8, pp. 338–358, 1965. [5] Ashish Mangalampalli, Vikram Pudi: Fuzzy Association Rule Mining Algorithm for Fast and Efficient Performance on Very Large Datasets. FUZZ-IEEE 2009, Korea, ISSN: 1098-7584, E-ISBN: 978-1-4244-3597-5, pp. 1163 – 1168, August 20-24, 2009. [6] Bezdek, J. C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell, MA, 1981.

173

[7] Hoppner, F., Klawonn, F., Kruse, R, Runkler, T.: Fuzzy Cluster Analysis, Methods for Classification, Data Analysis and Image Recognition. Wiley, New York, 1999. [8] Ehsan Vejdani Mahmoudi, Vahid Aghighi, Masood Niazi Torshiz, Mehrdad Jalali, Mahdi Yaghoobi: Mining generalized fuzzy association rules via determining minimum supports , IEEE Iranian Conference on Electrical Engineering (ICEE)2011, E-ISBN :978-964-463-428-4 ,Print ISBN:978-1-4577-0730-8, pp.1 – 6, 2011. [9] Toshihiko Watanabe: Fuzzy Association Rules Mining Algorithm Based on Output Specification and Redundancy of Rules, IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2011, ISSN: 1062-922X, Print ISBN: 978-1-4577-0652-3, pp.283 – 289, 2011. [10] M. Delgado, N. Marin, M. J. Martin-Bautista, D. Sanchez, and M.-A.Vila, “Mining Fuzzy Association Rules: An Overview,” Studies in Fuzziness and Soft Computing, Springer, vol. 164/2005, pp. 351-373, 2006. [11] M. Delgado, N. Marin, D. Sanchez, and M.-A. Vila, “Fuzzy Association Rules: General Model and Applications,” IEEE Trans. on Fuzzy Systems, vol. 11, no.2, pp. 214-225, 2003. [12] Toshihiko WATANABE, Ryosuke Fujioka: Fuzzy Association Rules Mining Algorithm Based on Equivalence Redundancy of Items, IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2012, E-ISBN: 978-1-4673-1712-2, Print ISBN: 978-1-4673-1713-9, pp.1960 – 1965, 2012. [13] Pawlak. Z. Rough Sets International Journal of Computer and Information Sciences, pp.341-356, 1982. [14] Guan, J.W.; Bell, D.A.; Liu, D.Y.: "The Rough Set Approach to Association Rule Mining", Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03), Print ISBN: 0-7695-1978-4, pp.529 - 532, 2003. [15] Feldman, R.; Aumann, Y.; Amir, A.; Zilberstain, A.;Kloesgen, W. Ben-Yehuda, Y.: Maximal association rules: a new tool for mining for keyword co-occurrences in document collection, in Proceedings of the 3rd International Conference on Knowledge Discovery (KDD 1997), pp.167-170, 1997. [16] Chen Chu-xiang ; Shen Jian-jing ; Chen Bing ; Shang Chang-xing ; Wang Yun-cheng : "An Improvement Apriori Arithmetic based on Rough set Theory " Third Pacific-Asia Conference on Circuits, Communications and System (PACCS), Print ISBN:978-1-4577-0855-8, pp.1 - 3 , 2011. [17] Xun Jiao; Xu Lian-cheng; Qi Lin:"Association Rules Mining Algorithm Based on Rough Set ", INTERNATIONAL SYMPOSIUM ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION, Print ISBN: 978-14673-2109-9, vol. 1, pp.361 - 364, 2012. [18] Pawlak, Z.: Rough Sets Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht, 1991. [19] Agrawal, Rakesh; Imielinski, Tomasz; Swami, Arun: Mining Association Rules between Sets of Items in Large Databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, 1993. [20] Aritra Roy, Rajdeep Chatterjee: “A Survey on Fuzzy Association Rule Mining Methodologies”, IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661, p- ISSN: 2278-8727, vol. 15, issue 6, pp. 1-8, 2013.

174

Advertisement