datamining methods and models by Saber bouabid

datamining methods and models

236

CHAPTER 5

NAIVE BAYES ESTIMATION AND BAYESIAN NETWORKS

classifier, in other words, just the number of predictor variables times the number of distinct values of the target variable. Bayesian classification can be extended from categorical to continuous predictor variables, provided that we know the relevant probability distribution. Bayesian belief networks (BBNs) are designed to allow joint conditional independencies to be defined among subsets of variables. BBNs, also called Bayesian networks or Bayes nets, take the form of a directed acyclic graph (DAG), where directed means that the arcs are traversed in one direction only, and acyclic means that no child node cycles back up to any progenitor. The nodes represent variables, and the arcs represent the (directed) dependence among the variables. In general, node A is a parent or immediate predecessor of node X , and node X is a descendant of node A, if there exists a directed arc from node A to node X . The intrinsic relationship among the variables in a Bayesian network is as follows: Each variable in a Bayesian network is conditionally independent of its nondescendants in the network, given its parents. The Bayesian network represents the joint probability distribution by providing that (1) a specified set of assumptions regarding the conditional independence of the variables, and (2) the probability tables for each variable, given its direct predecessors. For each variable, information regarding both (1) and (2) is provided.

REFERENCES 1. 2. 3.

4. 5. 6.

7. 8. 9.

Thomas Bayes, Essay towards solving a problem in the doctrine of chances, Philosophical Transactions of the Royal Society of London, 1763. Tom Mitchell, Machine Learning, WCB–McGraw-Hill, Boston, 1997. Churn data set, in C. L. Blake, and C. J. Merz, UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/∼mlearn/MLRepository.html, University of California, Department of Information and Computer Science, Irvine, CA, 1998. Also available at the book series Web site. Daniel Larose, Discovering Knowledge in Data: An Introduction to Data Mining, Wiley, Hoboken, NJ, 2005. David Hand, Heiki Mannila, and Padhraic Smyth, Principles of Data Mining, MIT Press, Cambridge, MA, 2001. Stuart Russell, John Binder, Daphne Koller, and Keiji Kanazawa, Local learning in probabilistic networks with hidden variables, in Proceedings of the 14th International Joint Conference on Artiﬁcial Intelligence, pp. 1146–1152, Morgan Kaufmann, San Francisco, CA, 1995. Peter Sprites, Clark Glymour, and Richard Scheines, Causation, Prediction, and Search, Springer-Verlag, New York, 1993. Marco Ramoni and Paola Sebastiani, Bayesian methods, in Michael Berthold and David J. Hand, ed., Intelligent Data Analysis, Springer-Verlag, Berlin, 1999. Breast cancer data set, compiled by Dr. William H. Wohlberg, University of Wisconsin Hospitals, Madison, WI; cited in O. L. Mangasarian and W. H. Wohlberg, Cancer diagnosis via linear programming, SIAM News, Vol. 23, No. 5, September 1990.