datamining methods and models

Page 161

USING THE PRINCIPAL COMPONENTS AS PREDICTORS

TABLE 3.26

143

Correlation Matrix, on Which the Principal Components Are Based

Calories cal-z 1.000 prot-z 0.026 fat-z 0.509 sodium-z 0.299 fiber-z −0.290 carbs-z 0.271 sugars-z 0.565 potas-z −0.068 vitamin-z 0.268

Protein 0.026 1.000 0.185 −0.002 0.516 −0.018 −0.302 0.561 0.050

Fat 0.509 0.185 1.000 0.019 0.020 −0.277 0.289 0.189 −0.008

Sodium 0.299 −0.002 0.019 1.000 −0.061 0.320 0.047 −0.025 0.347

Fiber −0.290 0.516 0.020 −0.061 1.000 −0.397 −0.133 0.907 −0.030

Carbohydrates 0.271 −0.018 −0.277 0.320 −0.397 1.000 −0.461 −0.381 0.217

Sugar 0.565 −0.302 0.289 0.047 −0.133 −0.461 1.000 0.026 0.105

Potasium Vitamins −0.068 0.561 0.189 −0.025 0.907 −0.381 0.026 1.000 0.026

0.268 0.050 −0.008 0.347 −0.030 0.217 0.105 0.026 1.000

1. Eigenvalue criterion. According to this criterion, only components with eigenvalues of at least 1.0 should be extracted. Table 3.27 shows three such components, with a fourth component being very close to 1.0, with a 0.997 eigenvalue. Thus, this criterion would suggest either three or four components. 2. Proportion of variance explained criterion. There is no concrete threshold for this criterion. We would, however, like to account for as much of the variability as possible while retaining a relatively small number of components. Table 3.27 shows us that 82% of the variability is accounted for by the first four components, and 89% is accounted for by the first five components. Thus, this criterion would suggest perhaps four or five components. 3. Minimum communality criterion. This criterion recommends that enough components should be extracted so that the communality (proportion of variance of a particular variable that is shared by the other variables) for each of these variables in the model exceeds a certain threshold, such as 50%. Table 3.28 shows that the communalities for each of the variables is greater than 60% four components are extracted, but the communality for vitamin-z is below 50% when only

TABLE 3.27 Eigenvalues and Proportion of Variance Explained by the Nine Components

Initital Eigenvalues Component

Total

% of Variance

Cumulative %

1 2 3 4 5 6 7 8 9

2.634 2.074 1.689 0.997 0.653 0.518 0.352 0.065 0.019

29.269 23.041 18.766 11.073 7.253 5.752 3.916 0.717 0.213

29.269 52.310 71.077 82.149 89.402 95.154 99.070 99.787 100.000


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.