Med 2016 12 issue 1

Page 17

G. Sočan, Bootstrapping Congruence Coefficients in PCA

2. Variance explained by the first principal component (EVP = 25%, 50%, or 75%, respectively). Although the choice of values is to some extent arbitrary, such values might be obtained in analyses of well-designed dichotomous items (25%), graded response items (50%), and factorially homogeneous test scores (75%).4 3. Coefficient of congruence in the population (ϕ = 1, .95, or .85, respectively). While the inclusion of the value 1 was obviously necessary, because it corresponded to the null hypothesis of perfect congruence in the population, the remaining two values were determined according to the results of Lorenzo-Seva and ten Berge (2006). They investigated the relation between ϕ and practitioners’ subjective judgments of factor similarity. They concluded that in cases when ϕ > .95, the factors’ interpretations could be considered equal, whereas the value range .85–.94 corresponded to a fair factor similarity. Therefore, the values of .95 and .85 can be considered as limiting sizes for small and large effect size, respectively. The manipulation of the sample size and the population value of ϕ was essential for the evaluation of both error levels. On the other hand, we included EVP because we expected this factor to differentiate between mCLCHYP and CLCHYP. As explained above, both procedures differ in the definition of the residual part of the model Ψ. When EVP is high, the sizes of the residuals in Ψ are generally small compared to the case when EVP is low; therefore, the behavior of both procedures should differ more in the low EVP condition than in the high EVP condition.5

Procedure In each of the 5 3 3 = 45 experimental conditions, we generated 5000 sample pairs. For each pair, we first constructed a pair of corresponding population covariance matrices. The pair of the first eigenvectors of these matrices had a fixed value of ϕ, and the eigenvalues of both matrices were identical. The eigenvalues had been set in advance according to the desired value of EVP and were the same for all population matrices within the same level of EVP (see Table 1). The eigenvalues were determined in such a way that the last p–1 eigenvalues formed a linear scree. 4

5

15

Table 1. Population eigenvalues for different levels of EVP Explained variance percentage e.v.

25%

50%

75%

1

2.50

5.00

7.50

2

1.50

1.00

0.50

3

1.33

0.89

0.44

4

1.17

0.78

0.39

5

1.00

0.67

0.33

6

0.83

0.56

0.28

7

0.67

0.44

0.22

8

0.50

0.33

0.17

9

0.33

0.22

0.11

10

0.17

0.11

0.06

Note. e.v. = eigenvalue.

On the other hand, a separate pair of eigenvector matrices was constructed for each pair of population matrices as follows. The first eigenvector of each matrix was obtained as a column of matrix W*, computed as

"

# ϕ pffiffiffiffiffiffiffiffiffiffiffiffiffi ; W ¼W 0 1 ϕ2

1

ð7Þ

where W was an orthonormalized p 2 matrix of random numbers from the uniform distribution, and ϕ was the desired population value of the congruence coefficient. The remaining nine eigenvectors were obtained by means of the procedure described in Step 2 of mCLCHYP: the respective column of W was used in place of UT, and a p (p – 1) matrix of uniformly distributed random numbers was used in place of VR. We used different population eigenvectors (and, consequently, a different pair of population covariance matrices) for each sample pair to prevent confounding the effects of our independent variables with idiosyncratic structure properties of a single population loading matrix. On the other hand, we used only three sets of eigenvalues. We controlled the variance explained by the first component, and at the same time we wished the structure to be close to unidimensional, making the extraction of a single principal component the optimal choice. Because of the fixed value of the sum of the eigenvalues, little room for variation of individual eigenvalues remained. Our sampling scheme therefore included a sampling of populations from a metapopulation

For example, the first principal component explains 27.3% of variance in the classical set of ten dichotomously scored LSAT items (Bock & Lieberman, 1970). Schmitt and Allik (2005) applied PCA to the 10 items of the Rosenberg Self-Esteem Scale, scored on a 4-point scale. The first principal component explained 50.3% of variance in the largest (USA) sample, and 41.4% of variance across all 53 nations. Finally, the EVP value of 75% for 10 variables (as used in this study) would imply average inter-test correlation of around .72, which could be expected when analyzing reasonably reliable tests measuring the same construct. This relation is not exact. The main difference between the procedures lies in the off-diagonal elements of Ψ, which are zero in CLCHYP and nonzero in mCLCHYP. On the other hand, EVP is related to all elements of Ψ. However, we see no quantity which would be directly related only to the off-diagonal elements and could be easily interpreted, computed, and manipulated.

Ó 2016 Hogrefe Publishing

Methodology (2016), 12(1), 11–20


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.