Med 2016 12 issue 1

Page 27

H. Du & L. Wang, Number of Dyads in Dyadic Data Analysis

Hox (2005) used 0.1, 0.2, and 0.3 for the ICC values in their simulations. Hox (2010) further suggests that ICC values could be much higher in small group and family research. For example, Salvy, Howard, Read, and Mele (2009) observed ICCs ranging from .47 to .81 using data from various kinds of eating partner dyads and McIsaac, Connolly, McKenney, Pepler, and Craig (2008) observed an ICC of 0.84 in adolescent romantic dyads. Therefore, we also include 0.5 and 0.7 as levels of ICC in our simulation.

Proportion of Singletons (PS = 0%, 10%, 30%, 50%) and Missing Data Mechanisms (MCAR, MAR) Missing data generally complicate the estimation of MLM models. For dyadic data, one special type of missing data, singletons, could occur by design or by other reasons. A singleton indicates that data from one member of a dyad are available with data from the other member missing. For example, in family research of collecting data from both mothers and fathers, by design, researchers may plan to collect data from all mothers but only half of the fathers due to budget constraints. Even when all mothers and all fathers are supposed to be measured by design, researchers have found that fathers are more likely to be absent from a data collection than mothers. Clarke and Wheaton (2007) have showed singletons may affect estimation. More generally, with large amounts of missing data, bias was observed in parameter estimates of multilevel models when the sample size is not large enough even for missing completely at random conditions (Gibson & Olejnik, 2003; Roth, 1994). Missing data proportions vary a lot in empirical dyadic studies and can be as low as 0% and as high as 80% (Grover & Vriens, 2006; Strauss et al., 2004). Therefore, in this study, we consider 0%, 10%, 30%, and 50% for the proportion of singletons (PS). In addition, we consider two kinds of missing data due to different missingness mechanisms (Rubin, 1976): MCAR and MAR. The former occurs when missingness is unrelated to either observed or unobserved variables and the latter occurs when missingness is related to observed variables in the analysis model but not to unobserved variables. To generate MCAR missing data, we randomly set a specific proportion of fathers’ posttest data to be missing. For generating MAR data, there are different ways, depending on how strongly missingness depends on observed data. For example, we can set fathers’ posttest data to be missing when their pretest scores are larger than the (1 PS)th percentile. Alternatively, we can evenly divide the distribution of fathers’ pretest data into several parts, for example, the upper, middle, and lower parts. Then, for example, Ó 2016 Hogrefe Publishing

25

PS 50%, PS 35%, PS 15% of the corresponding posttest scores of the three parts are set to be missing respectively. Zhang & Wang (2016) used both approaches of generating MAR data and found that the former yielded stronger MAR data than the latter. In the current simulation, we use the former approach to generate “stronger” MAR data. In sum, there are 8 5 4 2 = 320 conditions. For the true parameter values, we set the fixed intercept coefficients (γA00) as 1, and all the other fixed coefficients 0.3, following the simulation design in Maas and Hox (2005). The lower level predictor Xid has a normal distribution Xid N(0, 1). The level-1 residual variance σ2e is 0.5. Based on Equation 10, the level-2 residual variance, σ2A0d , is deter0

mined by a given ICC value and the corresponding σ2e value in the empty model. Note that if we simply use σ2A0d σ2A0d þσ2e

from the full model in Equation 8 for “ICC” and

simulate data based on the full model, the ICC values calculated from the empty models are smaller than the desired ones. From some numerical analyses, we observed the relaσ2A0d þσ2e

tions between σ2

A0d

in Equation 8 from the full model and

the ICCs in Equation 10 from the empty model. Therefore, σ2A0d þσ2e

we fix the ratio of σ2

in the full model to some certain

A0d

values, to achieve the designed values of ICC (0.1, 0.2, 0.3, 0.5, and 0.7), respectively, in the empty model (see the Appendix for the relations). For evaluating the Type I error rates, we set the corresponding true values to be 0. For example, for the level-2 variance estimates, we set μA0d to be 0. For each condition, 10,000 simulated data sets were generated. Analyses were implemented in SAS 9.3 using the SAS PROC MIXED procedure with REML for estimation. For evaluation, convergence rates, bias in point estimates, coverage probabilities, and Type I error rates of the estimates of both fixed coefficients and variance components were examined. Nonconvergence could happen when the estimate of a variance parameter during an iterative process is near zero or negative, so such a covariance matrix is not positive definite. In this study, nonconvergence was identified when SAS PROC MIXED provided an incomplete output of the variance/covariance estimates and/or an incomplete output of the fixed-effects estimates (e.g., missing standard errors) with a warning that “Estimated G matrix is not positive definite”. For suggesting the minimum number of dyads needed, we consider a convergence rate of 95% or higher as satisfactory. For bias, the ^ θj across absolute bias is computed by the average of jθ replications in conditions when θ = 0, and the relative bias ^ is given by the average of θ θ 100% across replications in θ conditions when θ ¼ 6 0. When the relative bias is higher than 5%, it is considered as unsatisfactory (Hoogland & Methodology (2016), 12(1), 21–31


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.