Page 1 of 23

Modeling Strategies in Epidemiology: I. Traditional methods

Sander Greenland1 Neil Pearce2,3 (17 April 2014) [1] Department of Epidemiology and Department of Statistics University of California, Los Angeles, USA lesdomes@ucla.edu [2] Departments of Medical Statistics and Non-communicable Disease Epidemiology Faculty of Epidemiology and Population Health London School of Hygiene and Tropical Medicine United Kingdom* [3] Centre for Public Health Research Massey University Wellington New Zealand

*Address for correspondence: Professor Neil Pearce Department of Medical Statistics Faculty of Epidemiology and Population Health London School of Hygiene and Tropical Medicine Keppel Street London WC1E 7HT United Kingdom E-mail: neil.pearce@lshtm.ac.uk Website: http://www.lshtm.ac.Uuk

Page 2 of 23 ABSTRACT Most epidemiology textbooks that discuss models are vague on details of model selection. This may be appropriate â€“ model selection should be strongly influenced by factors that are specific to the particular study, including prior information about covariates that may confound, modify, or mediate the effect under study. Nonetheless, if many covariates are available for control it is important that authors document their modeling goals and strategy. Thus most researchers could benefit from some simple strategies as starting points for more context-driven analyses. We review several traditional strategies, and discuss their shortcomings; in a second paper we provide refinements that do not require special macros or programming. Throughout, we assume the main goal is to derive the most accurate effect estimates obtainable from the data and commercial software. This goal shifts the focus to prediction of exposure or potential outcomes (or both) to adjust for confounding, and thus differs from the goal of ordinary statistical modeling, which is to accurately predict only observed outcomes.

Page 3 of 23 Introduction Few topics cause as much confusion and consternation, both in students and in experienced epidemiologists, as modeling strategies. It’s one thing to do a course on the basic principles of confounder control and assessment of “interaction” and mediation, and to learn how to run regression models. It’s another thing to be confronted with a new data set including many potentially important covariates, and then have to decide what to do with it.

Most epidemiologic textbooks are vague on the practicalities of model selection, and understandably so. Arguably, model selection should be strongly influenced by factors that are specific to the particular study, including possibly controversial prior information about which variables are important potential confounders, modifiers, or mediators. It is thus inappropriate to impose rigid rules for all aspects of the modeling process. Nonetheless, the lack of guidelines can leave confusion about how to proceed and can raise questions about biases from choosing models that favor a preferred hypothesis. Most researchers could thus benefit from recommendations even if they deviate from those in specific analyses. Similarly, critical reading of analyses requires knowledge of what strategy or guidelines were used to select analysis models.

Scope, Aims, and Assumptions The present paper, together with a companion paper1, is our attempt to update commentaries from decades ago2-4 and to supplement more recent overviews5, noting that few researchers employ modeling methods now considered in the forefront of theory. Rather, most studies fall back on strategies that have become entrenched in basic teaching and common commercial software. Therefore, we critically examine these strategies to discuss how they

Page 4 of 23 can be recast and upgraded with little effort to minimize harmful practices and follow more sound methodologic principles, without requiring new software.

In this paper and its companion1, we focus on ‘traditional’ strategies for confounder assessment and control, and we do not consider issues of effect measure modification. In subsequent papers, we will consider more advanced methods for confounding control6 and effect measure modification7.

We will begin with a review of traditional modeling approaches in epidemiology: stepwise regression and related strategies; and change in estimates (CIE) strategies. We avoid unfamiliar concepts and special software or programming, although we will also cite better methods available to those with more technical resources. Thus, our coverage is not intended to be a comprehensive review for highly skilled practitioners; rather, we target teachers, students and working epidemiologists who would like to do better with data analysis, but who lack resources such as R programming skills or a bona fide modeling expert committed to their project.

Throughout, we assume that we are applying a conventional risk or rate regression model (e.g., logistic, Cox, or Poisson regression) to estimate the effects of an exposure variable X on the distribution of a disease variable Y, while controlling for other variables. These other variables may be forced variables, such as age and sex, which we may always want to control, or unforced variables about which we are unsure whether to control. The unforced variables may include products among variables, in which case it will be important that models containing a product also contain the components of the product as main effects (the “hierarchy principle”)8 9 We will also assume that data checking, description, and

Page 5 of 23 summarization have been done carefully.10 We will not address problems arising from preliminary data examinations (e.g. by collapsing away small categories)11; see Chatfield12 13 for a discussion.

To facilitate interpretation of coefficients and minimize numeric problems, we assume that all quantitative variables have been recentered to ensure that zero is a meaningful reference value present in the data, and rescaled so that their units are meaningful differences spanning a range present in the data.8 For example, diastolic blood pressure could be recentered so that 0 represents 80mm, and then rescaled to cm instead of mm, so that 95mm would become (95âˆ’80)/10 = 1.5; adult age could be recentered so that 0 represents age 60, and rescaled to decades instead of years, so that 80 years would become (80-60)/10 = 2.0. Recentering and rescaling is especially important when employing product terms (â€œinteractionsâ€?),9 and also when using prior distributions or penalty functions in the analysis, as discussed in a subsequent paper6. We emphasize that the rescaling should be done in contextually meaningful units in order to facilitate interpretation, rather than in terms of arbitrary studyspecific quantities such as sample standard deviations or intraquartile ranges.14 15

The strategies we propose apply to exposures and covariates of any form. We assume that univariate distributions and background (contextual) information has been used to select categories or an appropriately flexible form (e.g., splines) for detailed modeling of quantitative variables.8 We focus only on basic variable selection, leaving many difficult issues about model specification and diagnostics to more detailed discussions8 16-22 Finally, we will not consider special models and problems that arise with time-varying exposures or mediation analysis.23-26

Page 6 of 23 Traditional Approaches to Variable Selection Unfortunately, it can be difficult to tell what modeling strategy, if any, has been used in published studies. To the extent we have discerned strategies, however, they are often a variant of these three: 1) Enter all the potential confounders in a model including the exposure (which is not really a variable-selection strategy, since only one set of variables is considered). 2) Eliminate or include variables based on statistical significance, as in conventional stepwise regression27 (which is still widely used, despite the extensive literature documenting how it can seriously distort P-values and confidence intervals2 12 17 28-41). 3) Eliminate or include variables based on the change in the exposure-effect estimate 42-44.

More recently, adjustment methods based on propensity scores16 45 and related inverseprobability weighting methods have become popular. While those methods may adopt any strategy when building the score or weights, traditional strategies (which focus exclusively on exposure prediction) can cause as much harm as when used for disease modeling.46-49

A problem with all the above strategies is that none are based on maximizing accuracy (minimizing bias and variance) in estimating target effects defined in an explicit causal model. Stepwise regression is the worst in this sense, falling firmly into a noncausal prediction50 framework because it attempts to explain most of the variation in the regression outcome variable (whether disease or exposure) with a minimal number of covariates, without regard to causal relations among the variables or their impact on the estimated exposure-disease relation. It is thus unsurprising that stepwise has long been condemned in epidemiology textbooks,16 42-44 although it is still found in published papers.

Page 7 of 23 Suppose we exclude variables that cannot be confounders on causal grounds, such as instrumental variables, variables affected by exposure or disease, and variables that have only “back-door” connections to exposure and disease).51-55 Even then, the covariates that statistically “explain” (are associated with) the most observed disease variation, or are most statistically significant, need not be the same as the covariates that are most important in terms of confounding control or causation of cases.2 56 57 Similarly, if we are building a propensity score, then the set of covariates that is most associated with exposure variation may be insufficient for confounding control, and may even introduce bias. Conversely, a covariate need not be a “statistically significant” predictor of exposure or disease to be an important confounder.56 58 These seemingly paradoxical facts arise because the target of noncausal prediction is only one of the associations of the covariate with exposure or disease, whereas confounding by the covariate is a function of both associations.

The change-in-estimate (CIE) approach represents a step in the direction of causal modeling because it selects covariates based on how much their control changes exposure effect estimates, which is presumed to measure confounding by the covariate. (There are different types of causal models59 60; here the term denotes any approach targeting the causal effect of a particular factor, rather than just describing variation or association.) Nonetheless, neither stepwise nor CIE have a strong theoretical foundation, and thus they have been largely superseded in the causal-inference literature by more advanced concepts and techniques which we will discuss below.

In that literature, prediction plays a crucial role, but not mere passive prediction of observed outcomes: rather, the goal is to predict health outcomes under alternative (and possibly counterfactual) interventions or treatment patterns (potential outcomes).23 60-64 Expressed in

Page 8 of 23 formal terms, passive prediction seeks only to forecast an observed outcome Yobs from an exposure indicator X and a list of covariates Z; in contrast, causal prediction seeks to forecast two distinct potential outcomes from Z: Y1, the outcome if exposure (X=1) occurs; and Y0, the outcome if no exposure (X =0) occurs. Causal prediction thus takes on a more difficult task of predicting two outcomes at once (Y1, Y0), and can be distorted by focusing only on prediction of Yobs, which is the composite Yobs = XY1 + (1−X)Y0.

Why Not Adjust for Every Available Covariate? There are of course variables for which control may be inappropriate based on preliminary causal considerations. These include intermediates (variables on the causal pathway between exposure and diseases) and their descendants65, and any other variable influenced by the exposure or outcome51 53 55. These also include variables that are not part of minimal sufficient adjustment sets, whose control may increase bias.51-55 65 66 We assume that these variables have been identified and eliminated (e.g., using causal diagrams52 53 65 to display contextual theory67), leaving us with a set of potential adjustment covariates (often called “potential confounders”), including those variables that we are reasonably confident would reduce bias if controlled and our study size were unlimited.

Unfortunately, even the largest studies can be small relative to the number of potential confounders. While rules similar to “at least 10 subjects per regression coefficient” are sometimes given, such rules take no account of exposure and disease frequencies and so can be quite overoptimistic when some exposure-disease combination is rare. In those cases, controlling too many variables by conventional means can lead to severe bias or unnecessary imprecision in estimates. In particular, unnecessary control can produce or aggravate two

Page 9 of 23 closely related problems: (i) data sparsity, in which full control results in too few subjects at crucial combinations of the variables, with consequent bias in estimates3 68; and (ii) multicollinearity, by which we mean high multiple correlation (or more generally, high association) of the controlled variables with exposure.3 When sparsity or multicollinearity is so severe that one cannot fit the pre-specified model by conventional means, then unless one is willing to turn to more advanced fitting methods, one is usually forced to use the data to make decisions about how to reduce the model.3

Stepwise Regression and Related Strategies As one of the oldest selection strategies27, stepwise regression attempts to achieve parsimonious noncausal prediction, searching for a model that explains most outcome variation with the fewest variables. The goal itself is reasonable for clinical prediction; e.g., if we can predict cardiovascular disease risk with reasonable accuracy by collecting information on five variables instead of 30, then this has clear practical benefits. Even with this goal, however, ordinary stepwise has many flaws, and requires further cautions when used for estimating effects2

17 28 30-41 69

. For

example, stepwise algorithms do not automatically limit

selection to potential confounders, so it is essential to exclude nonconfounders (such as intermediates) at the start and then apply the algorithm to the remaining variables. If we wish to ensure that particular confounders are included (e.g., age, sex) then those confounders must be forced into the model. Similarly, if we wish to estimate the effect of a particular exposure using a model for disease, this exposure must be forced into the model.

For covariates subject to selection, decisions about adding (or deleting) a covariate are made according to whether adding (or deleting) the covariate â€œsignificantlyâ€? improves (or reduces)

Page 10 of 23 the fit of the model, where “significantly” is usually defined by some arbitrary cut-off for the P-value of an improvement test (usually 0.05 to 0.20), such as the likelihood-ratio (deviance) test comparing the models with and without the covariate. This is equivalent to assessing whether the covariate explains a “significant” proportion of the residual variation (the outcome variation that remains given the variables in the model already). It is also equivalent to using the P-value for testing the coefficients for the covariate when it is in the model. One can include product terms (“interactions”) in the process, e.g., by adding products to the data set and declaring them candidates for selection.

The most general objection is that ordinary stepwise (as well as all-subsets) regression as implemented in standard software uses no cross-validation of its choices, and so does not account for preliminary testing (the same data being used to both fit and test the model). As a result, it produces P-values that are too small (i.e. overstates significance) and confidence intervals that are too narrow for variables that are forced in the model, including the study exposure in disease models2 17 28 30-41 69; in addition, the resulting model often yields much poorer predictions than can be obtained with modern techniques.17 21 These defects are especially bad for a field like epidemiology that is plagued by charges of generating too many false positives and inaccurate predictions. They can be corrected by using advanced resampling and cross-validation methods,18 20-22 but these corrections remain uncommon in commercial software and epidemiologic practice.

A more specialized objection is that model-fitting criteria such as significance testing are inappropriate for the assessment of confounding2 56 58 (and indeed there are questions about whether they can be justified for any epidemiologic purpose70 71). A risk factor may be

Page 11 of 23 strongly associated with the disease with a small P-value and yet not be associated with exposure, in which case it will not be a confounder. For example, age is a strong risk factor for most diseases, but will not be a confounder if exposure is independent of age. In parallel, a covariate may have a “nonsignificant” (P>0.05) association with disease, but it may be an important confounder. For example, if the variable is only moderately associated with disease but strongly associated with exposure, adding it to the model may achieve better confounder control, as reflected in the change in the main exposure coefficient and as predicted by background information26 66.

Furthermore, when deciding whether or not to control for a variable, the null hypothesis may not be the appropriate one to test.2 3 It may be more appropriate to start with the hypothesis that a particular variable is an important confounder, and to only decide to exclude the variable from the analysis if there is good evidence that it can be ignored. For example, instead of testing whether a confounder coefficient is zero, one could test whether its absolute magnitude is larger than a given “important” size (an equivalence test)72 73.

Regardless of the method chosen, simulation studies66 72 74 appear to confirm earlier suggestions75 that false negatives (incorrect exclusion of confounders) is a greater threat to effect estimates than false positives (incorrect inclusion of nonconfounders), supporting recommendations for liberal inclusion criteria (or equivalently, weak exclusion criteria). But if many variables are available, we are led back to the original problem that conventional fitting methods may be incapable of fitting more than a limited subset without sparse-data bias or convergence failure.

Page 12 of 23 The above issues become more difficult when building propensity-score (exposure) models, for then the only association being tested by stepwise is that of the potential confounder with exposure. This reorientation eliminates concern about illusory precision due to preliminary testing in cohort studies. Nonetheles, the overfitting it produces can lead to bias in casecontrol studies,76 77 and the inclusion of disease in the exposure model to remove this bias resurrects concerns about illusory precision. Furthermore, variables that predict exposure well will be preferentially selected even if they are not risk factors, which can produce bias and unnecessary variance inflation (by worsening the multicollinearity problems we discuss below).46-48 54 78 79

Change-in-estimate strategies Since the late 1970s, the modeling approaches in many epidemiologic textbooks and articles resemble stepwise regression, but with one important difference: decisions about adding or deleting variables from the model are made on the basis of the change in the size of the main exposure coefficient or its antilog rather than significance testing of the covariate42-44 (for examples, see80-82). Later versions2 58 suggest using change in the confidence limits instead, since the confidence interval is usually the final analysis product. One caution to these approaches is that an accurate assessment of confounding may require examining changes from moving groups of variables. Another caution, which arises if the disease frequency is high and measured by odds or rates, is that the change may partly reflect noncollapsibility of the effect measure rather than confounding.53 83 Nonetheless, these change-in-estimate (CIE) approaches are an improvement over selection based only on noncausal prediction2 84. Confounding is assessed by dropping a variable if its control produces less than a given change in the estimated exposure effect on the scale used for contextual interpretation. In logistic and loglinear regressions this change is usually taken to be 10% of the ratio effect

Page 13 of 23 (the antilog of the exposure coefficient). Others have extended CIE to choosing the form of the confounder (e.g., linear vs. quadratic)42.

Confounding Both stepwise and CIE strategies assess confounding solely on the basis of the analysis data, ignoring the a priori causal considerations that led to consideration of certain variables as potential confounders85. Because of this causal component, purely data-based procedures can be even more misleading for confounder selection than they are in noncausal prediction problems.85 86

Both strategies also have parsimony as a key goal: stepwise regression aims to achieve a model which maximizes prediction using few covariates, taking parsimony as an important goal in itself; CIE aims to achieve a model that controls confounding well with few variables, arguing for deletion of weak confounders from the model because â€œthe use of a reduced modelâ€Ś can sometimes lead to a gain in precisionâ€?43. Again, in practice the improvement in precision can be largely illusory since the component of variance due to selection is ignored in the standard errors computed from the final model. Some precision enhancement can be obtained however when there is severe multicollinearity (see below).

In fact, if our goal is to estimate the effect of a particular exposure, then there is no logical reason to favour a parsimonious model. Rather, it can be argued that we should adjust for all measured potential confounders or at least a maximal number.2 48 87 88 No simple selection strategy has proven uniformly better than this approach.5 68 84 Nonetheless, although it can provide a reasonable starting point, it may result in inflated or unstable estimates due to data sparsity or collinearity of the covariates with exposure.5 68 In particular, if we include non-

Page 14 of 23 confounders that are strongly associated with the main exposure (e.g. cigarette-lighter use when studying smoking), then the main exposure effect estimates may be unbiased but have unnecessarily wide confidence intervals.3 46 In this situation, it may be best to reduce sources of sparsity and multicollinearity to improve estimation accuracy. We revisit this issue below.

Significance testing (as in stepwise) and CIE are not the only possible approaches to obtaining a â€œreducedâ€? set of variables when it is not possible to adjust for all possible confounders. For example, one simulation study72 compared (i) CIE; (ii) significance-test-ofthe-covariate; (iii) significance-test-of-the-change (collapsibility testing); (iv) equivalencetest-of-the-change; and (v) a hybrid strategy that takes a weighted average of adjusted and unadjusted estimates. CIE and equivalence-test-of-the-change strategies performed best when the cut-point for deciding whether crude and adjusted estimates differed by an important amount was set to a low value (e.g. 10%), whereas significance-test strategies performed best when the Îą-level was very high (e.g. using P<0.20 instead of P<0.05). In other words, good performance appeared to hinge on avoiding false negatives (excluding strong confounders) than on incurring false positives (including weak confounders or nonconfounders).

Discussion Like more sophisticated but computationally intensive methods,22 the strategies we describe differ from stepwise regression and other purely predictive approaches, in that their goal is to improve accuracy of exposure effect estimates rather than to simply predict outcomes (tables 1-2). At the same time, recognizing that the gap between state-of-the-art methods and what is done in most publications has only grown over time, they are intended to fall within the scope of the limits on software and effort that constrain typical researchers. Thus, parsimony is replaced by the goal of minimizing error in effect estimation.

Page 15 of 23

A related point is that, as with parsimony, pursuit of goodness-of-fit may lead to inappropriate decisions about confounder control; in particular, some variables may not be included in the model because they do not significantly improve the fit, even though they are important confounders. “Global” tests of fit are especially inadequate for confounder selection16 since there can be many “good-fitting” models that correspond to very different confounder effects and exposure effect estimates3.

Parsimony and goodness-of-fit are helpful only to the extent they reduce variance and bias of the targeted effect estimate. The general inappropriateness of parsimony as a goal in causal analysis is supported by simulation studies in which full-model analysis has often outperformed conventional selection strategies.72 74 84 This result raises the question: if we can control for all potential confounders, then why wouldn’t we? If indeed we have numbers so large that there is no problem from controlling too many variables, we would generally expect covariate elimination to provide little benefit for the accuracy of effect estimates. But the harsh reality is that even databases of studies with hundreds of thousands of patients often face severe limits in crucial categories, such as the number of exposed cases. Coupled with the availability of what may be hundreds or even thousands of variables, some kind of algorithmic approach to potential confounders becomes essential.89 90 The strategies we describe are designed for common borderline situations in which control of all the variables may be possible, but some accuracy improvement may be expected from eliminating some or all variables whose inclusion is of uncertain benefit.

Conclusions

Page 16 of 23 Epidemiology students are often taught that statistical significance testing is inappropriate for confounder evaluation and that change-in-estimate is preferable.16 42-44 91 Nonetheless, they are also often taught that the goal of modeling is to achieve a model that is as simple as possible while providing an adequate fit to the data, without the recognition that this goal is closer to that of noncausal prediction (as in ordinary stepwise regression) than that of accurate effect estimation. Confusion is exacerbated when the different goals of noncausal prediction and effect estimation are not made explicit. Different goals produce different modeling strategies, and thus it is important to delineate these goals. Furthermore, it is important to assess the shortcomings of the various strategies in simulations and case studies involving a variety of conditions.92 Regardless of the strategy adopted, however, it is important that authors document how they chose their models, so that readers can interpret their results in light of the strengths and weaknesses attendant to the strategy that they used.

We would prefer to see a shift toward methods with good theoretical foundation, especially computationally intensive methods based on robustness, resampling, and cross-validation. Nonetheless, our experience indicates that considerable software development and training will be needed before more primitive strategies can be retired. Furthermore, no strategy is foolproof, and none can compensate for bias due to uncontrolled confounding, selection bias, or measurement error.93-95

Acknowledgements We would like to thank Tony Blakely, Naomi Brewer, Tim Clayton, Simon Cousens, Rhian Daniel, Shah Ebrahim, Chris Frost, Michael Hills, Katherine Hoggatt, Nancy Krieger, Dorothea Nitsch, George Davey Smith, Lorenzo Richiardi, Bianca de Stavola, and Jan Vandenbroucke, and the reviewers for their valuable comments on the revisions. The Centre

Page 17 of 23 for Public Health Research is supported by a Programme Grant from the Health Research Council of New Zealand.

Page 18 of 23 Key messages: •

•

•

•

When we can discern a modeling strategy in epidemiological and medical studies, it is usually a variant of one of these approaches: (i) use of a model with all measured variables a priori identified as confounders; (ii) significance testing (as in stepwise regression) to eliminate some of the variables in this list; or (iii) a change-in-estimate approach to eliminate variables. The main goal of a statistical analysis of effects should be the production of the most accurate (valid and precise) effect estimates obtainable from the data and available software. This goal is quite different from that of variable selection, which is to obtain a model which predicts observed outcomes well with the minimal number of variables; it is only indirectly related to the goal of change-in-estimate approaches, which is to obtain a model that controls most or all confounding with a minimal number of variables. Regardless of the modeling strategy chosen, it is important that authors document the strategy used so that readers can interpret the results in light of the strengths and weaknesses of the strategy.

Page 19 of 23

REFERENCES 1. Greenland S, Pearce N. Modeling strategies in epidemiology: II. Basic alternatives. 2013; submitted for publication. 2. Greenland S. Modeling and variable selection in epidemiologic analysis. American Journal of Public Health 1989;79(3):340-49. 3. Robins JM, Greenland S. The role of model selection in causal inference from nonexperimental data. American Journal of Epidemiology 1986;123:392-402. 4. Vandenbroucke JP. Should we abandon statistical modeling altogether? American Journal of Epidemiology 1987;126(1):10-13. 5. Greenland S. Invited commentary: Variable selection versus shrinkage in the control of multiple confounders. American Journal of Epidemiology 2008;167(5):523-29. 6. Greenland S, Pearce N. Modeling strategies in epidemiology: III. Alternative methods for confounder control. 2013; in preparation. 7. Greenland S, Pearce N. Modeling strategies in epidemiology: IV. Assessment of heterogeneity. 2013; in preparation. 8. Greenland S. Chapter 20: Introduction to regression models. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008. 9. McCullagh P, Nelder JA. Generalized linear models. 2nd ed. New York: Chapman and Hall, 1989. 10. Greenland S, Rothman KJ. Chaper 13: Fundamentals of epidemiologic data analysis. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008. 11. von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet 2007;370(9596):1453-57. 12. Chatfield C. Model uncertainty, data mining and statistical inference. Journal of the Royal Statistical Society Series a-Statistics in Society 1995;158:419-66. 13. Chatfield C. Confessions of a pragmatic statistician. Journal of the Royal Statistical Society Series D-the Statistician 2002;51:1-20. 14. Greenland S, Schlesselman JJ, Criqui MH. The fallacy of employing standardized regression coefficients and correlations as measures of effect. American Journal of Epidemiology 1986;123:203-08. 15. Greenland S, Maclure M, Schlesselman JJ, Poole C. Standardized regression coefficients: a further critique and review of some alternatives. Epidemiology 1991;2:387-92. 16. Greenland S. Chapter 21: Introduction to regression modeling. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008. 17. Harrell F. Regression modeling strategies. New York: Springer, 2001. 18. Hastie T, Tibshirani R, Friendman J. The elements of statistical learning: data mining, inference and prediction. 2nd ed. New York: Springer, 2009. 19. Leamer E. Specification searches. New York: Wiley, 1978. 20. Royston P, Sauerbrei W. Multivariable model-building: a pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables. Chichester, UK: John Wiley & Sons, 2008.

Page 20 of 23 21. Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York, NY: Springer, 2008. 22. van der Laan M, Rose R. Targeted learning: causal inference for observational and experimental data. New York: Springer, 2011. 23. Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Halloran ME, Berry D, editors. Institute for Mathematics and its Applications 116. New York: Springer, 1999:95-134. 24. Robins JM, Hernán MA. Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, editors. Longitudinal data analysis. New York: Chapman and Hall/CRC Press, 2009. 25. VanderWeele TJ. Subtleties of explanatory language: what is meant by ‘‘mediation’’? Eur J Epidemiol 2011;26:343-46. 26. Valeri L, VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods 2012:in press. 27. Efroymson MA. Multiple regression analysis. In: Ralston A, Wilf HS, editors. Mathematical Methods for Digital Computers. Hoboken, NJ: Wiley, 1960. 28. Bancroft TA, Han CP. Inference based on conditional specification - note and a bibliography. International Statistical Review 1977;45(2):117-27. 29. Copas JB. Regression , prediction and shrinkage. Journal of the Royal Statistical Society Series B-Methodological 1983;45(3):311-54. 30. Draper NR, Guttman I, Lapczak L. Actual rejection levels in a certain stepwise test. Communications in Statistics Part a-Theory and Methods 1979;8(2):99-105. 31. Faraway JJ. On the cost of data analysis. Journal of Computational and Graphical Statistics 1992;1:213-19. 32. Flack VF, Chang PC. Frequency of selecting noise variables in subet regression analysis a simulation study. American Statistician 1987;41(1):84-86. 33. Freedman DA. A note on screening regression equations. American Statistician 1983;37(2):152-55. 34. Freedman DA, Navidi W, Peters SC. On the impact of variable selection in fitting regression equations. In: Dijlestra TK, editor. On model uncertainty and its statistical implications. Berlin: Springer-Verlag, 1988:1-16. 35. Greenland S. Methods for epidemiologic analyses of multiple exposures: A review and a comparative study of maximum-likelihood, preliminary testing, and empirical-Bayes regression. Statistics in Medicine 1993;12:717-36. 36. Greenland S. When should epidemiologic regressions use random coefficients? Biometrics 2000;56(3):915-21. 37. Hurvich CM, Tsai CL. The impact of model selection on inference in linear regression. American Statistician 1990;44(3):214-17. 38. Sclove SL, Radhakri.R, Morris C. Non-optimality of preliminary-test estimators for mean of a multivariate normal distribution. Annals of Mathematical Statistics 1972;43(5):1481-&. 39. Steyerberg EW, Eijkemans MJC, Habbema JDF. Stepwise selection in small data sets: A simulation study of bias in logistic regression analysis. Journal of Clinical Epidemiology 1999;52(10):935-42. 40. Viallefont V, Raftery AE, Richardson S. Variable selection and Bayesian model averaging in case-control studies. Statistics in Medicine 2001;20(21):3215-30. 41. Weiss RE. The influence of variable selection - a Bayesian diagnostic perspective. Journal of the American Statistical Association 1995;90(430):619-25.

Page 21 of 23 42. Breslow NE, Day NE. Statistical methods in cancer research: Volume 1 - the analysis of case-control studies. Lyon: IARC, 1980. 43. Kleinbaum D, Kupper LL, Morgenstern H. Epidemiologic Research. Principles and quantitative methods. Belmont, CA: Lifetime Learning Publication, 1982. 44. Schlesselman JJ. Case-control studies: design, conduct, analysis. New York: Oxford University Press, 1982. 45. Rosenbaum PR. Observational studies. New York: Springer, 2002. 46. Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Sturmer T. Variable selection for propensity score models. American Journal of Epidemiology 2006;163(12):1149-56. 47. De Luna X, Waernbaum I, Richardson TS. Covariate selection for the nonparametric estimation of an average treatment effect. Biometrika 2011;98(4):861-75. 48. Vansteelandt S, Bekaert M, Claeskens G. On model selection and model misspecification in causal inference. Statistical methods in Medical Research 2012;21(1):7-30. 49. Westreich D, Cole SR, Funk MJ, Brookhart MA, Stuermer T. The role of the c-statistic in variable selection for propensity score models. Pharmacoepidemiology and Drug Safety 2011;20(3):317-20. 50. Geisser S. Predictive inference: an introduction. New York: Chapman Hall, 1993. 51. Cole SR, Platt RW, Schisterman EF, Chu HT, Westreich D, Richardson D, et al. Illustrating bias due to conditioning on a collider. International Journal of Epidemiology 2010;39(2):417-20. 52. Glymour MM, Greenland S. Causal diagrams. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. Philadelphina: Lippincott Williams & Wilkins, 2008. 53. Greenland S, Pearl J. Adjustments and their consequences - collapsibility analysis using graphical models. International Statistical Review 2011;79(3):401-26. 54. Pearl J. On a class of bias-amplifying covariates that endanger effect estimates. In: Grunwald P, Spirtes P, editors. Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence. Corvallis, OR: AUAI, 2010:417-24. 55. Rothman K, J,, Greenland S, Lash TL. Chaper 9: Validity in epidemiologic studies. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008. 56. Greenland S, Neutra RR. Control of confounding in the assessment of medical technology. International Journal of Epidemiology. 1980;9:361-67. 57. Pearce N. Epidemiology in a changing world: variation, causation and ubiquitous risk factors. International Journal of Epidemiology 2011;40:503-12. 58. Greenland S, Rothman KJ. Chapter 15: Introduction to stratified analysis. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008. 59. Greenland S, Brumback B. An overview of relations among causal modelling methods. International Journal of Epidemiology 2002;31(5):1030-37. 60. Maldonado G, Greenland S. Estimating causal effects. International Journal of Epidemiology 2002;31(2):422-29. 61. Greenland S. The logic and philosophy of causal inference: A statististical perspective. In: Bandyopadhyay PS, Forster MR, editors. Handbook of the Philosophy of Statistics. North Holland: Elsevier, 2011:813-32. 62. Greenland S. Causal inference as a prediction problem: assumptions, identification, and evidence synthesis. In: Berzuini C, Dawid AP, Bernadinelli L, editors. Causal inference; Statistical perspectives and applications. New York: Wiley, in press. 63. Rubin DB. Bayesian-inference for causal effects - role of randomization. Annals of Statistics 1978;6(1):34-58.

Page 22 of 23 64. Vansteelandt S, Keiding N. G-computation: lost in translation? American Journal of Epidemiology 2011;173(7):739-42. 65. Pearl J. Causality: models, reasoning, and inference. 2nd ed. New York: Cambridge University Press, 2009. 66. Myers JA, Rassen JA, Gagne JJ, et al. Effects of adjusting for instrumental variables on bias and precision of effect estimates. American Journal of Epidemiology 2011;174:1223-27. 67. Krieger N. Epidemiology and the people's health: theory and context. New York: Oxford University Press, 2011. 68. Greenland S, Schwartzbaum JA, Finkle WD. Problems due to small samples and sparse data in conditional logistic regression analysis. American Journal of Epidemiology 2000;151(5):531-39. 69. Greenland S. Putting background information about relative risks into conjugate prior distributions. Biometrics 2001;57(3):663-70. 70. Greenland S. Randomization, statistics, and causal inference. Epidemiology 1990;1:42129. 71. Sterne JAC, Smith GD. Sifting the evidence-what's wrong with significance tests? (Reprinted from Brit Med J, vol 322, pg 226-231, 2001). Physical Therapy 2001;81(8):1464-69. 72. Maldonado G, Greenland S. Simulation study of confounder-selection strategies. American Journal of Epidemiology 1993;138(11):923-36. 73. Maldonado G, Greenland S. Interpreting model coefficients when the true model form is unknown. Epidemiology 1993;4(4):310-18. 74. Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. American Journal of Epidemiology 1989;129(1):125-37. 75. Dales LD, Ury HK. An improper use of statistical significance testing in studying covariables. International Journal of Epidemiology 1978;4:373-75. 76. Mansson R, Joffe MM, Sun WG, Hennessy S. On the estimation and use of propensity scores in case-control and case-cohort studies. American Journal of Epidemiology 2007;166(3):332-39. 77. Tchetgen EJT, Rotnitzky A. Double-robust estimation of an exposure-outcome odds ratio adjusting for confounding in cohort and case-control studies. Statistics in Medicine 2011;30(4):335-47. 78. Day NE, Byar DP, Green SB. Overadjustment in case-control studies. American Journal of Epidemiology 1980;112(5):696-706. 79. Pearl J. Understanding bias amplification. American Journal of Epidemiology 2012;174:1223-27. 80. Alexander N, Rodriguez M, Perez L, Caicedo JC, Cruz J, Prieto G, et al. Case-control study of mosquito nets against malaria in the Amazon region of Colombia. American Journal of Tropical Medicine and Hygiene 2005;73(1):140-48. 81. Rodriguez MDM, Obasi A, Mosha F, Todd J, Brown D, Changalucha J, et al. Herpes simplex virus type 2 infection increases HIV incidence: a prospective study in rural Tanzania. Aids 2002;16(3):451-62. 82. Snow RW, Peshu N, Forster D, Bomu G, Mitsanze E, Ngumbao E, et al. Environmental and entomological risk factors for the development of clinical malaria among children on the Kenyan coast. Transactions of the Royal Society of Tropical Medicine and Hygiene 1998;92(4):381-85. 83. Greenland S, Rothman KJ. Chapter 4: Measures of effect and measures of association. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008.

Page 23 of 23 84. Weng HY, Hsueh YH, Messam LLM, Hertz-Picciotto I. Methods of Covariate Selection: Directed Acyclic Graphs and the Change-in-Estimate Procedure. American Journal of Epidemiology 2009;169(10):1182-90. 85. Greenland S. The effect of misclassification in the presence of covariates. American Journal of Epidemiology 1980;112(4):564-9. 86. Pearce N, Greenland S. Confounding and interaction. In: Ahrens W, Krickeberg K, Pigeot I, editors. Handbook of epidemiology. Heidelberg: Springer-Verlag, 2004:375-401. 87. Greenland S. Bayesian perspectives for epidemiological research. II. Regression analysis. International Journal of Epidemiology 2007;36(1):195-202. 88. VanderWeele TJ, Shpitser I. A New Criterion for Confounder Selection. Biometrics 2011;67(4):1406-13. 89. Joffe MM. Exhaustion, Automation, Theory, and Confounding. Epidemiology 2009;20(4):523-24. 90. Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. Highdimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology 2009;20(4):512-22. 91. Checkoway H, Pearce N, Kriebel D. Research methods in occupational epidemiology. 2 ed. New York: Oxford University Press, 2004. 92. Greenland S. Intuitions, Simulations, Theorems The Role and Limits of Methodology. Epidemiology 2012;23(3):440-42. 93. Greenland S. Sensitivity analysis and bias analysis. In: Ahrens W, Pigeot I, editors. Handbook of Epidemiology. 2nd ed. New York: Springer, 2013:in press. 94. Greenland S, Lash TL. Chapter 19: Bias analysis. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008. 95. Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. Boston: Springer Publishing Company, 2009.

Advertisement