BOX 6.6 ORGANIZING ANALYSIS CODE: A CASE STUDY FROM THE DEMAND FOR SAFE SPACES PROJECT The Demand for Safe Spaces team defined the control variables in globals in the master analysis script. Doing so guaranteed that control variables were used consistently across regressions. It also provided an easy way to update control variables consistently across all regressions when needed. In an analysis script, a regression that includes all demographic controls would then be expressed as regress y x ${demographics}. 1 /**************************************************************************************** 2 *
Set control variables
3 ****************************************************************************************/ 4 5
global star
star (* .1 ** .05 *** .01)
6
global demographics
d_lowed d_young d_single d_employed d_highses
7
global interactionvars
pink_highcompliance mixed_highcompliance
8
pink_lowcompliance mixed_lowcompliance
9
global interactionvars_oc pos_highcompliance zero_highcompliance
10 11
/// ///
pos_lowcompliance zero_lowcompliance global wellbeing
CO_concern CO_feel_level CO_happy CO_sad
///
12
CO_tense CO_relaxed CO_frustrated CO_satisfied
///
13
CO_feel_compare
14 15
* Balance variables (Table 1)
16
global balancevars1
d_employed age_year educ_year ride_frequency
///
17
home_rate_allcrime home_rate_violent
///
18
home_rate_theft grope_pink_cont grope_mixed_cont ///
19
comments_pink_cont comments_mixed_cont
20
global balancevars2
21
usual_car_cont nocomp_30_cont nocomp_65_cont
///
fullcomp_30_cont fullcomp_65_cont
22 23
* Other adjustment margins (Table A7)
24
global adjustind
25
CI_wait_time_min d_against_traffic CO_switch
///
RI_spot CI_time_AM CI_time_PM
For the complete master do-file from which this code is excerpted, visit the GitHub repository at https://git.io/JtgeT.
Creating this setup entails having an effective data management system, including file naming, organization, and version control. Just as for the analysis data sets, each of the individual analysis files needs to have a descriptive name. File names such as spatial-diff-indiff.do, matching-villages.R, and summary-statistics.py are clear indicators of what each file is doing and make it easy to find code quickly. If the script files will be ordered numerically to correspond to CHAPTER 6: CONSTRUCTING AND ANALYZING RESEARCH DATA
139