Issuu

Development Research in Practice

and, for each output it creates, explicitly loads data before analyzing them. This setup encourages data manipulation to be done earlier in the workflow (that is, in separate cleaning and construction scripts). It also prevents the common problem of having analysis scripts that depend on other analysis scripts being run before them. Such dependencies tend to require manual instructions so that all necessary chunks of code are run in the right order. Coding each task so that it is completely independent of all other code, except for the master script, is recommended. It is possible to go so far as to code every output in a separate script, but the key is to make sure that it is clear which data sets are used for each output and which code chunks implement each piece of analysis (see box 6.5 for an example of an analysis script structured like this). BOX 6.5 WRITING ANALYSIS CODE: A CASE STUDY FROM THE DEMAND FOR SAFE SPACES PROJECT The Demand for Safe Spaces team split the analysis scripts into one script per output and reloaded the analysis data before each output. This process ensured that the final exhibits could be generated independently from the analysis data. No variables were constructed in the analysis scripts: the only transformation performed was to subset the data or aggregate them to a higher unit of observation. This transformation guaranteed that the same data were used across all analysis scripts. The following is an example of a short analysis do-file: 1 /**************************************************************************************** 2 *

Demand for "Safe Spaces": Avoiding Harassment and Stigma

3 ***************************************************************************************** 4

OUTLINE:

PART 1: Load data

PART 2: Run regressions

PART 3: Export table

REQUIRES: ${dt_final}/platform_survey_constructed.dta

CREATES:

WRITEN BY:

${out_tables}/priming.tex Luiza Andrade

10 11 ***************************************************************************************** 12 *

PART 1: Load data

13 ****************************************************************************************/ 14 15

use "${dt_final}/platform_survey_constructed.dta", clear

16 17 /**************************************************************************************** 18 *

PART 2: Run regressions

19 ****************************************************************************************/ 20 21

reg scorereputation i.q_group, robust

est sto priming1

(Box continues on next page)

CHAPTER 6: CONSTRUCTING AND ANALYZING RESEARCH DATA

137

Development Research in Practice

Articles inside

Appendix C: Research design for impact evaluation

Appendix A: The DIME Analytics Coding Guide

Appendix B: DIME Analytics resource directory

8.1 Research data work outputs

Chapter 8: Conclusion

7.4 Releasing a reproducibility package: A case study from the Demand for Safe Spaces project

7.1 Summary: Publishing reproducible research outputs

7.3 Publishing research data sets: A case study from the Demand for Safe Spaces project

7.2 Publishing research papers and reports: A case study from the Demand for Safe Spaces project

Chapter 7: Publishing reproducible research outputs

6.1 Data analysis tasks and outputs

6.8 Managing outputs: A case study from the Demand for Safe Spaces project

6.7 Visualizing data: A case study from the Demand for Safe Spaces project

6.6 Organizing analysis code: A case study from the Demand for Safe Spaces project

6.5 Writing analysis code: A case study from the Demand for Safe Spaces project

6.4 Documenting variable construction: A case study from the Demand for Safe Spaces project

6.3 Creating analysis variables: A case study from the Demand for Safe Spaces project

6.2 Integrating multiple data sources: A case study from the Demand for Safe Spaces project

6.1 Summary: Constructing and analyzing research data

Chapter 6: Constructing and analyzing research data

5.7 Recoding and annotating data: A case study from the Demand for Safe Spaces project

5.6 Correcting data points: A case study from the Demand for Safe Spaces project

5.5 Implementing de-identification: A case study from the Demand for Safe Spaces project

5.1 Summary: Cleaning and processing research data

5.4 Assuring data quality: A case study from the Demand for Safe Spaces project

5.3 Tidying data: A case study from the Demand for Safe Spaces project

5.2 Establishing a unique identifier: A case study from the Demand for Safe Spaces project

Chapter 5: Cleaning and processing research data

B4.4.1 A sample dashboard of indicators of progress

4.4 Checking data quality in real time: A case study from the Demand for Safe Spaces project

4.3 Piloting survey instruments: A case study from the Demand for Safe Spaces project

4.2 Determining data ownership: A case study from the Demand for Safe Spaces project

B3.3.1 Flowchart of a project data map

B2.3.1 Folder structure of the Demand for Safe Spaces data work

Chapter 4: Acquiring development data

Chapter 3: Establishing a measurement framework

Chapter 1: Conducting reproducible, transparent, and credible research

Chapter 2: Setting the stage for effective and efficient collaboration

I.1 Overview of the tasks involved in development research data work

Introduction