Master data sets
Raw data
Processing
Acquisition
Design
FIGURE 8.1 Research data work outputs
Data acquisition documentation
Clean data
Analysis Publication
Preanalysis plan
Data map
Data processing documentation
Analysis data
Data publication package
Data processing code
Data analysis documentation
Reproducibility package
Analysis code
Raw outputs
Research outputs
Source: DIME (Development Impact Evaluation), World Bank.
The first chapter begins the book with a discussion of credibility, transparency, and reproducibility in research. The overarching idea is that research should always be accessible and available to others, both within and outside the research team. The handbook treats data work as a “social process” involving multiple team members with different roles and technical abilities. The fundamental theme of accessible open science research provides structure for all subsequent tasks and offers both private and public benefits: data work that is intentionally designed for others to interact with will also be easier for teams to collaborate on and maintain over time. This idea is carried through to the second chapter, which introduces the reader to the technical tools and concepts needed to develop a work environment conducive to accessible research. An open science approach necessitates cooperation with a diverse group of collaborators using modern approaches to computing technology. It requires collective agreement on specific tools and methods of collaboration and record keeping as well as on technical approaches like version control, file sharing, and directory organization. 170
DEVELOPMENT RESEARCH IN PRACTICE: THE DIME ANALYTICS DATA HANDBOOK