
5 minute read
Chapter 4: Acquiring development data
Chapter 4
Acquiring development data
Many research questions require original data because no source of publicly available data addresses the inputs or outcomes of interest for the relevant population. Data acquisition can take many forms, including primary data generated through surveys; private sector partnerships granting access to new data sources, such as administrative and sensor data; digitization of paper records, including administrative data; web scraping; data captured by unmanned aerial vehicles or other types of remote sensing; and novel integration of various types of data sets, such as combining survey and sensor data. Much of the recent push toward credibility in the social sciences has focused on analytical practices. However, credible development research depends, first and foremost, on the quality of the data acquired. Clear and careful documentation of the data acquisition process is essential for research to be reproducible.
This chapter covers reproducible data acquisition, special considerations for generating high-quality survey data, and protocols for handling confidential data safely and securely. The first section discusses acquiring data reproducibly, by establishing and documenting the right to use the data. This discussion applies to all original data, whether collected for the first time through surveys or sensors or acquired through a unique partnership. The second section examines the process of acquiring data through surveys, which is typically more involved than acquiring secondary data and has more in-built opportunities for quality control. It provides detailed guidance on the electronic survey workflow, from designing electronic survey instruments to monitoring data quality once fieldwork is ongoing. The final section discusses handling data safely, providing guidance on how to receive, transfer, store, and share confidential data. Secure file management is a basic requirement for complying with the legal and ethical agreements that allow access to personal information for research purposes. Box 4.1 summarizes the main points, lists the responsibilities of different members of the research team, and supplies a list of key tools and resources for implementing the recommended practices.
BOX 4.1 SUMMARY: ACQUIRING DEVELOPMENT DATA
The process of obtaining research data is unique to every project. However, some basic structures and processes are common to both data acquired from others and data generated by surveys:
1. When receiving data from others, ownership and licensing are critical. Before any data are transferred, knowing all of the formal rights associated with those data is essential:
• Ensure that the partner has the right to share the data, especially data containing personally identifying information. • Identify the data owner and any restrictions on the use, storage, or handling of data. • Secure a data use agreement or license from the partner, outlining the rights and responsibilities regarding analysis, publication of results and derived data, redistribution of data, and data destruction.
2. Collecting high-quality data requires careful planning and attention to detail throughout the workflow. The following best practices apply to all surveys, with further details for electronic surveys:
• Produce and pilot draft instruments on paper and focus on survey content. • Structure questionnaires for electronic programming and pilot them for function, considering features like pagination, ordering, looping, conditional execution, and instructions for enumerators. • Test data outputs for analytical compatibility, such as code-friendly variable and value labels. • Train enumerators carefully, using a paper survey before an electronic template, assess their performance objectively throughout training, and transparently select the top performers. • Assess data quality in real time, through scripted high-frequency checks and diligent field validation.
3. No matter how data are acquired, handling data securely is essential:
• Encrypt data on all devices, in transit and at rest, beginning from the point of collection and including all intermediate locations such as servers and local devices. • Store encryption keys using appropriate password management software with strong, unique passwords for all applications and devices with access. • Back up data in case of total loss or failure of hardware and software at any site.
Key responsibilities for task team leaders and principal investigators
• Obtain appropriate legal documentation and permission agreements for all data. • For surveys, guide and supervise development of all instruments. • For surveys, review and provide inputs to the project’s data quality assurance plan. • For surveys, guide decisions on how to correct issues identified during data quality checks.
(Box continues on next page)
BOX 4.1 SUMMARY: ACQUIRING DEVELOPMENT DATA (continued)
• Oversee implementation of security measures and manage access codes, encryption keys, and hardware. • Determine and communicate institutionally appropriate data storage and backup plans.
Key responsibilities for research assistants
• Coordinate with data providers, develop required technical documentation, and archive all final documentation. • For surveys, draft, refine, and program all survey instruments, following best practices for electronic survey programming and maintaining up-to-date and version-controlled paper and electronic versions. • For surveys, coordinate closely with field staff on survey pilots and contribute to enumerator manuals. • For surveys, draft a data quality assurance plan and manage the quality assurance process. • Implement storage and security measures for all data.
Key resources
• Manage Successful Impact Evaluation Surveys, a course covering best practices for the survey workflow, from planning to piloting instruments and monitoring data quality, at https://osf.io/resya • DIME Analytics Continuing Education for field coordinators, technical trainings and courses for staff implementing field surveys that are updated regularly, at https://osf.io/gmn38 • SurveyCTO coding practices, a suite of DIME Wiki articles covering common approaches to sophisticated design and programming in SurveyCTO, at https://dimewiki.worldbank .org/SurveyCTO_Coding_Practices • Monitoring data quality, a DIME Wiki article covering communication, field monitoring, minimizing attrition, back-checks, and data quality checks, at https://dimewiki.worldbank .org/Monitoring_Data_Quality
Acquiring data ethically and reproducibly
Clearly establishing and documenting access to data are critical for reproducible research. This section provides guidelines for establishing data ownership, receiving data from development partners, and documenting the research team’s right to use data. Researchers are responsible not only for respecting the rights of both people who own and people who are described by the data but also for making that information as available and as accessible as possible. These twin responsibilities can and do come into tension, so it is important for everyone on the team to be informed about what everyone else is doing. Writing down and agreeing to specific details is a good way to do that.