3 minute read

6.1 Data analysis tasks and outputs

for the analysis (measurement variables) with the information in the data map describing the study design (research variables) creates original data sets that are ready for analysis, as shown in figure 6.1. Doing so is difficult, creative work, and it cannot be reproduced by someone who lacks access to the detailed records and explanations of how the data were interpreted and modified. The chapter stressed that code must be well organized and well documented to allow others to understand how research outputs were created and used to answer the research questions. The next chapter of this book provides a guide to assembling the raw findings into publishable work and describes methods for making data, code, documentation, and other research outputs accessible and reusable alongside the primary outputs.

FIGURE 6.1 Data analysis tasks and outputs

Design Acquisition Processing Analysis

Clean data

Treat missing observations and distribution patterns Integrate data sources

Construction documentation Merged data

Create analysis indicators Analysis data dictionary

Analysis data

Conduct exploratory analysis

Publication

Include result in final output? No

Yes

Format and export outputs Analysis archive

Analysis code

Raw outputs

Acock, Alan C. 2018. A Gentle Introduction to Stata, 6th ed. College Station, TX:

Stata Press. Adjognon, Guigonan Serge, Daan van Soest, and Jonas Guthoff. 2019. “Reducing

Hunger with Payments for Ecosystem Services (PES): Experimental Evidence from Burkina Faso.” Policy Research Working Paper 8974, World Bank,

Washington, DC. Andrade, Luíza, Benjamin Daniels, and Florence Kondylis. 2020. “Nice and Fast

Tables in Stata?” Development Impact (blog), May 28, 2020. https://blogs .worldbank.org/impactevaluations/nice-and-fast-tables-stata. Angrist, Joshua D., and Jörn-Steffen Pischke. 2008. Mostly Harmless Econometrics:

An Empiricist’s Companion. Princeton, NJ: Princeton University Press. Angrist, Joshua D., and Jörn-Steffen Pischke. 2014. Mastering ‘Metrics: The Path from Cause to Effect. Princeton, NJ: Princeton University Press. Bjärkefur, Kristoffer, Luíza Cardoso de Andrade, and Benjamin Daniels. 2020.

“iefieldkit: Commands for Primary Data Collection and Cleaning.” Stata Journal 20 (4): 892–915. Cunningham, Scott. 2021. Causal Inference: The Mixtape. New Haven, CT:

Yale University Press. Daniels, Benjamin. 2019. “outwrite: Stata Module to Consolidate Multiple

Regressions and Export the Results to a .xlsx, .xls, .csv, or .tex File.” Revised

December 7, 2019. Statistical Software Components S458581, Department of

Economics, Boston College. Haghish, E. F. 2016. “markdoc: Literate Programming in Stata.” Stata Journal 16 (4): 964–88. http://www.stata-journal.com/article.html?article=pr0064. Healy, Kieran. 2018. Data Visualization: A Practical Introduction. Princeton, NJ:

Princeton University Press. Hlavac, Marek. 2015. “stargazer: Beautiful LaTeX, HTML, and ASCII Tables from

R Statistical Output.” Central European Labour Studies Institute, Bratislava,

Slovakia. Hugh-Jones, David. 2021. “huxtable: Easily Create and Style Tables for LaTeX,

HTML and Other Formats.” https://hughjonesd.github.io/huxtable/. Jann, Benn. 2005. “Making Regression Tables from Stored Estimates.” Stata Journal 5 (3): 288–308. http://www.stata-journal.com/article.html?article=st0085. Jann, Benn. 2007. “Making Regression Tables Simplified.” Stata Journal 7 (2): 227–44. http://www.stata-journal.com/article.html?article=st0085_1. Jann, Benn. 2016. “Creating LaTeX Documents from within Stata Using texdoc.”

Stata Journal 16 (2): 245–63. http://www.stata-journal.com/article. html?article=pr0062. Jann, Benn 2017. “Creating HTML or Markdown Documents from within

Stata Using webdoc.” Stata Journal 17 (1): 3–38. http://www.stata-journal .com/article.html?article=pr0065. Jones, Maria Ruth, Florence Kondylis, John Ashton Loeser, and Jeremy Magruder. 2019. “Factor Market Failures and the Adoption of Irrigation in Rwanda.” Policy

Research Working Paper 9092, World Bank, Washington, DC. Rodriguez, G. 2017. “Literate Data Analysis with Stata and Markdown.”

Stata Journal 17 (3): 600–18. http://www.stata-journal.com/article .html?article=pr0067.

Wada, Roy. 2014. “outreg2: Stata Module to Arrange Regression Outputs into an Illustrative Table.” Revised August 17, 2014. Statistical Software

Components S456416, Department of Economics, Boston College, Boston, MA. Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. New York:

Springer-Verlag. https://ggplot2.tidyverse.org. Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy,

Transform, Visualize, and Model Data. 1st ed. Sebastopol, CA: O’Reilly Media. Wilke, Claus O. 2019. Fundamentals of Data Visualization: A Primer on Making

Informative and Compelling Figures. Sebastopol, CA: O’Reilly Media.

This article is from: