The OECD Statistics Newsletter, Issue 74, July 2021

Page 13

Improving the quality of foreign aid data through machine learning Shashwat Koirala (shashwat.koirala @oecd.org), Development Co-operation Directorate, Pedro Asti (pedro.asti@oecd.org) and Jan-Anno Schuur (Jan-Anno.schuur@oecd.org), Digital, Knowledge and Information Service, Executive Directorate, OECD

T

he OECD Creditor Reporting System (CRS) is an activity-level database that enables analysis on where foreign aid goes, what purposes it serves and what policies it aims to implement. It contains financial and qualitative data, as well as project descriptions, for each individual aid project or programme. During the annual data collection process, the OECD ensures that the reported data meet the measurement standards for development finance. This includes using the descriptive information for each project to verify the accuracy of key reporting variables (e.g. sector codes), which is time-intensive, as it entails manually reading individual project descriptions. The OECD’s Development Cooperation Directorate (DCD) and the Digital, Knowledge and Information Service, (DKI) are collaborating on a semantic analysis tool to automate the verification of reported CRS sector codes. This tool, which will be launched in 2021, will not only speed up the data validation process, but also enhance the quality of CRS data. In addition, it will become part a free web-based service, which will allow donors to submit a text string (whether manually or through an Application Programming Interface) to validate an existing sector code or obtain suggestions for more appropriate codes, thereby improving the alignment between purpose codes and project descriptions even before the data are reported to the OECD.

CRS and the use of purpose codes to track sectoral focus of aid The CRS captures, on average, 250,000 aid activities per year, which are reported by 30 Development Assistance Committee (DAC) members, about 40 multilateral institutions, 25 non-DAC providers and 41 private foundations. For each aid activity, the database includes information along around 60 dimensions, broadly grouped into six categories (see Figure 1). In addition to data on the financial aspects of aid, the CRS consists of information on the targeted sector, channel of delivery, co-operation modality, targeted policy and environmental objectives, as well as project titles, short and long descriptions that summarise the main objectives of the project. A crucial element of the CRS database are purpose codes, which convey information on a given project’s sector of focus. These five-digit codes distinguish between not only broad sectors (e.g. education versus health), but also delineate different activities within the same sector (e.g. secondary education versus higher education). This enables a granular understanding of an aid provider’s activities and facilitates detailed statistical analysis as to which sectors and sub-sectors the aid is targeting. This information is used, for example, to monitor aid targets like SDG 4.b.1 on the volume of official development assistance flows for scholarships by sector.

Figure 1. Components of the CRS database

CREDITOR REPORTING SYSTEM DATABASE

IDENTIFICATION DATA (e.g., commitment date, donor country or organisation)

BASIC DATA (e.g., recipient country, short description, sector/purpose code)

SUPPLEMENTARY DATA (e.g., long description, policy objectives)

VOLUME DATA (e.g., commitment amount, disbursement amount)

LOAN DATA (e.g., interest rates, repayment period)

PRIVATE SECTOR INSTRUMENT DATA

If applicable

Issue No. 74, July 2021 - The OECD Statistics Newsletter  13


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
The OECD Statistics Newsletter, Issue 74, July 2021 by The OECD Statistics and Data Directorate - Issuu