O.6 Factors to Consider in Choosing a Targeting Method

from Revisiting Targeting in Social Assistance

7.5 Performance Triangle for Two Programs

Overview | 23

delivery systems. Demographic targeting is another mainstay with many programs for children, the elderly, or their families.

The literature is not definitive in ranking among targeting methods, as context and capacities shape the possibilities, nor is it definitive in matching methods to contexts, as preferences and history shape choices. Although there are some patterns as described, there is enormous variation in implementation and virtually every combination of methods has been used somewhere, sometimes in seemingly unlikely places.

The question of whether to use simpler methods, such as self-targeting, geographic targeting, or demographic targeting, or to develop householdspecific methods must be based on “fit for purpose” as well as context and capacities (figure O.6). Using geographic targeting to select only some areas in which to work may fit well with geographically delineated natural disasters, but it occasions large errors of exclusion for poverty-oriented

Figure O.6 Factors to Consider in Choosing a Targeting Method

METHODS

Self-targeting (transaction costs, prestige) Self-targeting (design features)

Categorical

Place (geographic) Age Disability Civil status

Lottery

Welfare based

Means test

Hybrid means test Proxy means test Community based

PURPOSE

Principally poverty/inequality? Principally supporting people in other defined categories? Shock response?

FEASIBILITY

Financial data, and technical capacities? Degree of inequality? Path dependency? Political economy considerations?

Choice of method(s)

Source: Original compilation for this publication.

24 | Revisiting Targeting in Social Assistance

programs in normal times. Demographic targeting for some purposes is axiomatically ideal; but for poverty-related purposes, it will be inexact although possibly pragmatic. Further, the fit for purpose can vary by program within a given country. For a school lunch program, geographic targeting to poor areas, possibly excluding any categories of schools (private or upper secondary) that serve students who are less poor, may be appropriate, whereas for a last resort income support program, some household-specific targeting may be used.

For household-specific targeting, there is a fairly clear order of preference, although sometimes the context narrows the choice set considerably. In some countries, means testing is feasible and with no inbuilt statistical errors, it is easily adopted as the first choice. This range of countries may be extended by hybrid means testing, which may have some errors in the imputation of informal income, but the imputations affect only some households and some of their incomes are still lower than in many methods. The range of countries where such methods are applicable is increasing with the secular trend in data availability and may be applicable in still more countries for programs where the eligibility threshold is set high. In some countries with high informality, means testing or hybrid means testing will not be able to distinguish the poorest, but it may be able to rule out the wealthiest. However, in many developing countries, the degree of informality implies that means testing, or even hybrid means testing, will not be very accurate and so those choices are often deemed to be off the table. In these cases, proxy means testing, community-based targeting, or some combination becomes the common option. In many places, the community has long been a part of targeting processes and, although its role may change from full-out decision making to more supportive roles in outreach, data collection, and monitoring, the community will maintain a degree of involvement due to path dependency. Conversely, in some settings, the degree of community cohesion may not allow community-based targeting. This may be true in urban settings where density and mobility (in both residence and where time is spent during each day) are so high that people do not know their neighbors well, or where geographical communities are socially divided by ethnicity or conflict. Still, proxy means testing, community-based targeting, and combinations are methods that are still on the table and used in a large share of developing countries. Where these are insufficient or undesired, some rationing, such as by geography, demography, other observable characteristics, self-targeting, or even lottery, may still be an option.

In countries that have a well-developed method of household-specific assessment, multiple programs that use household-specific assessments may use the same means of assessment, although with possibly varying

Overview | 25

thresholds or ancillary criteria. Using a common process or shared social registry as the entry point for multiple social programs has benefits and risks. By providing a shared format, it harmonizes the information collected and can add coherence across social policy. By serving as a common portal, it can lower the costs of application as a household may have to apply only once to receive multiple sets of benefits, or at least receive cross-referrals that improve their knowledge of programs from which they may benefit. Similarly, shared registries may lower the total administrative effort governments put into outreach, intake, and registration, and by uniting the efforts across various programs, they may be able to amass resources and gravitas to do the work well. However, in concentrating provision, a common process also concentrates risks as any failure in outreach or process affects not just a single program but many. The heavier is the use of the registry in social policy, the more important it is that it be dynamic, inclusive, and accurate.

The use of multiple targeting methods is common, but it is not necessary and sometimes not ideal. For example, although categorical methods can help prioritize when budgets are limited or sequencing of the rollout of a new program is required, they are guaranteed to exclude poor people as poverty affects some people in all places, all ages, all genders, or other group definitions. So, if a country has developed the capacity for a householdspecific system, does a categorical system add value?

Message 7. There are better and worse ways to implement each targeting method, and lessons have been learned over time.

No matter what targeting method is chosen, its application must be carefully customized. There is no guarantee that what worked well in one place will work the same in another. Although general principles may carry through from one context to another, customization will be needed to account for country- and program-specific details such as the purpose and design of the program; availability of data; capacity of the delivery system; characteristics of the population of concern; and weight put on the different desirable but sometimes conflicting traits of low errors of exclusion, low errors of inclusion, and low costs of all the various sorts. Customization includes the definition of the assistance unit, the thresholds used for each program, the roles assigned to each institutional actor, and the like. Customization involves the detailed decisions involved in moving from abstract concepts to implementation, as described in chapter 3; the delivery system, described in chapter 4; and each method’s knowhow, described in chapter 6.

While mindful that customization is needed, there are a few rules of thumb to guide planning and assessments of practice.

26 | Revisiting Targeting in Social Assistance

For some facets of eligibility determination, more is probably better than less, although of course tempered by cost. The following are some examples: • More care to outreach, communication, mechanisms for grievance redress, human-centered design, and client dignity is better than less. • More care in building capacity at the local level yields benefits, including for community actors who may be involved in community-based targeting, outreach, or monitoring. • Open registration is more desirable for most kinds of programming, compared with only periodic, especially infrequent, registration. • Greater coverage of foundational identification (ID) will facilitate many processes. Social protection programs can help facilitate access to foundational IDs but will need to have work-arounds for those still lacking them. • More data are usually better than fewer, more recent data better than older data. • A greater degree of interoperability among various government registers that help to define the assistance units (ID agencies or civil registries) and their welfare (records of income or social security contributions, and land, automobiles, payments for government services, or receipt of government benefits) is helpful as long as due data privacy and security provisions are in place and respected. • More regular and multimodal monitoring of implementation and results can allow faster adjustments and improvements.

For some facets of eligibility determination, there may be a sweet spot between too little and too much. The following are some examples: • Programs that are smaller than the target population will have exclusion errors by design. A program needs to be at least as large as the population for which it is intended and preferably a little larger; being somewhat larger than needed will likely reduce exclusion errors while those incorrectly included are unlikely to be very wealthy. At the same time, being much larger than needed is costly and begins to include those who are not part of the population meant to be served. • For recertification, very high frequency may raise costs and errors of exclusion unduly, but excessively long periods without reexamining eligibility will surely result in errors of inclusion. If budget/places in the program are rationed, lack of recertification will lead to errors of exclusion as well. • Means testing, even hybrid means testing, requires building a reasonably comprehensive data system to measure income and assets, but demanding too much detail can push clients into fraud, disincentives for work, or withdrawal from the program.

Overview | 27

• For proxy means testing formulae, many countries use simple, singlemodel ordinary least squares, but better results might be obtained from a little more complexity. In modeling, the use of quantile regressions and auxiliary data (big data) at the geographic level is likely to improve prediction. Whether more complex methods of machine learning will pay off is less clear or generalizable. Likewise, having a simple national model may be less accurate than having a suite of models for different areas (metropolitan areas, towns, and rural areas) or administrative units (states or provinces), but having many different models may require more data than are available, or it may introduce practical issues for implementation and communication. • Some systems are designed in a way that requires greater capacity than the program or country can muster and might be better simplified; other countries fail to make improvements that are seemingly within their reach.

Message 8. Income dynamics and shocks are significant and pose difficult challenges for eligibility determination processes; some targeting methods are more agile than others.

Even in normal times, the dynamics of welfare and poverty are considerable; shocks can dramatically amplify this. Chapter 3 provides examples of how fluid poverty is in many countries and regions, often with the number of transient poor being at least the same as the number of chronically poor. Volatility in income can be driven by good or bad harvests and seasonality of work in services or by job loss, illness, or an accident. Natural disasters, climate change, economic crises, and pandemics can disrupt livelihoods, at least temporarily, sometimes permanently and for many people at once.

When trying to forestall the long-term negative consequences of shocks, whether idiosyncratic or covariate, the speed of assistance can be of utmost importance. The logic is intuitive and substantiated in the formal academic literature. If assistance is to prevent a negative coping tactic, it must be timely, before a family’s baby becomes malnourished, before it withdraws a child from school, marries off a child bride, sells its assets, racks up highinterest debt, or loses its home, workshop, or land. Each such coping tactic can be very difficult to reverse, ratcheting down the individual or family’s welfare for years or the rest of their lives. Assistance (usually temporary) can help prevent such losses.

The recurrence of shocks and crises and the premium on swift response pose the challenge of how social protection systems can be adequately flexible and dynamic. Given the focus of this book on eligibility determination, it considers this element among the wider aspects of adaptive social protection (building resilience, ensuring adequate financing for crisis response, and building institutional frameworks and capacity). The conceptual and

28 | Revisiting Targeting in Social Assistance

measurement issues are treated in chapter 3. Several of these topics pertain to improvements or adaptations to delivery systems and so are treated in chapter 4. The pros and cons of the different targeting methods for emergency response are treated in chapter 5, and the how-tos are presented in chapter 6, including the use of new data and technology.

Shock responses require thinking through who gets the priority for assistance—those who were poor even before the shock? Those made poor because of it? Those with large losses even if they remain above the poverty line? In the ideal, all three groups would benefit from social policy but likely via different sorts of responses and for different reasons. • Helping the chronic poor after a crisis may not be sufficient, but it is this group that may most quickly have to resort to negative coping tactics and so they should get first priority. This is often relatively feasible since it is by far the simplest and fastest social protection response to issue an emergency top-up payment to people who are already in some social assistance program. Often, expanding ongoing but low-coverage programs is the next fastest option as there is a base of systems and personnel from which to start. • At the same time, a crisis response beyond helping the already poor may be needed to reach the new poor or those who have suffered significant losses. Often relatively broad, flat (or minorly customized) benefit designs are used for crisis response programs. This simplifies eligibility decisions and balances poverty reduction and risk management goals.

The government may also mandate or facilitate insurance programs to help cover risks ex ante.

Some targeting methods lend themselves more easily to handling some sorts of income dynamics or shocks than others: • Geographic targeting fits well for natural disasters, which are usually spatially delimited, but it is not very apt for economic crises, which usually affect all areas of a nation. • Demographic/categorical targeting is not a natural match for covariate shock response per se—no one’s age changes in response to a shock; natural disasters do not strike only those of some ages; and economic disasters hit workers/those of working age more directly and their dependents only indirectly. Nonetheless, top-up benefits to beneficiaries of demographically targeted programs may be a way to get money out quickly, especially where coverage of such programs is high. Of course, children are so biologically vulnerable that it is always important to protect them. Demographic targeting is something of a recognition of the idiosyncratic shocks that come as families move through the life cycle.

Child grants and social pensions help cushion changes in the dependency ratio within families.

Overview | 29

• Among household-specific methods, means testing and hybrid means testing, which rely to a large extent on data from interoperable government systems that maintain high-frequency data, can be fairly agile in responding to idiosyncratic and covariate shocks. This can be especially true for eligibility determination that draws on monthly or biweekly data on contributions to social security systems or income tax withholdings as these reflect changes in wages or formal employment in short order. Eligibility that is based on longer term measures, such as annual income tax information or holding of assets, is less responsive. • Proxy means tests are basically calibrated to reflect long-term welfare, traditionally have been based on characteristics of families and their assets that change slowly, and have tended to be used with data updated only every few years, so these tests are less able to be shock responsive.

Some recent innovations merge measures of exposure to weather or geophysical shocks with more traditional proxy means testing data to attenuate the problem, while faster-changing data such as phone data may offer some promise. • Community-based targeting assessments seem to be able to pick up how households are affected by various shocks, but they require updating after the shocks hit and thus may take some time.

Many actions that are important for preparing social protection systems to be responsive to shocks are also important for moving to USP in general and vice versa. Improved coverage of the chronically poor in normal times is important for USP; it also builds resilience before shocks and makes topup programs feasible. Such full coverage requires a continuously open enrollment process, adequate base financing, and enough flexibility to ensure that entitlement obligations are met, at least through normal swings in need. It thus provides a base of response in times of crisis. High coverage of foundational IDs (especially electronic identification [eID]) can help provide links to many sorts of data and facilitate some rapid (possibly simplified) eligibility assessments. Foundational IDs coupled with extensive financial inclusion also facilitate quick payments. Building out the insurance part of social protection systems serves both USP and resilience.

Message 9. Advances in technology—Information and communications technology, big data, and machine learning— offer the promise of significant improvements in targeting accuracy but are not a panacea; better data may matter more than greater sophistication in inference.

A key element of targeting is using data or inference to discern different degrees of poverty or vulnerability. Changes in technology and the availability of new data always excite hope that these will make targeting more

30 | Revisiting Targeting in Social Assistance

accurate or easier. Deeper discussion of these issues is concentrated in chapters 6 and 8.

Improvements in the availability and use of traditional government-held data have been and will continue to be a driver of improvements in the ability to observe welfare and target, especially potentiated by the increasing use of foundational IDs and eIDs. Increasing the coverage and quality of such data systems and the ability to conduct data matching are helpful for most methods of targeting, especially in facilitating means testing and hybrid means testing (and a move away from welfare proxies). Improvements in the scope and quality of traditional government data—on taxes (payroll for firms, sales for value added tax, personal income, and property such as land or automobiles); on fees for government-provided services (especially utilities and border crossings); and on the use of government-provided services targeted in various ways (other social assistance programs and sector-specific preferences or privileges such as fertilizer discounts and fee waivers for any government-provided services)—help make welfare observable. Governments have held such data for many years, but their use in eligibility determination may be improved with attention to the technical details of definitions and data structures, which make it easier to match among data sets, and with attention to policy issues of data privacy and data security that regulate the legality of doing so. Many countries are rolling out or extending coverage of foundational ID systems and often upgrading to eIDs, which will make much more data matching feasible in proximate years as the eIDs become the keys for matching on more government-held data sets. Drawing on the integration and interoperability of such data system matching in eligibility processes can reduce the need to collect data again and again and can facilitate cross-referral processes from one program to another, which can lower transaction costs for applicants and governments alike.

Where welfare is difficult to observe directly, targeting methods try to infer it from observable proxies; whether the proxies are new or old, they need to be highly correlated with welfare. Nonadministrative big data— such as from satellite imagery, mobile phones, and social media—and machine learning are expanding the data and techniques for this at a dizzying pace, although they remain largely proxies for welfare rather than direct measurements per se.

Although they are often still proxies, big data have the advantage of not requiring household-specific data collection by the social assistance agency via lengthy intake interviews or (partial) census sweeps as they are generated by other government or private processes. However, the social assistance agency must acquire and use them. Thus, they offer the prospect of being cheaper and faster for eligibility assessment, allowing not only rapid program start-up, but also more frequent reassessments as conditions change.

Overview | 31

Big data are already being combined with traditional data to improve poverty maps and help predict which households and areas are more at risk of natural disasters. Administrative data have long been used alongside traditional data to improve poverty maps; newer big data can similarly be incorporated. Moreover, historical data on localized natural disasters and drought combined with realized household poverty outcomes can be used to predict which households will be at risk in the future. Such models can be used to target the poor or vulnerable for covariate risk-mitigating social protection programs or public insurance schemes, helping administrators to manage covariate shocks.

In a crisis or data-scarce environment such as postconflict, using big data for determining eligibility may be one of the only options and an appropriate one. Big data can fill a gap when traditional data are not available, as in many poor or fragile countries or in postconflict settings, or when the data are not current, as in a crisis. In such circumstances, the ability to target much-needed assistance is vital.

Whether big data will replace the need for traditional data for eligibility assessment depends on whether the challenges arising from their newness can be fully understood and solutions crafted. Some of these challenges are well-known and require as complement more traditional data, such as for training and assessment. Some of the challenges are well-known and require care in implementation, such as avoiding bias in models. Other challenges require new thinking and new research, such as matching the unit of assistance and understanding impacts on behavior. • Ground truth for training. Big data are increasingly used to generate poverty maps at a fine-grained level and where traditional data do not facilitate them. Their viability still relies on accurate ground truth training data, that is, household surveys with direct income or consumption measures. In their absence, many big data–based maps use survey data such as the Demographic and Health Survey series where household welfare is not directly measured but instead estimated with proxies; in essence, big data proxies are often used to model another proxy rather than the direct measure of interest. This is a limitation for traditional poverty maps using census data as well, but big data do not overcome this. • Ground truth for assessing. Proper assessments of different big data maps— from satellites, call detail records, or social media—compared with traditional maps and survey data with directly measured household welfare are still needed to understand whether the big data maps are more or less accurate than traditional methods, and thus whether they should be preferred to traditional maps or only used when the latter are unavailable.

32 | Revisiting Targeting in Social Assistance

• Avoidance of bias in prediction. Machine learning models use big data to learn and predict. When the data they train on do not represent the whole population, the model predictions can be biased. For example, early face and voice recognition models are much better at predicting white males than nonwhites or females. Careful checks need to be put in place to ensure that eligibility assessments do not disadvantage particular subpopulations; the marginalized groups of interest to social assistance policy may often be exactly the ones missing from big data. • Unit of observation. Eligibility is often at the household level, while big data rarely are. Even newly fine bore geospatial analysis remains at a grid level rather than being household specific. Data from call records may pertain to the subscriber identity module (SIM) card or phone number, of which an individual or household may have none, one, or several, separately or shared across individuals or households. These issues add a level of complexity to the use of such proxies. • Data access and use. Many big data are privately held. What regulation or incentives it will take to make such data available to core government functions on an ongoing basis (beyond just in a crisis), or what is socially acceptable (for government to access and for what purpose), is mostly still to be worked out. • Incentives. Just as there has been great concern that using more traditional administrative big data might generate undesirable labor incentives, it will be of interest to learn whether the use of phone data or social media for eligibility determination will change behaviors in ways that reduce people’s welfare or reduce the accuracy of the proxy.

More sophisticated inference—machine learning—is probably less important for better targeting outcomes than more and better proxies—big data. The small literature exploring the use of machine learning algorithms finds that the algorithm that produces the best proxy means test depends on which metric is being used to evaluate and how the scoring would be implemented in practice. It also generally finds that the improvements in performance are relatively small compared with traditional models. Thus, it is not clear that the complex analysis required to determine which is the best machine learning model in a particular country context for a particular program objective and design is worth the improvement over more traditional models. Moreover, the increase in opacity—a black box on top of a black box—may concern policy makers in some countries, although machine learning–based proxy means testing models were recently adopted in Colombia and Costa Rica; new visualization tools can also help make the models more intuitive. Where significant improvements in machine learning models have been identified, the improvements are driven more by bringing more proxies into the model—whether administrative data or

Overview | 33

“feature engineering” (developing new variables from within the traditional data itself)—than the choice of model itself.

In the end, the use of new forms of big data and sophisticated inference should be understood as an interim step in the transition to measuring welfare and eligibility directly. Most nontraditional big data remain proxies for the underlying welfare that it would be preferable to measure directly to determine program or benefit eligibility. It remains that analysts resort to machine learning or traditional regressions to estimate the underlying welfare from the proxies, traditional or new. New data and new techniques may help reduce the inherent modeling error of proxy means testing, but such errors will remain. It is expected that as more and bigger data become available on which to train machine learning, the combination will soon become increasingly common. Yet, ultimately, an improved proxy means test is still not a substitute for direct measurement of most or all income nor for the need for interoperability and data integration.

Increasing use of new data and inference will not replace the need for humans in all parts of the provision of social services. New data and inference may improve the accuracy, increase the speed, and lower the cost of eligibility assessments. They may help lower transaction and administrative costs for some clients and some functions. However, some clients may need human social assistance workers to help overcome issues of information, agency, or the digital divide. Some processes, such as grievance redress or referral from income support to social services, may benefit from rapport built between the social assistance staff and clients.

Message 10. How countries target is often and should always be a dynamic story.

In many countries, efforts to target social assistance have evolved over time—often improving aspects of delivery systems or data collection on a continuous or recurrent basis, sometimes improving formulae and data use, and occasionally evolving from one targeting method to another altogether. Stories should be dynamic where new programs, problems, or heightened expectations demand attention and as new capacities, new data sources, and new computing power move the frontiers of what is possible. Sometimes there are reports of government administrations of different political orientation focusing on different sides of the targeting problem— with one putting more emphasis on reducing errors of inclusion and another on reducing errors of exclusion. Where taken in alternating turns or by different levels of government, with balance and technical quality, both emphases can contribute over the years to improved programs and impacts. There have also been occasions of stagnation when countries or programs have stalled in their efforts at improvement. These are a reminder

O.6 Factors to Consider in Choosing a Targeting Method

Next Article

7.5 Performance Triangle for Two Programs

Figure O.6 Factors to Consider in Choosing a Targeting Method

METHODS

Categorical

Welfare based

PURPOSE

FEASIBILITY

Choice of method(s)

Message 7. There are better and worse ways to implement each targeting method, and lessons have been learned over time.

Message 8. Income dynamics and shocks are significant and pose difficult challenges for eligibility determination processes; some targeting methods are more agile than others.

Message 9. Advances in technology—Information and communications technology, big data, and machine learning— offer the promise of significant improvements in targeting accuracy but are not a panacea; better data may matter more than greater sophistication in inference.

Message 10. How countries target is often and should always be a dynamic story.

More articles from this publication:

7.5 Performance Triangle for Two Programs

7.9 Relative Efficiency of Programs

Concluding Remarks

7.13 Exclusion and Inclusion Errors

the Poverty Line

7.12 Impacts on Poverty and Inequality

7.3 Inclusion and Exclusion Errors in a 10-Person Economy

7.4 Targeting Differential

What to Look for When Conducting Method Assessments

This article is from:

Revisiting Targeting in Social Assistance