The Quality of Health and Education Systems Across Africa
FIGURE B.1 Typical sampling strategy process for SDI surveys
1. Sampling frame
2. Stratification
3. Selection of locations (within stratum)
4. Selection of facilities (within location)
5. Selection of providers (and pupils) (within facility)
Source: Service Delivery Indicators (SDI) core team.
of fourth-grade pupils through a student assessment. The multistage sampling approach makes sampling procedures more practical by dividing the selection of large populations of sampling units in a step-by-step fashion (figure B.1). After defining the sampling frame (that is, the complete list from which sampling units are drawn)2 and categorizing it by stratum, a first-stage selection of sampling units is carried out independently within each stratum. Often, the first stage is selecting clusters or geographic locations (districts, communities, counties, neighborhoods) to ensure that survey teams do not have to travel long distances to interview just one facility. Clusters are randomly drawn within each stratum with a probability proportional to the size of the cluster (measured by the location’s number of facilities, providers, or pupils), which helps to ensure that the sample is representative of the services within those locations. Once locations are selected, a second stage takes place by randomly selecting facilities within locations (either with equal probability or with probability proportional to size) as secondary sampling units.3 At a third stage, a fixed number of health and education workers and pupils is randomly selected within facilities to provide information for the different questionnaire modules. Replacement facilities are also drawn from each location strata in case the sampling frame includes health or school facilities that no longer exist, are not functional, refuse to participate, or are inaccessible because of security concerns. These replacement facilities are selected in keeping with the probability sampling approach. More important, the rules for replacement are specified in the protocol ex ante to avoid bias in the results, and backup facilities are typically not allowed to be used for logistical ease.4 Because of this sampling process, survey results must be properly weighted using a sampling weight or expansion factor to assure representativeness of the population of interest. The basic weight for each sampling unit is equal to the inverse of its overall probability of selection, which is computed by multiplying the probabilities of selection at each sampling stage. Different weights need to be applied depending on the relevant level for the estimate, which can be the facility, the staff or provider, or the pupil.5 These different weights are later included in the data sets to facilitate reestimations. 134