The Importance of Accurate Sampling Frames for Health Surveys
In many settings, high-quality granular household (HH)-level data are unavailable, and practitioners must select from a variety of other sampling methods due to the significant financial resources, time, and logistical complexity required to conduct a full census. Using census data as a sampling frame for a representative survey is considered the gold standard and is used for Demographic and Health Surveys when a recent census has been conducted.
However, when a census is not possible, alternative sampling frames must be utilized. These can include outdated census data, gridded population datasets, or remote sensing-derived structure counts. While these methods can provide estimates, they often lack the granularity and accuracy of a full census, potentially impacting the representativeness of the final survey sample.
In 2017, the Ministry of Public Health and Population in Haiti (MSPP) partnered with the Malaria Zero consortium to conduct a comprehensive census prior to a malaria household survey in the Artibonite department. This novel approach allowed the team to rigorously evaluate different sampling frame methodologies against the gold standard of the full census data.
Conducting a Comprehensive Census in Artibonite, Haiti
Haiti and the Dominican Republic, which share the island of Hispaniola, are the only remaining malaria-endemic countries in the Caribbean. Plasmodium falciparum is the dominant parasite species, responsible for more than 99% of cases detected on the island in 2017. The main vector of transmission, Anopheles albimanus, is primarily exophilic with zoophilic tendencies and is relatively inefficient at transmitting malaria.
To characterize foci of malaria transmission and inform the elimination strategy, the Malaria Zero consortium, including the MSPP, developed a cross-sectional HH survey in the Artibonite department. However, the most recent national census in Haiti was conducted in 2003, making the data potentially outdated for the planned sampling approach.
The 2017 Artibonite Household Census
To create an accurate sampling frame, the MSPP and its partners conducted a comprehensive census in the study area prior to the HH survey. This census had several key components:
-
Satellite Imagery Digitization: High-resolution Maxar satellite imagery from 2016 was used to manually digitize all structures with rooftops ≥3 m2 in the study area. This process took 138.5 hours for two communes (Verrettes and La Chapelle).
-
Enumeration Area Delineation: The digitized structures and a 1 km2 grid were used to define the enumeration areas (EAs) for the census.
-
Community Engagement: Meetings were held with local government officials, police, and community leaders to discuss the census logistics and obtain support for the activities.
-
Field Data Collection: Thirty-five teams of two enumerators used tablets with geospatial PDF maps to systematically visit each EA, capturing GPS coordinates and details of each inhabited household and points of interest.
-
Real-Time Monitoring: Daily coverage maps were created and shared with enumerators to identify any missed areas, which were then revisited. A dashboard was also used to monitor progress and address any operational challenges.
The census identified 33,060 inhabited HHs with an estimated population of 121,593 and 6,126 points of interest. The use of daily coverage maps and the inclusion of digitized structures were novel methods that improved the census quality.
Evaluating Alternative Sampling Frames
To assess the impact of using different sampling frames, the census data were compared with three remote sampling frame methods:
-
2003 Census Enumeration Areas with 2012 Population Projections: The 2003 census EAs were resampled to the 2017 study EAs using areal interpolation, assuming an even population distribution.
-
2016 LandScanTM Population Estimates: The 2016 LandScanTM 1 km2 grid cells were used as the sampling frame, with an estimate of 4.5 persons per HH.
-
Digitized Structures with Occupancy Estimate: The number of digitized structures ≥3 m2 was aggregated to the 2017 study EAs, with an assumption that 70% were inhabited HHs with 4.5 persons per HH.
The census data was considered the gold standard, and the sampling frames were evaluated based on their ability to accurately represent the number of HHs and population within each study EA.
Comparison of Sampling Frames
The sampling frame derived from the manual digitization of structures most closely matched the census results, with 30,514 digitized structures in the study area. The LandScanTM method performed better in urban areas but produced the highest number of HHs to sample.
If a census is not possible, remotely digitizing structures and estimating occupancy may provide a close estimate. However, this method requires careful planning, high-resolution imagery, and a strategy to include HHs not visible in satellite imagery.
The 2003 census EA method performed the worst, with the highest absolute difference in population per grid cell compared to the census. This highlights the importance of using up-to-date data, as older census information may not accurately reflect the current population distribution.
Lessons Learned and Implications for Health Surveys
The comprehensive census conducted in Artibonite, Haiti, demonstrates the value of rigorous sampling frame development for representative health surveys. By comparing alternative methods to the gold standard census data, the team was able to assess the accuracy and limitations of each approach.
Key Lessons:
-
Digitizing Structures Improves Census Quality: The manual digitization of structures using high-resolution satellite imagery helped guide enumerators and improve the coverage and quality of the census.
-
Relying on Outdated Census Data is Risky: The 2003 census data, even with population projections, significantly underestimated the current population distribution, highlighting the need for more recent data.
-
Remote Sensing Methods Can Provide Estimates: When a full census is not feasible, remotely digitizing structures and estimating occupancy can offer a closer approximation than gridded population datasets.
-
Monitoring and Feedback Loops are Critical: The use of daily coverage maps and a progress dashboard allowed the team to identify and address gaps in the census, improving the final results.
These lessons have important implications for health surveys and population-based research in resource-limited settings. By investing in robust sampling frame development, researchers can ensure their data is representative and can effectively target interventions to the communities that need them most.
The comprehensive census approach used in Artibonite, Haiti, provides a model for other countries and programs to follow as they work towards malaria elimination and improved health outcomes.