Rwanda - Integrated Household Living Conditions Survey (EICV4), 2013-2014, Cross-Sectional Sample
Reference ID | RWA-NISR-EICV4-CS-2013-2014-V01 |
Year | 2013 - 2014 |
Country | Rwanda |
Producer(s) | National Institute of Statistics Rwanda (NISR) - Ministry of Finance and Economic Planning |
Sponsor(s) | African Development Bank - AfDB - Financial Partner World Bank - WB - Financial Partner UKaid - Ukaid - Financial Partner European Union - EU - Financial Partner One UN - EU - Financial Partner |
Metadata | Documentation in PDF |
Created on
Jun 29, 2016
Last modified
Jun 29, 2016
Page views
1327895
Sampling
Sampling Procedure
The EICV4 cross sectional (CS) sample includes two independent subsets selected using different sampling frames: 1) a new EICV4 sample of households in enumeration areas (EAs) selected using the 2012 Rwanda Population and Housing Census frame and 2) a panel of households selected from 177 EICV3 villages. A new listing of households was conducted in both the panel and new sample clusters in order to update the frame for the CS Survey. The sample households in the new CS sample EAs were selected from the new listing.
1) The new EICV4 sample
The main sampling frame for the EICV4 is based on the 2012 Rwanda Census. The primary sampling units (PSUs) are the 2012 census Enumeration Areas (EAs). In the Census, each EA was classified as urban, semi-urban, peri-urban or rural. The urban areas include Kigali-Ville and the district capitals. The semi-urban areas generally correspond to smaller towns that have service facilities and markets. The peri-urban areas currently have the characteristics of rural areas, but they are located on the periphery of urban areas and are designated for future development. For the EICV4 sampling frame, the semi-urban areas were grouped with the urban strata, and the peri-urban areas with the rural strata. This results in a final distribution of 17.2% urban households and 82.8% rural households in the sampling frame. EAs in the 177 EICV3 sample villages selected for the panel study were excluded from the sampling frame, in order to avoid any overlap between the two samples.
The new EICV4 sample of 12,312 households was selected using a stratified two-stage design. At the first stage, sample EAs were selected within each stratum (district) with probability proportional to size (PPS) from the ordered list of EAs in the sampling frame. The EAs are implicitly stratified by urban and rural strata within each district, ordered first by urban, semi-urban, peri-urban and rural areas, and then geographically by sector, cellule, village and EA codes. This first stage sampling procedure provides a proportional allocation of the sample to the urban and rural areas of each district. At the second stage, households in each sample EA are selected from the listing. For the three districts in Kigali Province, 9 households were selected in each sample EA as original households; for the remaining 27 districts, 12 households were selected in each sample EA as original households. In addition, a reserve sample of 3 replacement households were selected for each sample EA in Kigali Province and 4 replacement households for each sample EA in the remaining provinces.
This new EICV4 sample contains 12,312 households, including 12,233 original households and 79 replacement households.
2) Households from 177 EICV3 villages used for panel study
The second component of the EICV4 cross sectional sample consists of all the sample households interviewed inside the 177 EICV3 villages selected for the panel study (including any replacements households and panel split households inside the clusters).
Within each of the 177 villages, all households that were interviewed during EICV3 were included in the cross-sectional sample. When an EICV3 sample household moved and a new household occupied the same house in the cluster, it was interviewed for the Cross-Sectional Survey, and assigned a PID (dependency) code of 94. If an EICV3 household was empty or not found, a random replacement household was selected for the EICV4 Cross-Sectional Survey from the new listing of the sample cluster, and assigned a PID code of 95. The sample households with PID codes 94 and 95 are only used for the cross-sectional study, not the panel study.
This second component of the cross-sectional sample includes 2108 households drawn from the 177 EICV3 villages sampled for the panel study. These include 1604 original EICV3 households, 181 dependent household splitting from the original household in the same cluster, along with 243 households living in the dwelling formerly occupied by a panel household and 80 replacement households in the cluster in order to have 9/12 households per cluster.
The reason why we combine the EICV4 data from the new and panel clusters for the CS analysis is to obtain the most accurate CS estimates. In the case of the CS estimates from the combined samples, the additional data from the 177 sample panel clusters will result in a significant reduction in the variance component of the MSE. Although the bias of the CS data from the sample panel clusters may slightly increase the bias component, this bias is very small compared to the corresponding reduction in the variance component. Therefore the CS results from the EICV4 data for the combined new and panel clusters can be considered more accurate than the corresponding results using only the EICV4 data for the new sample clusters.
In total, the final EICV4 cross-sectional sample contains 14,419 households.
3) Assignment of EAs to cycles and sub-cycles
Data collection covering a period of 12 month is divided into 10 cycles to represent seasonality in consumption, income, employment and agricultural activity patterns.
For rural enumeration, each cycle is further divided into two sub-cycles. For the 177 EICV3 villages, the cycle and sub-cycle were pre-determined. Households were re-interviewed in the same cycle, correponding to the same time of the year as they were in EICV3. To assign cycles to the new EICV4 sample EAs, random cycle numbers from 1 to 10 were generated to identify the selection sequence. For the 27 districts outside Kigali, sub-cycle numbers of 1 or 2 were assigned systematically with a random start. This process ensured that the final distribution of the sample EAs to cycles and sub-cycles was geographically representative within each district.
Response Rate
Of the 12,312 sample households selected in the new sample clusters for EICV4, only 79 were non-interviews, for a response rate of 99.4% for this sample. All of the 79 non-interviews were replaced. There were only 12 refusals, and there were few cases of houses that were empty or not found, given that the listing was conducted very close to the interviewing period.
Weighting
Since the EICV4 cross-sectional sample includes two independent subsets (the new EICV4 sample and a panel of households from 177 EICV3 villages) selected from different sampling frames, the probabilities of selection and the corresponding weights for each subset were calculated separately. Weights were first calculated separately for expanding the sample from each sampling frame to the national level; and then, a factor was applied to each set of weights to reflect the proportion of the overall sample within each district that comes from the corresponding sampling frame. Together, these weights would expand the combined data to the district level.
1) Weighting procedures for the new EICV4 sample
The basic sampling weight is the inverse of the probability of selection, computed at the EA level. Generally it is necessary to adjust the weights to take into account the non-interview households in each sample EA. However, during the EICV4 data collection in the new sample clusters, all of the non-interview households were replaced, so the final number of completed households is exactly 9 for each new sample EA in Kigali Province, and 12 for each new sample EA in the remaining provinces. Therefore there is no need to adjust the weights for nonresponse.
2) Weighting procedures for households from EICV3 villages
The probability of selection is the original probability of selection for the EICV3 sample households multiplied by the probability of selecting the subsample of EICV3 villages for the panel in the district. The basic sampling weight is calculated as the inverse of the probability of selection. Since this weight is based on the number of completed CS interviews inside each sample cluster, it is automatically adjusted for any nonresponse and replacements.
When the CS data for the new sample clusters and the panel clusters are combined, it is necessary to multiply the weights from each sample component by the proportion of the combined sample that comes from that component for the corresponding stratum.
The final combined weight variable is included in all cross-sectional sample files. To correctly account for the sample design in the computation of standard errors, analysts should specify the primary sampling units (clust) and strata (district).