Essentially all empirical questions that are addressed with sample data require estimates of sampling variance. The econometrics and statistics literatures show that these estimates depend critically on the design of the sample. The sample for the U.S. Current Population Survey (CPS), which serves as the basis for official poverty, unemployment, and earnings estimates, results from a stratified and clustered design. Unfortunately, analysts are frequently unable to estimate sampling variance for many CPS statistics because the variables marking the strata and clusters are censored from the public-use data files. To compensate for this, the Bureau of Census provides a method to approximate the sampling variance for several, specific point estimates, but no general method exists for estimates that differ from these cases. Similarly there are no corrections at all for regression estimates. This paper proposes a general approximation method that creates synthetic design variables for the estimation of sampling variance. The results from this method compare well with officially reported standard errors. This methodology allows the analyst to estimate sampling variance for a significantly wider class of estimates than previously possible, and therefore increases the usefulness of research resulting from the CPS data files.


Downloads Statistics

Download Full History