Understanding and teaching unequal probability of selection
This paper focuses on econometrics pedagogy. It demonstrates the importance of including probability weights in regression analysis using data from surveys that do not use simple random samples (SRS). We use concrete, numerical examples and simulation to show how to effectively teach this difficult material to a student audience. We relax the assumption of simple random sampling and show how unequal probability of selection can lead to biased, inconsistent OLS slope estimates. We then explain and apply probability weighted least squares, showing how weighting the observations by the reciprocal of the probability of inclusion in the sample improves performance. The exposition is non-mathematical and relies heavily on intuitive, visual displays to make the content accessible to students. This paper will enable professors to incorporate unequal probability of selection into their courses and allow students to use best practice techniques in analyzing data from complex surveys. The primary delivery vehicle is Microsoft Excel®. Two user-defined array functions, SAMPLE and LINESTW, are included in a prepared Excel workbook. We replicate all results in Stata® and offer a file for easy analysis. Documented code in Excel and Stata allows users to see each step in the sampling and probability weighted least squares algorithms.
Barreto, Humberto and Raghav, Manu, Understanding and Teaching Unequal Probability of Selection (June 30, 2011). Available at SSRN: https://ssrn.com/abstract=1887806 or http://dx.doi.org/10.2139/ssrn.1887806