Last week, we discussed weighted least squares in my methods class. As a method for dealing with heteroskedastic errors, it has a lot of opponents, notably Angrist and Pischke of Mostly Harmless Econometrics fame. The form of heteroskedasticy, or rather the lack of information about the form, can make weighting a useless proposition. If we’re wrong about the form, which we most likely are, it only introduces further bias.
With regards to population weighting, however, Angrist and Pischke are more clear that we absolutely should weight samples to reflect populations. I’m chest-deep in edits right now, trying to get out my paper on match quality and reading to children, and struggling again with the weighting issue. My advisor, seminar participants at the Census Bureau, and Angrist and Pischke say weight for population. If not, the results aren’t meaningful, or so the story goes.
My gut and a friendly editor say “isn’t it enough to learn about this population itself? Why try to extrapolate to the whole population?” Especially when the sample population was picked to identify particular characteristics, I wonder how or whether weighting is a useful exercise. Or rather, how not weighting is somehow less meaningful.
Any suggestions on how to resolve this internal debate?