The Rise of Curated Sampling
Pretty much everyone who paid attention and passed Stat 101 got a good laugh at the expense of Daniel Webster, the first-term Republican congressman from Florida who recently sponsored legislation designed to eliminate the American Community Survey. "We're spending $70 per person to fill this out. That's just not cost effective," the mathematically challenged congressman insisted, "especially since in the end this is not a scientific survey. It's a random survey."
Clearly, Congressman Webster has no clue about survey sampling. Ha, ha, ha.
But this incident got me thinking: Is Webster's problem simply that he doesn't understand basic statistics, like so many other people in the US? Was he just trying to knock out the American Community Survey because, from his perspective, it represents a government intrusion into the private lives of citizens? Probably, since that was his main argument against it: "This is a program that intrudes on people's lives, just like the Environmental Protection Agency or the bank regulators."
But I can't help but wonder about what Webster meant by "random" and "scientific." He seems to be saying that a "random" survey would be improved by making it more "scientific." Those of us who got good grades in Stat 101 heard this as contradictory: it's the randomness of the sample that makes the study scientific.
That's clearly not the way Webster understood things. What does he mean by "random" and how is that opposed to "scientific?"
I turned to one of my favorite sources for American cultural intelligence, Urban Dictionary. Here's the current top definition returned by a search there for the word "random":
Uni Student: "Cheese! HA HA!"
Another Uni Student: "Wow that's soooo RANDOM! Let's go and buy some trendy clothes which have meaningless and pretentious words/numbers all over to make us look random."
Here's another (currently definition #6):
Teenage girl 1: Guess what? Cheese! Haha!
Teenage girl 2: Ninja monkeys steal my underwear at night!
Teenage girl 3: Monkey! LOL!
These interchanges sound like they could, in fact, have been produced by a random process. They're almost schizophrenic - they "come out of left field." They're non-sequiturs. Cheese is followed by monkeys and underwear. And, perhaps as important as anything else, this kind of "randomness is clearly despised as both meaningless and pretentious.
Who in their right mind would want a "random survey" if this is what "random" meant to them?
Urban Dictionary's definition of "scientific" is every bit as informative: "1. pertaining to science. 2. an educated guess passed off as a well thought out explanation that uses large words and concepts often unknown to everyone in the room. 3. a plausible explanation for something not easily explained given to the dismay and perhaps delight of those who are present."
This is as snarky as the stuff about "random," but it comes off quite a lot better. At least an "educated guess" is involved. Although it may be "passed off as well thought out," at least it uses concepts that go beyond the everyday. And it plausibly explains something of interest.
A "scientific survey" would clearly be better than a "random" one, even if it was powered by guesswork.
All of this got me thinking about some of the trends that have been evident in the work we do.
For at least a decade there has been a steady increase in what we sometimes call "curated sample": sample that is carefully, often elaborately, specified. Sample definitions constructed with the ostensible purpose of insuring that only real "customers" or "brand loyalists" get surveyed by terminating potential respondents at the slightest sign of deviation from a "profile." (We once had an actual screener that had 40 termination points. If every one eliminated half of the remaining sample, .0000000001% would have survived to eventually participate in the project!)
The kind of stratified random samples that were commonplace in the days of phone studies have given way, online, to elaborate quota sample. Nested quotas, layered like wedding cakes.
Qualitative research has suffered, if possible, even more than quantitative. "Recruit 10 to show for 8 to be interviewed" has now become "Recruit 20 so we can talk to everyone who shows up and decide which ones we want to hear from." The same principles apply to online bulletin boards and MROCs.
Thanks to Rep. Webster, I think I may understand what's been going on. We're being asked to eliminate "random" people and making our work more "scientific" by our well thought-out, educated intervention.