Crafting Surveys with the Mobile User in Mind - Part 2


By Walt Dickie, Executive Vice President

If you've followed this argument so far, you know that it's headed straight for Google Surveys.
Like everyone else in the MR business, C+R has been debating the eventual significance of Google's entrance into our world.

I'd like to focus on Google's unique approach to the structure of a questionnaire: "With Google Consumer Surveys, you can run multi-question surveys by asking people one question at a time." This seems designed with interstitial use firmly in mind. It is pure "mobile first, web second" design.

What is sacrificed, of course, is a true respondent-level record. Instead, as is explained in the accompanying white paper, "Comparing Google Consumer Surveys to Existing Probability and Non-Probability Based Internet Surveys," Google uses the power of its Big Data store to predict respondent-level data "using the respondent's IP address and DoubleClick cookie" to generate "inferred demographic and location information," then, on the back end, Google re-weights the data to better approximate the Internet population. Their "possible weighting options, ordered by priority, include: (age, gender, state), (age, gender, region), (age, state), (age, region), (age, gender), (age), (region), (gender)." The result of this manipulation "produces a close approximation to a random sample of the US Internet population and results that are as accurate as probability based panels."


Today, following their long-established pattern, Google's effort is somewhat primitive. Surveys are very short, and analytic cross-breaks are limited to demographic and location variables, although I see no reason in principle why the same predictive approach to creating a synthetic respondent record could not be extended further or why questionnaires could not be of unlimited length, given sufficiently large data sets and sample availability.

Once you get thinking about synthetic or predictive approaches to survey data, lots of opportunities open up that were closed before. I need to think more about these, and also need to think much more deeply about the modeling some would involve, but here are a few approaches that have come to mind so far:


  • Pre-plan which variables must to be associated within a single respondent record for analytic purposes, than break a long "web first, mobile second" survey into very short sub-sections keeping associated variables together within modules.

  • Create many short surveys such that every variable appears with every other variable enough times to model the "complete" data set (with the help of Google-style inferred variables).

  • Administer the long/full "web first, mobile second" survey to the "Immobile" or "Heritage" sample pool that will take a survey on a laptop/desktop. Break the long survey into short modules, and use Google-style inferred variables plus the long survey to
    model the "complete" data set.

  • Accept the diminishing representative of Immobile/Heritage sample as well as any other penalties/costs (time, expense, etc.) and conduct projects that require long surveys on Heritage sample only. (This is what happens today when projects use telephone rep sample.)

  • Work with clients to systematically develop libraries of data from customer communities, social media, etc. with associated demographic and customer-graphic information. Sample from customer bases using the info in this data set to collect data on necessary "ad hoc/custom" variables not represented in the library and model the "complete" data set using a short survey plus the library. (Essentially and proprietary version of the Google model.)

I'm sure there are other options to consider, and anyone can see at a glance that these options present large analytic, statistical, operational, and cost challenges. But assuming that MR will continue to need surveys longer than could be fitted all at once into an interstitial moment, something along the lines Google is using will be necessary. And, if that's right, does the industry have the resources - in expertise, massive individualized data set, and investment resources - to compete with the likes of Google?