I recently had the opportunity to hear David Rothschild from Microsoft Research speak about polling strategies at the Roper Center at Cornell. He has a remarkable record of polling accuracy, including election predictions, and he was kind enough to share some of his insights with the PollQ community.
What are some of the elements that go into creating a great poll question?
The most important thing is make sure you think very clearly about what you want to know. The answer will reflect both the question and the ability of people to respond to it. I try to start with the most direct question possible and deviate only when necessary. This may seem obvious, but in order to predict what will happen in an election the industry asks people "who will YOU vote for?" and to predict how well a company's stock will do the industry asks "would you recommend this company to a friend?". These questions are very indirect and, consequently, not always that powerful in predicting what the marketer really wants to know.
When polling on a polarizing topic, you've mentioned that you may get richer results when you ask the question in a way that reflects the underlying components of the issue rather than just asking a more broad, general question. Can you elaborate on this and give some examples?
Most people have a partisan trigger when you ask them about Obamacare or Trump's tax plan. These questions become a proxy for party ID (or presidential approval). If you want to know how people really feel about Obamacare, as them about providing Medicare for people up to 1.4x poverty and if you want to know how people really feel about Trump's tax plan, ask them about a massive tax cut for higher incomes.
Eliminate the partisan trigger and ask them about a single, clear policy. Most people only have a vague notion of the aggregation of policies that is "Obamacare" or "Trump's tax plan" and will project their ideas onto the question.
What are some of the issues you think about when collecting demographic information?
What data will be useful for predicting responses to what we want to know (i.e., does X correlate with how people answer questions). And, what data has known values in the population (i.e., do we know the percent of X in the population).
When analyzing poll results, what are some key strategies you use for filtering the data?
Use the plan you had before gathering the data.
What's next for you at Microsoft Research and how can our community support your work?
Non-probability polling is faster, cheaper, and more flexible than probability-based polling. And, it is accurate enough. Embrace the future. When I started studying polling methodology, it was because I wanted to stay a few steps ahead of what could be next. Now, with falling phone response rates and mixed-mode coverage problems, we do not have the luxury, as an industry, to wait for what comes next. We need to embrace reality.
David Rothschild is an economist at Microsoft Research in New York City. He has a Ph.D. in applied economics from the Wharton School of Business at the University of Pennsylvania. His primary body of work is on forecasting, and understanding public interest and sentiment. After joining Microsoft in 2012, he has been building prediction and sentiment models, and organizing novel/experimental polling and prediction games; this work has been utilized by Bing, MSN, Cortana, and Xbox. And, he correctly predicted 50 of 51 Electoral College outcomes in February of 2012, average of 20 of 24 Oscars from 2013-6, and 15 of 15 knockout games in the 2014 World Cup. He is also a fellow at the Applied Statistics Center at Columbia and the Penn Program on Opinion Research and Election Studies.