What election missteps teach us about polling science

Just in time for the 2018 midterms, let’s revisit the polling industry’s rude awakening of 2016. Nate Silver, the most prominent interpreter of polls has a comprehensive post mortem that targets commentators more than pollsters. But pollsters, especially in some key states, got it wrong.

The absolute level of errors was not unreasonable for most of the polls (between 3-4 percentage points), but those errors were not randomly distributed, and in fact nearly all of the polls missed in the same direction—favoring Hillary Clinton over Donald Trump. When errors point in one direction rather than being randomly distributed, the data contain a bias.

So what went wrong? Certain segments of the voting population were less willing to participate in the pre-election polls altogether, and these segments voted overwhelmingly for Donald Trump. This problem is known in survey methodology as “biased nonresponse” and occurs anytime when people who don’t respond to surveys are systematically different from the people who do. While it may be tempting to believe that biased nonresponse is specific to the political polling industry, it threatens the validity of nearly all survey research. Yet very few researchers take steps to assess or address it.

How biased nonresponse hurts your data

Most researchers know that they will come to incorrect conclusions if only happy customers or unhappy employees participate in their surveys. Both of these cases are simplistic examples of biased nonresponse. When this happens, the results are not generalizable to the population that the sample of respondents was drawn from.

Nevertheless, not all instances of nonresponse bias are obvious. Consider a scenario in which an organization conducts a study of customers and achieves a respectable response rate of 50%. However, the researcher is unaware that there is a nonresponse bias that leads customers who use a specific product more frequently to be twice as likely to participate in the study when compared with customers who use the software less frequently, even though both groups might be equally large in terms of the population of customers.

If the researcher were unaware of this mismatch between the sample and survey population, he would likely analyze the data without any adjustments and take the results to organizational stakeholders, like marketing or engineering. The fact that there is an undetected nonresponse bias that is related to an outcome that we care about (software usage frequency) means that the decisions made based on the data are also likely wrong. The organization would make bad decisions based on data like this.

How do you know if your research is being affected by biased nonresponse?

While most experienced researchers know that this kind of scenario can occur, very few take the steps necessary to determine whether or not survey results are being influenced by nonresponse bias. Determining whether there is nonresponse bias requires some additional steps in the research process. However, when researching a known population (e.g., a list of customers or employees), determining nonresponse bias is actually not that difficult and is worth the added effort.

Follow these basic steps necessary to identify the nonresponse bias.

Step 1: Identify auxiliary variables that exist for all members of the population (e.g. customers or employees) prior to survey design.
In the case above, the critical variable for the organization is the frequency of product usage, which is information that the researcher can include in the sampling list. The researcher might also identify other areas where nonresponse bias may be a problem: things like customer size and annual spend.
 
Step 2: Design and field your survey to collect the survey data.
In the example above, the organization would want to include questions to identify key drivers of software usage frequency.
 
Step 3: Merge the datasets that contain the auxiliary variables and the survey data.
In the example above where the response rate was 50%, the merged dataset will contain twice as many rows (cases) as the survey dataset alone. This is because customers that participated in the study will have the survey data added but those that did not will stay in the dataset but with only their auxiliary variables included.
 
Step 4: Create a variable that indicates whether a customer did or did not participate in the survey.
In most cases, the simplest approach is to create a new variable (column) that is 0 if the customer did not participate in the survey and 1 if the customer did participate in the survey.
 
Step 5: Calculate basic descriptive statistics (mean, median, variance, etc) for the auxiliary variables for those companies that participated in the survey and compare against those that did not participate in the survey
In the case above, one would pay special attention to the auxiliary variable about the frequency of product use. Because the hypothetical scenario indicated that there is a nonresponse bias, the researcher would quickly discover that there are substantial differences in the frequency of product use between those customers who responded to the survey and those who did not.

You’ve found nonresponse bias in your data, now what?

Just because a nonresponse bias exists in the data does not mean that the data are useless. In fact, it is entirely possible to find a nonresponse bias that is completely uncorrelated with outcomes that matter. For example, in the 2016 election, had the only nonresponse bias been that tall people were much more likely to participate in the pre-election polls than short people, it is unlikely that there would have been any impact on the accuracy of polling predictions.

Even when nonresponse bias is present it can often be adjusted for using weighting. Recall that in our hypothetical scenario the bias turned out to be that frequent users were twice as likely to participate in the customer survey as infrequent users. To adjust for this bias, we could simply put a weight of less than one on the frequent users that participated in the survey and a weight of more than one on the infrequent users that participated in the survey, with the weights formulated such that they average 1. A more sophisticated weighting scheme could account for multiple factors.

By making these adjustments, you can be much more confident that the conclusions you reach about your sample of respondents will also be generalizable to the broader population that your sample comes from. Without looking for evidence of nonresponse bias it is difficult to be certain that your results are actually representative unless you are using probability-based sampling from a known population.

Important caveat: You can only adjust for nonresponse bias that you can measure. For example, in the scenario above, if the organization did not have information about the frequency of product use by customers, in the form of auxiliary data, it would not have been able to assess whether there was nonresponse bias, much less adjust for it with weighting. Alternatively, if none of the infrequent users participated in the study at all it would be impossible to make a weighting adjustment. In this case, knowing about the nonresponse bias would tell you that the data are unusable.

A version of this post was published by Dave Vannette, PhD, in November 2016.