Our post last week outlined the steps of conducting a good survey. This week, we turn to sources of errors in survey research.

 

In this context, errors should not be interpreted to mean “mistakes” – rather, errors are sources of uncertainty, both in the estimates in the data and the inferences about the results. Last week, we discussed how the goal of a survey is usually to make inference to a larger population of interest. Evaluations of survey data quality typically reflect the degree of success in that effort.

 

Survey errors reduce, but don’t necessarily eliminate, our ability to accurately make inference to the larger population. Consequently, understanding survey errors is key to understanding survey data quality. Increasing error typically results in larger confidence intervals (reduced certainty) around the estimates in the data and inferences made about the population of interest. If these confidence intervals grow too large, the quality of the data and inferences can be degraded to the point of making them uninformative.

 

The Total Survey Error (TSE) model** is a helpful conceptual framework for understanding sources of error and their effects on survey estimates and inferences. In this framework, the mean square error (MSE) is used to sum all of the variable errors and biases for a particular survey. These errors are specific to a survey estimate or statistic, and in practice the MSE is rarely measured comprehensively and precisely, but the goal is to estimate the MSE as accurately as possible.

 

Using the TSE framework, survey errors can be classified in three broad categories illustrated in the figure below.

TSE1
The list in each category of error above is not exhaustive as there are many potential sources of errors in surveys. The data collection method influences many sources of error and is often the primary focus for efforts aimed at reducing error.

 

For example, to reduce nonresponse error a researcher may devote a larger portion of her budget to incentives, but this budgetary decision will have implications for sample size which affects other sources of error. When applying the TSE framework to survey design decisions, it is important to make every tradeoff explicitly and with as much information as possible. This will allow you to assess and account for the level of error associated with each design decision. The goal for most researchers will be to minimize error (maximize quality) within the constraints of a particular budget.
To determine the approach that will minimize TSE, the researcher must assess the likely level of error for each possible alternative procedure in the flow of survey design. In the figure below, we have added the likely sources of error to our survey data collection flow chart from last week to highlight the considerations and tradeoffs.

TSE2
The success of applying the TSE framework depends on having good information about the costs and errors associated with each step and decision of the survey process. This information may be theoretical, from the survey methodology literature, or it could be empirical from prior survey data collection efforts. The key is to make use of all information available when making survey design decisions. In future weeks, we will focus on some of the key sources of survey error in greater depth, particularly sources of measurement error that can be controlled by the researcher at minimal cost.

 

**For excellent further reading, see Groves & Lyberg, 2010 (http://poq.oxfordjournals.org/content/74/5/849.short)
and Biemer, 2010 (http://poq.oxfordjournals.org/content/74/5/817.short).