Researchers often face a difficult tradeoff ­­– deciding whether they want to collect more data or better data.


This tradeoff is an important dimension of the data collection process that is often neglected; however, not considering it beforehand can lead to a researcher finishing their data collection, and then realizing that they can’t generalize their findings to the population they’re studying.


Easy-to-Find Respondent Samples May Hurt Your Data


Not every data collection project requires a probability-based, nationally representative sample of respondents. Many researchers choose to use easy-to-find samples from customer lists, Amazon’s Mechanical Turk (MTurk) or college undergraduates because they’re accessible and inexpensive, meaning you can usually collect more data on a budget.


But if you’re hoping to generalize your results to a population that is broader than your customer list, the population of people working on MTurk, or the undergraduate psychology 101 class that you teach, then you risk collecting inaccurate data. And if your data are inaccurate, then it doesn’t really matter how much money you saved on data collection.


How to Choose the Right Sample for Your Survey


The key to finding the best sample for your research question is to determine in advance exactly what population you want to generalize your results to. Once you know this, you can collect and appropriate sample from that population.


In many cases, the respondents you have easily available will be perfectly appropriate for your research question – lists of existing or past customers or employees will give you accurate feedback on your company, and your psychology 101 students can accurately tell you how well you taught the class.


It’s important to determine the usefulness of the source of your respondents before you begin surveying, this way you don’t waste time and money collecting inaccurate data. For example, if you’re hoping to pretest your survey or get a general idea of what your results might look like, MTurk is a good resource to use before you actually run your study on a sample matching your population of interest.


An Exception


One case where finding a sample that adequately represents your population of interest isn’t always necessary is experimental research. Random assignment of respondents to treatment conditions provides a level of practical and theoretical protection against many biases, regardless of the sample source. So if you are using an experimental design, aligning your population of interest with your research question becomes less crucial.


As a general rule, thinking about how well your sample source addresses your research question before you begin surveying can help you collect better data and draw more valid and reliable conclusions from your data.