What is selection bias in research?
Selection bias, sometimes referred to as the selection effect, is a systematic error that occurs when proper randomization is not achieved. The result is an unrepresentative sample of the population. If selection bias is not taken into account, conclusions from the study are often rendered useless or inaccurate.
There are several reasons for selection bias, including:
- Inconsistencies amongst individuals or groups taking part in the study — g. not having similar characteristics to each other and to the larger population from which they are drawn
- Sample selection bias — rather than randomizing the sampling process, researchers choose who they want to study, affecting how representative and accurate the results are
- A lack of preplanned and publicized protocols — studies without plans or clear explanations of processes, for example, are prone to bias with respect to the selection of data and reported outcomes
- Volunteers to the study — those who join, i.e. self-select into the study, may share a characteristic that makes them different from your ideal population or target audience from the get-go.
What’s the impact of selection bias?
In research, there’s always the chance of random or systematic errors, including sampling or non-sampling errors, that affect the validity of results, but why should your business care about selection bias in particular?
Well, there are real-world consequences to inaccurate, unrepresentative data — from missed market opportunities to damaging the brand’s reputation (especially if studies are made public).
From a business perspective, there are several problems that arise from selection bias:
- Insights gleaned from unrepresentative samples (e.g. not matching the target audience) are far less useful for business planning and strategy. If business decisions go ahead based on these insights, there’s the potential for loss of revenue and reputation.
- Data is distorted and leads to unreliable research outcomes. This in turn affects the external validity of the analysis because the results come from a biased sample.
- If the end results are skewed and not representative of the topic under investigation, then the internal validity – how trustworthy the results are – is compromised and the research will be risky to trust and base business decisions on.
- If the results are published in a public-facing journal or news outlet, there’s the possibility of widespread ramifications of other companies using insights and conclusions drawn from the research.
According to healthcare researchers, Odgaard-Jensen J et al, ‘selection bias can have varying effects, and the magnitude of its impact and the direction of the effect is often hard to determine.’
Because of the uncertainty of the relative risk, awareness of selection bias, its types, and how to prevent it is critical to ensuring accurate, representative, and useful results.
What are the types of selection bias to be aware of?
Sampling bias — or sample selection bias — is when some members of a population are systematically more likely to be selected. Your choice of research design or data collection method can influence sampling bias, and sampling bias can occur in both probability and non-probability sampling.
Selective survival, also known as survivorship bias, survival bias, and immortal time bias, is when researchers concentrate on people, things or elements that have made it past some kind of selection process while overlooking those that did not. This leads to incorrect conclusions.
Example of selective survival
- A brand is looking to understand why employee turnover is so high, so they do research with their current employees. However, the people that will give them insights as to why they left are those that have left the organization. This is an example of selective survival bias or survivorship bias.
Observer bias, also known as recall bias and closely related to cherry-picking, is a bias of perception. It’s when researchers (or observers) disregard what’s in front of them and instead see what they want to see. This often occurs when researchers unconsciously apply their values or non-standard behavior (or expectations) to the data they observe.
Example of observer bias
- A researcher from a UK-based food business may ask US target participants to describe a recipe and interpret the numerical results in the UK metric system (liters and grams) instead of the US’s imperial system of measurement (pounds and ounces).
- A researcher conducting time interval tasks with the sample may round up or round down the time to the nearest decimal space, unconsciously. As their recall bias is personal to them, other researchers won’t know whether they rounded up or down.
Volunteer bias, or self-selection bias, affects the sample of a study by filling the sample with people who have an agenda. Perhaps they are keen to share their biased views on the topic under investigation, or they were in the vicinity and available on the day, or maybe they have a desire to impress.
Whatever the reason, if the sample population is full of people who volunteer, rather than the ideal target research population, it’s highly likely that researchers will receive skewed results.
Example of volunteer bias
- In a clinical research study about abortion drugs, a person who has pro-life views might volunteer just to convince others to alter their views or condemn the topic.
- In a research study looking at the perceived quality of a new car entering the target market, a pro-car enthusiast may join because they consider themselves an expert. They may answer inappropriately or give extra information that wasn’t requested because of the self-selection bias.
For more information about survey bias types, read our article: Survey bias types that researchers need to know about
Where does selection bias come up?
Now, random errors can occur in research at any time — but there are certain research situations that may result in a higher likelihood of selection bias:
- Sampling bias or selection bias occurs when there aren’t a lot of participants available to make up a representative sample.
- In convenience sampling, selection bias occurs when participants are selected for a sample, based on who is conveniently available at the time or who volunteers.
- Selective survival bias can occur in studies that use pre-screening applications, which discount a potential participant from being in the sample. This can lead to exclusion bias if there are not new potential participants to replace the removed ones.
- In observational studies (like cross-sectional studies or a cohort study) where the participant sampling isn’t random or representative, leading to non-randomized trials.
How to avoid selection bias
If you feel your survey is at high risk of selection bias, what can you do? Here are our tips to review and help you minimize selection bias:
When you are designing your survey:
- Correctly identify your survey goals
- Give clearly defined requirements for your target audience
- Give all potential respondents an equal chance of participating
For your sampling:
- For your sample selection process, ensure you have an up-to-date participant list that covers the right target population and is a random sample. If you need help knowing what is a good sample size, try our free sample size calculator tool.
- Use proper randomization in your sampling methods with random sampling. Try out these four methods: simple random sampling, systematic sampling, stratified random sampling, and cluster sampling.
- Ensure subgroups are equivalent to the population (i.e. they share key characteristics)
For review and validation — during and after
- Have another researcher provide oversight of the primary researcher’s work so that there is someone checking for unconscious bias in the sample selection, process and data collection.
- Use technology to keep track of how the data is shaping up as you go so that you can identify any surprising results and investigate fast to correct or prevent distorted data.
- Investigate if there are any ‘baseline’ data trends from other studies that you can use as a guide to see if your data is on track for high internal validity.
- Ask non-respondents to participate in a follow-up survey. Sometimes, people are busy and miss the email invitation. A second round may provide more ‘yeses’ that will help create a better picture of results.
If you have already conducted the research and assess that selection bias is present, you can assign a handicap or a ‘weight’ to answers coming from underrepresented groups or people with average views, over extreme ones. This allows for some balancing of the scores to make them fairer and acknowledge the distortions in the results from selection bias.
Repeating your survey or having control groups is another way to compensate when selection bias occurs, as you can include this data into the first survey’s findings to promote a larger dataset to draw conclusions from.
How can technology help keep you on track?
Selection bias occurs there are scenarios that bring together non-neutral samples and system issues. So, how can Qualtrics CoreXM, our “built-for-purpose” survey technology solution, help you prepare and prevent selection bias?
- Makes creating surveys a breeze with easy to use drag-and-drop interfaces
- Gives you free survey templates that are based on expert research
- Recommends great questions that encourage feedback
- Is easy to access by the sample, anywhere and on any device
- Chases up and tracks communication with non-responsive participants
- Combines with marketing services like vetted audience research samples
- Collected, compares and analyzes incoming data real-time
- Presents the results in reports and dashboard giving you constant visibility
- Alerts you or your team to inconsistencies in the data
- Delivers information to the right teams at the right times for action to be taken
The result? You’re left with the highest-quality sample involved in the research, providing a great completion rate, and a supportive system that gives you visibility of your progress.