What is sample size?
Sample size is the beating heart of any research project. It’s the invisible force that gives life to your data, making your findings robust, reliable and believable.
Sample size is what determines if you see a broad view or a focus on minute details; the art and science of correctly determining it involves a careful balancing act. Finding an appropriate sample size demands a clear understanding of the level of detail you wish to see in your data and the constraints you might encounter along the way.
Remember, whether you’re studying a small group or an entire population, your findings are only ever as good as the sample you choose.
Let’s delve into the world of sampling and uncover the best practices for determining sample size for your research.
How to determine sample size
“How much sample do we need?” is one of the most commonly-asked questions and stumbling points in the early stages of research design. Finding the right answer to it requires first understanding and answering two other questions:
- How important is statistical significance to you and your stakeholders?
- What are your real-world constraints?
How important is statistical significance to you and your stakeholders?
At the heart of this question is the goal to confidently differentiate between groups, by describing meaningful differences as statistically significant. Statistical significance isn’t a difficult concept, but it needs to be considered within the unique context of your research and your measures.
First, you should consider when you deem a difference to be meaningful in your area of research. While the standards for statistical significance are universal, the standards for “meaningful difference” are highly contextual.
For example, a 10% difference between groups might not be enough to merit a change in a marketing campaign for a breakfast cereal, but a 10% difference in efficacy of breast cancer treatments might quite literally be the difference between life and death for hundreds of patients. The exact same magnitude of difference has very little meaning in one context, but has extraordinary meaning in another. You ultimately need to determine the level of precision that will help you make your decision.
Within sampling, the lowest amount of magnification – or smallest sample size – could make the most sense, given the level of precision needed, as well as timeline and budgetary constraints.
If you’re able to detect statistical significance at a difference of 10%, and 10% is a meaningful difference, there is no need for a larger sample size, or higher magnification. However, if the study will only be useful if a significant difference is detected for smaller differences – say, a difference of 5% — the sample size must be larger to accommodate this needed precision. Similarly, if 5% is enough, and 3% is unnecessary, there is no need for a larger statistically significant sample size.
You should also consider how much you expect your responses to vary. When there isn’t a lot of variability in response, it takes a lot more sample to be confident that there are statistically significant differences between groups.
For instance, it will take a lot more sample to find statistically significant differences between groups if you are asking, “What month do you think Christmas is in?” than if you are asking, “How many miles are there between the Earth and the moon?”. In the former, nearly everybody is going to give the exact same answer, while the latter will give a lot of variation in responses. Simply put, when your variables do not have a lot of variance, larger sample sizes make sense.
The likelihood that the results of a study or experiment did not occur randomly or by chance, but are meaningful and indicate a genuine effect or relationship between variables.
Magnitude of difference
The size or extent of the difference between two or more groups or variables, providing a measure of the effect size or practical significance of the results.
Valuable findings or conclusions drawn from data analysis that can be directly applied or implemented in decision-making processes or strategies to achieve a particular goal or outcome.
It’s crucial to understand the differences between the concepts of “statistical significance”, “magnitude of difference” and “actionable insights” – and how they can influence each other:
- Even if there is a statistically significant difference, it doesn’t mean the magnitude of the difference is large: with a large enough sample, a 3% difference could be statistically significant
- Even if the magnitude of the difference is large, it doesn’t guarantee that this difference is statistically significant: with a small enough sample, an 18% difference might not be statistically significant
- Even if there is a large, statistically significant difference, it doesn’t mean there is a story, or that there are actionable insights
There is no way to guarantee statistically significant differences at the outset of a study – and that is a good thing.
Even with a sample size of a million, there simply may not be any differences – at least, any that could be described as statistically significant. And there are times when a lack of significance is positive.
Imagine if your main competitor ran a multi-million dollar ad campaign in a major city and a huge pre-post study to detect campaign effects, only to discover that there were no statistically significant differences in brand awareness. This may be terrible news for your competitor, but it would be great news for you.
With Stats iQ™ you can analyze your research results and conduct significance testing
What are your real-world constraints?
As you determine your sample size, you should consider the real-world constraints to your research.
Factors revolving around timings, budget and target population are among the most common constraints, impacting virtually every study. But by understanding and acknowledging them, you can definitely navigate the practical constraints of your research when pulling together your sample.
Gathering a larger sample size naturally requires more time. This is particularly true for elusive audiences, those hard-to-reach groups that require special effort to engage. Your timeline could become an obstacle if it is particularly tight, causing you to rethink your sample size to meet your deadline.
Every sample, whether large or small, inexpensive or costly, signifies a portion of your budget. Samples could be like an open market; some are inexpensive, others are pricey, but all have a price tag attached to them.
Sometimes the individuals or groups you’re interested in are difficult to reach; other times, they’re a part of an extremely small population. These factors can limit your sample size even further.
What’s a good sample size?
A good sample size really depends on the context and goals of the research. In general, a good sample size is one that accurately represents the population and allows for reliable statistical analysis.
Larger sample sizes are typically better because they reduce the likelihood of sampling errors and provide a more accurate representation of the population. However, larger sample sizes often increase the impact of practical considerations, like time, budget and the availability of your audience. Ultimately, you should be aiming for a sample size that provides a balance between statistical validity and practical feasibility.
4 tips for choosing the right sample size
Choosing the right sample size is an intricate balancing act, but following these four tips can take away a lot of the complexity.
1) Start with your goal
The foundation of your research is a clearly defined goal. You need to determine what you’re trying to understand or discover, and use your goal to guide your research methods – including your sample size.
If your aim is to get a broad overview of a topic, a larger, more diverse sample may be appropriate. However, if your goal is to explore a niche aspect of your subject, a smaller, more targeted sample might serve you better. You should always align your sample size with the objectives of your research.
2) Know that you can’t predict everything
Research is a journey into the unknown. While you may have hypotheses and predictions, it’s important to remember that you can’t foresee every outcome – and this uncertainty should be considered when choosing your sample size.
A larger sample size can help to mitigate some of the risks of unpredictability, providing a more diverse range of data and potentially more accurate results. However, you shouldn’t let the fear of the unknown push you into choosing an impractically large sample size.
3) Plan for a sample that meets your needs and considers your real-life constraints
Every research project operates within certain boundaries – commonly budget, timeline and the nature of the sample itself. When deciding on your sample size, these factors need to be taken into consideration.
Be realistic about what you can achieve with your available resources and time, and always tailor your sample size to fit your constraints – not the other way around.
4) Use best practice guidelines to calculate sample size
There are many established guidelines and formulas that can help you in determining the right sample size.
The easiest way to define your sample size is using a sample size calculator, or you can use a manual sample size calculation if you want to test your math skills. Cochran’s formula is perhaps the most well known equation for calculating sample size, and widely used when the population is large or unknown.
Beyond the formula, it’s vital to consider the confidence interval, which plays a significant role in determining the appropriate sample size – especially when working with a random sample – and the sample proportion. This represents the expected ratio of the target population that has the characteristic or response you’re interested in, and therefore has a big impact on your correct sample size.
If your population is small, or its variance is unknown, there are steps you can still take to determine the right sample size. Common approaches here include conducting a small pilot study to gain initial estimates of the population variance, and taking a conservative approach by assuming a larger variance to ensure a more representative sample size.
Empower your market research
Conducting meaningful research and extracting actionable intelligence are priceless skills in today’s ultra competitive business landscape. It’s never been more crucial to stay ahead of the curve by leveraging the power of market research to identify opportunities, mitigate risks and make informed decisions.
Equip yourself with the tools for success with our essential eBook, “The ultimate guide to conducting market research”.
With this front-to-back guide, you’ll discover the latest strategies and best practices that are defining effective market research. Learn about practical insights and real-world applications that are demonstrating the value of research in driving business growth and innovation.
Learn how to determine sample size
To choose the correct sample size, you need to consider a few different factors that affect your research, and gain a basic understanding of the statistics involved. You’ll then be able to use a sample size formula to bring everything together and sample confidently, knowing that there is a high probability that your survey is statistically accurate.
The steps that follow are suitable for finding a sample size for continuous data – i.e. data that is counted numerically. It doesn’t apply to categorical data – i.e. put into categories like green, blue, male, female etc.
Stage 1: Consider your sample size variables
Before you can calculate a sample size, you need to determine a few things about the target population and the level of accuracy you need:
1. Population size
How many people are you talking about in total? To find this out, you need to be clear about who does and doesn’t fit into your group. For example, if you want to know about dog owners, you’ll include everyone who has at some point owned at least one dog. (You may include or exclude those who owned a dog in the past, depending on your research goals.) Don’t worry if you’re unable to calculate the exact number. It’s common to have an unknown number or an estimated range.
2. Margin of error (confidence interval)
Errors are inevitable – the question is how much error you’ll allow. The margin of error, AKA confidence interval, is expressed in terms of mean numbers. You can set how much difference you’ll allow between the mean number of your sample and the mean number of your population. If you’ve ever seen a political poll on the news, you’ve seen a confidence interval and how it’s expressed. It will look something like this: “68% of voters said yes to Proposition Z, with a margin of error of +/- 5%.”
3. Confidence level
This is a separate step to the similarly-named confidence interval in step 2. It deals with how confident you want to be that the actual mean falls within your margin of error. The most common confidence intervals are 90% confident, 95% confident, and 99% confident.
4. Standard deviation
This step asks you to estimate how much the responses you receive will vary from each other and from the mean number. A low standard deviation means that all the values will be clustered around the mean number, whereas a high standard deviation means they are spread out across a much wider range with very small and very large outlying figures. Since you haven’t yet run your survey, a safe choice is a standard deviation of .5 which will help make sure your sample size is large enough.
Stage 2: Calculate sample size
Now that you’ve got answers for steps 1 – 4, you’re ready to calculate the sample size you need. This can be done using an online sample size calculator or with paper and pencil.
1. Find your Z-score
Next, you need to turn your confidence level into a Z-score. Here are the Z-scores for the most common confidence levels:
- 90% – Z Score = 1.645
- 95% – Z Score = 1.96
- 99% – Z Score = 2.576
If you chose a different confidence level, use this Z-score table (a resource owned and hosted by SJSU.edu) to find your score.
2. Use the sample size formula
Plug in your Z-score, standard of deviation, and confidence interval into the sample size calculator or use this sample size formula to work it out yourself:
This equation is for an unknown population size or a very large population size. If your population is smaller and known, just use the sample size calculator.
What does that look like in practice?
Here’s a worked example, assuming you chose a 95% confidence level, .5 standard deviation, and a margin of error (confidence interval) of +/- 5%.
((1.96)2 x .5(.5)) / (.05)2
(3.8416 x .25) / .0025
.9604 / .0025
385 respondents are needed
Voila! You’ve just determined your sample size.