What is ANOVA?
ANOVA stands for Analysis of Variance. It’s a statistical test that was developed by Ronald Fisher in 1918 and has been in use ever since. Put simply, ANOVA tells you if there are any statistical differences between the means of three or more independent groups.
One-way ANOVA is the most basic form. There are other variations that can be used in different situations, including:
- Two-way ANOVA
- Factorial ANOVA
- Welch’s F-test ANOVA
- Ranked ANOVA
- Games-Howell pairwise test
How does ANOVA work?
Like the t-test, ANOVA helps you find out whether the differences between groups of data are statistically significant. It works by analyzing the levels of variance within the groups through samples taken from each of them.
If there is a lot of variance (spread of data away from the mean) within the data groups, then there is more chance that the mean of a sample selected from the data will be different due to chance.
As well as looking at variance within the data groups, ANOVA takes into account sample size (the larger the sample, the less chance there will be of picking outliers for the sample by chance) and the differences between sample means (if the means of the samples are far apart, it’s more likely that the means of the whole group will be too).
All these elements are combined into a F value, which can then be analyzed to give a probability (p-vaue) of whether or not differences between your groups are statistically significant.
A one-way ANOVA compares the effects of an independent variable (a factor that influences other things) on multiple dependent variables. Two-way ANOVA does the same thing, but with more than one independent variable, while a factorial ANOVA extends the number of independent variables even further.
How can ANOVA help?
The one-way ANOVA can help you know whether or not there are significant differences between the means of your independent variables.
Why is that useful?
Because when you understand how each independent variable’s mean is different from the others, you can begin to understand which of them has a connection to your dependent variable (such as landing page clicks) and begin to learn what is driving that behavior.
You could also flip things around and see whether or not a single independent variable (such as temperature) affects multiple dependent variables (such as purchase rates of suncream, attendance at outdoor venues, and likelihood to hold a cook-out) and if so, which ones.
When might you use ANOVA?
You might use Analysis of Variance (ANOVA) as a marketer when you want to test a particular hypothesis. You would use ANOVA to help you understand how your different groups respond, with a null hypothesis for the test that the means of the different groups are equal. If there is a statistically significant result, then it means that the two populations are unequal (or different).
Examples of using ANOVA
You may want to use ANOVA to help you answer questions like this:
Do age, sex, or income have an effect on how much someone spends in your store per month?
To answer this question, a factorial ANOVA can be used, since you have three independent variables and one dependent variable. You’ll need to collect data for different age groups (such as 0-20, 21-40, 41-70, 71+), different income brackets, and all relevant sexes. A two-way ANOVA can then simultaneously assess the effect on these variables on your dependent variable (spending) and determine whether they make a difference.
Does marital status (single, married, divorced, widowed) affect mood?
To answer this one, you can use a one-way ANOVA, since you have a single independent variable (marital status). You’ll have 4 groups of data, one for each of the marital status categories, and for each one you’ll be looking at mood scores to see whether there’s a difference between the averages.
When you understand how the groups within the independent variable differ (such as widowed or single, not married or divorced), you can begin to understand which of them has a connection to your dependent variable (mood).
However, you should note that ANOVA will only tell you that the average mood scores across all groups are the same or are not the same. It does not tell you which one has a significantly higher or lower average mood score.
Understanding ANOVA assumptions
Like other types of statistical tests, ANOVA compares the means of different groups and shows you if there are any statistical differences between the means. ANOVA is classified as an omnibus test statistic. This means that it can’t tell you which specific groups were statistically significantly different from each other, only that at least two of the groups were.
It’s important to remember that the main ANOVA research question is whether the sample means are from different populations. There are two assumptions upon which ANOVA rests:
- Whatever the technique of data collection, the observations within each sampled population are normally distributed.
- The sampled population has a common variance of s2.
Types of ANOVA
From the basic one-way ANOVA to the variations for special cases, such as the ranked ANOVA for non-categorical variables, there are a variety of approaches to using ANOVA for your data analysis. Here’s an introduction to some of the most common ones.
What is the difference between one-way and two-way ANOVA tests?
This is defined by how many independent variables are included in the ANOVA test. One-way means the analysis of variance has one independent variable. Two-way means the test has two independent variables. An example of this may be the independent variable being a brand of drink (one-way), or independent variables of brand of drink and how many calories it has or whether it’s original or diet.
Factorial ANOVA is an umbrella term that covers ANOVA tests with two or more independent categorical variables. (A two-way ANOVA is actually a kind of factorial ANOVA.) Categorical means that the variables are expressed in terms of non-hierarchical categories (like Mountain Dew vs Dr Pepper) rather than using a ranked scale or numerical value.
Welch’s F Test ANOVA
Stats iQ recommends an unranked Welch’s F test if several assumptions about the data hold:
- The sample size is greater than 10 times the number of groups in the calculation (groups with only one value are excluded), and therefore the Central Limit Theorem satisfies the requirement for normally distributed data.
- There are few or no outliers in the continuous/discrete data.
Unlike the slightly more common F test for equal variances, Welch’s F test does not assume that the variances of the groups being compared are equal. Assuming equal variances leads to less accurate results when variances are not, in fact, equal, and its results are very similar when variances are actually equal.
When assumptions are violated, the unranked ANOVA may no longer be valid. In that case, Stats iQ recommends the ranked ANOVA (also called “ANOVA on ranks”); Stats iQ rank-transforms the data (replaces values with their rank ordering) and then runs the same ANOVA on that transformed data.
The ranked ANOVA is robust to outliers and non-normally distributed data. Rank transformation is a well-established method for protecting against assumption violation (a “nonparametric” method) and is most commonly seen in the difference between the Pearson and Spearman correlation. Rank transformation followed by Welch’s F test is similar in effect to the Kruskal-Wallis Test.
Note that Stats iQ’s ranked and unranked ANOVA effect sizes (Cohen’s f) are calculated using the F value from the F test for equal variances.
Games-Howell Pairwise Test
Stats iQ runs Games-Howell tests regardless of the outcome of the ANOVA test (as per Zimmerman, 2010). Stats iQ shows unranked or ranked Games-Howell pairwise tests based on the same criteria as those used for ranked vs. unranked ANOVA, so if you see “Ranked ANOVA” in the advanced output, the pairwise tests will also be ranked.
The Games-Howell is essentially a t-test for unequal variances that accounts for the heightened likelihood of finding statistically significant results by chance when running many pairwise tests. Unlike the slightly more common Tukey’s b-test, the Games-Howell test does not assume that the variances of the groups being compared are equal. Assuming equal variances leads to less accurate results when variances are not in fact equal, and its results are very similar when variances are actually equal (Howell, 2012).
Note that while the unranked pairwise test tests for the equality of the means of the two groups, the ranked pairwise test does not explicitly test for differences between the groups’ means or medians. Rather, it tests for a general tendency of one group to have larger values than the other.
Additionally, while Stats iQ does not show results of pairwise tests for any group with less than four values, those groups are included in calculating the degrees of freedom for the other pairwise tests.
How to conduct an ANOVA test
As with many of the older statistical tests, it’s possible to do ANOVA using a manual calculation based on formulae. You can also run ANOVA using any number of popular stats software packages and systems, such as R, SPSS or Minitab. A more recent development is to use automated tools such as Stats iQ from Qualtrics, which make statistical analysis more accessible and straightforward than ever before.
Stats iQ and ANOVA
Stats iQ from Qualtrics can help you run an ANOVA test. When you select one categorical variable with three or more groups and one continuous or discrete variable, Stats iQ runs a one-way ANOVA (Welch’s F test) and a series of pairwise “post hoc” tests (Games-Howell tests).
The one-way ANOVA tests for an overall relationship between the two variables, and the pairwise tests test each possible pair of groups to see if one group tends to have higher values than the other.
How to run an ANOVA test through Stats iQ
The Overall Stat Test of Averages in Stats iQ acts as an ANOVA, testing the relationship between a categorical and a numeric variable by testing the differences between two or more means. This test produces a p-value to determine whether the relationship is significant or not.
To run an ANOVA in StatsiQ, take the following steps:
- Select a variable with 3+ groups and one with numbers
- Select “Relate”
- You’ll then get an ANOVA, a related “effect size”, and a simple, easy to understand summary
Qualtrics Crosstabs and ANOVA
You can run an ANOVA test through the Qualtrics Crosstabs feature too. Here’s how:
- Ensure your “banner” (column) variable has 3+ groups and your “stub” (rows) variable has numbers (like Age) or numeric recodes (like “Very Satisfied” = 7)
- Select “Overall stat test of averages”
- You’ll see a basic ANOVA p-value
What are the limitations of ANOVA?
Whilst ANOVA will help you to analyze the difference in means between two independent variables, it won’t tell you which statistical groups were different from each other. If your test returns a significant F-statistic (the value you get when you run an ANOVA test), you may need to run an ad hoc test (like the Least Significant Difference test) to tell you exactly which groups had a difference in means.
Additional considerations with ANOVA
- With smaller sample sizes, data can be visually inspected to determine if it is in fact normally distributed; if it is, unranked t-test results are still valid even for small samples. In practice, this assessment can be difficult to make, so Stats iQ recommends ranked t-tests by default for small samples.
- With larger sample sizes, outliers are less likely to negatively affect results. Stats iQ uses Tukey’s “outer fence” to define outliers as points more than three times the intraquartile range above the 75th or below the 25th percentile point.
- Data like “Highest level of education completed” or “Finishing order in marathon” are unambiguously ordinal. Though Likert scales (like a 1 to 7 scale where 1 is Very dissatisfied and 7 is Very satisfied) are technically ordinal, it is common practice in social sciences to treat them as though they are continuous (i.e., with an unranked t-test).