Three Tips for Effectively Designing Rating Scales
Survey data are only as good as the questions asked and the way we ask them. To that end, let’s talk survey rating scales. Some of the common questions people ask are:
- How many scale points should I include in my rating scale?
- Do I want to give a middle response option?
- How should I label the response options?
To get started, let’s outline five basic goals for scale points and their labels:
- It should be easy to interpret the meaning of each scale point
- The meaning of scale points should be interpreted identically by all respondents
- The scale should give enough points to differentiate respondents from one another as much as validly possible
- Responses to the survey rating scale should be reliable, meaning that if we give the same question again, each respondent should provide the same answer
- The scale’s points should map as closely as possible to the underlying idea (construct) of the scale
Now that we’ve laid that groundwork, let’s jump back to the questions at the top.
How many scale points should I include?
The number of scale points depends on what sort of question you’re asking. If you’re dealing with an idea or construct that ranges from positive to negative – think satisfaction levels – (these are known as bi-polar constructs) then you’re going to want a 7-point scale that includes a middle or neutral point. In practice, this means the response options for a satisfaction question should look like this:
If you’re dealing with an idea or construct that ranges from zero to positive – think effectiveness – (these are known as unipolar constructs) then you’ll go with a 5-point scale. The response options for this kind of question would look like this:
Since it doesn’t make sense to measure negative effectiveness, this kind of five-point scale is the best practice.
Tip: Always measure bipolar constructs with bipolar scales and unipolar constructs with unipolar scales.
In short, the goal is to make sure respondents can answer in a way that allows them to differentiate themselves as much as is validly possible without providing so many points that the measure becomes noisy or unreliable. Even on an 11-point (0-10) scale respondents start to have difficulty reliably placing themselves, lowering the rating scale quality. This is because 3 isn’t so different from 4 and 6 isn’t so different from 7 and having this many more levels of contentment beyond the basic 5 or 7 makes survey measures more confusing.
Many researchers complain that including middle alternatives basically allows respondents to avoid taking a position. Some even mistakenly assume that midpoint responses are disguised “Don’t knows” or that respondents are satisficing when they provide midpoint responses.
However, research suggests that midpoint responses don’t necessarily mean that respondents don’t know or are avoiding making a choice. In fact, research indicates that if respondents that select the midpoint were forced to choose a side, they would not necessarily answer the question in the same way as other respondents that opted to choose a side.
This suggests that middle alternatives should be provided and that they may be validly and reliably chosen by respondents. Forcing respondents to take a side may introduce unwanted variance or bias to the data.
Labeling response options
Lastly, let’s look at how to label response scale points. Some people prefer to only label the end-points. Others will also label the midpoint. Some people label with words and others label numerically. What’s right?
The most accurate surveys will have a clear and specific label that indicates exactly what each point means. Going back to the goals of survey rating scale points and their labels, we want all respondents to easily interpret the meaning of each scale point and for there to be no room for different interpretations between respondents. Labels are key to avoiding ambiguity and respondent confusion.
This means that partially labeled scales may not perform as well as a fully labeled scale and that numbers should only be used for scales collecting numeric data (not rating scales). Fully labeled scales have been shown to produce more reliable and valid data.