Designing quality online surveys
At the Social Deck, we use online surveys as a tool to engage with a large number of diverse stakeholders. Whether it’s to gather feedback from community members in a safe and sensitive way, or to generate quantitative data from large samples of our community, surveys are an important addition to many of our engagement activities.
Our Research Lead, Natalie, shares her insights into our methods for developing surveys that are accessible and informed by the latest scientific research on survey design and behaviour change.
Why survey design matters. Poorly designed surveys produce unreliable and misleading data. They also limit the opportunity to develop meaningful, actionable insights. For respondents, poorly written surveys create confusion and frustration. This can also lead to a lack of trust in the broader engagement process.
Well-designed surveys, on the other hand, are easy and satisfying for respondents to complete. They accurately measure the constructs they are intended to measure, and generate reliable data that is transparent and easy to interpret. There is no perfect generic survey; each survey needs to be customised to gather meaningful responses on the topic of interest. That being said, there are some standard best practices to follow, informed by research into optimal survey design.
12 tips for good survey design
1. Avoid leading questions.
Questions shouldn’t lead respondents to answer in a particular way. This is often done by providing clues in the question that leads people to a response option they think is correct or the most desirable.
For example, the question, “Have you ever put pressure on a GP to prescribe you antibiotics?”, prompts respondents to answer ‘no’ due to the negative connotations associated with ‘putting pressure’ on someone. Better phasing of the question would be: “Have you ever asked a GP to prescribe you antibiotics?”
2. Questions should not be double-barrelled.
Double-barrelled questions lead to messy data, as it’s difficult to separate out which part of the question the respondent really answered.
For example, the question: “To what extent were your teachers knowledgeable and supportive?”, should be separated into two separate questions: one asking about the expertise of teachers, and another asking about how supportive they were.
Similarly, the question: “How often do you bike ride to work for environmental reasons?”, should be two separate questions: one about frequency of bike riding to work, and another about motivations for bike riding.
3. Avoid using absolutes in questions.
Absolutes can make people answer questions in a different way to what they might have intended.
For example, the question, “To what extent do you agree or disagree that all children should be vaccinated?” does not take into account that a small minority of children – such as those with compromised immune systems – may be unable to be immunised. The wording of this question means that it is unclear if pro-vaccination respondents should agree or disagree with this question. Removing the word “all” in the question, helps to eliminate this ambiguity.
Another example of using absolutes is the question, “Do you floss your teeth every day?” For some respondents, flossing most days might be enough for them to be able to respond ‘yes’ to this question, whereas some other respondents might assume that occasionally failing to floss, means that they have to answer ‘no’ to this question. Better wording would be: “Do you floss your teeth every or almost every day?” or ideally, “During the past two weeks, how often did you floss your teeth” (with the appropriate responses options provided).
4. Avoid using ambiguous or technical words and acronyms.
If people don’t understand the question, they will either skip it or answer randomly. Terms and acronyms that aren’t often used in everyday language should be avoided.
For example, in behavioural science we often use the terms ‘enablers’ but this isn’t necessarily familiar to people. Because its meaning is potentially ambiguous for people not familiar with the field of behavioural science, respondents in a survey might misinterpret the intent of the question. It’s better to use ‘plain English’ terms such as “drivers” or “things that encourage you to…”.
5. Avoid response scales with too many points (e.g., 100).
Having too many points is excessive, and respondents typically gravitate to multiples of 10. This means most of the points on a scale become redundant.
Research has shown that ratings tend to be more reliable and valid when five or seven points are offered (Krosnick & Fabrigar, 1997; Krosnick & Presser, 2010).
6. Ensure each point on a response scale is labelled.
Using word labels on each point of a scale helps to ensure that response options are interpreted in the same way by respondents. Numerical labels on scales are relatively meaningless. Scales containing both numerical and word labels creates confusion and adds to the cognitive burden for respondents. When developing word labels for response scales, ensure that the words used have equally spaced meaning.
For example, the following labels on a 5-point rating scale suggest a linear increase in likelihood: not at all likely, slightly likely, moderately likely, very likely, completely likely.
7. Ensure all respondents are able to answer each of the questions they are asked – when to use ‘Don’t know’, ‘Prefer not to say’ or ‘None of the above’
If a particular question is not applicable, then a “Not applicable” or “Don’t know” response option should be provided. Even better, programming logic should be used so that the question is skipped for that respondent. “Don’t know” options should only be provided when the question is assessing knowledge, or if it is reasonable that the respondent might not be able to provide an opinion on a given topic (i.e., due to unfamiliarity with the topic). Questions that are potentially sensitive, should contain a “Prefer not to say” option to allow respondents to opt out of answering those questions. Furthermore, multiple response questions that are not compulsory should include a ‘None of the above’ option, to differentiate between respondents who simply skipped the question and those for whom none of the response options applied.
8. Minimise agreement response bias by using item-specific rating scales.
Research has shown that people have a general tendency to provide affirmative responses to survey items. There are a number of reasons for this bias (Saris et al., 2010). Firstly, people are often conditioned to be polite and avoid social friction. Secondly, respondents might think that affirmative responses are the correct answers that the researchers are looking for. Lastly, some respondents look to take shortcuts to complete the survey as quickly as possible by not thinking about their responses, or not even reading the questions properly.
Using Agree/Disagree rating scales, instead of response options that are specific to the question, exacerbates this issue causing response bias. They are also more cognitively burdensome for respondents.
For example, try answering these two different versions of the same question: 1) “To what extent do you agree or disagree that your health is excellent? (completely agree, somewhat agree, neither agree nor disagree, somewhat disagree, completely disagree)”; 2) “How would you rate your current health? (excellent, very good, good, fair, poor)”. The second version of the question is more straightforward to answer and generates more reliable responses.
9. Minimise response order effects by randomising response options where appropriate.
There are two main types of response order effects: primacy effects (i.e., choosing response options presented near the beginning of a list); and recency effects (choosing response options presented near the end of a list). When the response options are categorical (as opposed to a rating scale), primacy effects predominate for visual surveys and recency effects predominate for oral surveys (i.e., when the list of response options is read out to a respondent).
Randomising the order of categorical response options across respondents eliminates response order effects in the data.
10. Use realistic time periods for recall and future prediction questions.
When asking respondents to report on their past behaviour, it is important to include a specific timeframe for them to reflect on. Simply asking about general or typical behaviour is likely to lead to biased, inaccurate data.
For behaviours that are performed regularly (e.g., exercise-related behaviours, driving, media consumption, dog walking, etc.), it’s best to ask participants to limit recall to two weeks. For behaviours that are performed very regularly, such as eating, the recall period may need to be even shorter (e.g., one week or a few days).
Recall periods for infrequently performed behaviours, such as visiting the dentist, should be much longer (e.g., 12 or 24 months). Similarly, when asking respondents to make predictions about the future, use sensible time frames that people can realistically envisage.
11. Getting the order right and removing the “Back” button.
When designing a survey, questions should be ordered so we are not leading respondents to answer in a specific way. For example, unprompted awareness of an intervention or campaign should be asked before any information about that campaign is revealed.
In these types of surveys, respondents should not be allowed to go back through the survey and alter their responses.
12. Don’t make surveys too onerous, particularly if respondents are not being reimbursed for their time. You can increase engagement with your survey and minimise the drop-out rate, by limiting the number of onerous tasks or questions.
Examples of more onerous tasks in a survey include:
- Open-ended questions
- Ranking tasks (which can produce higher quality data than ratings, but are more burdensome for respondents and are usually not accessible) (Krosnick, 2000)
- Choice task questions – e.g., choice-based conjoint analysis
- Overly wordy questions or statements
- Questions that require calculations or cognitively demanding recall.
Lastly, don’t forget that surveys should be accessible, produce accurate and meaningful insights, and be satisfying for respondents to complete. It’s always worth spending a little bit of extra time programming and testing a survey among colleagues and some people from the target audience.
If you need help with survey design to ensure you gather meaningful data from your engagement activity, contact us at email@example.com
Krosnick, J.A., and L.R. Fabrigar. 1997. “Designing Rating Scales for Effective Measurement in Surveys.” In Survey Measurement and Process Quality, edited by L. Lyberg, P. Biemer, M. Collins, E. de Leeuw, C. Dippo, N. Schwarz, and D. Trewin, 141–64. New York, NY: John Wiley and Sons, Inc.
Krosnick, J. A. (2000). The threat of satisficing in surveys: The shortcuts respondents take in answering questions. Survey Methods Newsletter, 20, 4-8.
Krosnick, J. A., & Presser, S. (2010). Questionnaire design . In J. D. Wright & P. V. Marsden (Eds.), Handbook of Survey Research (Second Edition). West Yorkshire, England: Emerald Group.
Saris WE, Revilla M, Krosnick JA, Shaeffer EM. Comparing questions with agree/disagree response options to questions with item-specific response options. Surv. Res. Methods. 2010;4:61–79.