An important part of survey research involves the creation of good items. It’s easy enough to write items, easier still to write poor items. Care needs to be taken to ensure the items align with the research purpose and questions. More importantly, once items have been constructed, researchers need to test the items to make sure the items are not flawed in any discernable manner. Creating good items is facilitated by following basic design principles.
General Principles for Creating Items on a Questionnaire
There are two parts to any survey question, the item stem and the response options. The item stem poses the question or presents a statement. The response options provide a way for the respondents to answer the question or indicate the degree to which they agree with the statement. Depending on the questions, direct responses (i.e., open-ended or numerical inputs) can also be used rather than selecting options from a list. In addition, many survey software packages provide alternative ways for people to respond to a survey; however, while often creative, these can be hard to interpret and report. While there is not any particular order in which the following principles should be addressed, each needs to be considered when creating good survey items.
Item Indispensability and Purpose
There should be a reason for including each and every item on a survey. Items must align with the study’s purposes and objectives. As you write each item, consider how the response will help answer the research questions or support a specific research purpose. Ask yourself whether the item is needed or if additional items are required in order to fully understand the response (or use the results).
On some surveys, researchers will ask for information that is not needed, given the purposes of the study. Researchers might also ask for information they do not plan to use but for some reason feel might be interesting. For example, researchers often attempt to obtain demographic information they hadn’t planned to use about the respondents. If these demographics do not serve a research purpose, you might consider removing the item from the survey to limit the length of the survey and avoid survey fatigue.
On the other hand, some research fails to ask enough questions. An item on the survey may provide broad impressions but fails to provide enough in-depth understanding to be useful. Two examples of this involve specificity issues and negative case analysis.
Item Specificity Issues
Consider a basic survey item asking the degree to which individuals feel a particular intervention was effective. Obtaining an overall impression of effectiveness may satisfy the research needs; however, an individual may be generally satisfied with the intervention but more or less satisfied with particular aspects of the program. The level of specificity needs to be established. Abstract or complex topics may require multiple items each designed to capture nuanced aspects of the topic. This is especially the case when measuring constructs (see Affective Scale development).
Negative Case Analysis
A negative case analysis can be used to better understand various responses. For example, a negative overall impression of satisfaction may require further investigation. For those individuals who felt generally satisfied with the intervention or program, no additional information may be needed; however, you may wish to know more from those who were less than satisfied. This would require branching and possibly an open-ended response option. You would need branching to avoid asking satisfied respondents unnecessary questions. If you don’t already know all the possible things respondents might be dissatisfied with, you may need to use open-ended response options to ascertain which specific aspects of the intervention the respondent felt were unsatisfactory. There may be more than one problematic aspect, and some of the problems may be more egregious to the respondent than others—which may prompt you to ask participants to provide more information on those problems.
Branching is a technique used to reduce the number of items presented to respondents or target individuals by only asking questions if the respondent belongs to a specific group. Group membership is often determined by key questions that can be used to modify the content or flow of the survey. For example, a respondent may be asked whether they have children living at home. The survey might use this information to present additional questions to individuals who answer affirmatively.
This principle involves making sure the questions you are asking can be answered by those you are surveying. You should never ask a question that requires the respondent to guess. This enables random response bias. For example, asking someone to speculate on the motive of others would not produce valid information. The information you obtain would represent the perception of the respondent (their guesses), not the actual motivations of those involved. It is unlikely that a reasonable purpose of any evaluation or research study would be determining speculative perceptions (guesses) individuals might have. It would be better to ask individuals about their own motives (asking those with a direct knowledge).
Clarity and Precision: Audience Appropriate Wording
Surveys should be written for a specific audience. The vocabulary and structure of the item stem must be appropriate for the intended audience. To do this, you must understand those in the target population. When creating items, consider the target population’s ability to read and how they will interpret the questions. Use common understandable language (i.e., natural language the target audience would easily understand). Pilot test each item with potential participants to verify the target audience would likely understand what is being asked. Just because you understand what you are asking doesn’t mean they will. Be clear and precise when writing items stems. Avoid double negatives and keep items relatively short.
Singular Purpose (Double-Barreled Items)
A common mistake many survey developers make is asking two related but separate questions in one item. We call this a double-barreled item. For example, a survey might ask, “How useful were the assignments and feedback?” These questions are tricky for respondents to answer because they don’t know which aspect of the question to focus on (i.e., the usefulness of the assignments or the usefulness of the feedback). It is problematic for the researcher because the interpretation of the results becomes incredibly difficult, if not impossible. Double-barreled items should be avoided. They can usually be fixed by separating the item into two questions.
Avoid Loaded or Leading Questions
A loaded question is one that has emotionally charged connotations that influence feelings and response patterns. A leading question is one that indirectly or unintentionally influences respondents to answer in a specific way. Loading a question with words that provoke emotional responses (positive or negative) will skew results. Using phrases that lead a respondent to make a connection or association when answering a question can also sway the way people respond.
The following examples show how an item stem statement might be loaded with emotional words and lead or influence individuals to respond a particular way. The first example loads the response by using the words "wrong," "condone," and "killing" when referring to abortion. This would likely skew the result against abortion. The second example, phrased as a women’s rights issue using words like "rights," "access," and "choice," would likely skew results in the opposite direction. In both cases the wording would likely lead respondents to agree with the statement even though each represents opposite stances towards abortion.
Abortion is wrong because society should not condone the killing of innocent unborn children.
Women have rights and should be allowed access to abortion services if they choose.
Item Stem: Response Scale Alignment
The item stem must align with and be appropriate for the response scale. In addition, respondents must be able to clearly understand each option and how to indicate their response. This can only be done by pilot testing the instrument.
Before implementing a survey, the items should be tested. This can take various forms and phases, which may require several iterations. A think-aloud technique is commonly used to test a specific item. This involves asking a few individuals (preferably ones from the target population) to read aloud items on the survey and express their feeling and thought processes as they take the survey. If revisions are required, the process may need to be repeated. This would include determining if the item is clear, whether the wording might be improved, how the stem was interpreted, why the individual chose the response they selected, and if they felt the response options provided were adequate or whether additional options might be needed.
- It is easy to create flawed items. Carefully considering various design principles will help improve the items you write.
- There are two parts to a survey item: the item stem and the response scale.
- There should be a clear purpose for including each item in the survey.
- Consider whether the items that were included are sufficient to answer the research questions and whether they are needed and adequate for the research purposes.
- Additional items, like a negative cases inquiry, may be needed to fully understand participant responses.
- Branching can be used to present specific items to targeted participants.
- Only ask participants to answer questions for which they would have a direct knowledge.
- The item stem should be carefully worded so the target audience would clearly understand its meaning.
- Avoid double-barreled item stems.
- Avoid loaded or leading item stems.
- Response scales should be appropriate for the item stem used.
- All items should be pilot tested, including a review by a representative of the survey's intended target audience.
- Explain how the pilot testing process might be used. What protocols and procedures might need to be implemented?
- What tradeoffs need to be made to keep the survey reasonably short and still ask enough questions to answer the research questions?