You are conducting an expansive international public health survey, with an ultimate goal of cross-cultural comparison. You ask your respondents the following question, adapted from a World Health Organization survey:
“Overall in the last 30 days, how much of a problem have you had with energy and vitality??”
The response categories are: None, Mild, Moderate, Severe, and Extreme/Cannot Do.
A 27 year-old woman who comes home fatigued during a particularly hard few weeks at work answers, “Severe.” An 85 year-old woman who can get out of bed in the morning and dress herself with minimal assistance answers, “Mild.” Does the younger woman have more of a problem than the older woman with energy and vitality, or are the two respondents applying differing standards for energy and vitality?
Because of the two women’s ages, you can assume that it is very likely that the two women do not possess the same latent level of “energy and vitality” – the older woman probably has objectively less. In addition, your survey spans different countries, so these two women are not only of different ages, but they come from different cultures.
Classic anthropological as well as clinical studies, suggest that culture influences perceptions of pain. In some countries, similar self-reports of health correlate negatively with objective measures of health (King 2009, Sen 2002). This problem is called differential item functioning (DIF). While it has been studied most extensively in the public health literature, but it poses a problem for political science survey research, too – especially in cross-cultural comparisons of political attitudes (on engagement, efficacy, corruption).
Anchoring vignettes represent one possible solution to DIF. By presenting a set of hypothetical scenarios that correspond to each value of a variable, researchers establish absolute variable thresholds for all respondents. Establishing these thresholds allows for interpersonal comparability across cultures.
Anchoring vignettes rest on the following two assumptions, however:
1. Response consistency: Despite the hypothetical nature of the vignette scenarios, respondents apply the same absolute scale to evaluating the vignette characters as they would to evaluating themselves.
2. Vignette equivalence: Although respondents have differing life experiences, socioeconomic backgrounds, and personalities, they use the same absolute scale to judge the levels of the variables presented in the vignettes (King et al. 2004)
Researchers rarely test the assumptions of response consistency and vignette equivalence, although they do not always hold, especially in cross-cultural survey research.
When they do test them, and the assumptions do not hold, they conclude by questioning the validity of the anchoring vignettes method in correcting for DIF and interpersonal incomparability.
Rather than discount the method all together, however, why not establish, as Kapteyn et al. (2011) suggest, a “systematic experimental approach to the design of anchoring vignettes”?
Kapteyn, Arie, et al. “Anchoring Vignettes and Response Consistency.” RAND. (2011).
King, Gary, et al. “Enhancing the Validity and Cross-cultural Comparability of Measurement in Survey Research.” American Political Science Review 98.01 (2004): 191-207.
King, Gary. “The Anchoring Vignettes Website.” 2008-08—25). http://gking. harvard. edu/vign (2009).
Sen, Amartya. 2002. “Health: Perception versus Observation.” BMJ 324:860–861.