Web-scraping

Earlier in the semester, I was trying to collect newspaper articles from online archives. When the structure of the archives changed from PDF to webpage links, I needed to find a way to automate the retrieval process.

Professor Settle then introduced me to Professor Van Der Veen who uses web-scraping in his own research. He also held a workshop that went through the web-scraping tutorial which can be found here.

Web scraping entails “automatically get some information from a website instead of manually copying it.”  There are several ways to go about doing this. We used Python along with several other packages and tried it out on the William & Mary Government Department website.

One of the aspects of web-scraping involves using Firebug, which is a Mozilla Firefox add-on that gives access to a variety of web development tools. Once you open up a webpage and click on Firebug, you can see what part of the webpage corresponds to the HTML code in the web development window at the bottom of the screen, as shown below. firebug

Personally, I have yet to make it all the way through the tutorial because no attempt at programming really ever goes smoothly, no matter what I try.

Although I’m no longer interested in trying to get the newspaper articles that originally led me to wanting to learn how to web-scrape, web-scraping is a useful skill that is worth learning.

References

http://www.sciedupress.com/journal/index.php/air/article/view/1390

http://stair.wm.edu/scraping.html

http://getfirebug.com/

Are We Really More Alike Than Unalike? Anchoring Vignettes in Cross-Cultural Survey Research

You are conducting an expansive international public health survey, with an ultimate goal of cross-cultural comparison. You ask your respondents the following question, adapted from a World Health Organization survey:

“Overall in the last 30 days, how much of a problem have you had with energy and vitality??”

The response categories are: None, Mild, Moderate, Severe, and Extreme/Cannot Do.

A 27 year-old woman who comes home fatigued during a particularly hard few weeks at work answers, “Severe.” An 85 year-old woman who can get out of bed in the morning and dress herself with minimal assistance answers, “Mild.” Does the younger woman have more of a problem than the older woman with energy and vitality, or are the two respondents applying differing standards for energy and vitality?

Because of the two women’s ages, you can assume that it is very likely that the two women do not possess the same latent level of “energy and vitality” – the older woman probably has objectively less. In addition, your survey spans different countries, so these two women are not only of different ages, but they come from different cultures.
Classic anthropological as well as clinical studies, suggest that culture influences perceptions of pain. In some countries, similar self-reports of health correlate negatively with objective measures of health (King 2009, Sen 2002). This problem is called differential item functioning (DIF). While it has been studied most extensively in the public health literature, but it poses a problem for political science survey research, too – especially in cross-cultural comparisons of political attitudes (on engagement, efficacy, corruption).

Anchoring vignettes represent one possible solution to DIF. By presenting a set of hypothetical scenarios that correspond to each value of a variable, researchers establish absolute variable thresholds for all respondents. Establishing these thresholds allows for interpersonal comparability across cultures.
Anchoring vignettes rest on the following two assumptions, however:

1. Response consistency: Despite the hypothetical nature of the vignette scenarios, respondents apply the same absolute scale to evaluating the vignette characters as they would to evaluating themselves.

2. Vignette equivalence: Although respondents have differing life experiences, socioeconomic backgrounds, and personalities, they use the same absolute scale to judge the levels of the variables presented in the vignettes (King et al. 2004)

Researchers rarely test the assumptions of response consistency and vignette equivalence, although they do not always hold, especially in cross-cultural survey research.

When they do test them, and the assumptions do not hold, they conclude by questioning the validity of the anchoring vignettes method in correcting for DIF and interpersonal incomparability.

Rather than discount the method all together, however, why not establish, as Kapteyn et al. (2011) suggest, a “systematic experimental approach to the design of anchoring vignettes”?

References
Kapteyn, Arie, et al. “Anchoring Vignettes and Response Consistency.” RAND. (2011).

King, Gary, et al. “Enhancing the Validity and Cross-cultural Comparability of Measurement in Survey Research.” American Political Science Review 98.01 (2004): 191-207.

King, Gary. “The Anchoring Vignettes Website.” 2008-08—25). http://gking. harvard. edu/vign (2009).

Sen, Amartya. 2002. “Health: Perception versus Observation.” BMJ 324:860–861.