Girl Talk: How to identify gender by online speech patterns

Do patterns of online political discussion differ based on the gender of the writer? One of the keys to answering this question may be LIWC, or Linguistic Inquiry and Word Count, a “a computerized text analysis program that categorizes and quantifies language use” (Kahn 263). LIWC analyzes text by recognizing words and grouping words into different categories. For example, “I” and “me” are grouped into the “self-referential words” category while verbs like “think” and “believe” are grouped into the “cognitive processes” category. These categories range in specificity from broad language descriptors like “affect” to specific emotions and topics like “sadness” and “occupation”.

LIWC will be especially useful for the Online Political Discussion Computer Science team as we begin working with our 2008 twitter data set. We will use hashtags that are co-occuring with #politics to create a social network diagram of political discourse. For example, each node will be a tweet, and it will be connected to every tweet with which it shares a hashtag. Overlaying LIWC data with the social network diagram will show how the language content of tweets is mapped out over the network. Specifically, I hope to use LIWC to focus on the relationship between gender and online political discussion. However, the twitter metadata does not disclose the gender of twitter authors. Instead, I will use LIWC to analyze the language patterns of tweets to figure out the gender of twitter users.
How do we differentiate the language patterns of males and females? This is a question that both linguists and feminists have confronted for years. Second wave Feminist writers tackled this question using the language of power and powerlessness. In “Discourse Competence: Or How to Theorize Strong Women Speakers,” Sara Mills argues that the linguistic elements that make women’s speech different from men’s speech, like expressions of uncertainty and reliance on verbal fillers are not unique to women, but are expressions of submissiveness (Mills 4). At the same time, Mills writes that women act as the facilitators of conversation. Instead of steering the course for conversation, women tend do the “repair-work” of the conversation by asking questions and avoiding awkward silences (Mills 5). It should be noted, however, that some of the feminist writings of the 1970s are more theoretical than quantitative. In Language and Woman’s Place—a text on the linguistics of gender that was ground-breaking in the 1970s—the author admits that “the data on which she bases her claims have been gathered mainly through introspection: she examined her own speech and that of her acquaintances, and used her own intuitions in analyzing it” (Lakoff 46). Nonetheless, these theories of the linguistics of gender create a useful framework for discussing online political discourse. For example, if women truly are the “facilitators” of conversation, will female-authored tweets have higher measures of centrality? Or does the nature of online communication destroy the need for conversation facilitators, in which case one might predict the marginalization of female-authored tweets. Or does Twitter, a female-dominated social media site, represent a completely different paradigm for female speech?
While these questions make a good framework for theorizing about gender in online political discussion, there is still the issue of analyzing tweets for gender. For that, I look to Koppel et al.’s work on automatically categorizing written work by author gender (Koppel 401-412). Koppel and his team used a comprehensive list of words and grammatical patterns to create an algorithm that was able to predict the gender of the author of a text with eighty-percent accuracy. Although Koppel did not use LIWC in his algorithm, his team’s methods will inform how I will manipulate LIWC, which allows users to add words or expressions to dictionaries.

Works Cited

Kahn, Jeffrey H., Renée M. Tobin, Audra E. Massey, and Jennifer A. Anderson. “Measuring Emotional Expression with the Linguistic Inquiry and Word Count.” The American Journal of Psychology 120.2 (2007): 263. Print.

Koppel, M.. “Automatically Categorizing Written Texts by Author Gender.” Literary and Linguistic Computing 17.4 (2002): 401-412. Print.

Lakoff, Robin Tolmach. Language and woman’s place. New York: Octagon Books, 19761975. Print.

Mills, Sara. “Discourse Competence: Or How to Theorize Strong Women Speakers.” Hypatia 7.2 (1992): 4-17. Print.


A Closer Look at Facebook Friends

Motivational Speaker Jim Rohn said that “you are the average of the five people you spend the most time with”, but does this adage apply to online activity? Am I the average of my eight-hundred-some facebook friends?

Over the past few weeks I’ve been examining my facebook friends more closely than ever before. Sound a little creepy? Well, you’re not wrong. With the help and guidance of Meg Schwenzfeier I was able to download a huge amount of data about my facebook friends. This data included a lot of basic information like names, gender, ages and hometowns, but I was particularly interested in their profile pictures. Specifically I’m trying to spot trends related to the equal sign profile picture, which surfaced on March 26th 2013 thanks to a gay-rights advocacy group called the Human Rights Campaign.

In the process of examining those friends who adopted the profile picture, I’ve learned a lot about my group of facebook friends as a whole. At first, I was surprised at how many of the profile-picture adopters were female. But after looking at the clusters of friends in my network, from my friends from the all-girls camp I attended as a kid to the hundred-some girls in my sorority, I realized that my entire sample had a decidedly feminine bias, with a whopping 540 female friends and only 275 male friends.

However, the female-bias among my facebook friends is nothing compared to the age bias. Although facebook data ( suggests that the demographic most likely to change their profile picture was 30-somethings, the other stand-out was college towns. In fact, the county with the greatest rate of equal sign profile pictures was Ann Arbor, Michigan with a rate of 6.2%. So it’s not surprising that my facebook friends’ adoption rate stood at a little over the national average at 5.65%.

So am I the average of my facebook friends? As someone who adopted the equal sign profile picture, am I the average of that subset? On a superficial level, the data points to yes. My facebook friends are an overwhelmingly female, college-aged and liberal echo chamber. However, as I keep working with the data, I hope to find more nuances of which factors made my facebook friends more or less likely to adopt the profile picture.