Filter by type:

Sort by year:

Gender, writing and ranking in online communities: A case study of the IMDb

Journal Article
Jahna Otterbacher
Knowledge and Information Systems 35(3), 2013, Springer-Verlag

Online review forums provide consumers with essential information about goods and services by facilitating word-of-mouth communication. Despite that preferences are cor- related to demographic characteristics, reviewer gender is not often provided on user profiles. We consider the case of the internet movie database (IMDb), where users exchange views on movies. Like many forums, IMDb employs collaborative filtering such that by default, reviews are ranked by perceived utility. IMDb also provides a unique gender filter that dis- plays an equal number of reviews authored by men and women. Using logistic classification, we compare reviews with respect to writing style, content and metadata features. We find salient differences in stylistic features and content between reviews written by men and women, as predicted by sociolinguistic theory. However, utility is the best predictor of gen- der, with women’s reviews perceived as being much less useful than those written by men. While we cannot observe who votes at IMDb, we do find that highly rated female-authored reviews exhibit “male” characteristics. Our results have implications for which contributions are likely to be seen, and to what extent participants get a balanced view as to “what others think” about an item.

Learning the lingo? Gender, prestige and linguistic adaptation in review communities

Conference Proceedings
Libby Hemphill and Jahna Otterbacher
In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW), Seattle, WA, February 2012

Women and men communicate differently in both face-to-face and computer-mediated environments. We study linguistic patterns considered gendered in reviews contributed to the Internet Movie Database. IMDb has been described as a male-majority community, in which females contribute fewer reviews and enjoy less prestige than males. Analyzing reviews posted by prolific males and females, we hypothesize that females adjust their communication styles to be in sync with their male counterparts. We find evidence that while certain characteristics of “female language” persevere over time (e.g., frequent use of pronouns) others (e.g., hedging) decrease with time. Surprisingly, we also find that males often increase their use of “female” features. Our results indicate, that even when they resemble men’s reviews linguistically, women’s reviews still enjoy less prestige and smaller audiences.

Being heard in review communities: Communication tactics and review prominence

Journal Article
Jahna Otterbacher
Journal of Computer-Mediated Communication 16(3), April 2011, Pages 424-444

Review communities typically display contributions in list format, using participant feedback in determining presentation order. Given the volume of contributions, which are likely to be seen? While previous work has focused on content, we examine the relationship between communication tactics and prominence. We study three communities, comparing front-page reviews versus those on latter pages. We consider 3 types of devices: structural features, textual features, and persuasive writing. Structural features, such as profiles, convey information about authors. Textual properties, such as punctuation use, can make an impression on others. Rhetorical writing strategies are used by reviewers to convince readers of their opinions. When controlling for content, the most salient tactics distinguishing prominent reviews are textual properties and persuasive language.

Our news, their events? A comparison of archived current events on English and Greek Wikipedias

Book Chapter
Jahna Otterbacher
In: Fichman, P. and Hara, N. (Eds.): Global Wikipedia: International and cross-cultural issues in online collaboration. Rowman and Littlefield, Pages 49-68.

This study focuses on the archived current events portal of the English language Wikipedia and the Greek language Wikipedia . In addition to language and culture, these communities differ substantially with respect to size and participation; while the English Wikipedia is the largest, the Greek Wikipedia ranks 49th in size, containing just over 84,000 articles . In addition, while English Wikipedia has received much attention from researchers, we are not aware of previous work on the Greek-language community. As will be motivated and explained, archived current events during the past decade will be compared across the two Wikipedias with respect to their content, using concepts surrounding the production of news as a theoretical lens.

Interacting or just acting? A case study of European, Korean, and American Politicians’ interactions with the public on Twitter

Journal Article
Jahna Otterbacher, Matthew A. Shapiro, and Libby Hemphill
Journal of Contemporary Eastern Asia 12(1), April/May 2013

Social media holds potential to facilitate vertical political communication by giving citizens the opportunity to interact directly with their representatives. However, skeptics claim that even when politicians use “interactive media,” they avoid direct engagement with constituents, using technology to present a façade of interactivity instead of genuine dialog. This study explores how elected officials in three regions of the world are using Twitter to interact with the public. Using the Twitter activity of 15 officials over a period of six months, we show that in addition to the structural features of Twitter that are designed to promote interaction, officials rely on language to foster or to avoid engagement. We also provide evidence that the existence of interactive features does not guarantee interactivity.

What's Congress doing on Twitter?

Conference Proceedings
Libby Hemphill, Jahna Otterbacher, Matthew A. Shapiro
Proceedings of the Association for Computing Machinery Conference on Computer Supported Cooperative Work and Social Computing (ACM CSCW), New York, NY, USA, 877-886.

As Twitter becomes a more common means for officials to communicate with their constituents, it becomes more important that we understand how officials use these communication tools. Using data from 380 members of Congress’ Twitter activity during the winter of 2012, we find that officials frequently use Twitter to advertise their political positions and to provide information but rarely to request political action from their constituents or to recognize the good work of others. We highlight a number of differences in communication frequency between men and women, Senators and Representatives, Republicans and Democrats. We provide groundwork for future research examining the behavior of public officials online and testing the predictive power of officials’ social media behavior.

"Helpfulness" in online communities: a measure of message quality

Conference Proceedings
Jahna Otterbacher
Proceedings of the Association for Computing Machinery Conference on Human Factors in Computing Systems, ACM Press, New York, pages 955-964.

Online communities displaying textual postings require measures to combat information overload. One popular approach is to ask participants whether or not messages are helpful in order to then guide others to interesting content. Adopting a well-established framework for assessing data quality, we examine the nature of “helpfulness.”We study consumer reviews at, deriving 22 measures quantifying their textual properties, authors’ reputations and product characteristics. Confirmatory factor analysis reveals five underlying quality dimensions representing reviewers’ reputations in the community, the topical relevancy of the reviews, the ease of understanding them, their believability and objectivity. A correlation and regression analysis confirms that these dimensions are related to the helpfulness scores assigned by community participants. However, it also uncovers a strong relationship between the chronological ordering of reviews and helpfulness, which both community participants and designers should keep in mind when using this method of social navigation.

Crowdsourcing Stereotypes: Linguistic Bias in Metadata Generated via GWAP

Conference Proceedings
Jahna Otterbacher
Proceedings of the ACM Conference on Human Factors in Computing Systems (ACM CHI)

Games with a Purpose (GWAP) is a popular approach for metadata creation, enabling institutions to collect descriptions of digital artifacts on a mass scale. Creating metadata is challenging not only because one must recognize the artifact; the description must then be encoded into natural language. Language behaviors are influenced by many social factors, particularly when we are asked to describe other people. We consider labels for images of people generated via the ESP Game. While ESP has been shown to produce relevant labels, critics claim they are obvious and stereotypical. Based on theories of linguistic biases, we examine whether there are systematic differences in the ways players describe images of men versus women. Our first analysis considers images of people generally, and reveals a tendency for women to be described with subjective adjectives. A second analysis compares images depicting men and women within each of six occupational roles. Images of women receive more labels related to appearance, whereas those depicting men receive more occupation-related labels. Our work exposes the presence of gender-based stereotypes through linguistic biases, illustrates the forms in which they manifest, and raises important implications for those who design systems or train algorithms using data produced via GWAP.

Adoption of translation support technologies in a multilingual work environment

Book Chapter
Jahna Otterbacher
Ishida, T., Fussell, S.R., Vossen, P.T.J.M. (eds.): Intercultural Collaboration I. Lecture Notes in Computer Science, Springer-Verlag, Pages 276-290.

We study the adoption of translation support technologies by professors at a multilingual university, using the framework of the Technology Adoption Model (TAM). TAM states that a user’s perceived usefulness and ease of use for the technology ultimately determines her actual use of it. Through a survey and a set of interviews with our subjects, we find that there is evidence for TAM in the context of translation support tools. However, we also find that user adoption of these tools is a bit more complicated. Users who are able to successfully employ these tools have not only developed strategies to overcome their inaccuracies (e.g. by post-editing machine translated text), they also often compensate for the weaknesses of a given technology by combining the use of multiple tools.

Biased LexRank: Passage retrieval using random walks with question-based priors

Journal Article
Jahna Otterbacher, Gunes Erkan, Dragomir Radev
Information Processing and Management, 45(1), pages 42-54.

We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user’s natural language question. We then perform a random walk on the lexical similarity graph in order to recursively retrieve additional passages that are similar to other relevant passages. We present results on several benchmarks that show the applicability of our work to question answering and topic-focused text summarization.

Write Like I Write: Herding in the Language of Online Reviews

Conference Proceedings
Loizos Michael, Jahna Otterbacher
International AAAI Conference on Weblogs and Social Media, June 2014, Pages,

Our behaviors often converge with those of others, and language within social media is no exception. We consider reviews of tourist attractions at TripAdvisor (TA), the world’s largest resource for travel information. Unlike social networking sites, TA review forums do not facilitate direct interaction between participants. Nonetheless, theory suggests that language is guided by writers’ conception of their audience, and that their style can shift in response. We implement a model of herding as a local transmission process, exploring the hypothesis that a reviewer is influenced by how preceding reviews manifest a given stylistic feature (e.g., pronouns, paralinguistic devices). We find that reviewers are more likely to use unusual features when such characteristics appear in their local context. The extent to which reviewers are influenced by context is correlated to attributes shared in their profiles, as well as their sentiment toward the attraction reviewed. Our results suggest that language can be influenced by others, even in an asynchronous environment with little to no interpersonal interaction. In other words, our behaviors may be susceptible to manipulation in social media; it may not always be the case that we write like ourselves.

Different voices, similar perspectives? "Useful" reviews at the Internet Movie Database

Conference Proceedings
Jing Gao, Jahna Otterbacher, Libby Hemphill
Proceedings of the Nineteenth Americas Conference on Information Systems, Chicago, Illinois, August 15-17, 2013.

Digital networked environments have enabled cross-border interactions, which can facilitate understanding and multiperspectivalism when participants’ opportunities to be heard by others are not limited by social or technical factors. We examine the International Movie Database, where people worldwide contribute movie ratings and reviews. An important feature of IMDb is its social voting mechanism, which serves as a gatekeeping process; participants vote as to whether or not reviews are “useful,” and the most useful reviews are predominately displayed. We question whether or not international voices are represented among these elite reviews, and whether they bring unique views that differ from domestic (U.S.) perspectives. We find that international contributions are among the best-rated reviews at rates we would expect. However, we find little evidence that these reviews differ from those written by locals, and question whether participants are really exposed to alternative views of popular movies, despite IMDb’s international character.

Hierarchical summarization for delivering information to mobile devices

Journal Article
Jahna Otterbacher, Dragomir Radev, Omer Kareem
Information Processing and Management, 44(2), Pages 931-947.

Access to information via handheld devices supports decision making away from one’s computer. However, limitations include small screens and constrained wireless bandwidth. We present a summarization method that transforms online content for delivery to small devices. Unlike previous algorithms, ours assumes nothing about document formatting, and induces a hierarchical structure based on the relative importance of sentences within the document. As compared to delivering full documents, the method reduces the bytes transferred by half. An experiment also demonstrates that when given hierarchical summaries, users are no less accurate in answering questions about the documents.

Exploring fact‐focused relevance and novelty detection

Journal Article
Jahna Otterbacher, Dragomir Radev
Journal of Documentation, 64(4), Pages 496-510

Automated sentence‐level relevance and novelty detection would be of direct benefit to many information retrieval systems. However, the low level of agreement between human judges performing the task is an issue of concern. In previous approaches, annotators were asked to identify sentences in a document set that are relevant to a given topic, and then to eliminate sentences that do not provide novel information. This paper aims to explore a new approach in which relevance and novelty judgments are made within the context of specific, factual information needs, rather than with respect to a broad topic. An experiment is conducted in which annotators perform the novelty detection task in both the topic‐focused and fact‐focused settings. Higher levels of agreement between judges are found on the task of identifying relevant sentences in the fact‐focused approach. However, the new approach does not improve agreement on novelty judgments. The analysis confirms the intuition that making sentence‐level relevance judgments is likely to be the more difficult of the two tasks in the novelty detection framework.

NewsInEssence: Summarizing online news topics

Journal Article
Dragomir Radev, Jahna Otterbacher, Adam Winkel, Sasha Blair-Goldensohn
Communications of the Association for Computing Machinery (CACM), 48(10), Pages 95-98.

A news delivery and summarization system, acting as a user’s agent, gathers and recaps news items based on specifications and interests.

Is the crowd biased? Understanding group value judgments in open contribution systems

Conference Proceedings
Jahna Otterbacher, Libby Hemphill
Proceedings of the ACM CSCW Workshop on Collective Intelligence as Community Discourse and Action.

Binary voting mechanisms are extensively used in open contribution systems (OCS), crowdsourcing the decision as to which contributions are most valued. Often, aggregated votes are used to rank a shared collection of artifacts (e.g., the most helpful reviews of a medication shared by those who use it; the best-liked reader responses to a news story.) Previous work suggests that voting mechanisms pick up on salient dimensions of content quality and that users agree which contributions are “helpful” or “likable” in particular contexts. However, researchers have also documented systematic biases in the rankings that few systems reveal to users. After illustrating the need for a comprehensive study of social voting bias in OCS across domains, we propose a framework under which such research can be carried out. Finally, we discuss implications for OCS design as well as for end-users.

Linguistic Bias in Collaboratively Produced Biographies: Crowdsourcing Social Stereotypes?

Conference Proceedings
Jahna Otterbacher
Proceedings of the 9th International AAAI Conference on Web and Social Media (AAAI ICWSM-15)

Language is the primary medium through which stereotypes are conveyed. Even when we avoid using derogatory language, there are many subtle ways in which stereotypes are created and reinforced, and they often go unnoticed. Linguistic bias, the systematic asymmetry in language patterns as a function of the social group of the persons described, may play a key role. We ground our study in the social psychology literature on linguistic biases, and consider two ways in which biases might manifest: through the use of more abstract versus concrete language, and subjective words. We analyze biographies of African American and Caucasian actors at the Internet Movie Database (IMDb), hypothesizing that language patterns vary as a function of race and gender. We find that both attributes are correlated to the use of abstract, subjective language. Theory predicts that we describe people and scenes that are expected, as well as positive aspects of our in-group members, with more abstract language. Indeed, white actors are described with more abstract, subjective language at IMDb, as compared to other social groups. Abstract language is powerful because it implies stability over time; studies have shown that people have better impressions of others described in abstract terms. Therefore, the widespread prevalence of linguistic biases in social media stands to reinforce social stereotypes. Further work should consider the technical and social characteristics of the collaborative writing process that lead to an increase or decrease in linguistic biases.

Competent Men and Warm Women: Gender Stereotypes and Backlash in Image Search Results

Conference Proceedings
Jahna Otterbacher, Jo Bates, Paul Clough
Proceedings of the CHI Conference on Human Factors in Computing Systems, May 2017, Pages 6620-6631

There is much concern about algorithms that underlie information services and the view of the world they present. We develop a novel method for examining the content and strength of gender stereotypes in image search, inspired by the trait adjective checklist method. We compare the gender distribution in photos retrieved by Bing for the query “person” and for queries based on 68 character traits (e.g., “intelligent person”) in four regional markets. Photos of men are more often retrieved for “person,” as compared to women. As predicted, photos of women are more often retrieved for warm traits (e.g., “emotional”) whereas agentic traits (e.g., “rational”) are represented by photos of men. A backlash effect, where stereotype-incongruent individuals are penalized, is observed. However, backlash is more prevalent for “competent women” than “warm men.” Results underline the need to understand how and why biases enter search algorithms and at which stages of the engineering process.

S/he's too Warm/Agentic!: The Influence of Gender on Uncanny Reactions to Robots

Conference Proceedings
Jahna Otterbacher, Michael Talias
Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, Pages 214-223

Gender stereotypes are strong influences on human behavior. Given our tendency to anthropomorphize, incorporating gender cues into a robot’s design can influence acceptance by humans. However, little is known about the interaction between human and robot gender. We focus on the role of gender in eliciting negative, “uncanny” reactions from observers. We create a corpus of YouTube videos featuring robots with female, male and no gender cues. Our experiment is grounded in Gray and Wegner’s (2012) model, which holds that uncanny reactions are driven by the perception of robot agency (i.e., ability to plan and control) and experience (i.e., ability to feel), which in turn, is driven by robot appearance and behavior. Participants watched videos and completed questionnaires to gauge perceptions of robots as well as affective reactions. We used Structural Equation Modeling to test whether the model explains reactions of both men and women. For gender-neutral robots, it does. However, we find a salient human-robot gender interaction. Men’s uncanny reactions to robots with female cues are best predicted by the perception of experience, while women’s negativity toward masculine robots is driven by perceived agency. The result is interpreted in light of the “Big Two” dimensions of person perception, which underlie expectations for women to be warm and men to be agentic. When a robot meets these expectations, it increases the chances of an uncanny reaction in the other-gender observer.