I am a Moore Sloan research fellow in the Center for Data Science at the New York University, where I work in the Social Media and Political Participation lab. I received my PhD in Political Science from the University of Washington in 2018.
My research interests encompass the areas of political communication, public policy processes, and computational social sciences. I am particularly interested in how social movements and interest groups influence the political agenda and the decision making process in the current media environment. My methodological interests and strengths are natural language processing (text as data), computer vision (images as data), and machine learning and artificial intelligence in general.
In addition to my work at the Social Media and Political Participation lab, I collaborate with John Wilkerson and Matt Denny in a NSF-funded project developing text as data methods to study the lawmaking process in the U.S. Congress; and with John Wilkerson and Nora Webb Williams in another NSF-funded project developing computer vision methods to study how images shared in social media contribute to the diffusion of outsider groups such as civil society organizations, social movements, and radical violent groups.
Abstract: For more than half a century, scholars have been studying legislative effectiveness using a single metric - whether the bills a member sponsors progress through the legislative process. We investigate a less orthodox form of effectiveness - bill proposals that become law as provisions of other bills. Counting these “hitchhiker” bills as additional cases of bill sponsorship success reveals a more productive, less hierarchical and less partisan lawmaking process. We argue that agenda and procedural constraints are central to understanding why lawmakers pursue hitchhiker strategies. We also investigate the legislative vehicles that attract hitchhikers and find, among other things, that more Senate bills are enacted as hitchhikers on House laws than become law on their own.
Abstract: Do images affect online political mobilization? If so, how? These questions are of fundamental importance to scholars of social movements, contentious politics, and political behavior generally. However, little prior work has systematically addressed the role of images in mobilizing online participation in social movements. We first confirm that images have a positive mobilizing effect in the context of online protest activity. We then argue that images are mobilizing because they trigger stronger emotional reactions than text. Building on existing political psychology models we theorize that images evoking enthusiasm, anger, and fear should be particularly mobilizing, while sadness should be demobilizing. We test the argument through a study of Twitter activity related to a Black Lives Matter protest. We find that both images in general and some of the proposed emotional attributes (enthusiasm and fear) contribute to online participation. The results hold when controlling for alternative theoretical mechanisms for why images should be mobilizing, as well as for the presence of frequent image features. Our paper thus provides evidence supporting the broad argument that images increase the likelihood of a protest to spread online while also teasing out the mechanisms at play in a new media environment.
Abstract: Foundations are much more than disinterested philanthropic institutions that award grants to service-providing nonprofits. Foundations are political actors that seek to produce social change, not only by donating resources to nonprofits that promote causes but also by supporting policy reform in a more direct manner. We investigate engagement in advocacy among community foundations in the United States, which we define as the effort to influence public policy by proposing or endorsing ideas and by mobilizing stakeholders for social change. Drawing primarily on organizational sociology, we posit that the environmental context in which community foundations are situated and particular structural characteristics or operational features of community foundations (institutional logics, identity and embeddedness, and managerialism) will be associated with advocacy. We utilize machine learning techniques to establish an outcome measure of advocacy discourse on community foundation websites and ordinary least squares (OLS) regression to model that outcome with a cross-sectional dataset compiled from multiple sources. We find considerable support for our conceptual frame, and we conclude by offering an agenda for future research on foundations as interest groups.
Abstract: Strong party brands help congressional parties elect candidates, maintain or gain majority control, and advance their policy agendas. Because successful branding efforts depend on consistent messaging, party leaders try to choose issues that most members are willing to promote. But what do leaders do when a party majority pressures them to take up issues that harm the brand for others? We investigate the 2013 government shutdown as a branding event. House Republican leaders instigated the shutdown after learning that a majority of Republicans would not vote for a clean funding bill. However, instead of highlighting the issues that led to the shutdown, they publicized the party's efforts to resolve it. Party leaders sought to exploit the fact that party brands have both position and valence components to simultaneously address the demands of the party base and the electoral concerns of members representing competitive districts.
Abstract: Text has always been an important data source in political science. What has changed in recent years is the feasibility of investigating large amounts of text quantitatively. The internet provides political scientists with more data than their mentors could have imagined, and the research community is providing accessible text analysis software packages, along with training and support. As a result, text as data research is beginning to mainstream in political science. Scholars are tapping new data sources, they are employing more diverse methods, and they are becoming critical consumers of findings based on those methods. In this article, we first introduce readers to the subject by describing the four stages of a typical text as data project. We then review recent political science applications, and explore one important methodological challenge - topic model instability - in greater detail.
Abstract: In May 2011, thousands of outraged citizens (i.e. the indignados) occupied the squares of the main Spanish cities to express their discontent and claim for reforms. This article investigates via Twitter messages the ability of the 15-M movement to place their claims into the media agenda and to keep ownership of their own discourse. The analysis emphasizes the fact that the social movement originated in the Internet with a highly decentralized structure and with scarce organizational resources. Results show that protesters discourse included a great number of claims, although the activists focused their discussions on three specific issues: electoral and party systems, democracy and governance, and finally, civil liberties. Moreover, the study reveals that the indignados managed to keep control over their repertoires and were able to determine the media agenda despite the later mainly focused on the most dramatic events.
Social scientists have long argued that images play a crucial role in politics. This role is heightened by the bombardment of images that people experience today. Digitization has both increased the presence of images in daily life and made it easier for scholars to access and collect large quantities of pictures and videos. However, using images as data for social science inference is an arduous task. Political scientists have therefore often turned to other data sources and puzzles, leaving substantive theoretical questions unanswered. Fortunately, recent innovations in computer vision can reduce the costs of using images as data. The goals of this project are twofold. First, we build on existing computer vision methods to present a set of automatic techniques that will aid political scientists working with images. We highlight the potential of Convolutional Neural Nets for automatic object detection and recognition; for face detection and recognition; and for visual sentiment analysis. Second, we apply these techniques to a novel dataset of Black Lives Matter Twitter protest images, demonstrating the ability of computer vision methods to replicate gold standard manual image labels.
Are legislators responsive to the issue demands of the public? If so, are they more likely to be responsive to some citizens than to others? Research on agenda setting and responsiveness finds a correspondence between the issue priorities of the public and politicians, but it does not provide conclusive evidence on who influences whom. We argue that determining the direction of the effect is of great relevance to adjudicate between particular models of representation and to judge political responsiveness generally. We fill this current gap by studying all tweets sent by Members of the United States Congress, sets of Democratic and Republican supporters and attentive citizens, and a random sample of Twitter users located in the United States from January 2013 to December 2014. Using a Latent Dirichlet Allocation model, we extract topics that represent the diversity of issues that legislators discuss. Then, we exploit variation in the distribution of topics over time to test whether Members of Congress lead or follow their constituents in their selection of issues to discuss. We find that legislators are more likely to respond to than to influence discussion of public issues, results that hold even after controlling for media effects. We also find that legislators are more likely to be responsive to their own supporters and attentive voters than to the general public.
The literature on protest mobilization has long suggested that social ties have a strong influence in the decision to protest. Recent literature on social media mobilization also shows that in the current digital environment these social ties effects often take place online, particularly in social media. However, previous research does not provide clear evidence for why personal ties play a mobilizing role. In this paper we lay out four main theoretical mechanisms and we test them using real-world protest attendance data: social networks mobilize because a) they provide basic logistic information that is vital to protest coordination, and b) they create motivations for others to protest, c) they solve coordination problems, and d) they put pressure on others to participate. We collect data on Twitter activity during the 2018 Womens March that took place in many cities in the United States on January 20th and 21st. We then use geolocated accounts to find a set of users who attended a march and a set of users who did not. We use machine learning techniques to determine the amount of information, motivation, coordination, and pressure frames to which they were exposed through their Twitter networks. In line with current theories, we find users who protested to be more connected among themselves than those who did not. In regards to the mechanism analysis, we find that users who saw a larger number of information, coordination, and to a lesser extent, pressure frames, were more likely to protest, but find the opposite effect for motivation frames.
A rapidly growing body of research in political science uses unsupervised topic modeling techniques such as Latent Dirichlet Allocation (LDA) models to construct measures of topic attention for the purpose of hypothesis testing. A central advantage of unsupervised topic modeling, compared to supervised approaches, is that it does not require advance knowledge of the topics to be studied, or a sizable set of training examples. However, the topics discovered using these methods can be unstable. This is potentially problematic to the extent that researchers report results based on a single topic model specification. We propose an approach to using topic model results for hypothesis testing that incorporates information from multiple specifications. We then illustrate this robust approach by replicating an influential political science study. An R package (ldaRobust) for its implementation is provided.
Political scientists frequently study the public communications of members of Congress to better understand their electoral strategies, policy responsiveness, ability to influence public opinion and media coverage of Congress. However, different studies base their conclusions on different communication channels, including (among others) member websites, newsletters, press releases, and social media. All these scholars have taken these individual sources as fully representative of the communication strategy of the elected official as a whole. However, what has not been asked is whether members communicate the same or different messages across these differing channels? In this paper we look at the member press releases, Twitter, and Facebook messages sent from August to December 2014 in order to study to what extent their communication strategy is consistent across channels. We use an automatic semi-supervised method to classify the messages into political issues and to assess the validity of inferring broader communication patterns from a single source.
These are courses I have taught as a Teacher Assistant at the University of Washington:
A python module that provides a set of functions to fit multiple LDA models to a text corpus and then search for the robust topics present in multiple models