Research

I am a computational social scientist with a substantive focus on social media, digital politics, and elite political networks. A common theme in my research is the use of large-scale data from social media (e.g. YouTube, Twitter) and other digital platforms (e.g. Wikipedia, Wikidata, Google Search) to understand politics and society. Such data often require using specialized computational methods for text analysis, network analysis, and machine learning. Some of my work directly relate to the use of the now booming transformer-based language models in social science research.

Please find information on my publications and working papers below, grouped by sub-themes.

Social Media and Society

Kevin Munger, James Bisbee, Omer Yalcin, Joseph Phillips, Matthew Hindman. 2025. “Pressing Play on Politics: Quantitative Description of YouTube. In Journal of Quantitative Description

Abstract: We present a large-scale quantitative analysis of anglophone politics channels on YouTube, with three distinct units of analysis: channels, comments, and videos. We demonstrate that although channels have been entering the YouTube system at a roughly constant rate since 2008, there is serious inequality in the attention received by different channels and videos. Furthermore, prolific commenters are responsible for an astonishing amount of activity: 50% of total comments are written by just over 2% of all commenters. The toxicity for which YouTube comments are famous tends to be more pronounced among these super-users than among infrequent commenters. Our findings have important implications for the way in which YouTube viewers interpret what they see as representative of public opinion.

One of our most profound findings is that of extreme inequality in subscriber, video, and view counts. Note that the y-axes are in log-scale.

Akbiyik, Ahmet, Omer F. Yalcin, Abdullatif Köksal, and M. Tahir Kilavuz. “Blame Thy Neighbor: Terrorist Attacks and Anti-Refugee Sentiment” (SSRN preprint: https://ssrn.com/abstract=4986999)

Abstract: We scrutinize how terrorist attacks influence public attitudes toward refugees, focusing on the complex interplay between such events and societal perceptions in host countries. Analyzing a large dataset of about 1 million Turkish language tweets and conducting a survey experiment, we find a significant rise in anti-refugee sentiment following attacks. Notably, those who support the government and are initially less prejudiced against refugees exhibit the most substantial change in attitude. This shift is not driven by dissatisfaction with government policies but by a dormant tendency to blame refugees, which emerges in the wake of terrorism. The findings emphasize the need to understand the nuanced effects of security-related events on refugee perceptions.

We find a drastic drop in sentiment toward refugees—change towards negative—in the immediate aftermath of terrorist attacks.

Sangyeon Kim, Omer F. Yalcin, Samuel E. Bestvater, Kevin Munger, Burt L. Monroe, and Bruce A. Desmarais. “The Effects of an Informational Intervention on Attention to Anti-Vaccination Content on YouTube.” 2020. In Proceedings of the International AAAI Conference on Web and Social Media, vol. 14, pp. 949-953.

Abstract: The spread of misinformation related to health, especially vaccination, is a potential contributor to myriad public health problems. This misinformation is frequently spread through social media. Recently, social media companies have intervened in the dissemination of misinformation regarding vaccinations. In the current study we focus on YouTube. Recognizing the extent of the problem, YouTube implemented an informational modification that affected many videos related to vaccination beginning in February 2019. We collect original data and analyze the effects of this intervention on video viewership. We find that this informational intervention reduced traffic to the affected videos, both overall, and in comparison to a carefully-matched set of control videos that did not receive the informational modification.

We use comment counts as a proxy for view counts and find that YouTube’s informational intervention has resulted in reduction in traffic to videos influenced by the intervention.

Computational Linguistics

Abdullatif Köksal, Omer Faruk Yalcin, Ahmet Akbiyik, M. Tahir Kilavuz, Anna Korhonen, and Hinrich Schütze. 2023. “Language-Agnostic Bias Detection in Language Models.” In Findings of the Association for Computational Linguistics: EMNLP 2023

Abstract: Pretrained language models (PLMs) are key components in NLP, but they contain strong social biases. Quantifying these biases is challenging because current methods focusing on fill-the-mask objectives are sensitive to slight changes in input. To address this, we propose a bias probing technique called LABDet, for evaluating social bias in PLMs with a robust and language-agnostic method. For nationality as a case study, we show that LABDet “surfaces” nationality bias by training a classifier on top of a frozen PLM on non-nationality sentiment detection. We find consistent patterns of nationality bias across monolingual PLMs in six languages that align with historical and political context. We also show for English BERT that bias surfaced by LABDet correlates well with bias in the pretraining data; thus, our work is one of the few studies that directly links pretraining data to PLM behavior. Finally, we verify LABDet’s reliability and applicability to different templates and languages through an extensive set of robustness checks. We publicly share our code and dataset in https://github.com/akoksal/LABDet.

We find massive differences in sentiment detected by various language models, for different languages, based on merely changing the nationality associated with the person mentioned in the sentence.

Nitheesha Nakka, Omer Yalcin, Bruce Desmarais, Sarah Rajtmajer, and Burt Monroe. “The study of short texts in digital politics: Document aggregation for topic modeling” (arXiv preprint: https://arxiv.org/abs/2503.05065)

Abstract: Statistical topic modeling is widely used in political science to study text. Researchers examine documents of varying lengths, from tweets to speeches. There is ongoing debate on how document length affects the interpretability of topic models. We investigate the effects of aggregating short documents into larger ones based on natural units that partition the corpus. In our study, we analyze one million tweets by U.S. state legislators from April 2016 to September 2020. We find that for documents aggregated at the account level, topics are more associated with individual states than when using individual tweets. This finding is replicated with Wikipedia pages aggregated by birth cities, showing how document definitions can impact topic modeling results.

Number of topics whose top 10 FREX include state-related words (Sum of proportions of state-related topics): Tweets Document Definition vs. Legislator Document Definition. The document definition influences what topics one finds from the same underlying text corpus.

Elite Political Networks

Omer F. Yalcin. 2022. “Empirical Study of Elite Networks with Wikidata” (OSF preprint https://osf.io/yrfvs)

Abstract: The elusiveness of cross-country empirical measurement is a significant hurdle in the study of comparative political elite networks. Most studies of elites focus on a few countries or a region, limit their scope to specific institutions, and use ad-hoc data collection methods, making comparative work difficult. Wikidata stores not only entities like people, places, and institutions but also the relationships that connect them, such as parent, alma mater, country of citizenship, and so on. Wikidata’s bias toward notable people and its graph structure make it suitable for elite networks. An application to kinship networks with data on 272,930 actors across the globe shows that kinship ties among the elite increase with authoritarianism, conforming to theories of coup-proofing and corruption.

The largest component of the Saudi elite kinship network in Wikidata

Firat Kimya, Abdullah Sait Ozcan, Omer Faruk Yalcin. Elite Networks and Personalized Rule. (SSRN preprint: https://ssrn.com/abstract=5173713)

Abstract: The breakdown of democracy by elected incumbents has become a global phenomenon. The existing literature falls short in adequately explaining how populist-minded strongmen ascend to power within party organizations. We argue that party personalization precedes democratic backsliding, as would-be autocrats often dominate their party institutions before aggrandizing executive authority at the national level. We analyze the elite networks of Turkey’s ruling party, focusing on two key units: the Central Decision-making and Administrative Committee and the Cabinet, whose members occupy pivotal roles within the party and regime. Using an original dataset of 320 individuals who served in leadership positions in the governing party over the course of 22 years and network analysis, we demonstrate that party personalization intensified well before democratic breakdown. These findings reveal that potential autocrats consolidate their control over party structures as a precursor to undermining institutional and electoral barriers. We contribute to the literature on democratic backsliding and elite networks, highlighting the subtle nature of party personalization, which can occur even in regimes considered democratic.

Using Google search data to track elite network structure over time, we find the cabinet to be one of the principal organs of government that lost its influence over time as the Turkish regime became more and more personal. Over time, membership of a node / actor / vertex in the cabinet becomes less and less predictive of tie formation with that node.

Other Work

Nicole Schwitter and Omer Faruk Yalcin. Web Data Collection. In Handbook of Quantitative Methods in Sociology, ed. Ulf Liebe, Edward Elgar Publishing, forthcoming. (SSRN preprint: https://ssrn.com/abstract=5009050)