Browse by author
Lookup NU author(s): Dr Christine CuskleyORCiD
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
Language change is influenced by many factors, but often starts from synchronic variation, where multiple linguistic patterns or forms coexist, or where different speech communities use language in increasingly different ways. Besides regional or economic reasons, communities may form and segregate based on political alignment. The latter, referred to as political polarization, is of growing societal concern across the world. Here we map and quantify linguistic divergence across the partisan left-right divide in the United States, using social media data. We develop a general methodology to delineate (social) media users by their political preference, based on which (potentially biased) news media accounts they do and do not follow on a given platform. Our data consists of 1.5M short posts by 10k users (about 20M words) from the social media platform Twitter (now “X”). Delineating this sample involved mining the platform for the lists of followers (n = 422M) of 72 large news media accounts. We quantify divergence in topics of conversation and word frequencies, messaging sentiment, and lexical semantics of words and emoji. We find signs of linguistic divergence across all these aspects, especially in topics and themes of conversation, in line with previous research. While US American English remains largely intelligible within its large speech community, our findings point at areas where miscommunication may eventually arise given ongoing polarization and therefore potential linguistic divergence. Our flexible methodology — combining data mining, lexicostatistics, machine learning, large language models and a systematic human annotation approach — is largely language and platform agnostic. In other words, while we focus here on US political divides and US English, the same approach is applicable to other countries, languages, and social media platforms.
Author(s): Karjus A, Cuskley C
Publication type: Article
Publication status: Published
Journal: Humanities and Social Sciences Communications
Year: 2024
Volume: 11
Online publication date: 15/03/2024
Acceptance date: 04/03/2024
Date deposited: 01/05/2024
ISSN (electronic): 2662-9992
Publisher: Nature Publishing Group
URL: https://doi.org/10.1057/s41599-024-02922-9
DOI: 10.1057/s41599-024-02922-9
Data Access Statement: The code used to run the analyses is available at https://github.com/andreskarjus/evolving_divergence. Unfortunately, and exceptionally, at this time we cannot make neither the collected data nor the tweet or user IDs publicly available, in order to avoid potential conflicts with the current Terms of Service of the Twitter/X platform regarding potentially political and sensitive contexts. The data may be shared directly upon reasonable request.
Altmetrics provided by Altmetric