Toggle Main Menu Toggle Search

Open Access padlockePrints

Evolving linguistic divergence on polarizing social media

Lookup NU author(s): Dr Christine CuskleyORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

Language change is influenced by many factors, but often starts from synchronic variation, where multiple linguistic patterns or forms coexist, or where different speech communities use language in increasingly different ways. Besides regional or economic reasons, communities may form and segregate based on political alignment. The latter, referred to as political polarization, is of growing societal concern across the world. Here we map and quantify linguistic divergence across the partisan left-right divide in the United States, using social media data. We develop a general methodology to delineate (social) media users by their political preference, based on which (potentially biased) news media accounts they do and do not follow on a given platform. Our data consists of 1.5M short posts by 10k users (about 20M words) from the social media platform Twitter (now “X”). Delineating this sample involved mining the platform for the lists of followers (n = 422M) of 72 large news media accounts. We quantify divergence in topics of conversation and word frequencies, messaging sentiment, and lexical semantics of words and emoji. We find signs of linguistic divergence across all these aspects, especially in topics and themes of conversation, in line with previous research. While US American English remains largely intelligible within its large speech community, our findings point at areas where miscommunication may eventually arise given ongoing polarization and therefore potential linguistic divergence. Our flexible methodology — combining data mining, lexicostatistics, machine learning, large language models and a systematic human annotation approach — is largely language and platform agnostic. In other words, while we focus here on US political divides and US English, the same approach is applicable to other countries, languages, and social media platforms.


Publication metadata

Author(s): Karjus A, Cuskley C

Publication type: Article

Publication status: Published

Journal: Humanities and Social Sciences Communications

Year: 2024

Volume: 11

Online publication date: 15/03/2024

Acceptance date: 04/03/2024

Date deposited: 01/05/2024

ISSN (electronic): 2662-9992

Publisher: Nature Publishing Group

URL: https://doi.org/10.1057/s41599-024-02922-9

DOI: 10.1057/s41599-024-02922-9

Data Access Statement: The code used to run the analyses is available at https://github.com/andreskarjus/evolving_divergence. Unfortunately, and exceptionally, at this time we cannot make neither the collected data nor the tweet or user IDs publicly available, in order to avoid potential conflicts with the current Terms of Service of the Twitter/X platform regarding potentially political and sensitive contexts. The data may be shared directly upon reasonable request.


Altmetrics

Altmetrics provided by Altmetric


Funding

Funder referenceFunder name
ESRC Research Grant ES/T005955/1
the European Union Horizon 2020 research and innovation program (Project No. 810961)

Share