Carnegie Mellon University

Center for Informed Democracy & Social - cybersecurity (IDeaS)

CMU's center for disinformation, hate speech and extremism online

IDeaS Center for Informed Democracy & Social-cybersecurity

Table of facts for protests

March 15, 2022

Exploring Polarization in User Behavior on Twitter During the 2019 South American Protests

By Ramon Villa-Cox

Link to Publication:

Ramon Villa-Cox,  Helen (Shuxuan)ZengAshiqur R. KhudaBukhshKathleen M. Carley, 2021, "Exploring Polarization of Users Behavior on Twitter During the 2019 South American Protests,"



The 2019 South American Protest were a series of protests that shocked the region at the end of 2019. They started in Ecuador and were followed by Chile, Bolivia and Colombia and effectively paralyzed the countries for months. Apart from Bolivia, the protest resulted from populist movements seeking to resist austerity measures being imposed in each country and demanding more government spending in social programs. In Bolivia, they were response to an alleged electoral fraud undertaken by the government in favor of the president who was seeking reelection. These protests also had in common a massive online presence and the reported involvement of international and regional actors that sought to influence their evolution. These include international news agencies like RT en Español, funded in part by the Russian government, TeleSUR and NTN24, funded in part by the Venezuelan government, that were more critical of local governments (except for Bolivia) and provided more favorable coverage of the protesters. In contrast, local news agencies tended to be more critical of them and favorable towards the government[1].

We sought to explore the polarization observed throughout the Twitter discussions between users supporting or opposing the Governments studied during the protests. The dataset consists of 100 million tweets from 15+ million users collected using Twitter’s API v1, with the usage of more than 500 hashtags and terms for the different countries. A special effort was taken to collect conversations around antagonistic positions, by including hashtags that were used by different groups (for and against the different governments). To identify a user’s stance towards each government, we developed a weak labeling methodology that requires minimal labeling effort and leverages users’ endorsement of politicians' tweets and hashtag campaigns with defined stances towards the protest (for or against). The reliance on not only hashtags, but also endorsement of political figures provides different levels of validation to assess the robustness of the labels obtained. We believe that this methodology holds promise for the development of large-scale databases for the analysis of similar contentious events (with the active involvement of local political figures). The analysis focuses on two dimensions: Polarization in language and in news sharing patterns.

Language Polarization

We quantify linguistic polarization by training word embeddings on the corpora based on the tweets of users of each stance. We then estimated a translation matrix between both embeddings and explored systematic differences in a translated word and its closest neighbors in the target language. We find a clear polarization in language, mainly manifested along ideological, political, or protest-related lines (e.g., Socialism in one group is discussed in similar context to fascism in the other and police in the same way as vandals). Notable instances of the observed miss-translations for Ecuador are presented in Table 1 (we find the same patterns across all countries). We also show that this methodology can be used to mine knowledge, as we find that local political leaders from the protest in one country translate to their counterparts in another country.

Polarization in News sharing behavior

The current fractionalized way of consuming news facilitated by social media not only helps to gain a quicker overview of current events, but also can lead to confirmation bias. For this reason, it is important to explore how it may lead to the manipulation of public opinion by actors with devious motives. By clustering news media based on the homogeneity of their user bases, we showed that agencies are clustered both geographically and ideologically. We also find strong evidence of polarization in news sharing and information diffusion by users, consistent with their stances towards the government. There is also consistent evidence of polarization in the way users choose to share news on Twitter, as users tend to stay in the community of news media that shares information, they are more likely to agree with. Moreover, we show the important role that regional Russian and Venezuelan news outlets like RT en Español and TeleSUR, played in the social media discussion of the protests throughout the region. This underscores how effective these outlets have been in gathering an audience of left-leaning users in the region, an initiative that has been identified by other studies of these news outlets.

This work falls in the emerging field of social cybersecurity which is concerned with the study of disinformation, hate speech and extremism online.

[1] Lara Jakes, “As Protests in South America Surged, So Did Russian Trolls on Twitter, U.S. Finds” New York Times, January 19 2020, accessed  December 19 2020,