Carnegie Mellon University

Center for Informed Democracy & Social - cybersecurity (IDeaS)

CMU's center for disinformation, hate speech and extremism online

IDeaS Center for Informed Democracy & Social-cybersecurity

trolls

August 17, 2022

Disruptive Trolls and How to Find Them: Introducing TrollHunter

By Joshua Uyheng

& JD Moffitt

Direct link to paper, published July 08, 2022: https://doi.org/10.1016/j.ipm.2022.103012

tags: trolls; bots; disinformation; social media; psycholinguistics

Image Credit: J.D. Moffitt, image generated through OpenAI Full DALL-E API

If you’ve been on social media over the last decade, chances are you’ve heard of trolls.

Maybe a celebrity or political leader you follow has gotten trolled by people who actively dislike them. Or maybe someone you know personally has been on its receiving end for a headstrong opinion they expressed on their own account.

In its basic form, trolling has been considered a messy facet of modern online culture. Defined as online interactions with the intent to harass or disrupt conversation, the concept of trolling has evolved from meme communities and specialized forums to become common parlance among much of the general, internet-savvy public.

But the problem is that trolling hasn’t been confined to run-of-the-mill interactions or pop culture ridicule. As recent research demonstrates, trolling has been weaponized in high-stakes settings like elections and international conflict. In this context, there’s an urgent need for both a strong social scientific understanding of online trolling and scalable tools to rapidly detect it.

We decided to confront these challenges in a recent paper published in the journal Information Processing & Management. We found that online trolling tended to follow a distinct linguistic signature, so we could build a machine learning tool called TrollHunter that could detect it automatically. We then applied that tool to  understand further the dynamics of trolling in the wild.

Trolling’s Linguistic Signature

Though the notion of trolling seems to be well-known, it’s often conflated with other concepts or treated as though “you’ll know it when you see it.” Hence, even before building a tool to detect trolling, we sought to study its properties systematically and empirically distinguish it from related forms of online harm.

Using the Netmapper software, we characterized and statistically analyzed trolling using previously validated psycholinguistic measures. Overall, we discovered that trolling is ultimately defined by its tendency to use simpler language, abusive terms, and speech targeted toward various named entities. While this makes sense based on colloquial understandings of trolling, this also formalizes how it can be quantified using measures based in social scientific theory.

In contrast, trolling was also found to be independent of an account’s number of followers, verified status, and the likelihood that it was a bot. This indicated that trolling could come from various sources, including official and automated accounts.

Who Trolls, Gets Trolled, and Why

These signature features of troll-like language were so distinct that TrollHunter, a machine learning model trained on these properties, achieved an 89% accuracy rate at automatically detecting it.

Using the TrollHunter model, we discovered that Chinese state-sponsored accounts tended to engage in more trolling behavior than Russian state-sponsored accounts on average. This made sense given prior studies which had indicated that while Chinese influence campaigns may seek to harass opponents to defend Beijing’s international reputation, Russian influence campaigns may instead wish to infiltrate existing organic groups in the West. Whereas the former may benefit from higher levels of trolling, the latter may require more stealthy behaviors for success.

Furthermore, we also found that the Twitter accounts of various American news agencies had different amounts of trolls that were interacting with them. Interestingly, these levels largely depended on whether the audiences of these channels were right-wing or left-wing. When we looked at bot activity as a comparison group, for instance, we saw that there were generally more bots among news sources trusted by the right. But for trolls, it seemed they sent more replies to news sources trusted by both the left and the right.

This observation indicated that trolls seeking to stir up conflict could potentially do so more reliably in settings with more diverse audiences. Bots, on the other hand, could perhaps be used to amplify messages among those who already shared relatively homogeneous opinions.

Responding to the Problem of Trolling

Through this study, we took a systematic, social scientific look at the phenomenon of online trolling. We built the TrollHunter tool based on our insights, and used it to answer downstream questions about who engages in trolling and to what ends.

This work raises the need for caution around mixing up the vocabulary around bots, trolls, and state-sponsored accounts. Even if they are all used in information operations, these terms refer to different types of agents that may have varying impacts on the health of our online ecosystems.

Therefore, we demonstrate how applications of TrollHunter and the broader pipeline of social cyber-security methodologies help us understand disruptive and manipulative online behaviors better. By clarifying these otherwise taken-for-granted understandings of online harms, such insights may help us devise more precise and effective measures to counter them.