Carnegie Mellon University

Center for Informed Democracy & Social - cybersecurity (IDeaS)

CMU's center for disinformation, hate speech and extremism online

IDeaS Center for Informed Democracy & Social-cybersecurity

stock image of published article

May 04, 2022

Hoaxes and Hidden agendas: A Twitter Conspiracy Theory Dataset

By Samantha Phillips

Tags: Conspiracy Theory; Dataset; Twitter; Stance Detection; Classification; Natural Language Processing

Forthcoming Publication:

Samantha C. Phillips, Lynnette Hui Xian Ng, and Kathleen M. Carley. 2022.  Hoaxes and Hidden agendas: A Twitter Conspiracy Theory Dataset: Data Paper.  In Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion), April 25-29, 2022, Virtual Event, Lyon, France.  ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3487553.3524665

Think back to the conflicting information you’ve seen or heard about COVID-19 origins, vaccines, mitigation strategies, and more in the last few years.  Spread maliciously or not, conspiracy theories thrive in times of uncertainty and crisis.  COVID-19 provides a ripe breeding ground for such theories to spread and evolve like wildfire.  If an individual believes COVID-19 is a hoax or hidden agenda, they may be less likely to abide by precautionary measures like wearing a mask in public spaces or social distancing.   

Why do we care about conspiracy theories on Twitter?

Conspiracy theories are unsubstantiated narratives designed to explain significant social or political events with secret plots by malicious and powerful actors .  While many of these theories are ultimately innocuous, others have the potential to do real harm by instigating real-world support or disapproval of the theories.   This is further fueled by social media which provides a platform for conspiracy theories to spread at unprecedented rates.  One study found over one-quarter of the most viewed videos on Youtube about COVID-19 contain mis-leading information

A well-known example of potentially harmful offline effects on online conspiratorial discourse is the Pizzagate conspiracy theory that went viral during the 2016 United States presidential election, purporting a child sex ring at the basement of a pizza shop. This resulted in harassment of the owner and employees of the pizza shop, and shots fired in the restaurant by an individual attempting “save the children”.   Another example comes from a recent paper investigating the association between belief in 5G COVID-19 conspiracy theories and willingness to participate in violence.  They assessed conspiracy mentality, belief in COVID-19 conspiracy theories, current state of anger, reactions to real-life violence, and general justification and willingness to use violence of over 600 participants.  Through this survey, they conclude there is a positive association between belief in 5G COVID-19 conspiracy theories, as well as general conspiratorial thinking, and willingness for violence.  

Given the propensity of conspiracy theories to contain misinformation and contribute to offline consequences, it is incredibly important to identify conspiracy theories to aid governmental entities, journalists and social media companies in mitigating the spread of these claims.  Another vital research area is the development of models to classify the stance of social media texts as supportive, against or neutral towards any conspiracy theory to allow further exploration of the dynamics of users engaged in conspiracy theories online.  By publishing a dataset of annotated conspiracy theories, we hope to contribute to these efforts and improve the performance of methods detecting conspiracy theories at scale

Dataset Description

We collect and analyze Twitter data surrounding four conspiracy theory topics: climate change, COVID-19 origins, COVID-19 vaccines, and Jeffery Epstein/Ghilaine Maxwell.  We collect, manually annotate, and publicly share a conspiracy theory dataset with 3100 tweets across domains.  Additionally, we fine-tune text classification models to detect whether a tweet (1) contains a conspiracy or not; (2) the stance of the tweet; (3) the topic of the tweet.  

Experiments Performed

We analyze the dataset through the following three experiments, each building language models on a different aspect of the annotation, to demonstrate how to use this dataset.  

  • Conspiracy-detection: Given a tweet, classify whether it contains a conspiracy theory or not.
  • Stance-detection: Given a tweet that contains a conspiracy theory, classify its stance towards the conspiracy theory, i.e. {support, neutral, against}.
  • Topic-detection: Given a tweet that contains a conspiracy theory, classify which of the four conspiracy theory topics it addresses.

For each experiment, we built five classifiers: a majority classifier as a baseline classifier which assigns the majority class as a class value for all the instances; and four neural network classifiers, BERT, ALBERT, RoBERTa and DistilBERT.   We used the macro-F1 performance metric to adjust for the imbalance in the proportion of class labels in the dataset when reporting performance.  

How accurately can we predict the presence, stance towards, and topic of a conspiracy theory?

For all three experiments, the language classifiers out-perform the baseline majority-class classifier, highlighting the need for contextualized tweet embeddings in aiding a classification model to differentiate labels.  The best model is RoBERTa for the conspiracy-detection task (macro-F1 = 0.813), BERT for the stance-detection (macro-F1=0.722), and RoBERTa for topic-detection (macro-F1=0.944).  However, we note that DistilBERT, a smaller version of BERT and RoBERta, took at most half the time of BERT or RoBERTa but resulted in a macro-F1 score drop of at most 0.083 of the best performing model of each experiment.  These results indicate the additional parameters and time required by more complex text analysis models like BERT and RoBERTa may not be necessary for this dataset, depending on the performance desired.

The ratio of conspiracy supporters and deniers in the dataset is roughly 4:1.  In this sample of tweets, users were more outspoken in promoting these theories than shutting them down.  We posit that users with extreme opinions are typically more vocal on social media, suggesting caution in extrapolating findings.  Moreover, homophily on social media interactions has been well-studied.  There is evidence that people tend to interact with other users similar to them in beliefs or other attributes.  

Users who do not believe in conspiracy theories may not seek out or interact with users spreading conspiracy theories.  A future research path is to use these models to detect stance towards conspiracy theories on a much larger scale, which could allow researchers to conclude if supporters or deniers of conspiracy theories discuss them more online.

This study demonstrates that identifying and classifying conspiracy theories within tweets is possible, even across domains that range from climate change to public health conspiracy theories.  We hope to contribute to the development of tools for identifying conspiracy theories before they become widespread, enabling effective public messaging and mitigation.