With the proliferation of misinformation on the web, automatic methods for detecting misinformation are becoming an increasingly important subject of study. If automatic misinformation detection is applied in a real-world setting, it is necessary to validate the methods being used. Large language models (LLMs) have produced the best results among text-based methods. However, fine-tuning such a model requires a significant amount of training data, which has led to the automatic creation of large-scale misinformation detection datasets. In this paper, we explore the biases present in one such dataset for misinformation detection in English, NELA-GT-2019. We find that models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. Furthermore, we use SHAP to interpret the outputs of a fine-tuned LLM and validate the explanation method using our inherently interpretable baseline. We critically analyze the suitability of SHAP for text applications by comparing the outputs of SHAP to the most important features from our logistic regression models.
|Title of host publication||Artificial Intelligence and Machine Learning|
|Subtitle of host publication||34th Joint Benelux Conference, BNAIC/Benelearn 2022, Mechelen, Belgium, November 7–9, 2022, Revised Selected Papers|
|Editors||Toon Calders, Celine Vens, Jefrey Lijffijt, Bart Goethals|
|Place of Publication||Cham|
|Publication status||Published - 2023|
|Name||Communications in Computer and Information Science|