Bots into the Fediverse dataset

Bots into the Fediverse This dataset contains anonymized features for bot detection on Mastodon (Fediverse). It was created for the accompanying paper and consists of accounts labeled as bot or non-bot, collected from publicly accessible content via the Mastodon Application … Read More

Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches

Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches Retrieval of previously fact-checked claims is a well-established task, whose automation can assist professional fact-checkers in the initial steps of information verification. Previous works have mostly tackled the … Read More

Comparing Specialised Small and General Large Language Models on Text Classification

Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance When solving NLP tasks with limited labelled data, researchers typically either use a general large language model without further update, or use … Read More

Face the Facts! Evaluating RAG-based Pipelines for Professional Fact-Checking

Face the Facts! Evaluating RAG-based Pipelines for Professional Fact-Checking Natural Language Processing and Generation systems have recently shown the potential to complement and streamline the costly and timeconsuming job of professional fact-checkers. In this work, we lift several constraints of … Read More

A Survey on Automatic Credibility Assessment Using Textual Credibility Signals in the Era of LLM

A Survey on Automatic Credibility Assessment Using Textual Credibility Signals in the Era of Large Language Models In the age of social media and generative AI, the ability to automatically assess the credibility of online content has become increasingly critical, … Read More

MuLTa-Telegram: A Fine-Grained Italian and Polish Dataset for Hate Speech and Target Detection

MuLTa-Telegram: A Fine-Grained Italian and Polish Dataset for Hate Speech and Target Detection This paper introduces the MuLTa-Telegram dataset, a Multi- Lingual and multi-Target dataset specifically developed to detect hate speech on Telegram, an understudied yet influential platform in which … Read More

Generative AI and the Threat to Thinking

Generative AI and the Threat to Thinking Information security is concerned with maintaining the integrity of the information ecosystem. The proliferation of content created using generative artificial intelligence can overwhelm the ability of people to process information. Consideration of a … Read More

Activities and Needs of European Fact-checkers as a Basis for Designing Human-Centered AI Systems

Autonomation, Not Automation: Activities and Needs of European Fact-checkers as a Basis for Designing Human-Centered AI Systems To mitigate the negative effects of false information more effectively, the development of Artificial Intelligence (AI) systems to assist fact-checkers is needed. Nevertheless, … Read More

MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts

MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts Recent LLMs are able to generate high-quality multilingual texts, indistinguishable for humans from authentic human-written ones. Research in machine-generated text detection is however mostly focused on the English language and … Read More

EuroVerdict: A multilingual dataset for verdict generation against misinformation

EuroVerdict: A multilingual dataset for verdict generation against misinformation Misinformation is a global issue that shapes public discourse, influencing opinions and decision-making across various domains. While automated fact-checking (AFC) has become essential in combating misinformation, most work in multilingual settings … Read More