Typographic Attacks in a Multi-Image Setting

Typographic Attacks in a Multi-Image Setting Large Vision-Language Models (LVLMs) are susceptible to typographic attacks, which are misclassifications caused by an attack text that is added to an image. In this paper, we introduce a multi-image setting for studying typographic … Read More

Fine-grained Fallacy Detection with Human Label Variation

Fine-grained Fallacy Detection with Human Label Variation We introduce FAINA, the first dataset for fallacy detection that embraces multiple plausible answers and natural disagreement. FAINA includes over 11K span-level annotations with overlaps across 20 fallacy types on social media posts … Read More

LLMs vs Established Text Augmentation Techniques for Classification

LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs? The generative large language models (LLMs) are increasingly being used for data augmentation tasks, where text samples are LLM-paraphrased and then used for classifier fine-tuning. … Read More

ModaFact dataset with Event Factuality and Modality in Italian

Authors: Rovera Marco, Cristoforetti, Serena, Tonelli Sara ModaFact is a textual dataset annotated with Event Factuality and Modality in Italian. ModaFact’s goal is to model in a joint way factuality and modality values of event-denoting expressions in text. Original texts (sentences) … Read More

ModaFact: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection

Authors: Rovera Marco, Cristoforetti, Serena, Tonelli Sara Factuality and modality are two crucial aspects concerning events, since they convey the speaker’s commitment to a situation in discourse as well as how this event is supposed to occur in terms of … Read More

Authorship Obfuscation in Multilingual Machine-Generated Text Detection

  Creators:Macko, Dominik Moro, Robert Uchendu, Adaku Srba, Ivan Lucas, Jason Samuel Yamashita, Michiharu Tripto, Nafis Irtiza Simko, Jakub Bielikova, Maria   Description: High-quality text generation capability of latest Large Language Models (LLMs) causes concerns about their misuse (e.g., in … Read More

A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

Creators Description:  In the realm of text manipulation and linguistic transformation, the question of authorship has been a subject of fascination and philosophical inquiry. Much like the Ship of Theseus paradox, which ponders whether a ship remains the same when … Read More

KInIT at SemEval-2024 Task 8: Fine-tuned LLMs for Multilingual Machine-Generated Text Detection

Creators: Description: SemEval-2024 Task 8 is focused on multigenerator, multidomain, and multilingual black-box machine-generated text detection. Such a detection is important for preventing a potential misuse of large language models (LLMs), the newest of which are very capable in generating … Read More

SIDBench: A Python framework for reliably assessing synthetic image detection methods

Creators Description The generative AI technology offers an increasing variety of tools for generating entirely synthetic images that are increasingly indistinguishable from real ones. Unlike methods that alter portions of an image, the creation of completely synthetic images presents a … Read More

Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation

Creators Description The latest generative large language models (LLMs) have found their application in data augmentation tasks, where small numbers of text samples are LLM-paraphrased and then used to fine-tune downstream models. However, more research is needed to assess how … Read More