1

Definition Extraction Feature Analysis: From Canonical to Naturally-Occurring Definitions

Textual definitions constitute a fundamental source of knowledge when seeking the meaning of words, and they are the cornerstone of …

Mireia Roig Mirapeix, Luis Espinosa-Anke, Jose Camacho-Collados

Don't Patronize Me! An Annotated Dataset with Patronizing and Condescending Language towards Vulnerable Communities

In this paper, we introduce a new annotated dataset which is aimed at supporting the development of NLP models to identify and …

Carla Pérez-Almendros, Luis Espinosa-Anke, Steven Schockaert

Embeddings in Natural Language Processing

Embeddings have been one of the most important topics of interest in NLP for the past decade. Representing knowledge through a …

Jose Camacho-Collados, Mohammad Taher Pilehvar

Go Simple and Pre-Train on Domain-Specific Corpora: On the Role of Training Data for Text Classification

Pre-trained language models provide the foundations for state-of-the-art performance across a wide range of natural language processing …

Aleks Edwards, Jose Camacho-Collados, Hélène de Ribaupierre, Alun Preece

Towards Preemptive Detection of Depression and Anxiety in Twitter

Depression and anxiety are psychiatric disorders that are observed in many areas of everyday life. For example, these disorders …

David Owen, Jose Camacho-Collados, Luis Espinosa-Anke

Combining BERT with Static Word Embeddings for Categorizing Social Media

Pre-trained neural language models (LMs) have achieved impressive results in various natural language processing tasks, across …

Israa Alghanmi, Luis Espinosa-Anke, Steven Schockaert

Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation

State-of-the-art methods for Word Sense Disambiguation (WSD) combine two different features: the power of pre-trained language models …

Daniel Loureiro, Jose Camacho-Collados

TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification

The experimental landscape in natural language processing for social media is too fragmented. Each year, new shared tasks and datasets …

Francesco Barbieri, Jose Camacho-Collados, Luis Espinosa-Anke, Leonardo Neves

Understanding the Source of Semantic Regularities in Word Embeddings

Semantic relations are core to how humans understand and express concepts in the real world using language. Recently, there has been a …

Hsiao-Yu Chiang, Jose Camacho-Collados, Zachary Pardos

XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

The ability to correctly model distinct meanings of a word is crucial for the effectiveness of semantic representation techniques. …

Alessandro Raganato, Tommaso Pasini, Jose Camacho-Collados, Mohammad Taher Pilehvar