Definition Extraction Feature Analysis: From Canonical to Naturally-Occurring Definitions

Abstract

Textual definitions constitute a fundamental source of knowledge when seeking the meaning of words, and they are the cornerstone of lexical resources like glossaries, dictionaries, encyclopedia or thesauri. In this paper, we present an in-depth analytical study on the main features relevant to the task of definition extraction. Our main goal is to study whether linguistic structures from canonical (the Aristotelian or genus et differentia model) can be leveraged to retrieve definitions from corpora in different domains of knowledge and textual genres alike. To this end, we develop a simple linear classifier and analyze the contribution of several (sets of) linguistic features. Finally, as a result of our experiments, we also shed light on the particularities of existing benchmarks as well as the most challenging aspects of the task.

Type
Publication
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon
Luis Espinosa-Anke
Luis Espinosa-Anke
Senior Lecturer
Jose Camacho-Collados
Jose Camacho-Collados
Professor & UKRI Future Leaders Fellow