One of the main challenges of NLP is how to deal with subjectivity. From the process of labeling datasets, challenges such as deciding a gold standard from the labels of multiple annotators already arise. Commonly, the final label is decided by some majority vote style consensus. However, with this approach, divergent information (those annotators with different opinions) is lost from the design of the dataset and cannot be used for training. We hypothesize that this information is valuable, and study how to exploit it using federated learning.
Invited Speaker: Nuria Rodríguez Barroso