Seminar: "Prudent NLG Evaluation with Humans"

Name: Seminar: "Prudent NLG Evaluation with Humans"
Start: 2025-01-16T13:00:00Z
End: 2025-01-16T14:00:00Z
Location: Abacws

Fernando Alva-Manchego

Abstract

Annually, research teams spend large amounts of money to evaluate the quality of NLG systems (WMT for machine translation, inter alia). We’ll first look at how to speed up and improve the quality of the annotators’ work by pre-filling annotations with automatic quality estimation (ESA, ESAᴬᴵ). In the second part, we’ll take the automatization a step further and try to determine which segments do not need to be evaluated at all. For this, we make use of methods from psychometrics for efficient yet informative testset construction for human students. In our case, the students to be tested are NLG systems.

Date

Jan 16, 2025 13:00 — 14:00

Location

Abacws

Invited Speaker: Vilém Zouhar (ETH Zürich, Switzerland)

Bio: Vilém is a PhD student at ETH Zürich working on both human and automatic evaluation of MT/NLG systems, balancing costs, quality, and bias.

Seminar: "Prudent NLG Evaluation with Humans"

Abstract

Fernando Alva-Manchego

Lecturer