Masked language models (MLMs) trained on massive amounts of textural training data have been found to encode concerning levels of social biases. Numerous prior work evaluating the social biases in word embeddings generated by pretrained MLMs, the biases in sense embeddings have been relatively understudied. On the other hand, multiple underlying factors are associated with an MLM such as its model size, size of the training data, training objectives, the domain from which pretraining data is sampled, tokenization, and languages present in the pretrained corpora. It remains unclear as to which of those factors influence the social biases that are learnt by the MLMs. In this talk, I will show that sense embeddings are also biased and describe the measure to evaluate social bias in sense embeddings. Then I will explain the factor analysis of social bias in MLMs.
Invited Speaker: Yi Zhou (Cardiff University - COMSC)