Coreference Resolution

Apr 05, 2019 NLP Comments

Problem

The task of Coreference Resolution is to identify all mentions that refer to the same real world entity. Consider the following example -

Barack Obama nominated Hillary Rodham Clinton as his secretary of state on Monday. He chose her because she had foreign affairs experience as former First Lady.

Barack Obama, he and his all refer to Obama (Entity).
Hillary Rodham Clinton, secretary of state, her, she and First Lady all refer to Hillary Clinton.

Datasets

OntoNotes 5.0 is the biggest coreference dataset out there - it has around 3000 documents labeled by humans. You can download the dataset from LDC here. Conll has annotated the dataset and you can find the annotated dataset and scripts to merge them here.

Metrics & SotA

MUC
CEAF
B-CUBED
- For each mention, compute a precision and recall.
- Average the individual Ps and Rs

Often report F1 over these metrics.

Paper	F1 Score
Wiseman et al (2015)	63.3
Clark & Manning (2016)	65.4
Lee et al (2017)	67.2

Key Resources

[1] - 2018 Stanford NLP Lecture - Slides
[2] - HuggingFace Demo and Github repository - Github
[3] - Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution - Paper
[4] - Improving Coreference Resolution by Learning Entity-Level Distributed Representations - Paper

Theory

Winograd Schema

If you think this is an easy task, consider the following 2 examples -

She poured water from the pitcher into the cup until it was full/empty.

The trophy would not fit in the suitcase because it was too big/small.

In each of the above 2 examples, the coreference would change based on a single world. Also, there is a lot of common knowledge (not written in books) that goes into these resolution. These are called Winograd Schema and was recently proposed as an alternative to Turing Test.

Applications

Document Understanding
Machine Translation
Dialog Systems

Anaphora

Anaphora is a kind of reference where one term in the document (anaphor) refers to another term term (antecedent).

Barack Obama said he would sign the bill.

Bridging Anaphora

We went to see a concert last night. The tickets were really expensive.

Insert-Venn Diagram

Ranking

Coreference resolution can be broadly divided into 2 steps -

Detect the mentions
- easier
- can be nested
Cluster the mentions
- harder
- multiple ways (see below)

Mention Detection

Mention is a span of text referring to some entity. There are 3 kinds of mentions -

Pronouns
- Part of Speech Tagger
Named Entities
- NER
Noun Phrases
- Constituency Parser

Mention Clustering Models

There are 3 kinds of coref models -

Mention Pair
Mention Ranking
Mention Clustering

Mention Pair

In the Mention Pair model, we train a binary classifier that predicts if a pair of mentions are coreferent.

At the train time, we minimize a standard cross entropy loss -

Assume, we’ve N mentions in the model
y_ij if mentions m_i and m_j are coreferent, -1 otherwise

\[\begin{align*} J = - \sum_{i=2}^{N} \sum_{j=1}^{i} y_{\text{ij}} log p(m_j, m_i) \end{align*}\]

i -> iterates through the mentions
j -> iterate through candidate antecedents

At the test time, we pick some threshold (say 0.5) and add coreference links between all positives.

Disadvantages

Could easily ball up and all form 1 big cluster.
Most of the mentions only have 1 antecedent, but we predict all of them

Mention Ranking

In the Mention Ranking model, we assign each mention to its highest scoring candidate antecedent. We also have a dummy NA mention that allows the model to decline linking the current mention to anything.

\[\begin{align*} J = - \sum_{i=2}^{N} -log \sum_{j=1}^{i-1} (y_{\text{ij} == 1}) p(m_j, m_i) \end{align*}\]

Model Architectures

Non-Neural Statistical Classifier
Feed-Forward Neural Network a. Embeddings, distance, document genre, speaker information.
LSTMs and Attention

HuggingFace Coref Implementation

The HuggingFace Coref Implementation is very similar to this paper by Clark and Manning.

Deep Neural Networks for YouTube Recommendations

Kaushik Rangadurai

Code. Learn. Explore

Coreference Resolution

Theory

Ranking

HuggingFace Coref Implementation

Deep Neural Networks for YouTube Recommendations

SUTIME: A Library for Recognizing and Normalizing Time Expressions

Kaushik Rangadurai

Share this post

Comments