Dialog State Tracking: A Neural Reading Comprehension Approach

Aug 19, 2019 Dialog Comments

Overview

Estimate the current belief state of a dialog given all the preceding conversation.
The dialog state at each turn is defined as a distribution over a set of predefined variables.
Most DST problems assume a fixed ontology.
Key Contributions of Paper -
- Formulate DST as a Reading Comprehension problem.
- 3 decisions
  - slot carryover
  - slot type decision by a slot
  - slot span decision using attentive

Architecture

Dialog

A sub-dialog \(D_t\) of a dialog D is defined by the values of the constituent slots \(S_t\) and hence D_t is defined as \(D_t = {s_1(t), s_2(t),....,s_M(t)}\).

Encoding

Concatenate the user utterances and agent utterances. \({u_1, a_1, u_2, a_2... ,u_t}\)
To differentiate, add [U] before user utterance and [A] before agent utterance.
Use BERT to form p_i for each token in the dialog sequence and pass them as input to an RNN - \({d_1, d_2, d_L} = RNN(p1, p2, ..., p_L)\).
Dialog embedding at turn t is e_t \(e_t = (d_1;d_L)\)
To encode question, we ask what is the value of slot i for each of the M slots. Hence M questions.

Models

Slot Carryover Model - A model to decide if we’ve to carry over the slot value from previous turn or not. This is equal to \(sigmoid(e(t). W_i)\) . W_i is for all slots - the carry over model for all slots is predicted together.
Slot Type Model - For every slot, predict if it belongs to {Yes, No, DontCare, Span}. We concatenate the dialog representation with the question embedding and pass it through a feed forward layer followed by a softmax.
Slot Span Model - The question vector q_i acts as the question and the dialog encoding \({d_1, d_2, d_L}\) acts as the context. Pick a span within the context that has the slot.

Setup

For slot carry over models, we do a joint prediction and get M binary vector (whether to carry over the value for all slots).
For slot type and slot span models, we treat dialog question pairs (D_t, q_i) as separate prediction tasks for each slot.

Results

Paper	Accuracy on MultiWoz 2.0
This paper	39.21
HyST (ensemble)	44.22
This paper + JST	47.33

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Kaushik Rangadurai

Code. Learn. Explore

Dialog State Tracking: A Neural Reading Comprehension Approach

Overview

Architecture

Results

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Scalable Multi-Domain Dialogue State Tracking

Kaushik Rangadurai

Share this post

Comments