DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation

NLG Comments

Paper Link



The dataset is filtered for the following conditions -

  1. There is a URL in source or target.
  2. Where the target contains word repetitions of at least three words
  3. Where the response does not contain at least one of the top-50 most frequent English words (e.g., “the”, “of”, “a”), since this probably indicates it might not be an English sentence.
  4. Where the response contains special markers such as “[” or “]”, as this could be markup language.
  5. Where source and target sequences together are longer than 200 words.
  6. Where the target contains offensive language.

Model Architecture

Maximum Mutual Information


The model does near human performance on DSTC.

Kaushik Rangadurai

Code. Learn. Explore

Share this post