DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation

Jan 29, 2020 NLG Comments

Dataset

The dataset is filtered for the following conditions -

There is a URL in source or target.
Where the target contains word repetitions of at least three words
Where the response does not contain at least one of the top-50 most frequent English words (e.g., “the”, “of”, “a”), since this probably indicates it might not be an English sentence.
Where the response contains special markers such as “[” or “]”, as this could be markup language.
Where source and target sequences together are longer than 200 words.
Where the target contains offensive language.

Model Architecture

Maximum Mutual Information

The model does near human performance on DSTC.

Code. Learn. Explore