BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

NLG Comments

Paper Link

Introduction

Architecture

It is implemented as a sequence-to-sequence model with a bidirectional encoder over corrupted text and a left-to-right autoregressive decoder.

BART vs BERT vs GPT

Difference to BERT

Pre-training BART

BART Transformations

Fine-tuning BART

  1. Sequence Classification - Same input is fed to both encoder and decoder.
  2. Token Classification - same as sequence classification but token level.
  3. Sequence Generation - standard encoder decoder.
  4. Machine Translation - TBR.

Results

BART Results

Kaushik Rangadurai

Code. Learn. Explore

Share this post

Comments