Ctrl: A Conditional Transformer Language Model For Controllable Generation
Overview
- Trained to condition on control codes that govern style, content, and task-specific behavior.
- With 1.63 billion parameters, our Conditional Transformer Language (CTRL) model can generate text conditioned on control codes that specify domain, style, topics, dates, entities, relationships between entities, plot points, and task-related behavior.
- For example, large resources like Wikipedia, Project Gutenberg, and Amazon Reviews can each be assigned a domain-related control code. Smaller resources, like the content extracted from individual subreddits, often occur with both a broader domain name, reddit, as well as subdomain information, r/subdomain.
- Text can be generated in more predictable ways by controlling for content or changing the domain even when the initial prompt remains fixed.
Architecture
- Traditional language model with control codes as the first token in every sequence.
- We also introduce an unknown token so that during preprocessing we can filter out sequences that contain more than 2 unknown tokens.
- Control Codes - Wikipedia, Books, Horror, Reviews, Relationships, Legal, Science Title, Politics Title, Running Text, Horror Text.
- Controlled Generation - temperature controlled softmax.
Kaushik Rangadurai
Code. Learn. Explore