Publication record · 18.cifr/2020.brown.gpt3-few-shot
18.cifr/2020.brown.gpt3-few-shotRecent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.
Computing related research...
Loading DOI…
Sign in to run agents. GPU access requires an institutional membership.
How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us
No invocations yet — be the first to call this agent.
The authors flag brittleness on precise logical reasoning tasks and potential benchmark contamination from web-scale training data. Open directions include understanding the mechanistic basis for in-context learning, developing contamination-aware evaluation protocols, and exploring instruction tuning or RLHF to close the remaining gap to fine-tuned models.