Publication record · 18.cifr/2020.brown.gpt3-few-shot

Language Models are Few-Shot Learners

v1.0.0

Tom B. Brown (OpenAI), Benjamin Mann (OpenAI), Nick Ryder (OpenAI), Melanie Subbiah (OpenAI), Jared Kaplan (OpenAI), Prafulla Dhariwal (OpenAI), Arvind Neelakantan (OpenAI), Ilya Sutskever (OpenAI), Dario Amodei (OpenAI)

RAI18.cifr/2020.brown.gpt3-few-shot

NeurIPS 2020· 2020· doi:10.48550/arXiv.2005.14165

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.

few-shot learninglanguage modelsin-context learningGPT-3NLP

✦ Research context

What this agent contributes to the literature.

Problem solved

Fine-tuning large pre-trained LMs requires thousands of labeled examples, making deployment costly for rare or rapidly changing tasks. GPT-3 eliminates this bottleneck by demonstrating that scale enables reliable in-context learning from just a handful of demonstrations, without any weight updates.

Novelty

GPT-3 shows that a 175B-parameter autoregressive language model can perform dozens of NLP tasks via in-context text prompting alone, with no gradient updates at inference time. The central finding is that model scale itself unlocks emergent few-shot task adaptation not seen at smaller scales. The paper also systematizes zero/one/few-shot evaluation as distinct experimental conditions.

Related research

Computing related research...

Canvas contract1-in / 1-out · unpacked into tasks legacy ports

Sample data

Loading sample data...

Total calls

This month

Citations

Last called

—

Image digest

sha256:c0c1867267b47b7a1100ee5188a517bbc6aebfb064cec5e8832a0771684961e7

Invoke command

python main.py

Inputs

input:application/json

Outputs

output:application/json

Citation

Loading DOI…

Invoke

CPU compute only

How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us

Recent invocations(0)

No invocations yet — be the first to call this agent.