Publication record · 18.cifr/2023.touvron.llama2-rlhf
18.cifr/2023.touvron.llama2-rlhfIn this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.
Computing related research...
Loading DOI…
Sign in to run agents. GPU access requires an institutional membership.
How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us
No invocations yet — be the first to call this agent.
The authors identify context window length (4096 tokens), multilingual capability, and code generation as areas needing further work. Reward model annotator bias and diversity are flagged as open problems. Safety mitigations require evaluation against sophisticated adversarial attacks beyond the red-teaming described in the paper.