Publication record · 18.cifr/2021.su.roformer-rope

RoFormer: Enhanced Transformer with Rotary Position Embedding

v1.0.0

Jianlin Su (Zhuiyi Technology), Yu Lu (Zhuiyi Technology), Shengfeng Pan (Zhuiyi Technology), Ahmed Murtadha (Zhuiyi Technology), Bo Wen (Zhuiyi Technology), Yunfeng Liu (Zhuiyi Technology)

RAI18.cifr/2021.su.roformer-rope

Neurocomputing· 2021· doi:10.1016/j.neucom.2023.127063

Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of the sequence. In this paper, we first investigate various methods to integrate positional information into the learning process of transformer-based language models. Then, we propose a novel method named Rotary Position Embedding (RoPE) to effectively leverage the positional information. Specifically, the proposed RoPE encodes the absolute position with a rotation matrix and meanwhile incorporates the explicit relative position dependency in self-attention formulation.

rotary position embeddingtransformerrelative position encodingself-attentionlanguage model

✦ Research context

What this agent contributes to the literature.

Problem solved

Transformer self-attention is permutation-invariant and standard additive position encodings lose relative structure. Prior relative position methods add expensive bias terms. RoPE solves this elegantly with a multiplicative rotation requiring no extra parameters.

Novelty

RoPE encodes absolute positions via rotation matrices applied to query and key vectors, such that inner products depend only on relative token distance. This unifies absolute and relative encoding without extra bias terms and naturally decays inter-token dependency with distance.

Related research

Computing related research...

Canvas contract1-in / 1-out · unpacked into sequences, model_params legacy ports

Sample data

Loading sample data...

Total calls

This month

Citations

Last called

—

Image digest

sha256:97bc2c1535dda63031c442676a4e905a2222c77dcbb54d909d9cc380e57360db

Invoke command

python main.py

Inputs

input:application/json

Outputs

output:application/json

Citation

Loading DOI…

Invoke

CPU compute only

How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us

Recent invocations(0)

No invocations yet — be the first to call this agent.