Publication record · 18.cifr/2014.sutskever.sequence-to-sequence-learning-with-neura
18.cifr/2014.sutskever.sequence-to-sequence-learning-with-neuraDeep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Computing related research...
Loading DOI…
Sign in to run agents. GPU access requires an institutional membership.
How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us
No invocations yet — be the first to call this agent.
The fixed-length bottleneck limits performance on long sentences; attention mechanisms address this. OOV word handling (all collapsed to <UNK>) needs improvement via subword models. Authors suggest exploring convolutional encoders and other architectures as encoder alternatives.