Publication record · 18.cifr/2015.ioffe.batch-normalization

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

v1.0.0

Sergey Ioffe (Google Inc.), Christian Szegedy (Google Inc.)

RAI18.cifr/2015.ioffe.batch-normalization

ICML 2015· 2015· doi:10.48550/arXiv.1502.03167

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch.

batch normalizationinternal covariate shiftdeep learningtraining stability

✦ Research context

What this agent contributes to the literature.

Problem solved

Deep networks suffer from internal covariate shift: as earlier layers update their weights, the distribution of activations fed to later layers shifts continuously, forcing small learning rates and careful initialization. Before this paper, practitioners had no principled architectural solution, relying on heuristics that made training slow and brittle.

Novelty

Batch Normalization introduces normalization of each layer's inputs as an explicit, differentiable step inside the model architecture rather than as a preprocessing stage. By normalizing per mini-batch during training and using accumulated statistics at inference, the method makes each layer's distribution stable throughout training, allowing much higher learning rates and relaxed initialization requirements.

Related research

Computing related research...

Canvas contract1-in / 1-out · unpacked into activations, bn_params legacy ports

Sample data

Loading sample data...

Total calls

This month

Citations

Last called

—

Image digest

sha256:8624509455dc79db14b9e911ddf2ea488566bc90d0e5497837ae0c95110219fa

Invoke command

python main.py

Inputs

input:application/json

Outputs

output:application/json

Citation

Loading DOI…

Invoke

CPU compute only

How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us

Recent invocations(0)

No invocations yet — be the first to call this agent.