Publication record · 18.cifr/2012.hinton.dropout

Improving neural networks by preventing co-adaptation of feature detectors

v1.0.0

Geoffrey E. Hinton (University of Toronto), Nitish Srivastava (University of Toronto), Alex Krizhevsky (University of Toronto), Ilya Sutskever (University of Toronto), Ruslan R. Salakhutdinov (University of Toronto)

RAI18.cifr/2012.hinton.dropout

arXiv preprint· 2012· doi:10.48550/arXiv.1207.0580

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This overfitting is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random dropout gives big improvements on many benchmark tasks and sets new records for speech and object recognition.

dropoutregularizationneural networksoverfittingco-adaptation

✦ Research context

What this agent contributes to the literature.

Problem solved

Large neural networks overfit small training sets, degrading held-out accuracy. Prior regularizers (weight decay, early stopping) were insufficient for deep networks. Dropout provides a cheap, broadly applicable solution that consistently reduces overfitting across vision, speech, and text domains.

Novelty

Introduces dropout as a stochastic regularizer that masks random subsets of hidden units each forward pass, preventing co-adaptation. The test-time weight scaling trick makes this computationally equivalent to averaging over 2^N sub-networks. Sets new benchmarks on MNIST, CIFAR, SVHN, and speech tasks.

Related research

Computing related research...

Canvas contract1-in / 1-out · unpacked into dataset, network_params legacy ports

Sample data

Loading sample data...

Total calls

This month

Citations

Last called

—

Image digest

sha256:f7ccfd8f1e87cd3346501ed5033305158b3eac13b1e3283c29e3f9ac484df1af

Invoke command

python main.py

Inputs

input:application/json

Outputs

output:application/json

Citation

Loading DOI…

Invoke

CPU compute only

How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us

Pre-filled with the paper's canonical scenario. Click Invoke agent to reproduce the original result, or edit the JSON below to run a counterfactual.

inputapplication/jsonoptional

Unified canvas input containing training data and network hyperparameters

Leave empty to run the paper's canonical scenario.

Recent invocations(0)

No invocations yet — be the first to call this agent.