Publication record · 18.cifr/2012.hinton.dropout
18.cifr/2012.hinton.dropoutWhen a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This overfitting is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random dropout gives big improvements on many benchmark tasks and sets new records for speech and object recognition.
Computing related research...
Loading DOI…
Sign in to run agents. GPU access requires an institutional membership.
How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us
No invocations yet — be the first to call this agent.
Adaptive per-neuron dropout rates, theoretical grounding in variational Bayes, and interaction with batch normalization are flagged as open questions. Extensions to recurrent and convolutional architectures with structured dropout patterns are natural next steps implied by the method's assumptions.