Randomly zero out entire channels (a channel is a 2D feature map,
e.g., the \(j\)-th channel of the \(i\)-th sample in the
batched input is a 2D tensor \(\mbox{input}[i, j]\)).
Usage
nn_dropout2d(p = 0.5, inplace = FALSE)
Arguments
p
(float, optional): probability of an element to be zero-ed.
inplace
(bool, optional): If set to TRUE, will do this operation
in-place
Shape
Input: \((N, C, H, W)\)
Output: \((N, C, H, W)\) (same shape as input)
Details
Each channel will be zeroed out independently on every forward call with
probability p using samples from a Bernoulli distribution.
Usually the input comes from nn_conv2d modules.
As described in the paper
Efficient Object Localization Using Convolutional Networks ,
if adjacent pixels within feature maps are strongly correlated
(as is normally the case in early convolution layers) then i.i.d. dropout
will not regularize the activations and will otherwise just result
in an effective learning rate decrease.
In this case, nn_dropout2d will help promote independence between
feature maps and should be used instead.