Dropout

similar to Subspace Sampling and Bagging

During each training step (i.e., gradient update step), a fraction (typically half) of the units in specified hidden layers are “dropped”. Thus, only the weights and biases related to units that are not dropped are updated in this gradient step
During test, the entire network is used after scaling the weights based on the dropout rate
prevents co-adaptation among different neurons

Dilution

Randomly remove weights. Dropout is a special case of dilution where we remove a whole row w_j in the vector matrix W with some probability p instead of individual weights w_ij

🪴Digital Garden

Explorer

Dropout

Dilution

Graph View

Backlinks