(2016. 7) Layer Normalization
published in 2016. 7
Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton
Simple summary
Similar to batch normalization.
Works well for recurrent networks with mini-batch.
Independent with batch size.
Robust to Input data's scale
Robust to Weight matrix's scale and shift
Naturally uploaded scales decreased as learning progresses
but, Batch Norm still perform better for CNNs.
Last updated