Optimization

  • Understanding the difficulty of training deep feedforward neural networks (2010)

  • On the difficulty of training Recurrent Neural Networks (2012. 11)

    • Gradient Clipping, RNN

  • Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (2015. 2)

  • A Simple Way to Initialize Recurrent Networks of Rectified Linear Units (2015. 4)

    • Weight Initialization, RNN, Identity Matrix

  • Cyclical Learning Rates for Training Neural Networks (2015. 6)

    • CLR, Triangular, ExpRange, Longtherm Benefit

  • On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima (2016. 9)

    • Generalization, Sharpness of Minima

  • Neural Optimizer Search with Reinforcement Learning (2017. 9)

    • Neural Optimizer Search (NOS), PowerSign, AddSign

  • On the Convergence of Adam and Beyond (2018. 2)

  • Adafactor: Adaptive Learning Rates with Sublinear Memory Cost (2018. 4)

    • Adafactor, Adaptive Method, Update Clipping

  • Revisiting Small Batch Training for Deep Neural Networks (2018. 4)

    • Generalization Performance, Training Stability

Last updated