the goal of all these is to keep a good distribution

Activation Functions

**[brain]**activation: stimulus mechanism

Matrix multiplication is linear process, but in most of the time, we need to solve unlinear problem.

Untitled

Weight Initialization

activation at std=0.01

activation at std=0.01

activation at std=0.05

activation at std=0.05

Proper initialization is an active area of research…

  1. Understanding the difficulty of training deep feedforward neural networks by Glorot and Bengio, 2010
  2. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks by Saxe et al, 2013
  3. Random walk initialization for training very deep feedforward networks by Sussillo and Abbott, 2014
  4. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification by He et al., 2015
  5. Data-dependent Initializations of Convolutional Neural Networks by Krähenbühl et al., 2015
  6. All you need is a good init, Mishkin and Matas, 2015
  7. Fixup Initialization: Residual Learning Without Normalization, Zhang et al, 2019
  8. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, Frankle and Carbin, 2019