Batch normalization 이란

 - 각 층의 입력값의 평균을 0,  표준편차를 1로 만드는 것 (정규화, 입력 분포의 균일화)

 - 학습 속도가 개선 (학습률을 높게 설정할 수 있기 때문)

 - 가중치 초기값 선택의 의존성이 적어짐 (학습을 할 때마다 출력값을 정규화하기 때문)

 - 과적합(overfitting) 위험을 줄일 수 있음

 - 테스트시에는 학습 동안 저장된 이동 평균값들로 Normalize

 

Batch normalization tips

 

1) Use With Different Network Types

 It can be used with most network types,

   such as Multilayer Perceptrons, Convolutional Neural Networks and Recurrent Neural Networks.

 

2) Probably Use Before the Activation

It may be more appropriate to use batch normalization after the activation function


3) Use Large Learning Rates

Using batch normalization makes the network more stable during training.

This may require the use of much larger than normal learning rates,

that in turn may further speed up the learning process.

 

4) Alternate to Data Preparation

If the mean and standard deviations calculated for each input feature are calculated

   over the mini-batch instead of over the entire training dataset, then the batch size must be sufficiently

   representative of the range of each variable.

+ Recent posts