けの〜のブログ

ガッキーでディープラーニングして教育界に革命を起こす

学習を上手く行うために Hyperparameter optimization ハイパーパラメーターの設定について

Hyperparameter optimization ハイパーパラメーターの設定について

 

Learning Rateやlearnig rate scheduleやregularization strengthなど調整すべきハイパーパラメーターをどのように設定するか

Learning Rateを例にとってまとめる

 

まずざっくりの範囲でRandom searchを行う

For example, a typical sampling of the learning rate would look as follows: learning_rate = 10 ** uniform(-6, 1).

 

このようにざっくり値を定めて

Prefer random search to grid search. As argued by Bergstra and Bengio in Random Search for Hyper-Parameter Optimization, “randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid”. As it turns out, this is also usually easier to implement.

 

f:id:keno-lasalle-kagoshima:20171118000648p:plain

良い精度を出す範囲を絞って行く

その絞って行く過程をどのようにするか

Stage your search from coarse to fine. In practice, it can be helpful to first search in coarse ranges (e.g. 10 ** [-6, 1]), and then depending on where the best results are turning up, narrow the range. Also, it can be helpful to perform the initial coarse search while only training for 1 epoch or even less, because many hyperparameter settings can lead the model to not learn at all, or immediately explode with infinite cost. The second stage could then perform a narrower search with 5 epochs, and the last stage could perform a detailed search in the final range for many more epochs (for example).

10の(−6~1)乗の範囲で探索している場合、多くても1epochで絞り込む

次の絞り込みの段階では5epoch

次の段階ではより多くのepoch数で範囲を絞って行く

cs231n.github.io