Skip to content

"Hello world" of deep learning

Three Steps

  • Step 1: define a set of function
  • Step 2: goodness of function
  • Step 3: pick the best function

老師課堂上的 task 是使用 Keras

1
2
3
4
5
6
7
model = Sequential()                                # declare a model
model.add(Dense(input_dim=28*28, output_dim=500))   # Dense: Fully connected
model.add(Activation('sigmoid'))
model.add(Dense(units=500, activation='relu'))
model.add(Dense(units=10, activation='softmax'))    # output between 0 and 1
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(s_train, y_train, batch_size=100, epochs=20)  # pick the best function

Mini-batch

Pseudo code:

  1. Randomly initialize network parameters
  2. while (!all mini-batches have been picked)

    pick the $i^\text{th}$ batch $L' = C^1 + C^{31} + \cdots$ update parameters once

每跑一次 while-loop,即是一個 epoch。

  • 如果 batch_size = 1:SGP
  • 如果 batch_size = #training data:(full batch)Gradient Descent

Matrix Operation

Stochastic Gradient Descent

Mini-batch

GPU 的加速,就是因為取 batch 後,能夠平行運算,所以若沒有 batch_size,是無法用 GPU 加速的。