深入淺出解讀卷積神經(jīng)網(wǎng)絡(luò)
卷積神經(jīng)網(wǎng)絡(luò)
經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)圖
圖2 卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)圖
卷積神經(jīng)網(wǎng)絡(luò)和全連接的神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)上的差異還是比較大的,全連接的網(wǎng)絡(luò),相鄰兩層的節(jié)點(diǎn)都有邊相連,而卷積神經(jīng)網(wǎng)絡(luò),相鄰節(jié)點(diǎn)只有部分節(jié)點(diǎn)相連。
全連接神經(jīng)網(wǎng)絡(luò)處理圖像的***問題在于全連接層的參數(shù)太多,參數(shù)太多的話容易發(fā)生過擬合而且會導(dǎo)致計算速度減慢,卷積神經(jīng)網(wǎng)絡(luò)可以減小參數(shù)個數(shù)的目的。
假設(shè)輸入是一張圖片大小為28*28*3,***層隱藏層有500個節(jié)點(diǎn),那么***層的參數(shù)就有28*28*3*500+500=1176000個參數(shù),當(dāng)圖片更大時,參數(shù)就更多了,而且這只是***層。
那么為什么卷積神經(jīng)網(wǎng)絡(luò)可以達(dá)到減小參數(shù)的目的呢?
卷積神經(jīng)網(wǎng)絡(luò)最為關(guān)鍵的有卷積層,池化層,全連接層。
卷積層
卷積層中每個節(jié)點(diǎn)的輸入只是上一層神經(jīng)網(wǎng)絡(luò)的一小塊,通常由卷積核來實(shí)現(xiàn),卷積核是一個過濾器,可以想象成一個掃描窗口,扣到每張圖片上,然后根據(jù)設(shè)置好的大小步長等等掃描圖片,計算規(guī)則是被扣的圖像像素矩陣跟卷積核的權(quán)重對應(yīng)位置相乘然后求和,每掃描一次得到一個輸出。卷積層所做的工作可以理解為對圖像像素的每一小塊進(jìn)行特征抽象??梢酝ㄟ^多個不同的卷積核對同一張圖片進(jìn)行卷積,卷積核的個數(shù),其實(shí)就是卷積之后輸出矩陣的深度。卷積神經(jīng)網(wǎng)絡(luò)的參數(shù)個數(shù)與圖片大小無關(guān),只跟過濾器的尺寸、深度以及卷積核的個數(shù)(輸出矩陣的深度)有關(guān)。假設(shè)是還是28*28*3的圖片,卷積核的大小設(shè)為3*3*3,輸出矩陣的深度為500,那么參數(shù)個數(shù)為3*3*3*500+500=14000個參數(shù),對比全連接層,參數(shù)減少了很多。
圖3 形象的卷積層示例
池化層
池化層可以認(rèn)為是將一張高分辨率的圖片轉(zhuǎn)化為低分辨率的圖片。可以非常有效的縮小矩陣的尺寸,從而減小全連接層的參數(shù)個數(shù),這樣可以加快計算速率同時又防止過擬合,池化,可以減小模型,提高速度,同時提高所提取特征的魯棒性。
使用2*2的過濾器步長為2,***池化如下圖所示:
圖4 2*2過濾器***池化示例圖
我們可以將卷積層和池化層看成是自動特征提取就可以了。
通過上面直觀的介紹,現(xiàn)在我們就知道為什么卷積神經(jīng)網(wǎng)絡(luò)可以達(dá)到減小參數(shù)的目的了?
和全連接神經(jīng)網(wǎng)絡(luò)相比,卷積神經(jīng)網(wǎng)絡(luò)的優(yōu)勢在于共享權(quán)重和稀疏連接。共享權(quán)重在于參數(shù)只與過濾器有關(guān)。卷積神經(jīng)網(wǎng)絡(luò)減少參數(shù)的另外一個原因是稀疏連接。輸出節(jié)點(diǎn)至于輸入圖片矩陣的部分像素矩陣有關(guān),也就是跟卷積核扣上去的那一小塊矩陣有關(guān)。這就是稀疏連接的概念。
卷積神經(jīng)網(wǎng)絡(luò)通過權(quán)重共享和稀疏連接來減少參數(shù)的。從而防止過度擬合。
訓(xùn)練過程
卷積神經(jīng)網(wǎng)絡(luò)的訓(xùn)練過程大致可分為如下幾步:
***步:導(dǎo)入相關(guān)庫、加載參數(shù)
- import math
 - import numpy as np
 - import tensorflow as tf
 - import matplotlib.pyplot as plt
 - import h5py
 - from tensorflow.python.framework import ops
 - from tf_utils import *
 - np.random.seed(1)
 - X_train_orig,Y_train_orig,X_test_orig,Y_test_orig,classes=load_dataset()
 - index=0
 - plt.imshow(X_train_orig[index])
 - print("y="+str(np.squeeze(Y_train_orig[:,index])))
 - plt.show()
 
第二步:歸一化,有利于加快梯度下降
- X_train=X_train_orig/255.0
 - X_test=X_test_orig/255.0
 - Y_train=convert_to_one_hot(Y_train_orig,6)
 - Y_test=convert_to_one_hot(Y_test_orig,6)
 
第三步:定義參數(shù)及卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)
- def create_placeholder(num_px,channel,n_y):
 - X=tf.placeholder(tf.float32,shape=(None,num_px,num_px,channel),name='X')
 - Y=tf.placeholder(tf.float32,shape=(None,n_y),name='Y')
 - return X,Y
 - X,Y=create_placeholder(64,3,6)
 - print("X="+str(X))
 - print("Y="+str(Y))
 - def weight_variable(shape):
 - return tf.Variable(tf.truncated_normal(shape,stddev=0.1))
 - def bias_variable(shape):
 - return tf.Variable(tf.constant(0.1,shape=shape))
 - def conv2d(x,W):
 - return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')
 - def max_pool_2x2(x):
 - return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')
 - def initialize_parameters():
 - w_conv1=weight_variable([5,5,3,32])
 - b_conv1=bias_variable([32])
 - w_conv2=weight_variable([5,5,32,64])
 - b_conv2=bias_variable([64])
 - w_fc1=weight_variable([16*16*64,512])
 - b_fc1=bias_variable([512])
 - w_fc2=weight_variable([512,6])
 - b_fc2=bias_variable([6])
 - parameters={
 - "w_conv1":w_conv1,
 - "b_conv1":b_conv1,
 - "w_conv2":w_conv2,
 - "b_conv2":b_conv2,
 - "w_fc1":w_fc1,
 - "b_fc1":b_fc1,
 - "w_fc2":w_fc2,
 - "b_fc2":b_fc2
 - }
 - return parameters
 
第四步:前行傳播過程
- def forward_propagation(X,parameters):
 - w_conv1=parameters["w_conv1"]
 - b_conv1=parameters["b_conv1"]
 - h_conv1=tf.nn.relu(conv2d(X,w_conv1)+b_conv1)
 - h_pool1=max_pool_2x2(h_conv1)
 - w_conv2=parameters["w_conv2"]
 - b_conv2=parameters["b_conv2"]
 - h_conv2=tf.nn.relu(conv2d(h_pool1,w_conv2)+b_conv2)
 - h_pool2=max_pool_2x2(h_conv2)
 - w_fc1=parameters["w_fc1"]
 - b_fc1=parameters["b_fc1"]
 - h_pool2_flat=tf.reshape(h_pool2,[-1,16*16*64])
 - h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,w_fc1)+b_fc1)
 - #keep_prob=tf.placeholder(tf.float32)
 - #h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)
 - w_fc2=parameters["w_fc2"]
 - b_fc2=parameters["b_fc2"]
 - y_conv=tf.matmul(h_fc1,w_fc2)+b_fc2
 - return y_conv
 
第五步:成本函數(shù)
- def compute_cost(y_conv,Y):
 - cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_conv,labels=Y))
 - return cost
 
第六步:梯度下降更新參數(shù)
- def random_mini_batches1(X, Y, mini_batch_size = 64, seed = 0):
 - m = X.shape[0] # number of training examples
 - mini_batches = []
 - np.random.seed(seed)
 - Y=Y.T #(1080,6)
 - # Step 1: Shuffle (X, Y)
 - permutation = list(np.random.permutation(m))
 - shuffled_X = X[permutation,:,:,:]
 - shuffled_Y = Y[permutation,:].reshape((m,Y.shape[1]))
 - # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.
 - num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning
 - for k in range(0, num_complete_minibatches):
 - mini_batch_X = shuffled_X[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:,:,:]
 - mini_batch_Y = shuffled_Y[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:]
 - mini_batch = (mini_batch_X, mini_batch_Y)
 - mini_batches.append(mini_batch)
 - # Handling the end case (last mini-batch < mini_batch_size)
 - if m % mini_batch_size != 0:
 - mini_batch_X = shuffled_X[num_complete_minibatches * mini_batch_size : m,:,:,:]
 - mini_batch_Y = shuffled_Y[num_complete_minibatches * mini_batch_size : m,:]
 - mini_batch = (mini_batch_X, mini_batch_Y)
 - mini_batches.append(mini_batch)
 - return mini_batches
 
第七步:訓(xùn)練模型
- def model(X_train,Y_train,X_test,Y_test,learning_rate=0.001,num_epochs=20,minibatch_size=32,print_cost=True):
 - ops.reset_default_graph() #(1080, 64, 64, 3)
 - tf.set_random_seed(1) #Y_train(6, 1080)
 - seed=3
 - (m,num_px1,num_px2,c)=X_train.shape
 - n_y=Y_train.shape[0]
 - costs=[]
 - X,Y=create_placeholder(64,3,6)
 - parameters=initialize_parameters()
 - Z3=forward_propagation(X,parameters)
 - cost=compute_cost(Z3,Y)
 - optm=tf.train.AdamOptimizer(learning_rate).minimize(cost)
 - correct_prediction=tf.equal(tf.argmax(Z3,1),tf.argmax(Y,1))#居然忘記1了,所以一直出現(xiàn)損失越來越小了,但是準(zhǔn)確率卻一直是0
 - accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
 - with tf.Session() as sess:
 - tf.global_variables_initializer().run()
 - for epoch in range(num_epochs):
 - epoch_cost=0
 - num_minibatches=int(m/minibatch_size)
 - seed+=1
 - #下面輸入要求(6,,1080)格式,所以要加個轉(zhuǎn)置
 - minibatches=random_mini_batches1(X_train,Y_train,minibatch_size,seed)
 - for minibatch in minibatches:
 - (minibatch_X,minibatch_Y)=minibatch
 - _,minibatch_cost=sess.run([optm,cost],feed_dict={X:minibatch_X,Y:minibatch_Y})
 - epoch_cost+=minibatch_cost/num_minibatches
 - if(print_cost==True and epoch % 2==0):
 - #print("Epoch",'%04d' % (epoch+1),"cost={:.9f}".format(epoch_cost))
 - print("Cost after epoch %i:%f" % (epoch,epoch_cost))
 - if(print_cost==True and epoch %1==0):
 - costs.append(epoch_cost)
 - print("Train Accuracy:",accuracy.eval({X:X_train,Y:Y_train.T}))
 - print("Test Accuracy:",accuracy.eval({X:X_test,Y:Y_test.T}))
 - plt.plot(np.squeeze(costs))
 - plt.ylabel('cost')
 - plt.xlabel('iterations(per tens)')
 - plt.title("learning rate="+str(learning_rate))
 - plt.show()
 - parameters=sess.run(parameters)
 - return parameters
 - parameters=model(X_train,Y_train,X_test,Y_test)
 



















 
 
 









 
 
 
 