上篇,Amusi带着大家学习了如何
浅入浅出TensorFlow 5 — 可视化工具TensorBoard
,今天继续给大家介绍linolzhang大佬的
TensorFlow系列课程
,带大家学习
如何实现经典网络
。
正文
一. 经典网络介绍
首先介绍目前比较主流的几种经典网络,
AlexNet[2012]、VGG16[2014]、GoogLeNet[2014]、ResNet[2015]
。
这几种网络都是在 ILSVRC 比赛中脱颖而出的,越往后网络越复杂, ResNet 为152层结构,测试错误率为 3.57%。
为了能给大家一个更宏观的视图,作者将目前CNN在多个领域的应用整理了一张图来说明:
下面我们重点关注和讲解的仍是之前提到的四种主流模型,主要应用聚焦在分类,当然其他方向的应用都是以此为基础,希望有时间可以在后续的文章中展开。
二. AlexNet
在 LeNet 基础上,AlexNet 加入了
ReLU层
和
DropOut
,有效解决了大规模数据训练的震荡问题,因此也开启了一个里程碑,AlexNet 可以看作是深度学习的一个起点,网络分为
5个卷积层+3个全连接层
,来看网络结构图(这张结构图比较经典,双GPU实现):
AlexNet 贡献是非常大的,描述为以下几点:
● 采用 ReLU 替代了sigmoid,提高了网络的非线性;
● 引入Dropout 训练方式,增强了网络的健壮性;
● 通过 LRN(Local Responce Normalization)提高了网络适应性;
目前LRN已经不怎么使用,基本被BN取代
● 通过Data Augmentation 证明了大量数据对于模型的作用。
来看 AlexNet 在 TensorFlow 下的时间评测代码:
1
2from datetime import datetime
3import math
4import time
5import tensorflow as tf
6
7batch_size = 32
8num_batches = 100
9
10
11def print_activations(t):
12 print(t.op.name,' ',t.get_shape().as_list())
13
14
15def inference(images):
16 parameters = []
17
18 with tf.name_scope('conv1') as scope:
19 kernel = tf.Variable(tf.truncated_normal([11,11,3,64],dtype=tf.float32,stddev=1e-1), name='weights')
20 wx = tf.nn.conv2d(images,kernel,[1,4,4,1],padding='SAME')
21 biases = tf.Variable(tf.constant(0.0, shape=[64],dtype=tf.float32),trainable=True,name='biases')
22 wx_add_b = tf.nn.bias_add(wx,biases)
23 conv1 = tf.nn.relu(wx_add_b,name=scope)
24 parameters += [kernel,biases]
25 print_activations(conv1)
26
27 lrn1 = tf.nn.lrn(conv1,4,bias=1.0,alpha=0.001
/9,beta=0.75,name='lrn1')
28 pool1 = tf.nn.max_pool(lrn1,ksize=[1,3,3,1],strides=[1,2,2,1],padding='VALID',name='pool1')
29 print_activations(pool1)
30
31
32 with tf.name_scope('conv2') as scope:
33 kernel = tf.Variable(tf.truncated_normal([5,5,64,192],dtype=tf.float32,stddev=1e-1), name='weights')
34 wx = tf.nn.conv2d(pool1,kernel,[1,1,1,1],padding='SAME')
35 biases = tf.Variable(tf.constant(0.0, shape=[192],dtype=tf.float32),trainable=True,name='biases')
36 wx_add_b = tf.nn.bias_add(wx,biases)
37 conv2 = tf.nn.relu(wx_add_b,name=scope)
38 parameters += [kernel,biases]
39 print_activations(conv2)
40
41 lrn2 = tf.nn.lrn(conv2,4,bias=1.0,alpha=0.001/9,beta=0.75,name='lrn2')
42 pool2 = tf.nn.max_pool(lrn2,ksize=[1,3,3,1],strides=[1,2,2,1],padding='VALID',name='pool2')
43 print_activations(pool2)
44
45
46 with tf.name_scope('conv3') as scope:
47 kernel = tf.Variable(tf.truncated_normal([3,3,192,384],dtype=tf.float32,stddev=1e-1), name='weights')
48 wx = tf.nn.conv2d(pool2,kernel,[1,1,1,1],padding='SAME')
49 biases = tf.Variable(tf.constant(0.0, shape=[384],dtype=tf.float32),trainable=True,name='biases')
50 wx_add_b = tf.nn.bias_add(wx,biases)
51 conv3 = tf.nn.relu(wx_add_b,name=scope)
52 parameters += [kernel,biases]
53 print_activations(conv3)
54
55
56 with tf.name_scope('conv4') as scope:
57 kernel = tf.Variable(tf.truncated_normal([3,3,384,256],dtype=tf.float32,stddev=1e-1), name='weights')
58 wx = tf.nn.conv2d(conv3,kernel,[1,1,1,1],padding='SAME')
59 biases = tf.Variable(tf.constant(0.0, shape=[256],dtype=tf.float32),trainable=True,name='biases')
60 wx_add_b = tf.nn.bias_add(wx,biases)
61 conv4 = tf.nn.relu(wx_add_b,name=scope)
62 parameters += [kernel,biases]
63 print_activations(conv4)
64
65
66 with tf.name_scope('conv5') as scope:
67 kernel = tf.Variable(tf.truncated_normal([3,3,256,256],dtype=tf.float32,stddev=1e-1), name='weights')
68 wx = tf.nn.conv2d(conv4,kernel,[1,1,1,1],padding='SAME')
69 biases = tf.Variable(tf.constant(0.0, shape=[256],dtype=tf.float32),trainable=True,name='biases')
70 wx_add_b = tf.nn.bias_add(wx,biases)
71 conv5 = tf.nn.relu(wx_add_b,name=scope)
72 parameters += [kernel,biases]
73 print_activations(conv5)
74
75 pool5 = tf.nn.max_pool(conv5,ksize=[1,3,3,1],strides=[1,2,2,1],padding='VALID',name='pool5')
76 print_activations(pool5)
77 return pool5,parameters
78
79
80def time_tensorflow_run(session,target,info_string):
81 num_steps_burn_in = 10
82 total_duration = 0.0
83 total_duration_squared = 0.0
84
85 for i in range(num_batches + num_steps_burn_in):
86 start_time = time.time()
87 _ = session.run(target)
88
duration = time.time() - start_time
89 if i >= num_steps_burn_in:
90 if not i % 10:
91 print('%s: step %d, duration = %.3f' %(datetime.now(),i-num_steps_burn_in,duration))
92 total_duration += duration
93 total_duration_squared += duration * duration
94
95 mn = total_duration / num_batches
96 vr = total_duration_squared / num_batches - mn *mn
97 sd = math.sqrt(vr)
98 print('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %(datetime.now(),info_string,num_batches,mn,sd))
99
100
101def run_benchmark():
102 with tf.Graph().as_default():
103 image_size = 224
104 images = tf.Variable(tf.random_normal([batch_size,image_size,image_size,3],dtype=tf.float32,stddev=1e-1))
105 pool5, parameters = inference(images)
106
107 init = tf.global_variables_initializer()
108 sess = tf.Session()
109 sess.run(init)
110
111
112 time_tensorflow_run(sess,pool5,"Forward")
113 objective =tf.nn.l2_loss(pool5)
114 grad = tf.gradients(objective,parameters)
115 time_tensorflow_run(sess,grad,"Forward-backward")
116
117
118run_benchmark()
三. VGG16
人们相信,网络层数越多,对应的参数越复杂,所能描述的网络越准确,基于这个假设,将 AlexNet 网络进行扩充,得到了 VGG16/19,这是在当时的条件下能控制住网络震荡最好的网络(预知更多的层,请看之后的分解 ^v^)。
VGG16 的叫法来自于 网络包含 16 层卷积,包括
13层的卷积+3层的全连接
。
VGG的网络结构图 【点击查看详图】:
VGG19 是在 conv3、conv4、conv5 三个Group各添加一层卷积(上图红色框)。
1
2from datetime import datetime
3import math
4import time
5import tensorflow as tf
6
7batch_size = 32
8num_batches = 100
9
10
11def conv(input,name,kh,kw,n_out,dh,dw,p):
12 n_in = input.get_shape()[-1].value
13
14
with tf.name_scope(name) as scope:
15 kernel = tf.get_variable(scope+"w",shape=[kh,kw,n_in,n_out],dtype=tf.float32,initializer=tf.contrib.layers.xavier_initializer_conv2d())
16 conv = tf.nn.conv2d(input,kernel,[1,dh,dw,1],padding='SAME')
17 bias_init_val = tf.constant(0.0,shape=[n_out],dtype=tf.float32)
18 biases = tf.Variable(bias_init_val, trainable=True,name='biases')
19 wx_add_b = tf.nn.bias_add(conv,biases)
20 relu = tf.nn.relu(wx_add_b,name=scope)
21 p += [kernel,biases]
22 return relu
23
24
25def fc(input,name,n_out,p):
26 n_in = input.get_shape()[-1].value
27
28 with tf.name_scope(name) as scope:
29 kernel = tf.get_variable(scope+"w",shape=[n_in,n_out],dtype=tf.float32,initializer=tf.contrib.layers.xavier_initializer())
30 biases = tf.Variable(tf.constant(0.1,shape=[n_out],dtype=tf.float32), name="biases")
31 relu = tf.nn.relu_layer(input,kernel,biases, name = scope)
32 p += [kernel,biases]
33 return relu
34
35
36def max_pool(input,name,kh,kw,dh,dw):
37 return tf.nn.max_pool(input,ksize=[1,kh,kw,1],strides=[1,dh,dw,1],padding='SAME',name=name)
38
39
40def inference(input,keep_prob):
41 p = []
42 conv1_1 = conv(input,name="conv1_1",kh=3,kw=3,n_out=64,dh=1,dw=1,p=p)
43 conv1_2 = conv(conv1_1,name="conv1_2",kh=3,kw=3,n_out=64,dh=1,dw=1,p=p)
44 pool1 = max_pool(conv1_2, name="pool1",kh=2,kw=2,dw=2,dh=2)
45
46 conv2_1 = conv(pool1,name="conv2_1",kh=3,kw=3,n_out=128,dh=1,dw=1,p=p)
47 conv2_2 = conv(conv2_1,name="conv2_2",kh=3,kw=3,n_out=128,dh=1,dw=1,p=p)
48 pool2 = max_pool(conv2_2, name="pool2",kh=2,kw=2,dw=2,dh=2)
49
50 conv3_1 = conv(pool2,name="conv3_1",kh=3,kw=3,n_out=256,dh=1,dw=1,p=p)
51 conv3_2 = conv(conv3_1,name="conv3_2",kh=3,kw=3,n_out=256,dh=1,dw=1,p=p)
52 conv3_3 = conv(conv3_2,name="conv3_3",kh=3,kw=3,n_out=256,dh=1,dw=1,p=p)
53 pool3 = max_pool(conv3_3, name="pool3",kh=2,kw=2,dw=2,dh=2)
54
55 conv4_1 = conv(pool3,name="conv4_1",kh=3,kw=3,n_out=512,dh=1,dw=1,p=p)
56 conv4_2 = conv(conv4_1,name="conv4_2",kh=3,kw=3,n_out=512,dh=1,dw=1,p=p)
57 conv4_3 = conv(conv4_2,name="conv4_3",kh=3,kw=3,n_out=512,dh=1,dw=1,p=p)
58 pool4 = max_pool(conv4_3, name="pool4",kh=2,kw=2,dw=2,dh=2)
59
60
61 conv5_1 = conv(pool4,name="conv5_1",kh=3,kw=3,n_out=512,dh=1,dw=1,p=p)
62 conv5_2 = conv(conv5_1,name="conv5_2",kh=3