python - tensorflow 卷积神经网络的人脸识别准确率仅为0.05

标签 python machine-learning tensorflow deep-learning conv-neural-network

操作系统:win10

人脸数据库:yale人脸数据库(15个人,共约160张图片)

编程语言:tensorflow 上的 python

我使用tensorflow通过CNN进行人脸识别,但是准确率只有0.05左右。 (在卷积层中,没有padding) 网络结构为: Conv1-->最大池化-->Conv2-->最大池化-->全连接(15个输出)

代码如下: 一些定义就像 tensorflow 示例:

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="VALID") # no padding

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                         strides=[1, 2, 2, 1], padding="VALID")  # no padding 

第一个Conv层:

# first layer
SHAPE = [None, 64, 64, 1]
Y_SHAPE = [None, 15]
x = tf.placeholder(tf.float32, shape=SHAPE, name="x_data")
y = tf.placeholder(tf.float32, shape=Y_SHAPE, name="y_true")

W1_shape = [7, 7, 1, 6]
b1_shape = [6]
with tf.name_scope("Conv1"):
    W_conv1 = weight_variable(W1_shape)
    b_conv1 = bias_variable(b1_shape)
    tf.summary.histogram("weights", W_conv1)
#     tf.summary.histogram("bias", b_conv1)

    a_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)
    a_pool1 = max_pool_2x2(a_conv1)

    # a_pool1 shape : (29, 29, 6)



# second layer
W2_shape = [8, 8, 6, 16]
b2_shape = [16]
with tf.name_scope("Conv2"):
    W_conv2 = weight_variable(W2_shape)
    b_conv2 = bias_variable(b2_shape)
    tf.summary.histogram("weights", W_conv2)
#     tf.summary.histogram("bias", b_conv2)

    a_conv2 = tf.nn.relu(conv2d(a_pool1, W_conv2) + b_conv2)
    a_pool2 = max_pool_2x2(a_conv2)

    # a_pool2 shape (11, 11, 16)



# full connect
W_out_shape = [11*11*16, 15]
b_out_shape = [15]

with tf.name_scope("sigmoid"):
    W_out= weight_variable(W_out_shape)
    b_out = bias_variable(b_out_shape)

    a_pool2_flat = tf.reshape(a_pool2, [-1, 11*11*16])
    z_out = tf.matmul(a_pool2_flat, W_out) + b_out

    a_out = tf.nn.sigmoid(z_out)



# train and evaluate
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=a_out)

batch_size = 40
train_index = np.arange(90)

train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss)

correct_prediction = tf.equal(tf.argmax(a_out, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(1000):    # epochs=1000
        # index shuffle
        np.random.shuffle(train_index)
        batch_train = train_data[train_index[:batch_size]] 
        batch_labels = train_labels[train_index[:batch_size]]

        if i % 10 == 0:   # print accuracy each ten epoches
            train_accuracy = accuracy.eval(feed_dict={x:batch_train, y:batch_labels})
            print("step %d, train accuracy %g"%(i, train_accuracy))

        _, loss_ = sess.run([train_step, loss], feed_dict={x:batch_train, y:batch_labels})

    test_index = np.arange(74)
    np.random.shuffle(test_index)
    print("test accuracy:", sess.run(accuracy, feed_dict={x:test_data[test_index], y:test_labels[test_index]}))

writer.close()

以下图片是我的输出:

enter image description here

enter image description here

最佳答案

主要问题是网络结构不正确

a_out = tf.nn.sigmoid(z_out)
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=a_out)

应该是

a_out = z_out
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=a_out)

softmax_cross_entropy_with_logits 在内部应用 softmax,因此提前应用 sigmoid 是没有意义的(并且使训练变得更加困难,如果不是不可能的话)。在您当前的设置中,单个类的概率位于 [0, 0.345] 而不是 [0, 1],与完全饱和的 sigmoid 一样,softmax 是:

exp(1) / (14*exp(-1) + exp(1)) ~= 0.345

另外两个问题是:

  • 使用的学习率似乎完全是任意的,您可能需要切换到对无效学习率不太敏感的 Adam
  • 初始化方案似乎也很随意,您可能想减少标准。
  • 打印损失而不是打印精度。如果它在训练集上没有下降,那么你在训练中就有错误。如果是,但太慢 - 调整学习率等等。

关于python - tensorflow 卷积神经网络的人脸识别准确率仅为0.05,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46892880/

相关文章:

python - matplotlib:如何选择点击图形的转变?

input - 通过机器学习对 N x L 输入空间(表格形式)进行分类

python - python中的模型训练和Golang中的模型运行,模型导入过程中的问题

python - Tensorboard 不显示标量

machine-learning - tensorflow 中立体图像的批量学习

python - AppEngine - OpenID 登录后尝试重定向到新页面

python - 如何使用一行中单元格的值来选择在 pandas 数据框中查找列名?

python - 属性错误 : 'numpy.ndarray' object has no attribute 'plot'

machine-learning - LDA 文本分类的良好训练数据?

artificial-intelligence - 神经网络的标称值输入