python - 我的神经网络没有提高其准确性

我正在使用 notMNIST 数据集训练一个神经网络来识别字符，但是一旦运行它，它的准确性在每次迭代后都会保持相对恒定。

我尝试过降低学习率，但没有什么不同。可能是什么问题？

我认为问题可能出在 tf.nn.relu() 方法的实现上，以及我如何计算预测，因为我对 tensorflow 和神经网络还很陌生

这是我的程序运行的截图，可以看到训练集、验证集、测试集的准确率都很差

num_steps=801

def accuracy(predictions, labels):
    return (100.0 * np.sum(np.argmax(predictions,1) == np.argmax(labels,1))
        / predictions.shape[0])

with tf.Session(graph=graph) as session:
    #this is a one-time operation which ensure the parameters get initialized
    #we described in the graph: random weights for the matrix, zeros for the
    #biases.
    tf.global_variables_initializer().run()
    print("initialized")
    for step in range(num_steps):
        #run the computations. we tell .run() that we  want to run the optimizer,
        #and get the loss value and the training predictions returned as numpy
        #arrays.
        _, l, predictions = session.run([optimizer,loss, train_prediction])
        if (step % 100 ==0):
            print("loss at step %d: %f" % (step,l))
            print("Training accuracy: %.1f%%" % accuracy(
                predictions, train_labels[:train_subset,:]))
            #calling .eval() on valid_prediction is basically like calling run(), but
            #just to get that one numpy array. Note that it recomputes all its graph
            #dependencies.
            print("Validation accuracy: %.1f%%" % accuracy(
                valid_prediction.eval(), valid_labels))
            print("test accuracy: %.1f%%" % accuracy(test_prediction.eval(),test_labels))

batch_size = 128
hidden_nodes = 1024
graph = tf.Graph()
with graph.as_default():
    #input data. For the training data, we use a placeholder that will be fed
    #at run time with a training minibatch
    tf_train_dataset = tf.placeholder(tf.float32,
                                    shape=(batch_size, image_size*image_size), name="td")
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels), name="tl")
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)

    #variables
    weights1 = tf.Variable(
        tf.truncated_normal([image_size*image_size, hidden_nodes]))
    biases1 = tf.Variable(tf.zeros([hidden_nodes]))
    weights2 =tf.Variable(
        tf.truncated_normal([hidden_nodes, num_labels]))
    biases2 = tf.Variable(tf.zeros([num_labels]))

    #training computation.
    relu1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights1) + biases1)
    relu_out= tf.nn.relu(tf.matmul(relu1, weights2) + biases2)

    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(logits=relu_out,labels=tf_train_labels))

    #optimizer
    optimizer = tf.train.GradientDescentOptimizer(0.25).minimize(loss)

    #predictions for the training, validation, and test data
    train_prediction = relu_out
    valid_prediction = tf.nn.relu(tf.matmul(tf.nn.relu(tf.matmul(tf_valid_dataset, weights1) + biases1), weights2) + biases2) 
    test_prediction = tf.nn.relu(tf.matmul(tf.nn.relu(tf.matmul(tf_test_dataset, weights1) + biases1), weights2) + biases2)

num_steps = 3001

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    print("initialized")
    for step in range(num_steps):
        #pick an offset within the training data, which has been randomized.
        #note: we could use better randomization across epochs.
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        #generate a minibatch.
        batch_data = train_dataset[offset:(offset + batch_size), :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        #prepare a dictionary telling the session where to feed the minibatch.
        #the key of the dictionary is the placeholder node of the graph to be fed,
        #and the value is the numpy array to feed to it
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
        _, l, predictions = session.run(
            [optimizer, loss, train_prediction], feed_dict=feed_dict)
        if (step % 500 == 0):
            print("minibatch loss at step %d: %f" % (step,l))
            print("minibatch accuracy: %.1f%%" % accuracy(predictions,batch_labels))
            print("validation accuracy: %.1f%%" % accuracy(
                valid_prediction.eval(), valid_labels))
            print("test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

最佳答案

正如我所想，问题出在 relu() 方法的实现上。

在计算部分，我使用了 relu() 两次，而我应该只使用一次。更改后，它最终看起来像这样。

logits_1 = tf.matmul(tf_train_dataset, weights1) + biases1
relu1 = tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu1, weights2) + biases2

我将损失变量中的参数 logits 从 relu_out 更改为 logits_2。

loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(logits=logits_2,labels=tf_train_labels))

最后，我更改了预测变量，以便使用 logits_2 而不是 relu_out 进行计算。

train_prediction = tf.nn.softmax(logits_2)
    valid_prediction = tf.nn.softmax(
        tf.matmul(tf.nn.relu(tf.matmul(tf_valid_dataset,weights1) +biases1), weights2) + biases2)
    test_prediction = tf.nn.softmax(
        tf.matmul(tf.nn.relu(tf.matmul(tf_test_dataset, weights1) + biases1), weights2) + biases2)

如您所见，准确率提高了 90% 左右

尽管我仍然不确定为什么两次实现 relu() 方法会出现问题。如果我没记错的话，relu() 方法返回 0 或给定参数的值，那么它不应该是相同的吗？

如果有人知道请回答

关于python - 我的神经网络没有提高其准确性，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47956261/

python - 我的神经网络没有提高其准确性

上一篇：machine-learning - 使用brew 添加新的辅助方法会引发错误

下一篇：查找重叠元素的 Pythonic 方法