machine-learning - TensorFlow 二元分类任务精度较差，但 SciKit-Learn GBM 效果良好

我使用以下 tensorflow 实现来执行二元分类任务，但准确性非常差。然而，当我在没有任何调整的情况下使用 sklearn.ensemble.GradientBoostingClassifier 训练相同的数据集时，结果非常好。当我深入研究神经网络的样本外预测时，我意识到大多数预测都是正类。

         precision    recall  f1-score   support

      0       0.01      1.00      0.02         8
      1       1.00      0.37      0.55      1630

avg / total       1.00      0.38      0.54      1638

2层全连接网络的实现:

import math
batch_size = 200
feature_size = len(train_features.columns)

graph = tf.Graph()
with graph.as_default():

  # Input data. For the training data, we use a placeholder that will be fed
  # at run time with a training minibatch.
  tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, feature_size))
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  # Variables.
  weights1 = tf.Variable(tf.truncated_normal([feature_size, 512]))
  biases1 = tf.Variable(tf.zeros([512]))

  weights2 = tf.Variable(tf.truncated_normal([512, 512], stddev=0.005))
  biases2 = tf.Variable(tf.zeros([512]))

  weights = tf.Variable(tf.truncated_normal([512, num_labels], stddev=0.005))
  biases = tf.Variable(tf.zeros([num_labels]))

  hidden_layer1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights1) + biases1)
  hidden_layer2 = tf.nn.relu(tf.matmul(hidden_layer1, weights2) + biases2)
  logits = tf.matmul(hidden_layer2, weights) + biases

  loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))

  # Optimizer.
  optimizer = tf.train.AdamOptimizer(0.0005).minimize(loss)

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)

  valid_hidden_layer1 = tf.nn.relu(tf.matmul(tf_valid_dataset, weights1) + biases1)
  valid_hidden_layer2 = tf.nn.relu(tf.matmul(valid_hidden_layer1, weights2) + biases2)
  valid_prediction = tf.nn.softmax(tf.matmul(valid_hidden_layer2, weights) + biases)

  test_hidden_layer1 = tf.nn.relu(tf.matmul(tf_test_dataset, weights1) + biases1)
  test_hidden_layer2 = tf.nn.relu(tf.matmul(test_hidden_layer1, weights2) + biases2)
  test_prediction = tf.nn.softmax(tf.matmul(test_hidden_layer2, weights) + biases)

关于如何调试这个问题有什么建议吗？

最佳答案

sklearn GradientBoostingClassifier 是一种与神经网络不同的算法。它基于回归树做一些事情，与神经网络相比，回归树需要更少的微调才能提供良好的性能。这是使用神经网络时的权衡；如果您希望性能优于随机森林和 SVM 等替代算法，则需要调整超参数。

就目前而言，您应该做的第一件事是将 relu 单位的偏差初始化为非零。这有助于防止它们进入“死亡”状态并最终永远给出 0 输出和 0 梯度。你还应该尝试不同的学习率；学习率太高会导致算法不能正常学习，太低会浪费资源。

您还应该尝试神经元和层的数量。我看到每个隐藏层有 512 个神经元，这可能太多了，除非你的问题维度那么高并且你有足够的数据。您的训练和测试/交叉验证错误是什么样的？您应该在训练时跟踪这些。如果您的训练误差较低但验证误差较高，那么您应该减少神经元的数量，因为您过度拟合了。您也可以尝试只使用一个隐藏层，看看是否有帮助。

关于machine-learning - TensorFlow 二元分类任务精度较差，但 SciKit-Learn GBM 效果良好，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39021463/

machine-learning - TensorFlow 二元分类任务精度较差，但 SciKit-Learn GBM 效果良好

上一篇：optimization - 具有非常大的 λ 的正则化成本函数

下一篇：python - 将训练数据添加到现有的 LinearSVC