python - tensorflow /keras : "logits and labels must have the same first dimension" How to squeeze logits or expand labels?

我正在尝试制作一个简单的 CNN 分类器模型。对于我的训练图像 (BATCH_SIZEx227x227x1) 和标签 (BATCH_SIZEx7) 数据集，我使用 numpy ndarray，它们通过 ImageDataGenerator 批量输入到模型中。我使用的损失函数是 tf.nn.sparse_categorical_crossentropy。当模型尝试训练时就会出现问题；模型(这里的批量大小为 1，用于我的简化实验)输出形状为 [1, 7]，标签为形状 [7]。

我几乎肯定我知道这个问题的原因，但我不确定如何解决它。我的假设是稀疏分类交叉熵正在挤压标签的尺寸(例如，当 BATCH_SIZE 为 2 时，输入的真实标签形状从 [2, 7] 挤压到 [14])，使我无法修复标签形状，我所有修复 logits 形状的尝试都没有结果。

我最初尝试使用np.expand_dims修复标签形状。但无论我如何扩展维度，损失函数总是使标签变平。

随后，我尝试在模型末尾添加 tf.keras.layers.Flatten() 以消除无关的第一维度，但没有效果；我仍然遇到同样的错误。接下来，尝试使用 tf.keras.layers.Reshape((-1,)) 来压缩所有尺寸。但是，这导致了不同的错误:

in sparse_categorical_crossentropy logits = array_ops.reshape(output, [-1, int(output_shape[-1])]) TypeError: int returned non-int (type NoneType)

问题:如何将 logits 的形状压缩为与稀疏_分类_交叉熵返回的标签相同的形状？

 ### BUILD SHAPE OF THE MODEL ###

 model = tf.keras.Sequential([
   tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, 
                          input_shape=(227,227,1)),
   tf.keras.layers.MaxPooling2D((2,2), strides=2),
   tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu),
   tf.keras.layers.MaxPooling2D((2,2), strides=2),
   tf.keras.layers.Flatten(),
   tf.keras.layers.Dense(128, activation=tf.nn.relu),
   tf.keras.layers.Dense(7, activation=tf.nn.softmax), # final layer with node for each classification
   #tf.keras.layers.Reshape((-1,))
])

# specify loss and SGD functions
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

### TRAIN THE MODEL ###
#specify training metadata
BATCH_SIZE = 1
print("about to train")
# train the model on the training data
num_epochs = 1 
model.fit_generator(generator.flow(train_images, train_labels, batch_size=BATCH_SIZE), epochs=num_epochs)

--- 完整错误跟踪 ---

Traceback (most recent call last):
  File "classifier_model.py", line 115, in <module>
    model.fit_generator(generator.flow(train_images, train_labels, batch_size=BATCH_SIZE), epochs=num_epochs)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1426, in fit_generator
    initial_epoch=initial_epoch)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py", line 191, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1191, in train_on_batch
    outputs = self._fit_function(ins)  # pylint: disable=not-callable
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3076, in __call__
    run_metadata=self.run_metadata)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [1,7] and labels shape [7]
     [[{{node loss/dense_1_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]

最佳答案

不，你把原因搞错了。您提供了 one-hot 编码标签，但 sparse_categorical_crossentropy 需要整数标签，因为它本身就是 one-hot 编码(因此是稀疏的)。

一个简单的解决方案是将损失更改为categorical_crossentropy，而不是稀疏版本。另请注意，形状为 (7,) 的 y_true 是不正确的，它应该是 (1, 7)。

关于python - tensorflow /keras : "logits and labels must have the same first dimension" How to squeeze logits or expand labels?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56301426/

python - tensorflow /keras : "logits and labels must have the same first dimension" How to squeeze logits or expand labels?

上一篇：r - 如何使用 R 中的插入符号绘制多标签 SVM 问题的决策边界

下一篇：python - 无法导入 pyLDAvis - ModuleNotFoundError : No module named '_contextvars'