python - ValueError : logits and labels must have the same shape ((None, 10)与(无，1))

我是 tensorflow 的新手，我正在尝试构建一个简单的模型来输出安装概率(安装列)。

这里是数据集的一个子集:

{'A': {0: 12, 2: 28, 3: 26, 4: 9, 5: 36},
 'B': {0: 10, 2: 17, 3: 22, 4: 2, 5: 31},
 'C': {0: 1, 2: 0, 3: 5, 4: 0, 5: 1},
 'D': {0: 5, 2: 0, 3: 0, 4: 0, 5: 0},
 'E': {0: 12, 2: 1, 3: 4, 4: 3, 5: 1},
 'F': {0: 12, 2: 2, 3: 14, 4: 9, 5: 11},
 'install': {0: 0, 2: 0, 3: 1, 4: 0, 5: 0},
 'G': {0: 21, 2: 12, 3: 8, 4: 13, 5: 19},
 'H': {0: 0, 2: 5, 3: 1, 4: 6, 5: 5},
 'I': {0: 21, 2: 22, 3: 5, 4: 10, 5: 20},
 'J': {0: 0.0, 2: 136.5, 3: 0.0, 4: 0.1, 5: 29.5},
 'K': {0: 0.15220949263502456,
  2: 0.08139534883720931,
  3: 0.15625,
  4: 0.15384584755440725,
  5: 0.04188829787234043},
 'L': {0: 649, 2: 379, 3: 531, 4: 660, 5: 242},
 'M': {0: 0, 2: 0, 3: 0, 4: 1, 5: 1},
 'N': {0: 1, 2: 1, 3: 1, 4: 0, 5: 0},
 'O': {0: 0, 2: 1, 3: 0, 4: 1, 5: 0},
 'P': {0: 0, 2: 0, 3: 0, 4: 0, 5: 0},
 'Q': {0: 1, 2: 0, 3: 1, 4: 0, 5: 1}}

这是我正在处理的代码:

X = df.drop('install', axis=1) #data
y = df['install'] #target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 42, test_size = 0.3)

X_train = ss.fit_transform(X_train)
X_test = ss.fit_transform(X_test)

model = keras.models.Sequential([
  keras.layers.Flatten(),
  keras.layers.Dense(128, activation='softmax'),
  keras.layers.Dropout(0.2),
  keras.layers.Dense(10)
])

loss = keras.losses.BinaryCrossentropy(from_logits=True)
optim = keras.optimizers.Adam(lr=0.001)
metrics = ["accuracy"]

model.compile(loss=loss, optimizer=optim, metrics=metrics)

batch_size = 32
epoch = 5
model.fit(X_train, y_train, batch_size=batch_size, epochs=epoch, shuffle=True, verbose=1)

你能帮我理解这个错误吗？我知道问题出在我的 X 和 y 的大小上。

最佳答案

注意:您尚未指定 ss 对象属于哪个类，因此我将讨论删除它的所有内容。

首先让我们讨论一下您的目标。即安装列。根据这些值，我假设您的问题是二元分类，即预测 0 和 1，并且您希望获得它们的概率。

为此，您必须如下定义模型。

model = keras.models.Sequential([
  keras.layers.Flatten(),
  keras.layers.Dense(128, activation='relu'),
  keras.layers.Dropout(0.2),
  keras.layers.Dense(2, activation='softmax')
])

'''
Note: I have changed the activation of the first `dense` layer from
'softmax` to `relu` as `softmax` is not ideal for inner layers as it greatly
reduce information from each node. Although having 'softmax' will not result
in any syntax error but it is methodologically wrong.

Now the next major change is changing the number of units in the last
`Dense` layer from 10 to 2. What you want is the probability of having
either 0 or 1. So if you have the have the output from your model as `[a ,
b]` here a is some value corresponding to 0 and b corresponding to 1 then
you can get probability on them using the 'softmax' activation. Without
activation the values we get are called 'logits'.
'''

# Now you have to change your loss function as below
loss = tf.keras.losses.SparseCategoricalCrossentropy()

# The rest is same. Now we run a dummy trial of the model after training it using your code.

preds = model.predict(X_test)
preds
'''
This gives the results:
array([[9.9999726e-01, 2.7777487e-06],
       [9.5156413e-01, 4.8435837e-02]], dtype=float32)

This says the probability of sample 1 being 0 is '9.9999726e-01' i.e.
'0.999..' and of it being 1 is '2.7777487e-06' i.e. '0.00000277..` and these
gracefully sum up to 1. Same for the sample 2.
'''

还有另一种方法可以做到这一点。因为你只有 1 个标签，因此如果你有对应于该标签的概率，那么你可以通过从 1 中减去它来获得对应于另一个标签的概率。你可以按如下方式实现它:

model = keras.models.Sequential([
  keras.layers.Flatten(),
  keras.layers.Dense(128, activation='relu'),
  keras.layers.Dropout(0.2),
  keras.layers.Dense(1, activation='sigmoid')
])

'''
The difference is 'softmax' and 'sigmoid' is that the 'softmax' is applied
on all the units in a unified manner but 'sigmoid' is applied on each
individual unit. So you can say that 'softmax' is the applied on the 'layer'
and 'sigmoid' is applied on the 'units'.

Now the output of the 'sigmoid' is the probability of the result being 1. So
we can say that the result could either be 0 or 1 depending on the output
probability with some threshold and hence we will not use a different loss
that is BinaryCrossEntropy as the values will be binary (either 0 or 1).
'''

loss = keras.losses.BinaryCrossentropy() # again without logits

# We once again the train the model using the rest of the code and analyze
the outputs.

preds = model.predict(X_test)
preds
'''
This gives the results:
array([[1.6424768e-13],
       [2.0349980e-06]], dtype=float32)

So for sample 1 we have the probability of it being '1' as '1.6424768e-13'
and as we have only '1' and '0' the probability of it being '0' is '1 -
1.6424768e-13'. Same for the sample 2.
'''

现在从@ Mattpats 来回答.这个答案也有效，但在这种情况下，你不会得到概率作为输出，而是你会得到 logits 因为你没有使用任何 activation 并且计算了损失通过指定参数 from_logits=True 在 logits 上。对于由此产生的概率，您必须像下面这样使用它:

preds = model.predict(X_test)
sigmoid_preds = tf.math.sigmoid(preds).numpy()
preds, sigmoid_preds
'''
This give the following results:
preds = array([[-51.056973],
              [-32.444508]], dtype=float32)

sigmoid_preds = array([[6.702527e-23],
                      [8.119502e-15]], dtype=float32)
'''

关于python - ValueError : logits and labels must have the same shape ((None, 10)与(无，1))，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/68837104/

python - ValueError : logits and labels must have the same shape ((None, 10)与(无，1))

上一篇：c# - 从 Entity Framework Core 中的导航属性选择对象列表

下一篇：html - 无法通过 CSS 将调整大小的大图像居中