python - 我怎样才能得到损失的梯度Tensorflow 中的模型预测?

标签 python tensorflow keras

我想计算误差梯度:dJ/dredictionp(如果J是成本函数)。在函数 train_step() 中,您可以看到梯度是根据 w.r.t 计算的。模型权重。

当我尝试计算梯度时,例如:gradients = Tape.gradient(loss, Predictions),它返回None,这意味着我的损失函数不依赖关于预测。

怎么会这样?

class SimpleModel(models.Model):
    def __init__(self, nb_classes, X_dim: int, batch_size: int):
        super().__init__()
        self.model_input_layer = layers.InputLayer(input_shape=(X_dim,), batch_size=batch_size)

        self.d1 = layers.Dense(64, name="d1")
        self.a1 = layers.Activation("relu", name="a1")

        self.d2 = layers.Dense(32, name="d2")
        self.a2 = layers.Activation("relu", name="a2")

        self.d3 = layers.Dense(nb_classes, name="d3")
        self.a3 = layers.Activation("softmax", name="a3")

        self.model_input = None
        self.d1_output = None
        self.a1_output = None
        self.d2_output = None
        self.a2_output = None
        self.d3_output = None
        self.a3_output = None

    def call(self, inputs, training=None, mask=None):
        self.model_input = self.model_input_layer(inputs)
        self.d1_output = self.d1(self.model_input)
        self.a1_output = self.a1(self.d1_output)
        self.d2_output = self.d2(self.a1_output)
        self.a2_output = self.a2(self.d2_output)
        self.d3_output = self.d3(self.a2_output)
        self.a3_output = self.a3(self.d3_output)
        return self.a3_output


model = SimpleModel(NB_CLASSES, X_DIM, BATCH_SIZE)
model.build((BATCH_SIZE, X_DIM))

optimizer = Adam()
loss_object = losses.CategoricalCrossentropy()

train_loss = metrics.Mean(name='train_loss')
test_loss = metrics.Mean(name='test_loss')


@tf.function
def train_step(X, y):
    with tf.GradientTape() as tape:
        predictions = model(X)
        loss = loss_object(y, predictions)
    gradients = tape.gradient(loss, model.trainable_weights)
    optimizer.apply_gradients(zip(gradients, model.trainable_weights))

    train_loss(loss)

最佳答案

问题是 GradientTape 默认情况下仅跟踪可训练变量,而不跟踪其他张量。因此,您需要明确地告诉它跟踪感兴趣的张量。试试这个:

predictions = model(X)  # if you also need gradients for model variables, move this back into the tape context
with tf.GradientTape() as tape:
    tape.watch(predictions)
    loss = loss_object(y, predictions)
gradients = tape.gradient(loss, [predictions])

请注意使用 watch 方法来跟踪任意张量。这不应再返回 None

关于python - 我怎样才能得到损失的梯度Tensorflow 中的模型预测?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56521956/

相关文章:

python - 列表列表的所有组合

Tensorflow 1.0 Windows + 64 位 Anaconda 4.3.0 错误

python - LSTM输入形状错误: Input 0 is incompatible with layer sequential_1

python - 在 Keras 中使用自定义损失函数时出现批量大小问题

python - 如何在训练期间替换损失函数 tensorflow.keras

python - 如何在 Python 中将字典键作为列表返回?

python - 在Python中使用列表的斐波那契数列

python - 如何在tensorflow中计算PDF

python - 在 virtualenv 中使用 python3.5 导入 torch 时出现段错误(核心转储)

tensorflow - Colab资源和Self-Attention(分配张量时出现OOM)