python - 在 Keras 中构建自定义损失函数

我正在尝试在 Keras 中编写一个自定义损失函数 this paper 。也就是说，我想要创建的损失是这样的:

enter image description here

这是多类多标签问题的一种排名损失。详细信息如下:

Y_i = set of positive labels for sample i
Y_i^bar = set of negative labels for sample i (complement of Y_i)
c_j^i = prediction on i^th sample at label j

在下文中，y_true 和 y_pred 的维度均为 18。

def multilabel_loss(y_true, y_pred):
    """ Multi-label loss function.

    More complete description here...

    """    
    zero = K.tf.constant(0, dtype=tf.float32)
    where_one = K.tf.not_equal(y_true, zero)
    where_zero = K.tf.equal(y_true, zero)

    Y_p = K.tf.where(where_one)
    Y_n = K.tf.where(where_zero)

    n = K.tf.shape(y_true)[0]
    loss = 0

    for i in range(n):
        # Here i is the ith sample; for a specific i, I find all locations
        # where Y_p, Y_n belong to the ith sample; axis 0 denotes
        # the sample index space
        Y_p_i = K.tf.equal(Y_p[:,0], K.tf.constant(i, dtype=tf.int64))
        Y_n_i = K.tf.equal(Y_n[:,0], K.tf.constant(i, dtype=tf.int64))

        # Here I plug in those locations to get the values
        Y_p_i = K.tf.where(Y_p_i)
        Y_n_i = K.tf.where(Y_n_i)

        # Here I get the indices of the values above
        Y_p_ind = K.tf.gather(Y_p[:,1], Y_p_i)
        Y_n_ind = K.tf.gather(Y_n[:,1], Y_n_i)

        # Here I compute Y_i and its complement
        yi = K.tf.shape(Y_p_ind)[0]
        yi_not = K.tf.shape(Y_n_ind)[0]

        # The value to normalize the inner summation
        normalizer = K.tf.divide(1, K.tf.multiply(yi, yi_not))

        # This creates a matrix of all combinations of indices k, l from the 
        # above equation; then it is reshaped
        prod = K.tf.map_fn(lambda x: K.tf.map_fn(lambda y: K.tf.stack( [ x, y ] ), Y_n_ind ), Y_p_ind )
        prod = K.tf.reshape(prod, [-1, 2, 1])
        prod = K.tf.squeeze(prod)

        # Next, the indices are fed into the corresponding prediction
        # matrix, where the values are then exponentiated and summed
        y_pred_gather = K.tf.gather(y_pred[i,:].T, prod)
        s = K.tf.cast(K.sum(K.tf.exp(K.tf.subtract(y_pred_gather[:,0], y_pred_gather[:,1]))), tf.float64)
        loss = loss + K.tf.multiply(normalizer, s)
    return loss

我的问题如下:

当我编译图表时，我收到一个围绕 n 的错误。即，TypeError:“Tensor”对象无法解释为整数。我环顾四周，但找不到阻止这种情况的方法。我的预感是我需要完全避免 for 循环，这让我想到
如何在没有 for 循环的情况下写出这个损失？我对 Keras 相当陌生，并且自己花了几个小时编写这个自定义损失。我想写得更简洁一些。阻止我使用所有矩阵的原因是 Y_i 及其补集对于每个 i 可以采用不同的大小。

如果您希望我详细说明我的代码，请告诉我。很高兴这样做。

更新3

根据 @Parag S. Chandakkar 的建议，我有以下建议:

def multi_label_loss(y_true, y_pred):

    # set consistent casting
    y_true = tf.cast(y_true, dtype=tf.float64)
    y_pred = tf.cast(y_pred, dtype=tf.float64)

    # this get all positive predictions and negative predictions
    # it also exponentiates them in their respective Y_i classes
    PT = K.tf.multiply(y_true, tf.exp(-y_pred))
    PT_complement = K.tf.multiply((1-y_true), tf.exp(y_pred))

    # this step gets the weight vector that we'll normalize by
    m = K.shape(y_true)[0]
    W = K.tf.multiply(K.sum(y_true, axis=1), K.sum(1-y_true, axis=1))
    W_inv = 1./W
    W_inv = K.reshape(W_inv, (m,1))

    # this step computes the outer product of two tensors
    def outer_product(inputs):
        """
        inputs: list of two tensors (of equal dimensions, 
            for which you need to compute the outer product
        """
        x, y = inputs

        batchSize = K.shape(x)[0]

        outerProduct = x[:,:, np.newaxis] * y[:,np.newaxis,:]
        outerProduct = K.reshape(outerProduct, (batchSize, -1))

        # returns a flattened batch-wise set of tensors
        return outerProduct

    # set up inputs to outer product
    inputs = [PT, PT_complement]

    # compute final loss
    loss = K.sum(K.tf.multiply(W_inv, outer_product(inputs)))

    return loss

最佳答案

这不是一个答案，而更像是我的思考过程，它应该帮助您编写简洁的代码。

首先，我认为您现在不应该担心该错误，因为当您消除 for 循环时，您的代码可能看起来非常不同。

现在，我没有看论文，而是看预测 c_j^i应该是来自最后一个非 softmax 层的原始值(这就是我的假设)。

因此您可以添加一个额外的 exp分层和计算 exp(c_j^i)对于每个预测。现在，for循环是因为求和而出现的。如果仔细观察，它所做的就是首先将所有标签形成对，然后减去它们相应的预测。现在，首先将减法表示为 exp(c_l^i) * exp(-c_k^i) 。要了解发生了什么，请举一个简单的例子。

import numpy as np
a = [1, 2, 3]
a = np.reshape(a, (3,1))

根据上述解释，您需要以下结果。

r1 = sum([1 * 2, 1 * 3, 2 * 3]) = sum([2, 3, 6]) = 11

通过矩阵乘法可以得到相同的结果，这是消除 for 循环的一种方法。

r2 = a * a.T
# r2 = array([[1, 2, 3],
#             [2, 4, 6],
#             [3, 6, 9]])

Extract the upper triangular part ，即2, 3, 6并对数组求和得到 11 ，这就是你想要的结果。现在，可能存在一些差异，例如，您可能需要详尽地形成所有对。您应该能够将其转换为矩阵乘法的形式。

处理完求和项后，如果您预先计算数量 |Y_i|，则可以轻松计算归一化项。和\bar{Y_i}对于每个样本i 。将它们作为输入数组传递，并将它们作为 y_pred 的一部分传递到损失中。最终总结i将由 Keras 完成。

编辑1:即使 |Y_i|和\bar{Y_i}采用不同的值，一旦预先计算了 |Y_i|，您应该能够构建一个通用公式来提取上三角部分，而不管矩阵大小如何和\bar{Y_i} .

编辑2:我认为你没有完全理解我。在我看来，NumPy 根本不应该在损失函数中使用。这(大部分)仅使用 Tensorflow 就可以实现。我将再次解释，同时保留我之前的解释。

我现在知道正标签和负标签之间存在笛卡尔积(分别为 |Y_i| 和 \bar{Y_i} )。首先，输入 layer of exp 在原始预测之后(在 TF 中，而不是在 Numpy 中)。
现在，您需要知道 y_true 的 18 个维度有哪些索引哪些对应于正，哪些对应于负。如果您使用一种热编码，则可以使用 tf.where 即时找到它。和tf.gather (参见here)。
现在，您应该知道索引 j (在 c_j^i 中)对应于正标签和负标签。您所需要做的就是计算\sum_(k, l) {exp(c_k^i) * (1 / exp(c_l^i))}成对 (k, l) 。您需要做的就是形成一个由 exp(c_k^i) for all k 组成的张量。 (称之为 A )和另一个由 exp(c_l^i) for all l 组成的(称之为 B )。然后计算sum(A * B^T) 。如果您采用笛卡尔积，则无需提取上三角形部分。现在，您应该得到最内层求和的结果。
与我之前所说的相反，我认为您还可以从 y_true 即时计算标准化因子.

您只需弄清楚如何将其扩展到三个维度即可处理多个样本。

注:Numpy的用法是probably possible通过使用tf.py_func但这里似乎没有必要。只需使用TF的功能即可。

关于python - 在 Keras 中构建自定义损失函数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51794398/

python - 在 Keras 中构建自定义损失函数

上一篇：python - pygame.mixer.music，我想在 while 循环运行时播放轨道

下一篇：python - Google Cloud 函数 Python Flask 模板文件夹