python - TensorFlow:执行此损失计算

标签 python python-2.7 neural-network tensorflow recurrent-neural-network

我的问题和问题在两段代码下面说明。


损失函数

def loss(labels, logits, sequence_lengths, label_lengths, logit_lengths):    
    scores = []
    for i in xrange(runner.batch_size):
        sequence_length = sequence_lengths[i]
        for j in xrange(length):
            label_length = label_lengths[i, j]
            logit_length = logit_lengths[i, j]

             # get top k indices <==> argmax_k(labels[i, j, 0, :], label_length)
            top_labels = np.argpartition(labels[i, j, 0, :], -label_length)[-label_length:]
            top_logits = np.argpartition(logits[i, j, 0, :], -logit_length)[-logit_length:]

            scores.append(edit_distance(top_labels, top_logits))

    return np.mean(scores)
    
# Levenshtein distance
def edit_distance(s, t):
    n = s.size
    m = t.size
    d = np.zeros((n+1, m+1))
    d[:, 0] = np.arrange(n+1)
    d[0, :] = np.arrange(n+1)

    for j in xrange(1, m+1):
        for i in xrange(1, n+1):
            if s[i] == t[j]:
                d[i, j] = d[i-1, j-1]
            else:
                d[i, j] = min(d[i-1, j] + 1,
                              d[i, j-1] + 1,
                              d[i-1, j-1] + 1)

    return d[m, n]

用于

我试图扁平化我的代码,以便一切都发生在一个地方。如果有拼写错误/混淆点,请告诉我。

sequence_lengths_placeholder = tf.placeholder(tf.int64, shape=(batch_size))
labels_placeholder = tf.placeholder(tf.float32, shape=(batch_size, max_feature_length, label_size))
label_lengths_placeholder = tf.placeholder(tf.int64, shape=(batch_size, max_feature_length))
loss_placeholder = tf.placeholder(tf.float32, shape=(1))

logit_W = tf.Variable(tf.zeros([lstm_units, label_size]))
logit_b = tf.Variable(tf.zeros([label_size]))

length_W = tf.Variable(tf.zeros([lstm_units, max_length]))
length_b = tf.Variable(tf.zeros([max_length]))

lstm = rnn_cell.BasicLSTMCell(lstm_units)
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * layer_count)

rnn_out, state = rnn.rnn(stacked_lstm, features, dtype=tf.float32, sequence_length=sequence_lengths_placeholder)

logits = tf.concat(1, [tf.reshape(tf.matmul(t, logit_W) + logit_b, [batch_size, 1, 2, label_size]) for t in rnn_out])

logit_lengths = tf.concat(1, [tf.reshape(tf.matmul(t, length_W) + length_b, [batch_size, 1, max_length]) for t in rnn_out])

optimizer = tf.train.AdamOptimizer(learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss_placeholder, global_step=global_step)

...
...
# Inside training loop

np_labels, np_logits, sequence_lengths, label_lengths, logit_lengths = sess.run([labels_placeholder, logits, sequence_lengths_placeholder, label_lengths_placeholder, logit_lengths], feed_dict=feed_dict)
loss = loss(np_labels, np_logits, sequence_lengths, label_lengths, logit_lengths)
_ = sess.run([train_op], feed_dict={loss_placeholder: loss})

我的问题

问题是这会返回错误:

  File "runner.py", line 63, in <module>
    train_op = optimizer.minimize(loss_placeholder, global_step=global_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 188, in minimize
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 277, in apply_gradients
    (grads_and_vars,))

  ValueError: No gradients provided for any variable: <all my variables>

所以我假设这是 TensorFlow 提示它无法计算我的损失的梯度,因为损失是由 numpy 执行的,在 TF 的范围之外。

很自然地要修复,我会尝试在 TensorFlow 中实现它。问题是,我的 logit_lengthslabel_lengths 都是张量,所以当我尝试访问单个元素时,我会返回一个形状为 [] 的张量。当我尝试使用 tf.nn.top_k() 时,这是一个问题,它的 k 参数采用 Int

另一个问题是我的 label_lengths 是占位符,因为我的 loss 值需要在 optimizer.minimize(loss) 之前定义> 调用,我还收到一个错误,指出需要为占位符传递一个值。

我只是想知道如何尝试实现这个损失函数。或者,如果我遗漏了一些明显的东西。


编辑: 在一些 further reading 之后,我发现像我描述的那样的损失通常用于验证和训练替代损失,该替代损失在使用真实损失的同一位置最小化。有谁知道像我这样的基于编辑距离的场景使用什么代理损失?

最佳答案

我要做的第一件事是使用 tensorflow 而不是 numpy 来计算损失。这将允许 tensorflow 为您计算梯度,因此您将能够反向传播,这意味着您可以最大限度地减少损失。

核心库中有 tf.edit_distance( https://www.tensorflow.org/api_docs/python/tf/edit_distance ) 函数。

So naturally to fix that I would try and implement this in TensorFlow. The issue is, my logit_lengths and label_lengths are both Tensors, so when I try and access a single element, I'm returned a Tensor of shape []. This is an issue when I'm trying to use tf.nn.top_k() which takes an Int for its k parameter.

您能否提供更多细节说明这是一个问题?

关于python - TensorFlow:执行此损失计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35325480/

相关文章:

python - 如何在人工神经网络中处理多个分类列?

python - 如何提高读取大文件并将其作为下载返回的 python cgi 的性能?

python - Google App Engine 用户界面小部件

python - 为什么web.py中打印结果是 "favicon.ico"

Python-凯撒密码

python - 如何获取 .avi 文件长度

python - 这个 Python 2.7 setdefault/defaultdict 代码如何给出嵌套结果?

java - 为什么我训练过的神经网络会产生相同的输出

machine-learning - 用于不平衡多类多标签分类的神经网络

python - 如何制作带有未定义区域的彩色图?