python - TensorFlow:实现 Spearman 距离作为目标函数

标签 python ranking tensorflow

为了使我的问题可重现,我使用鸢尾花数据集(10 任意行,所有列标准标准化)和最小神经网络模型(预测花瓣宽度(使用萼片长度、萼片宽度和花瓣长度)通过修改我在互联网上找到的 MNIST 示例来实现。向下滚动查看我的问题!

iris.csv

"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
0.0551224773430978,-0.380319414627833,-0.335895230408602,-0.548226210538025,"versicolor"
1.48830688826362,-1.01418510567422,1.37931445678426,0.614677872421422,"virginica"
0.606347250774068,0.887411967464943,0.450242542888127,0.780807027129915,"virginica"
-0.606347250774067,-1.64805079672061,0.235841331989019,0.44854871771293,"virginica"
1.15757202420504,-1.01418510567422,0.950512034986045,0.44854871771293,"virginica"
-1.92928670700839,0.887411967464943,-2.33697319880027,-2.37564691233144,"setosa"
0.38585734140168,0.253546276418555,0.307308402288722,1.1130653365469,"virginica"
-0.826837160146455,0.253546276418555,-0.478829371008007,-0.548226210538025,"versicolor"
0.0551224773430978,1.52127765851133,-0.192961089809197,-0.21596790112104,"versicolor"
-0.385857341401679,0.253546276418555,0.021440121089911,0.282419563004437,"virginica"

nn.py

import pandas as pd
import numpy as np
import tensorflow as tf
import scipy.stats

# Import iris data
data = pd.read_csv("iris.csv")
input = data[["Sepal.Length", "Sepal.Width", "Petal.Length"]]
target = data[["Petal.Width"]]

# Parameters
learning_rate = 0.001
training_epochs = 6000

# Network Parameters
n_hidden_1 = 5 # 1st layer number of features
n_hidden_2 = 5 # 2nd layer number of features
n_input = 3 # data input
n_output = 1 # data output

# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_output])

# Create model
def multilayer_network(x, weights, biases):
  # Hidden layer with TanH activation
  layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
  layer_1 = tf.tanh(layer_1)
  # Hidden layer with TanH activation
  layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
  layer_2 = tf.tanh(layer_2)
  # Output layer with linear activation
  out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
  return out_layer

# Store layers weight & bias
weights = {
  'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
  'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
  'out': tf.Variable(tf.random_normal([n_hidden_2, n_output]))
}
biases = {
  'b1': tf.Variable(tf.random_normal([n_hidden_1])),
  'b2': tf.Variable(tf.random_normal([n_hidden_2])),
  'out': tf.Variable(tf.random_normal([n_output]))
}

# Construct model
pred = multilayer_network(x, weights, biases)

# Define loss and optimizer
cost = tf.reduce_mean(tf.square(pred-y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
  sess.run(init)

  # Training cycle
  for epoch in range(training_epochs):
    # Run optimization op (backprop) and cost op (to get loss value)
    _, c = sess.run([optimizer, cost], feed_dict={x: input, y: target})

    # Display logs per epoch step
    if epoch % 1000 == 0:
      print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c)

print "Optimization Finished!"

以下是培训类(class)结果示例:

$ python nn.py
Epoch: 0001 cost= 3.000185966
Epoch: 1001 cost= 0.031734336
Epoch: 2001 cost= 0.000614795
Epoch: 3001 cost= 0.000008422
Epoch: 4001 cost= 0.000000057
Epoch: 5001 cost= 0.000000000
Optimization Finished!

我的想法是用我最近了解到的斯 PIL 曼距离代替均方误差作为我的目标函数。遵循定义:

FORMULA

我编写了一个返回向量排名的函数:

import scipy.stats

def rank(vector):
  return scipy.stats.rankdata(vector, method="min")

使用 TensorFlow 的方法 py_func,我定义了成本张量,如下所示。

pred = tf.to_float(tf.py_func(rank, [pred], [tf.int64])[0])
y = tf.to_float(tf.py_func(rank, [y], [tf.int64])[0])

cost = tf.reduce_mean(tf.square(y-pred))

但是,这给了我错误

ValueError: No gradients provided for any variable: ((None, <tensorflow.python.ops.variables.Variable object at 0x7f67ffe4ee90>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed3c4990>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed357310>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed357190>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed380350>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed3801d0>))

我不明白根本问题是什么。您能为我提供的任何指导将不胜感激!

最佳答案

您的错误来自于 tf.py_func 没有定义渐变。

无论如何,正如 @user20160 在评论中所说,rank 操作甚至不存在梯度,因此这不是您可以直接训练算法的损失。

关于python - TensorFlow:实现 Spearman 距离作为目标函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38635182/

相关文章:

tensorflow - TFRecords 文件的分片需要什么?

python - TFRecords 比原始大小大 100 倍

python - IPython 中的条形图;不认识我的专栏之一

python - 基于多列对 DataFrame 进行排名

elasticsearch - 如何删除elasticsearch中的重复搜索结果?

mysql - MySQL中没有 session 变量的排名顺序分组数据?

Tensorflow 摘要标量未显示在 tensorboard 中

python - 正则表达式从字符串中查找所有匹配项

python - Google App Engine (Python) - 邮件已发送但未收到

python - Nomad 中的多个 bash 命令