python - TensorFlow 中 KNN 实现的问题

标签 python machine-learning tensorflow knn

我正在努力在 TensorFlow 中实现 K 最近邻算法。我认为要么是我忽略了一个错误,要么是做了一些严重的错误。

以下代码始终将 Mnist 标签预测为 0。

from __future__ import print_function

import numpy as np
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data

K = 4
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# In this example, we limit mnist data
Xtr, Ytr = mnist.train.next_batch(55000)  # whole training set
Xte, Yte = mnist.test.next_batch(10000)  # whole test set

# tf Graph Input
xtr = tf.placeholder("float", [None, 784])
ytr = tf.placeholder("float", [None, 10])
xte = tf.placeholder("float", [784])

# Euclidean Distance
distance = tf.neg(tf.sqrt(tf.reduce_sum(tf.square(tf.sub(xtr, xte)), reduction_indices=1)))
# Prediction: Get min distance neighbors
values, indices = tf.nn.top_k(distance, k=K, sorted=False)
nearest_neighbors = []
for i in range(K):
    nearest_neighbors.append(np.argmax(ytr[indices[i]]))

sorted_neighbors, counts = np.unique(nearest_neighbors, return_counts=True)

pred = tf.Variable(nearest_neighbors[np.argmax(counts)])

# not works either
# neighbors_tensor = tf.pack(nearest_neighbors)
# y, idx, count = tf.unique_with_counts(neighbors_tensor)
# pred = tf.slice(y, begin=[tf.arg_max(count, 0)], size=tf.constant([1], dtype=tf.int64))[0]

accuracy = 0.

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # loop over test data
    for i in range(len(Xte)):
        # Get nearest neighbor
        nn_index = sess.run(pred, feed_dict={xtr: Xtr, xte: Xte[i, :]})
        # Get nearest neighbor class label and compare it to its true label
        print("Test", i, "Prediction:", nn_index,
              "True Class:", np.argmax(Yte[i]))
        # Calculate accuracy
        if nn_index == np.argmax(Yte[i]):
            accuracy += 1. / len(Xte)
    print("Done!")
    print("Accuracy:", accuracy)

非常感谢任何帮助。

最佳答案

因此,一般来说,在定义 TensorFlow 模型时转到 numpy 函数并不是一个好主意。这正是您的代码无法正常工作的原因。我只对您的代码做了两处更改。我已将 np.argmax 替换为 tf.argmax。我还删除了来自 #This doesn’t work either 的评论。

这里是完整的工作代码:

from __future__ import print_function

import numpy as np
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data

K = 4
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# In this example, we limit mnist data
Xtr, Ytr = mnist.train.next_batch(55000)  # whole training set
Xte, Yte = mnist.test.next_batch(10000)  # whole test set

# tf Graph Input
xtr = tf.placeholder("float", [None, 784])
ytr = tf.placeholder("float", [None, 10])
xte = tf.placeholder("float", [784])

# Euclidean Distance
distance = tf.negative(tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(xtr, xte)), reduction_indices=1)))
# Prediction: Get min distance neighbors
values, indices = tf.nn.top_k(distance, k=K, sorted=False)

nearest_neighbors = []
for i in range(K):
    nearest_neighbors.append(tf.argmax(ytr[indices[i]], 0))

neighbors_tensor = tf.stack(nearest_neighbors)
y, idx, count = tf.unique_with_counts(neighbors_tensor)
pred = tf.slice(y, begin=[tf.argmax(count, 0)], size=tf.constant([1], dtype=tf.int64))[0]

accuracy = 0.

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # loop over test data
    for i in range(len(Xte)):
        # Get nearest neighbor
        nn_index = sess.run(pred, feed_dict={xtr: Xtr, ytr: Ytr, xte: Xte[i, :]})
        # Get nearest neighbor class label and compare it to its true label
        print("Test", i, "Prediction:", nn_index,
             "True Class:", np.argmax(Yte[i]))
        #Calculate accuracy
        if nn_index == np.argmax(Yte[i]):
            accuracy += 1. / len(Xte)
    print("Done!")
    print("Accuracy:", accuracy)

关于python - TensorFlow 中 KNN 实现的问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41131728/

相关文章:

machine-learning - Catboost:l2_leaf_reg 的合理值是多少?

python - 将pytorch中的roi池化转换为nn层

python - 从文件加载后 CNTK/TF LSTM 模型性能下降

python - Tensorflow - 获取像素的邻域

python - 正则表达式没有抓取所有组,不能在多行中工作

python - 实时算法交易专家的逻辑

c - 感知器学习算法不收敛到 0

javascript - 获取 Python 脚本标签内的变量数据或从 js 添加的内容

python - 三次样条获得平滑的Python直线曲线

python - 如何在 TensorFlow 的 eager 执行中使用 Keras.applications 的 ResNeXt?