python-3.x - 在 TensorFlow 中训练自定义数据集会出现错误

标签 python-3.x tensorflow machine-learning classification

我想使用 TensorFlow 对我的自定义数据集执行图像分类。我已经导入了自己的数据集,但停留在训练步骤(不确定它是导入完整的数据集还是单批 50 个图像,尽管列表包含所有文件名)。

数据集信息:图像分辨率 = 88*128(单 channel ),批量大小 = 50。

这是我要执行的操作列表:

  1. 导入完整数据集(如果仅创建一批 50 张图像,则更改代码)
  2. 使用我自己的数据集训练模型(训练图像和测试图像)
  3. 创建批处理的正确方法。

这是到目前为止的完整代码:

import tensorflow as tf
import os


def init_weights(shape):
    init_random_dist = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(init_random_dist)

def init_bias(shape):
    init_bias_vals = tf.constant(0.1, shape=shape)
    return tf.Variable(init_bias_vals)


def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2by2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                          strides=[1, 2, 2, 1], padding='SAME')


def convolutional_layer(input_x, shape):
    W = init_weights(shape)
    b = init_bias([shape[3]])

    return tf.nn.relu(conv2d(input_x, W) + b)


def normal_full_layer(input_layer, size):
    input_size = int(input_layer.get_shape()[1])
    W = init_weights([input_size, size])
    b = init_bias([size])

    return tf.matmul(input_layer, W) + b


def get_labels(path):
    return os.listdir(path)


def files_list(path):
    return [val for sublist in [[os.path.join(j) for j in i[2]] for i in os.walk(path)] for val in sublist]


def image_tensors(filesQueue): 
    reader = tf.WholeFileReader()
    filename, content = reader.read(filesQueue)
    image = tf.image.decode_jpeg(content, channels=1)
    image = tf.cast(image, tf.float32)
    resized_image = tf.image.resize_images(image, [88, 128])

    return resized_image


path = './data/train'
trainLabels = get_labels(path)
trainingFiles = files_list(path)

trainQueue = tf.train.string_input_producer(trainingFiles)
trainBatch = tf.train.batch([image_tensors(trainQueue)], batch_size=50)
# ^^^^^^^^ a complete dataset or only a single batch? How to check?

path = './data/test'
testLabels = get_labels(path)
testingFiles = files_list(path)

testQueue = tf.train.string_input_producer(testingFiles)
testBatch = tf.train.batch([image_tensors(testQueue)], batch_size=50)
# ^^^^^^^ same here

x = tf.placeholder(tf.float32,shape=[88, 128])
y_true = tf.placeholder(tf.float32,shape=[None,len(trainLabels)])

x_image = tf.reshape(x,[-1,88,128,1])

convo_1 = convolutional_layer(x_image,shape=[6,6,1,32])
convo_1_pooling = max_pool_2by2(convo_1)

convo_2 = convolutional_layer(convo_1_pooling,shape=[6,6,32,64])
convo_2_pooling = max_pool_2by2(convo_2)


convo_2_flat = tf.reshape(convo_2_pooling,[-1,22*32*64])
full_layer_one = tf.nn.relu(normal_full_layer(convo_2_flat,1024))

hold_prob = tf.placeholder(tf.float32)
full_one_dropout = tf.nn.dropout(full_layer_one,keep_prob=hold_prob)


y_pred = normal_full_layer(full_one_dropout,10)

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true,logits=y_pred))

optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
train = optimizer.minimize(cross_entropy)

init = tf.global_variables_initializer()



steps = 4000

with tf.Session() as sess:
    sess.run(init)
    for i in range(steps):
        batch_x , batch_y = tf.train.batch(trainBatch, batch_size=50)
        #                                  ^^^^^^^^^^^ Error
        sess.run(train,feed_dict={x:batch_x,y_true:batch_y,hold_prob:0.5})

        if i%400 == 0:
            print('Currently on step {}'.format(i))
            print('Accuracy is:')

            matches = tf.equal(tf.argmax(y_pred,1),tf.argmax(y_true,1))
            acc = tf.reduce_mean(tf.cast(matches,tf.float32))
            print(sess.run(acc,feed_dict={x:testBatch,y_true:testLabels,hold_prob:1.0}))
            #                             ^^^^^^^^^^^^ Test Images?
            print('\n')

这是我得到的错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-24-5d0dac5724cd> in <module>()
      5     sess.run(init)
      6     for i in range(steps):
----> 7         batch_x , batch_y = tf.train.batch([trainBatch], batch_size=50)
      8         sess.run(train,feed_dict={x:batch_x,y_true:batch_y,hold_prob:0.5})
      9 

c:\users\TF_User\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py in __iter__(self)
    503       TypeError: when invoked.
    504     """
--> 505     raise TypeError("'Tensor' object is not iterable.")
    506 
    507   def __bool__(self):

TypeError: 'Tensor' object is not iterable.

似乎转换了错误的类型,而不是张量列表,但无法弄清楚。请纠正问题并帮助我解决上面列出的问题。

最佳答案

看起来您正在使用不必要的第二次调用tf.train.batch

通常你会这样做:

...     
images, labels = tf.train.batch([images, labels], batch_size=50)

with tf.Session() as sess:
    sess.run(init)
    for i in range(steps):
        sess.run(train, feed_dict={x:images,y_true:labels,hold_prob:0.5})
...

我认为TensorFlow: does tf.train.batch automatically load the next batch when the batch has finished training?应该可以让您更好地了解 tf.train.batch 正在做什么以及如何使用它。还有关于 Reading Data 的文档应该也有帮助。

关于python-3.x - 在 TensorFlow 中训练自定义数据集会出现错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48325733/

相关文章:

python - 通过 Python 的 "requests"库发出 HTTP 请求时的 404 状态代码。但是页面在浏览器中加载正常

python - Python3日历模块Unicode错误

tensorflow - Tensorflow CTC 损失的填充标签?

python - 在加载 Keras 保存模型时,是否有人得到 "AttributeError: ' str' 对象没有属性 'decode'"

python-3.x - 使用 10000 张图像训练 vgg 网络期间验证准确性停滞不前

python - 在 Python 中从单个大图像创建多个缩略图的最快方法

machine-learning - Tensorflow 对于简单的 softmax 模型参数值没有变化

c++ - undefined symbol 'fixed_address_empty_string' : new tensorflow op with protobuf

python - 我有一个形状为 [5, 2, 18, 4096] 的张量。我想沿着第 2 维堆叠第 0 维。我该怎么做?

machine-learning - 如何将不确定性传播到神经网络的预测中?