tensorflow - tf.data OutOfRangeError(回溯见上文): End of sequence

标签 tensorflow tensorflow-datasets

在尝试使用生成器生成测试数据时,我遇到了一个奇怪的问题。
这是我的代码:

from __future__ import absolute_import, division, print_function
import tensorflow as tf
import os
# tf.enable_eager_execution()

def _parse_function(data):
    split_data = tf.string_split([data], ",")
    tmp = tf.string_to_number(split_data.values, out_type=tf.int32)
    result = tf.map_fn(lambda x: (tmp[0], x), tmp[1:], dtype=(tf.int32, tf.int32))
    return result

data_path = "data"
file_names = os.listdir(data_path)
file_names = list(map(lambda x: os.path.join(data_path, x), file_names))
dataset = tf.data.TextLineDataset(file_names)
dataset = dataset.map(_parse_function)
dataset = dataset.apply(tf.data.experimental.unbatch())
dataset = dataset.batch(20)
user_id, item_id = dataset.make_one_shot_iterator().get_next()

user_id = tf.reshape(user_id, shape=(-1, ))
item_id = tf.reshape(item_id, shape=(-1, ))
print(user_id)
print(item_id)

with tf.Session() as sess:
    for i in range(10):
        user_ids = sess.run([user_id])
        item_ids = sess.run([item_id])
        print(user_ids)
        print(item_ids)

这是要处理的原始数据:
1,2,3,4,5
6,7,8,9,10,11
12,13,14,15,16,17
18,19,20
21,22,23
24,25,26

第一列是用户ID,其他列是项目ID。

目标数据为:
1,2
1,3
1,4
...
24,25
24,26

这是我的错误:
Caused by op 'IteratorGetNext', defined at:
File "C:/Users/Liheng/Desktop/xlearning/tensorflow_data.py", line 22, in 
<module>
user_id, item_id = dataset.make_one_shot_iterator().get_next()
File "F:\ProgramData\Anaconda3\lib\site- 
packages\tensorflow\python\data\ops\iterator_ops.py", line 421, in get_next
name=name)), self._output_types,
File "F:\ProgramData\Anaconda3\lib\site- 
packages\tensorflow\python\ops\gen_dataset_ops.py", line 2068, in 
iterator_get_next
output_shapes=output_shapes, name=name)
File "F:\ProgramData\Anaconda3\lib\site- 
packages\tensorflow\python\framework\op_def_library.py", line 787, in 
_apply_op_helper
op_def=op_def)
File "F:\ProgramData\Anaconda3\lib\site- 
packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "F:\ProgramData\Anaconda3\lib\site- 
packages\tensorflow\python\framework\ops.py", line 3274, in create_op
op_def=op_def)
File "F:\ProgramData\Anaconda3\lib\site- 
packages\tensorflow\python\framework\ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): End of sequence
[[node IteratorGetNext (defined at 
C:/Users/Liheng/Desktop/xlearning/tensorflow_data.py:22)  = 
IteratorGetNext[output_shapes=[[?], [?]], output_types=[DT_INT32, DT_INT32], 
_device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

但是如果我在eager模式下运行,代码运行良好,输出如下:
tf.Tensor([ 1  1  1  1  6  6  6  6  6 12 12 12 12 12 18 18 21 21 24 24], 
shape=(20,), dtype=int32)
tf.Tensor([ 2  3  4  5  7  8  9 10 11 13 14 15 16 17 19 20 22 23 25 26], 
shape=(20,), dtype=int32)

最佳答案

数据集非常小,无法执行批处理。尝试删除该行:

dataset = dataset.batch(20)
或该行更改为:
dataset = dataset.batch(2)

关于tensorflow - tf.data OutOfRangeError(回溯见上文): End of sequence,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53613515/

相关文章:

python - 带有 ListDirectory 的 Tensorflow 数据集 API

machine-learning - 从 Tensorflow 中的损失函数中屏蔽样本

tensorflow - 内核大小为1的conv1d与密集层之间有什么区别?

tensorflow - 如何将大 float 保存为 TFRecord 格式? float_list/float32 似乎截断了值

python - TensorFlow 数据集 .map() 方法不适用于内置 tf.keras.preprocessing.image 函数

tensorflow - 如何使tf.data.Dataset在一次调用中返回所有元素?

Tensorflow 数据集 API : parallelising tf. data.Dataset.from_generator with parallel_interleave

python - 如何让 HMM 处理 Tensorflow 中的实值数据

python - 损失函数的正确做法

python - 模块未找到错误 : No module named 'tensorflow.contrib' ; 'tensorflow' is not a package