我想使用 tf.data.Dataset
类提供数据
from tensorflow_core.python.keras.datasets import cifar10
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
test_dataset = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
我这样做是为了在管道中使用
数据集
。进一步利用
Dataset
的其他功能。
我像这样定义我的模型
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPool2D((2, 2)))
# more layers
但是当我打电话训练模型时
model.fit(train_dataset, epochs=10, validation_data=test_dataset, callbacks=[ cp_callback])
我收到错误
ValueError: Error when checking input: expected conv2d_input to have 4 dimensions, but got array with shape (32, 32, 3)
- 到底发生了什么?如何在 Conv2D 层中使用
DataSet
和 input_shape=(32, 32, 3) ?
Tensorflow 教程 ( https://www.tensorflow.org/tutorials/load_data/numpy ) 没有涵盖这种情况,我找不到可以帮助我解决问题的解释。
最佳答案
应将批量生成器添加到具有任意批量大小的数据集中。基于Tensorflow的文档here ,批处理
功能:
Combines consecutive elements of this dataset into batches. The components of the resulting element will have an additional outer dimension, which will be
batch_size
(orN % batch_size
for the last element ifbatch_size
does not divide the number of input elementsN
evenly anddrop_remainder
isFalse
). If your program depends on the batches having the same outer dimension, you should set thedrop_remainder
argument toTrue
to prevent the smaller batch from being produced.
假设您的批量大小为 16
。然后:
my_batch_size =16
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
test_dataset = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
# Shapes of data are (32,32,3) here
train_dataset.batch(my_batch_size)
test_dataset.batch(my_batch_size)
# Shapes of data are (None,32,32,3) or (16,32,32,3) here
然后你就可以训练你的模型了。
关于python - 如何使用 Tensorflow 数据集进行 CNN 模型训练,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59109662/