来自 Train and evaluate with Keras :
The argument validation_split (generating a holdout set from the training data) is not supported when training from Dataset objects, since this features requires the ability to index the samples of the datasets, which is not possible in general with the Dataset API.
有解决办法吗?我怎样才能仍然使用带有 TF 数据集的验证集?
最佳答案
不,您不能使用 validation_split
(如文档中清楚描述的那样),但您可以创建 validation_data
并创建 Dataset
“手动”。
您可以在相同的 tensorflow tutorial 中看到示例:
# Prepare the training dataset
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
# Prepare the validation dataset
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_dataset = val_dataset.batch(64)
model.fit(train_dataset, epochs=3, validation_data=val_dataset)
您可以使用简单的切片从 numpy
数组((x_train, y_train)
和 (x_val, y_val)
)创建这两个数据集显示在那里:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]
还有其他方法可以创建tf.data.Dataset
对象,参见tf.data.Dataset
文档和相关教程/笔记本。
关于python - 使用 TensorFlow 数据集进行验证集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61595081/