python - 程序卡在 Tensorflow 1.6 中的 Estimator.evaluate 上

作为一个学习工具，我试图做一些简单的事情。

我有两个训练 CSV 文件:

一个文件有 36 列(3500 条记录)，其中有 0 和 1。我将此文件设想为展平的 6x6 矩阵。我有另一个 CSV 文件，其中包含 1 列真实值 0 或 1(3500 条记录)，该文件指示 6x6 矩阵对角线中的 6 个元素中是否至少有 4 个为 1。

我还有两个测试 CSV 文件，它们的结构与训练文件相同，只是每个文件都有 500 条记录。

当我使用调试器单步调试程序时，似乎...

estimator.train(
    input_fn=lambda: get_inputs(x_paths=[x_train_file], y_paths=[y_train_file], batch_size=32), steps=100)

...运行正常。我在检查点目录中看到文件，并在 Tensorboard 中看到损失函数图。

但是当程序到达...

eval_result = estimator.evaluate(
    input_fn=lambda: get_inputs(x_paths=[x_test_file], y_paths=[y_test_file], batch_size=32))

...它只是挂起。

我已经检查了测试文件，并且还尝试使用训练文件运行 estimator.evaluate。仍然挂起

我使用的是 TensorFlow 1.6、Python 3.6

以下是全部代码:

import tensorflow as tf
import os
import numpy as np

x_train_file = os.path.join('D:', 'Diag', '6x6_train.csv')
y_train_file  = os.path.join('D:', 'Diag', 'HasDiag_train.csv')
x_test_file = os.path.join('D:', 'Diag', '6x6_test.csv')
y_test_file  = os.path.join('D:', 'Diag', 'HasDiag_test.csv')
model_chkpt = os.path.join('D:', 'Diag', "checkpoints")

def get_inputs(
        count=None, shuffle=True, buffer_size=1000, batch_size=32,
        num_parallel_calls=8, x_paths=[x_train_file], y_paths=[y_train_file]):
    """
    Get x, y inputs.

    Args:
        count: number of epochs. None indicates infinite epochs.
        shuffle: whether or not to shuffle the dataset
        buffer_size: used in shuffle
        batch_size: size of batch. See outputs below
        num_parallel_calls: used in map. Note if > 1, intra-batch ordering
            will be shuffled
        x_paths: list of paths to x-value files.
        y_paths: list of paths to y-value files.

    Returns:
        x: (batch_size, 6, 6) tensor
        y: (batch_size, 2) tensor of 1-hot labels
    """

    def x_map(line):
        n_dims = 6
        columns = [str(i1) for i1 in range(n_dims**2)]
        # Decode the line into its fields
        fields = tf.decode_csv(line, record_defaults=[[0]] * (n_dims ** 2))

        # Pack the result into a dictionary
        features = dict(zip(columns, fields))
        return features

    def y_map(line):
        y_row = tf.string_to_number(line, out_type=tf.int32)
        return y_row

    def xy_map(x, y):
        return x_map(x), y_map(y)

    x_ds = tf.data.TextLineDataset(x_train_file)
    y_ds = tf.data.TextLineDataset(y_train_file)

    combined = tf.data.Dataset.zip((x_ds, y_ds))
    combined = combined.repeat(count=count)
    if shuffle:
        combined = combined.shuffle(buffer_size)
    combined = combined.map(xy_map, num_parallel_calls=num_parallel_calls)
    combined = combined.batch(batch_size)
    x, y = combined.make_one_shot_iterator().get_next()
    return x, y

columns = [str(i1) for i1 in range(6 ** 2)]

feature_columns = [
    tf.feature_column.numeric_column(name)
    for name in columns]

estimator = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                   hidden_units=[18, 9],
                                   activation_fn=tf.nn.relu,
                                   n_classes=2,
                                   model_dir=model_chkpt)

estimator.train(
    input_fn=lambda: get_inputs(x_paths=[x_train_file], y_paths=[y_train_file], batch_size=32), steps=100)

eval_result = estimator.evaluate(
    input_fn=lambda: get_inputs(x_paths=[x_test_file], y_paths=[y_test_file], batch_size=32))

print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))

最佳答案

有两个参数导致此问题:

tf.data.Dataset.repeat有一个 count 参数:

count: (Optional.) A tf.int64 scalar tf.Tensor, representing the number of times the dataset should be repeated. The default behavior (if count is None or -1) is for the dataset be repeated indefinitely.

在您的情况下，count 始终为None，因此数据集会无限期重复。</p>
tf.estimator.Estimator.evaluate有 steps 参数:

steps: Number of steps for which to evaluate model. If None, evaluates until input_fn raises an end-of-input exception.

步骤是为训练设置的，但不是为评估设置的，因此估计器一直运行，直到 input_fn 引发输入结束异常，如上所述，这种情况永远不会发生。

您应该设置其中任何一个，我认为 count=1 是最合理的评估。

关于python - 程序卡在 Tensorflow 1.6 中的 Estimator.evaluate 上，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49962323/

python - 程序卡在 Tensorflow 1.6 中的 Estimator.evaluate 上

上一篇：python - Django - 运行所有测试时仅打印错误

下一篇：Python 从多个文件写入会覆盖以前的内容