python - 由于保存模型导致训练崩溃 : "tensorflow.GraphDef was modified concurrently during serialization"

标签 python tensorflow machine-learning deep-learning tensorflow-datasets

我目前正在尝试训练模型,并且我的输入管道是按照此答案 here 构建的。我想在每个时期后保存我的模型。但经过一些时期的训练后,训练崩溃了。我读到这是因为它将输入作为常量张量添加到图中。有建议的解决方案here使用 tf.placeholder 来解决问题。不幸的是它并不能解决我的问题。输入管道如下所示

....
filenames = [P_1]
dataset = tf.data.TFRecordDataset(filenames)
def _parse_function(example_proto):
       keys_to_features = { 'data':tf.VarLenFeature(tf.float32)},
       parsed_features = tf.parse_single_example(example_proto,  keys_to_features)
       return tf.sparse_tensor_to_dense(parsed_features['data'
# Parse the record into tensors.
dataset = dataset.map(_parse_function)
# Shuffle the dataset
dataset = dataset.shuffle(buffer_size=1000)
# Repeat the input indefinitly 
dataset = dataset.repeat()      
# Generate batches     
dataset = dataset.batch(Batch_size) 
# Create a one-shot iterator
iterator = dataset.make_one_shot_iterator()
data = iterator.get_next()   
....
for i in range(epochs):
    for ii in range(iteration):
        image = sess.run(data)
        ....
     saver.save(sess, 'filename')

错误消息如下所示

[libprotobuf FATAL external/protobuf_archive/src/google/protobuf/message_lite.cc:68] CHECK failed: (byte_size_before_serialization) == (byte_size_after_serialization): tensorflow.GraphDef was modified concurrently during serialization.
terminate called after throwing an instance of 'google::protobuf::FatalException'  
what():  CHECK failed: (byte_size_before_serialization) == (byte_size_after_serialization): tensorflow.GraphDef was modified concurrently during serialization.
Aborted

最佳答案

问题看起来像是在 _parse_function 中。确保解析器以与创建 TFrecord 文件时相同的方式执行操作。例如,如果它们具有相同的数据类型等等

关于python - 由于保存模型导致训练崩溃 : "tensorflow.GraphDef was modified concurrently during serialization",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52717630/

相关文章:

python - Django 静态文件导致 404

Python:使用另一个列表中的索引汇总列表中的数据

python - Tensorflow - tf.nn.conv2D() 中的权重值是否发生变化?

protocol-buffers - 有没有关于如何生成包含经过训练的 TensorFlow 图的 protobuf 文件的示例

python - 我的模型是否应该始终在训练数据集上提供 100% 的准确率?

machine-learning - 交叉熵和对数损失误差有什么区别?

python - 按两个元素分组的数据帧统计信息

python - HDFStore 附加错误 - "Cannot serialize the column"

python - 创建随机可训练索引,以从 tensorflow 中的一个 2_D 张量构建两个 2_D 张量

python - 为什么不是所有的激活函数都相同?