python - 我在 Colab 上对图像分类模型的训练总是停止而不会出现错误

标签 python tensorflow machine-learning deep-learning google-colaboratory

当我设置 colab 环境来训练图像分类器模型时,它会启动训练过程并最终自行停止。我怀疑分配的12G RAM不够,因为RAM上的条变成橙色并且进程停止,然后进程显示ctrl C(这意味着停止训练。我可以增加RAM内存吗?

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /content/drive/My Drive/models/research/slim/nets/inception_resnet_v2.py:373: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.

WARNING:tensorflow:From /content/drive/My Drive/models/research/slim/nets/mobilenet/mobilenet.py:397: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From train.py:55: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From train.py:55: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

WARNING:tensorflow:From train.py:184: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/absl/app.py:250: main (from __main__) is deprecated and will be removed in a future version.
Instructions for updating:
Use object_detection/model_main.py.
W1001 13:13:15.483837 139753866016640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/absl/app.py:250: main (from __main__) is deprecated and will be removed in a future version.
Instructions for updating:
Use object_detection/model_main.py.
WARNING:tensorflow:From train.py:90: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

W1001 13:13:15.484074 139753866016640 deprecation_wrapper.py:119] From train.py:90: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

W1001 13:13:15.484555 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:From train.py:95: The name tf.gfile.Copy is deprecated. Please use tf.io.gfile.copy instead.

W1001 13:13:15.490095 139753866016640 deprecation_wrapper.py:119] From train.py:95: The name tf.gfile.Copy is deprecated. Please use tf.io.gfile.copy instead.

WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/legacy/trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
W1001 13:13:15.501523 139753866016640 deprecation.py:323] From /content/drive/My Drive/models/research/object_detection/legacy/trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:182: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

W1001 13:13:15.505989 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:182: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:197: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

W1001 13:13:15.506183 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:197: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:64: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

W1001 13:13:15.524426 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:64: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:71: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

W1001 13:13:15.527117 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:71: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W1001 13:13:15.527241 139753866016640 dataset_builder.py:72] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:86: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.
W1001 13:13:15.533276 139753866016640 deprecation.py:323] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:86: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/data/python/ops/interleave_ops.py:77: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W1001 13:13:15.533428 139753866016640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/data/python/ops/interleave_ops.py:77: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:155: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W1001 13:13:15.562883 139753866016640 deprecation.py:323] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:155: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:43: DatasetV1.make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
W1001 13:13:33.349572 139753866016640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
I1001 13:13:33.351701 139753866016640 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I1001 13:13:33.607376 139753866016640 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Starting Session.
I1001 13:13:38.220966 139753866016640 learning.py:754] Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
I1001 13:13:38.410431 139752680122112 supervisor.py:1117] Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
I1001 13:13:38.413790 139753866016640 learning.py:768] Starting Queues.
2019-10-01 13:13:49.720631: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 1382 of 2048
INFO:tensorflow:global_step/sec: 0
I1001 13:13:49.738999 139752671729408 supervisor.py:1099] global_step/sec: 0
2019-10-01 13:13:54.929910: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:162] Shuffle buffer filled.
INFO:tensorflow:Recording summary at step 0.
I1001 13:13:56.814973 139752663336704 supervisor.py:1050] Recording summary at step 0.
INFO:tensorflow:global step 1: loss = 13.7762 (20.265 sec/step)
I1001 13:14:00.905406 139753866016640 learning.py:507] global step 1: loss = 13.7762 (20.265 sec/step)
^C

最佳答案

既然它变成黄色,然后它也可以变成红色,这意味着 RAM 已满。您无法增加 RAM。该问题已由 Google 修复。

解决这个问题的一种方法是减小尺寸,如果这不起作用,那么还可以减少模型层中神经元的数量(参数数量)。

关于python - 我在 Colab 上对图像分类模型的训练总是停止而不会出现错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58186074/

相关文章:

java - 在映射器内部的对象中转换字符串化的 MapWritable

python - 绘制 Matplotlib 中的缺陷

python - 如何使用 Python 提取硬件 ID?

python - 将 float 列表转换为 NumPy 数组

c++ - 在 tensorflow 中访问张量数据时出现段错误(在 c++ 中)

python - 如何更改tensorflow的SKCompat中的global_step

python - tf.where 具有多个条件

python - Scikit Learn的ExtraTreeRegressor考虑的特征构建随机森林

Python - 通过将一些项目移动到前面,同时保持其余项目按相同顺序来重新排序列表中的项目

python - 使用 scikit 决策树进行多输出分类