python - 调用 tf.Session() 两次导致 fatal error : failed to get device attribute 13 for device 0

标签 python tensorflow

我刚刚安装了带有 CUDA 10.0.130 和 cudnn v7.6.1.34 的 Tensor Flow 1.14.0。 当我在一个 python session 中第一次调用 tf.Session() 时它运行良好,但是当我尝试再次调用它时即使我关闭了第一个 session 它也会崩溃。

重现此错误的最小示例如下

(tensorflow-gpu) C:\Users\Argen>python
Python 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> a = tf.Session()
2019-07-20 12:04:23.279225: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2019-07-20 12:04:23.912859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce 940M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:01:00.0
2019-07-20 12:04:23.921996: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-07-20 12:04:23.927364: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-07-20 12:04:23.931103: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-07-20 12:04:23.938320: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce 940M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:01:00.0
2019-07-20 12:04:23.944323: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-07-20 12:04:23.950175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-07-20 12:04:26.671775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-20 12:04:26.678254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0
2019-07-20 12:04:26.681610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N
2019-07-20 12:04:26.686087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1391 MB memory) -> physical GPU (device: 0, name: GeForce 940M, pci bus id: 0000:01:00.0, compute capability: 5.0)
>>> a.close()
>>> a = tf.Session()
2019-07-20 12:06:57.801849: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error

我的环境是: 赢 10 专业 Intel(R) 高清显卡 520 和 NVIDIA GeForce 940M Python 3.7.3

最佳答案

默认情况下,TensorFlow 为进程的生命周期分配 GPU 内存,而不是 session 对象的生命周期。更多详情:https://www.tensorflow.org/programmers_guide/using_gpu#allowing_gpu_memory_growth

因此,如果你想释放内存,你必须退出 Python 解释器,而不仅仅是关闭 session 。

希望对您有所帮助。

关于python - 调用 tf.Session() 两次导致 fatal error : failed to get device attribute 13 for device 0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57121326/

相关文章:

python - 如何在tensorflow中加载本地镜像?

tensorflow - 当不再需要时如何从内存中释放张量?

python - conda安装boost后,libboost_python.lib和boost_python.lib有什么区别

python - 由于无法选择表 id 属性,如何使用 BeautifulSoup 抓取表?

python - 元音查找器,错误 : list index out of range

python - 导入错误 : Attempted relative import with no known parent package

python - 如何使用另一个数组的元素作为索引对 tensorflow 中的张量进行切片?

python - Tensorflow 重新训练.py tensorflow.python.framework.errors_impl.FailedPreconditionError

multithreading - 关闭/终止对象的 Pythonic 方式

python - 导入错误 : Could not import the Python Imaging Library (PIL) required to load image files on tensorflow