python - W tensorflow/core/common_runtime/gpu/gpu_device.cc :1598] Cannot dlopen some GPU libraries

标签 python tensorflow installation centos centos7

我应该如何在 CentOS 7 中解决这个问题?

[jalal@goku ~]$ pip freeze | grep tensorflow
tensorflow-estimator==2.2.0
tensorflow-gpu==2.2.0
[jalal@goku ~]$ python
Python 3.8.5 (default, Mar 31 2021, 02:37:07) 
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-06-07 23:50:07.811271: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-06-07 23:50:07.867796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:05:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-06-07 23:50:07.869403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:06:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-06-07 23:50:07.870136: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64:
2021-06-07 23:50:07.874249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-06-07 23:50:07.877819: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-06-07 23:50:07.878745: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-06-07 23:50:07.882687: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-06-07 23:50:07.884788: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-06-07 23:50:07.890952: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-07 23:50:07.891011: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Num GPUs Available:  0
尽管有两个 GPU:
enter image description here
[jalal@goku ~]$ lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.9.2009 (Core)
Release:    7.9.2009
Codename:   Core
还,
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
我按照 https://github.com/tensorflow/tensorflow/issues/38194#issuecomment-629801937 的建议尝试了以下操作并没有工作:
[jalal@goku djrn]$ ls /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2
lrwxrwxrwx. 1 root root 20 Sep 21  2020 /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2 -> libcudart.so.10.2.89
[jalal@goku djrn]$ sudo ln -s /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2 /usr/lib/x86_64-linux-gnu/libcudart.so.10.1
[sudo] password for jalal: 
ln: failed to create symbolic link ‘/usr/lib/x86_64-linux-gnu/libcudart.so.10.1’: No such file or directory
自从:
ls: cannot access /usr/lib/x86_64-linux-gnu: No such file or directory
具体来说,我需要 tensforflow 才能使用 CUDA 10.2,我可以使用任何版本的 tensorflow(首选 tensorflow 2+),但是找不到适用于 CUDA 10.2 的版本。 https://www.tensorflow.org/install/source#tested_build_configurations
另外,基于此,我的CUDA版本是 10.2这与 nvidia-smi 不同和 nvcc --version版本:
$ stat /usr/local/cuda
  File: ‘/usr/local/cuda’ -> ‘/usr/local/cuda-10.2’
  Size: 20          Blocks: 0          IO Block: 4096   symbolic link
Device: fd00h/64768d    Inode: 67157410    Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:usr_t:s0
Access: 2021-05-20 10:43:06.864530636 -0400
Modify: 2020-09-21 09:39:18.559883390 -0400
Change: 2020-09-21 09:39:18.559883390 -0400
 Birth: -
P.S.:我使用 python venv 制作了我的虚拟环境命令并且不想使用 condapyenv .
P.P.S.:我做了这个软链接(soft link),但仍然不起作用:
(djrn) [jalal@goku djrn]$ sudo ln -s /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2 /usr/lib/libcudart.so.10.1
[sudo] password for jalal: 
(djrn) [jalal@goku djrn]$ ls /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2
lrwxrwxrwx. 1 root root 20 Sep 21  2020 /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2 -> libcudart.so.10.2.89
(djrn) [jalal@goku djrn]$ python
Python 3.8.5 (default, Mar 31 2021, 02:37:07) 
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-06-08 01:40:39.152040: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-06-08 01:40:39.401399: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:05:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-06-08 01:40:39.403106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:06:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-06-08 01:40:39.403438: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64:
2021-06-08 01:40:39.406985: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-06-08 01:40:39.410320: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-06-08 01:40:39.410912: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-06-08 01:40:39.414628: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-06-08 01:40:39.416297: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-06-08 01:40:39.422208: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-08 01:40:39.422260: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Num GPUs Available:  0
>>> 

最佳答案

归功于 jonno_FTW

$ sudo ln -s /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2 /usr/lib/x86_64-linux-gnu/libcudart.so.10.1
$ export LD_LIBRARY_PATH=/usr/lib
解决了这个问题。现在我看到以下输出:
(djrn) [jalal@goku djrn]$ python
Python 3.8.5 (default, Mar 31 2021, 02:37:07) 
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-06-08 01:45:59.138197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-06-08 01:45:59.191833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:05:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-06-08 01:45:59.193773: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:06:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-06-08 01:45:59.194216: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-06-08 01:45:59.197372: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-06-08 01:45:59.200555: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-06-08 01:45:59.201078: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-06-08 01:45:59.204664: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-06-08 01:45:59.206295: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-06-08 01:45:59.212072: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-08 01:45:59.217509: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1
Num GPUs Available:  2

关于python - W tensorflow/core/common_runtime/gpu/gpu_device.cc :1598] Cannot dlopen some GPU libraries,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67881193/

相关文章:

python - 任何与 Tor 的 Python 绑定(bind)?

python - 如何仅在python虚拟环境中升级django包?

python - 从无效的 json 字符串加载 python 字典

python - 不要用 Python 字符串 split() 拆分双引号单词?

python - 如何分析 tf.data.Dataset?

tensorflow - ValueError : Tensor Tensor(. ..) 不是该图的元素。使用全局变量 keras 模型时

machine-learning - 为什么 tensorflow 中的这段代码不起作用?

javascript - Node.js 在 cmd 中运行,但不在浏览器中运行

android - "Android studio is currently running"错误

python - 在python中建立数据库连接的正确方法