我正在尝试在 WSL2 中设置具有 GPU 支持的 TensorFlow。我正在关注this指南。
当我运行此代码时:
>>> from tensorflow import keras
>>> import numpy as np
>>> t = np.ones([5,32,32,3])
>>> c = keras.layers.Conv2D(32, 3, activation="relu")
>>> c(t)
我收到此错误:
2023-07-09 09:59:38.820408: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:39.031437: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:39.031864: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:39.034068: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:39.034535: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:39.034921: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:40.590457: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:40.590941: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:40.591052: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1722] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-07-09 09:59:40.591459: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-07-09 09:59:40.591526: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3858 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5
Could not load library libcublasLt.so.12. Error: libcublasLt.so.12: cannot open shared object file: No such file or directory
Aborted
令人困惑的是,当我运行这段代码时:
>>> from tensorflow import keras
>>> import numpy as np
>>> t = np.ones([5,32,32,3])
>>> c = keras.layers.Dense(32, activation="relu")
>>> c(t)
我得到了输出并且没有错误。
- 我尝试重新安装 Cuda、CuDNN
- 我尝试在全新安装的 wsl ubuntu 20.04 和 22.04.2 中安装所有内容
- 我尝试过 Tensorflow 2.10、2.11、2.12 和 2.13
- 我还尝试了
apt install libcublasLt
没有任何作用
环境:
- Windows 11 家庭版
- WSL 2
- 英特尔 i7-9750h
- Nvidia RTX 2060 笔记本
- tensorflow 2.12.1
- Python 3.9
- WSL2 Ubuntu 20.04
- CUDA 11.8
- CuDNN 8.6
我也在 conda 环境中运行它
最佳答案
和您一样,我在使用 TensorFlow 和 CUDA 11 时突然开始看到这些错误,要求使用 CUDA 12 库。
经过一番搜索后,我发现 libcublas-12-0
Ubuntu 软件包确实提供了所有必需的文件:
/usr/local/cuda-12.0/targets/x86_64-linux/lib/
libcublas.so.12 -> libcublas.so.12.0.2.224
libcublasLt.so.12 -> libcublasLt.so.12.0.2.224
libnvblas.so.12 -> libnvblas.so.12.0.2.224
但是,我不想搞乱我完美的 CUDA 11 安装,所以我只是手动下载该软件包并将所需的文件提取到当前的 CUDA 目录中:
# install CUDA repo: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#network-repo-installation-for-ubuntu
sudo apt update
sudo apt download libcublas-12-0
mkdir contents
dpkg-deb -xv libcublas-12-0_<VERSION>_<ARCH>.deb contents/
sudo mv contents/usr/local/cuda-12.0/targets/x86_64-linux/lib/* /usr/local/cuda/lib64/
rm -rf contents
仍然不知道为什么 TensorFlow 首先需要这些库......
关于tensorflow2.0 - 无法加载库 libcublasLt.so.12。错误 : libcublasLt. so.12 : cannot open shared object file: No such file or directory, Conv2D TensorFlow,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76646474/