linux - 安装tensorflow : cannot stat '/usr/include/cudnn.h'

标签 linux tensorflow

我正在尝试在我的 Jetson TX2 上安装 tensorflow,因此我正在遵循 Jetsonhacks 的教程:https://www.youtube.com/watch?v=V51IO7kNXCg

当尝试执行 ./setTensorflowEV.sh 时,我得到以下输出:

~/installTensorFlowTX2$ ./setTensorFlowEV.sh 
mkdir: cannot create directory ‘/usr/lib/aarch64-linux-gnu/include/’: File exists
cp: cannot stat '/usr/include/cudnn.h': No such file or directory
You have bazel 0.5.2- installed.
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-    packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Invalid path to CUDA 8.0 toolkit. /usr/local/cuda/lib64/libcudart.so.8.0 cannot be found

setTensorflowEV.sh 文件的内容:https://github.com/jetsonhacks/installTensorFlowTX2/blob/master/setTensorFlowEV.sh

我试图在我的系统上找到 cudnn.h ($locate cudnn.h),但它不在任何地方。我还查看了我需要在共享对象中安装的内容 (sudo apt-file search libcudart.so.8.0),但也没有返回任何内容。

所以我想知道我该怎么做才能不再收到此错误消息。

重要说明:我没有物理访问电路板来对其进行闪存或类似操作

我试过像这样禁用 cuda TF_CUDA_NEED=0

给出:

~/installTensorFlowTX2$ ./setTensorFlowEV.sh 
mkdir: cannot create directory ‘/usr/lib/aarch64-linux-gnu/include/’: File exists
cp: cannot stat '/usr/include/cudnn.h': No such file or directory
You have bazel 0.5.2- installed.
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Configuration finished

但是在尝试构建 Tensorflow 时我得到:

~/installTensorFlowTX2$ ./buildTensorFlow.sh 
ERROR: /home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/BUILD:4:1: Traceback (most recent call last):
    File "/home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/BUILD", line 4
        error_gpu_disabled()
    File "/home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/error_gpu_disabled.bzl", line 3, in error_gpu_disabled
        fail("ERROR: Building with --config=c...")
ERROR: Building with --config=cuda but TensorFlow is not configured to build with GPU support. Please re-run ./configure and enter 'Y' at the prompt to build with GPU support.
ERROR: no such target '@local_config_cuda//crosstool:toolchain': target 'toolchain' not declared in package 'crosstool' defined by /home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/BUILD.
INFO: Elapsed time: 0.403s

我在任何地方都没有 ./configure 脚本,在我的 ./buildTensorFlow.sh 文件中像这样设置行 export TF_NEED_CUDA=0:

#this is my modified buildTensorFlow.sh file
export TF_NEED_CUDA=0
export TF_CUDA_VERSION=8.0
export CUDA_TOOLKIT_PATH=/usr/local/cuda
export TF_CUDNN_VERSION=6.0.21
export CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu/
export TF_CUDA_COMPUTE_CAPABILITIES=6.2

# Build Tensorflow
cd $HOME/tensorflow
bazel build -c opt --local_resources 3072,4.0,1.0 --verbose_failures --config=cuda //tensorflow/tools/pip_package:build_pip_package

最佳答案

DISCLAIMER for other readers: I cannot test this and I'm going to assume that the board has been previously flashed with an Nvidia L4T Ubuntu 16.04. If it is not, stop reading and good luck, but the board needs to be flashed with that one to run reliably and to be stable for an embedded application. Any diversion from that may cause any sort of unknown behavior.

OP 声明该板已使用 L4T 27.1 闪存,该 L4T 27.1 指的是 Nvidia JetPack 3.0,您可以从 Nvidia Archives 下载,here .要了解您的 L4T 需要哪个版本的 JetPack,您可以引用 this page .

下载 JetPack 后,我们需要将其解压缩并运行其内部二进制文件之一以创建存储库 json 文件。

bash ./JetPack-L4T-3.0-linux-x64.run --noexec
cd _installer
./Chooser

Chooser 需要在您的主机上安装 libpng12(至少)。如果您 checkin 目录,它会生成我们需要检查的 repository.json。从该文件看来,NVIDIA 正在为 TX1 和 TX"提供相同的包,因此我们需要关注 TX1 包。

通过检查它出现的 json:

您必须使用 ssh (wget http...) 在开发板上下载这两个包。

您应该安装的第一个是 cuda 存储库:

 sudo dpkg -i cuda-repo*.deb

这将使很多包在本地可用,例如您需要安装的libcudart:

 sudo apt update
 sudo apt install cuda-toolkit-8.0 # (this may be enough)

还有其他软件包可能需要安装(使用 ls/var/cuda* 列出所有软件包)。

对于 cudnn 的安装,您必须将之前的文件解压缩到一个临时目录中:

unzip cuDNN-....zip
cd cuDNN

需要安装三个deb文件

sudo dpkg -i *.deb

这应该会在正确的目录中安装所有需要的文件。此时您应该尝试重新启动编译过程。但在那之前我会改变this line版本为 5.1.x(在本例中为 5.1.5)。

关于linux - 安装tensorflow : cannot stat '/usr/include/cudnn.h' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46544757/

相关文章:

python - 无法迭代 tf.data.Dataset

python - 如何使用 python 从 Linux/Windows 机器在远程 Linux/Windows 机器上执行命令?

linux - 如何在 shell 脚本中使用 grep 在文件中查找单词

python - 在 Windows 7 中用冒号 (":") 打开文件名

linux - ValueError : Cannot feed value of shape (50, ) 对于张量 u'Placeholder_1 : 0', which has shape ' (? , 10)'

java - 在 CentOS Linux 上从 Tomcat 访问 Tensorflow

linux - 在 bash 中使用带有命令的变量

linux - Cloudera Manager 认证失败 : Exhausted available authentication methods

python - 来自 Keras 应用程序模块的 TensorFlow/Keras : How to get missing models (ResNet101, ResNeXt 等)?

python - 如何构建具有多个输入的 Tensorflow 模型?