python - 在Caffe2上启用多线程

当使用 Caffe2 编译我的程序时，我收到以下警告:

[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.

由于我确实想获得 Caffe2 的多线程支持，所以我已经搜索了该怎么做。我发现 Caffe2 必须在创建 cmake 或在 CMakeLists 中重新编译并设置一些参数。

由于我已经在 conda 环境中安装了 pytorch，因此我首先使用以下命令卸载了 Caffe2:

pip uninstall -y caffe2

然后我按照 Caffe2 docs 中的说明进行操作，从源头构建它。我首先按照指示安装了依赖项。然后我在我的 conda 环境中下载了 pytorch :

git clone https://github.com/pytorch/pytorch.git && cd pytorch
git submodule update --init --recursive

此时我想是时候更改刚刚下载的pytorch\caffe2\CMakeLists文件了。我读过，为了启用多线程支持，启用此 CMakeLists 中的选项 USE_NATIVE_ARCH 就足够了，但是我无法在其中找到这样的选项我正在寻找。也许我做错了什么。有什么想法吗？谢谢。

以下是有关我的平台的一些详细信息:

我使用的是 macOS Big Sur
我的python版本是3.8.5

更新:

要回答Nega，这就是我所得到的:

python3 -c 'import torch; print(torch.__config__.parallel_info())'
ATen/Parallel:
    at::get_num_threads() : 1
    at::get_num_interop_threads() : 4
OpenMP not found
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
    mkl_get_max_threads() : 4
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
std::thread::hardware_concurrency() : 8
Environment variables:
    OMP_NUM_THREADS : [not set]
    MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP

更新2:

原来XCode自带的Clang不支持OpenMP。我使用的 gcc 只是 Clang 的符号链接(symbolic link)。事实上，运行 gcc --version 后我得到了:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin20.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

我从 Homebrew 安装了 gcc-10 并设置了这样的别名 alias gcc='gcc-10'。事实上，现在使用 gcc --version 这就是我得到的:

gcc-10 (Homebrew GCC 10.2.0_4) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

我还尝试了使用 8 个线程的 OpenMP 的简单 Hello World，一切似乎都正常。但是重新运行命令后:

python3 -c 'import torch; print(torch.__config__.parallel_info())'

我得到了同样的结果。有什么想法吗？

最佳答案

AVX、AVX2、FMA是CPU指令集，与多线程无关。如果 pytorch/caffe2 的 pip 包在不支持这些指令的 CPU 上使用这些指令，则该软件将无法运行。通过 pip 安装的 Pytorch 启用了多线程。您可以使用 torch.__config__.parallel_info()

确认这一点

❯ python3 -c 'import torch; print(torch.__config__.parallel_info())'
ATen/Parallel:
    at::get_num_threads() : 6
    at::get_num_interop_threads() : 6
OpenMP 201107 (a.k.a. OpenMP 3.1)
    omp_get_max_threads() : 6
Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
    mkl_get_max_threads() : 6
Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
std::thread::hardware_concurrency() : 12
Environment variables:
    OMP_NUM_THREADS : [not set]
    MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP

话虽这么说，如果您仍然想继续从源代码构建 pytorch 和 caffe2，您要查找的标志 USE_NATIVE 位于 pytorch/CMakeLists.txt 中，比 caffe2 上一级。编辑该文件并将 USE_NATIVE 更改为 ON。然后继续使用 python3 setup.py build 构建 pytorch。请注意，标志 USE_NATIVE 并不像您想象的那样执行。它只允许使用 CPU native 优化标志构建 MKL-DNN。它不会渗透到 caffe2(除非 caffe2 显然使用 MKL-DNN。)

关于python - 在Caffe2上启用多线程，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66315250/

python - 在Caffe2上启用多线程

上一篇：dataframe - 了解 Julia DataFrames.select() 中冒号的行为

下一篇：linux - EOF 内的 bash 脚本无法将参数传递给函数