python - Keras 看到我的 GPU 但在训练神经网络时不使用它

我的 GPU 没有被 Keras/TensorFlow 使用。

为了让我的 GPU 与 tensorflow 一起工作，我通过 pip 安装了 tensorflow-gpu(我在 Windows 上使用 Anaconda)

我有英伟达1080ti

print(tf.test.is_gpu_available())

True

print(tf.config.experimental.list_physical_devices())

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), 
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

我绑

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

但没用

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))
print(sess)

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1

<tensorflow.python.client.session.Session object at 0x000001A2A3BBACF8>

仅来自 tf 的警告:

W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows

整个日志:

2019-10-18 20:06:26.094049: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2019-10-18 20:06:35.078225: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-10-18 20:06:35.090832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-10-18 20:06:35.180744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:06:35.185505: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:06:35.189328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:06:35.898592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:06:35.901683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:06:35.904235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:06:35.906687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-18 20:06:38.694481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:06:38.700482: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:06:38.704020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
[I 20:06:47.324 NotebookApp] Saving file at /Untitled.ipynb
2019-10-18 20:07:22.227110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:07:22.246012: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:07:22.261643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:07:22.272150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:07:22.275457: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:07:22.277980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:07:22.316260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2019-10-18 20:07:32.986802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:07:32.990509: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:07:32.993763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:07:32.995570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:07:32.997920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:07:32.999435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:07:33.001380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-18 20:07:36.048204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-10-18 20:07:37.971703: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-10-18 20:07:38.576861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll

还尝试用 pip 重新安装 tensorflow-gpu

为什么我认为 GPU 不起作用？ - 因为我的 python 内核使用 CPU 99%、RAM 99%，有时 GPU ~7%，但大多数时候是 0
我使用自定义数据生成器，但现在它只选择批处理并调整它们的大小(skimage.io.resize) 1 个纪元 ~ 44s 也有奇怪的行为，每 ~10 个样本在随机点卡住，最后一个样本几乎不卡住(37/38)(~10-15 秒)

编辑:

我发布了我的自定义数据生成 here

train_gen = DataGenerator(x = x_train,
                              y = y_train,
                              batch_size = 128,
                              target_shape = (100, 100, 3), 
                              sample_std = False,
                              feature_std = False,
                              proj_parameters = None,
                              blur_parameters = None,
                              nois_parameters = None,
                              flip_parameters = None,
                              gamm_parameters = None)

验证是一样的

更新:

所以它是一个引起问题的生成器，但我该如何解决它？
我只使用了 skimage 和 numpy 操作

最佳答案

日志显示确实使用了 GPU。您几乎肯定会遇到 IO 瓶颈:您的 GPU 正在处理 CPU 扔给它的任何东西，速度快于 CPU 加载和预处理它的速度。这在深度学习中很常见，并且有很多方法可以解决它。

如果不详细了解您的数据管道(批处理的字节大小、预处理步骤……)以及数据的存储方式，我们将无法提供很多帮助。加快速度的一种典型方法是将数据存储为二进制格式，例如 TFRecords，以便 CPU 可以更快地加载它。查看official documentation for this.

编辑:我快速浏览了您的输入管道。这个问题很可能确实是由 IO 造成的:

您还应该在 GPU 上运行预处理步骤，您使用的大量增强技术都在 tf.image 中实现。如果可以，您应该考虑使用 Tensorflow 2.0，因为它包含 Keras，并且其中还有很多帮助程序。
检查 tf.data.Dataset API，它有很多帮助程序可以在不同的线程中加载所有数据，这可以根据您拥有的内核数量粗略地加快进程。<
您应该将图像存储为 TFRecords。如果您的输入图像较小，这可能会将加载速度提高一个数量级。
您也可以尝试更大的批量大小，我认为您的图像可能非常小。

关于python - Keras 看到我的 GPU 但在训练神经网络时不使用它，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58455765/

python - Keras 看到我的 GPU 但在训练神经网络时不使用它

编辑:

更新:

上一篇：python - Pandas 数据框 : how to fill `nan` with 0s but only nans existing between valid values?

下一篇：python - main() 接受 0 个位置参数，但给出了 2 个