neural-network - 批量大小不适用于带有 deploy.prototxt 的 caffe

标签 neural-network classification deep-learning caffe pycaffe

我正在努力使我的分类过程更快一些。我想增加我的 deploy.prototxt 中的第一个 input_dim,但这似乎不起作用。它甚至比一张一张地分类要慢一点。

部署.prototxt

input: "data"  
input_dim: 128  
input_dim: 1  
input_dim: 120  
input_dim: 160  
... net description ...

python网络初始化

net=caffe.Net( 'deploy.prototxt', 'model.caffemodel', caffe.TEST)
net.blobs['data'].reshape(128, 1, 120, 160)
transformer = caffe.io.Transformer({'data':net.blobs['data'].data.shape})
#transformer settings

python分类

images=[None]*128
for i in range(len(images)):
  images[i]=caffe.io.load_image('image_path', False)
for j in range(len(images)):
  net.blobs['data'].data[j,:,:,:] = transformer.preprocess('data',images[j])
out = net.forward()['prob']

我跳过了一些细节,但应该给出重要的东西。我尝试了不同的批量大小,例如 32、64、...、1024,但都几乎相同。所以我的问题是,如果有人知道我做错了什么或需要改变什么? 感谢您的帮助!

编辑:
一些计时结果,avg-times只是总时间除以处理图像(1044)。

批量大小:1

2016-05-04 10:51:20,721 - detector - INFO - data shape: (1, 1, 120, 160)
2016-05-04 10:51:35,149 - main - INFO - GPU timings:
2016-05-04 10:51:35,149 - main - INFO - processed images: 1044
2016-05-04 10:51:35,149 - main - INFO - total-time: 14.43s
2016-05-04 10:51:35,149 - main - INFO - avg-time: 13.82ms
2016-05-04 10:51:35,149 - main - INFO - load-time: 8.31s
2016-05-04 10:51:35,149 - main - INFO - avg-load-time: 7.96ms
2016-05-04 10:51:35,149 - main - INFO - classify-time: 5.99s
2016-05-04 10:51:35,149 - main - INFO - avg-classify-time: 5.74ms

批量大小:32

2016-05-04 10:52:30,773 - detector - INFO - data shape: (32, 1, 120, 160)
2016-05-04 10:52:45,135 - main - INFO - GPU timings:
2016-05-04 10:52:45,135 - main - INFO - processed images: 1044
2016-05-04 10:52:45,135 - main - INFO - total-time: 14.36s
2016-05-04 10:52:45,136 - main - INFO - avg-time: 13.76ms
2016-05-04 10:52:45,136 - main - INFO - load-time: 7.13s
2016-05-04 10:52:45,136 - main - INFO - avg-load-time: 6.83ms
2016-05-04 10:52:45,136 - main - INFO - classify-time: 7.13s
2016-05-04 10:52:45,136 - main - INFO - avg-classify-time: 6.83ms

批量大小:128

2016-05-04 10:53:17,478 - detector - INFO - data shape: (128, 1, 120, 160)
2016-05-04 10:53:31,299 - main - INFO - GPU timings:
2016-05-04 10:53:31,299 - main - INFO - processed images: 1044
2016-05-04 10:53:31,299 - main - INFO - total-time: 13.82s
2016-05-04 10:53:31,299 - main - INFO - avg-time: 13.24ms
2016-05-04 10:53:31,299 - main - INFO - load-time: 7.06s
2016-05-04 10:53:31,299 - main - INFO - avg-load-time: 6.77ms
2016-05-04 10:53:31,299 - main - INFO - classify-time: 6.66s
2016-05-04 10:53:31,299 - main - INFO - avg-classify-time: 6.38ms

批量大小:1024

2016-05-04 10:54:11,546 - detector - INFO - data shape: (1024, 1, 120, 160)
2016-05-04 10:54:25,316 - main - INFO - GPU timings:
2016-05-04 10:54:25,316 - main - INFO - processed images: 1044
2016-05-04 10:54:25,316 - main - INFO - total-time: 13.77s
2016-05-04 10:54:25,316 - main - INFO - avg-time: 13.19ms
2016-05-04 10:54:25,316 - main - INFO - load-time: 7.04s
2016-05-04 10:54:25,316 - main - INFO - avg-load-time: 6.75ms
2016-05-04 10:54:25,316 - main - INFO - classify-time: 6.63s
2016-05-04 10:54:25,316 - main - INFO - avg-classify-time: 6.35ms

最佳答案

我很确定问题出在行上

for j in range(len(images)):
net.blobs['data'].data[j,:,:,:] =   transformer.preprocess('data',images[j])
out = net.forward()['prob']

这样做只会将 for 循环的最后一次迭代中的单个图像数据设置为网络的唯一输入。尝试预先堆叠 N 图像(例如 stackedimages)并仅调用一次该行,例如

for j in range(len(images)):
stackedimages <- transformer.preprocess('data',images[j])

然后调用,

net.blobs['data'].data[...] =   stackedimages

关于neural-network - 批量大小不适用于带有 deploy.prototxt 的 caffe,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37003575/

相关文章:

cluster-analysis - 聚类还是分类?

machine-learning - 哪个是对纯输入文本进行分类的最佳 svm 示例?

python - 在 python 上使用 TensorRT .engine 文件进行推理

deep-learning - 空间依赖性与时间依赖性

neural-network - 我如何决定在特定情况下使用哪种神经网络和学习方法?

machine-learning - Caffe中Tiling层的用途是什么

machine-learning - 3层神经网络可以用于多类分类吗?

python - 如何生成包含几个随机整数的列表?

python-3.x - scikit-learn 中针对大量特征的特征选择

c++ - Caffe Sigmoid 交叉熵损失层损失函数