我在 Windows 上使用 Python 运行并行处理。这是我的代码:
from joblib import Parallel, delayed
def f(x):
return sqrt(x)
if __name__ == '__main__':
a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
这是错误信息:
Process PoolWorker-2:
Process PoolWorker-1:
Traceback (most recent call last):
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\pool.py", line 102, in worker
task = get()
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\User\lib\site-packages\joblib\pool.py", line 363, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
最佳答案
根据 this site问题是特定于 Windows 的:
Yes: under linux we are forking, thus their is no need to pickle the function, and it works fine. Under windows, the function needs to be pickleable, ie it needs to be imported from another file. This is actually good practice: making modules pushes for reuse.
我试过你的代码,它在 Linux 下完美运行。
在 Windows 下,如果它从脚本运行,它运行正常,例如 python script_with_your_code.py
。但是在交互式 python session 中运行时失败。当我将 f
函数保存在单独的模块中并将其导入到我的交互式 session 中时,它对我有用。
不工作:
互动环节:
>>> from math import sqrt
>>> from joblib import Parallel, delayed
>>> def f(x):
... return sqrt(x)
>>> if __name__ == '__main__':
... a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
Process PoolWorker-1:
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "C:\Python27\lib\multiprocessing\pool.py", line 102, in worker
task = get()
File "C:\Python27\lib\site-packages\joblib\pool.py", line 359, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
工作:
有趣的.py
from math import sqrt
def f(x):
return sqrt(x)
互动环节:
>>> from joblib import Parallel, delayed
>>> from fun import f
>>> if __name__ == '__main__':
... a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
>>> a
[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]
关于Windows 上的 python joblib Parallel 即使添加了 "if __name__ == ' __main_ _':"也无法正常工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35452694/