Windows 上的 python joblib Parallel 即使添加了 "if __name__ == ' __main_ _':"也无法正常工作

标签 python windows python-2.7 parallel-processing syntax-error

我在 Windows 上使用 Python 运行并行处理。这是我的代码:

from joblib import Parallel, delayed

def f(x): 
    return sqrt(x)

if __name__ == '__main__':
    a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))

这是错误信息:

Process PoolWorker-2:  
Process PoolWorker-1:  
Traceback (most recent call last):    
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()   
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)   
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\pool.py", line 102, in worker
task = get()   
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\User\lib\site-packages\joblib\pool.py", line 363, in get
return recv()  
AttributeError: 'module' object has no attribute 'f'

最佳答案

根据 this site问题是特定于 Windows 的:

Yes: under linux we are forking, thus their is no need to pickle the function, and it works fine. Under windows, the function needs to be pickleable, ie it needs to be imported from another file. This is actually good practice: making modules pushes for reuse.

我试过你的代码,它在 Linux 下完美运行。 在 Windows 下,如果它从脚本运行,它运行正常,例如 python script_with_your_code.py。但是在交互式 python session 中运行时失败。当我将 f 函数保存在单独的模块中并将其导入到我的交互式 session 中时,它对我有用。

不工作:
互动环节:

>>> from math import sqrt
>>> from joblib import Parallel, delayed

>>> def f(x):
...     return sqrt(x)

>>> if __name__ == '__main__':
...     a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
Process PoolWorker-1:
Traceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python27\lib\multiprocessing\pool.py", line 102, in worker
    task = get()
  File "C:\Python27\lib\site-packages\joblib\pool.py", line 359, in get
    return recv()
AttributeError: 'module' object has no attribute 'f'


工作:
有趣的.py

from math import sqrt

def f(x):
    return sqrt(x)

互动环节:

>>> from joblib import Parallel, delayed
>>> from fun import f

>>> if __name__ == '__main__':
...     a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
>>> a
[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]

关于Windows 上的 python joblib Parallel 即使添加了 "if __name__ == ' __main_ _':"也无法正常工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35452694/

相关文章:

python-2.7 - Scikit Learn 中用于特征选择(机器学习)的包装方法

Python sys.modules 包含一个尚未导入的模块

python - Thonny ide 中的语法错误没有任何错误,为什么?

python - 转换为稀疏矩阵 - TypeError : no supported conversion for types: (dtype ('0' ), )

c# - 是否有可能劫持标准输出

windows - 找不到满足要求的版本 <every package> ,没有匹配的分发

python - 将 Pandas DataFrame 转换为 DataFrame 列表

python - 谷歌云语音到 Python 中的文本 : Save translation and time to JSON

java - ./mvnw spring-boot :run command displays "Spring was unexpected this time" message

android - 如何打破 Kivy 中的 while 循环?