当我并行运行 GridsearchCV()
和 RandomizedsearchCV()
方法时(具有 n_jobs>1
或 n_jobs=-1
选项集)
它显示此消息:
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if name == 'main'". Please see the joblib documentation on Parallel for more information" I put the code in a class in .py file and call it using if_name_=='main in other .py file but it still shows this message
当 n_jobs=1
import platform; print(platform.platform())
Windows-10-10.0.10586-SP0
import numpy; print("NumPy", numpy.__version__)
NumPy 1.13.1
import scipy; print("SciPy", scipy.__version__)
SciPy 0.19.1
import sklearn; print("Scikit-Learn", sklearn.__version__)
Scikit-Learn 0.19.0
更新
我试过这段代码,但它仍然给我同样的错误
import numpy as np
from sklearn.model_selection import RandomizedSearchCV
from sklearn.tree import DecisionTreeClassifier
class Test():
def __init__(self):
attributes = [..]
dataset = pd.read_csv("..")
X=dataset[[..]]
Y=dataset[...]
model=DecisionTreeClassifier()
model = RandomizedSearchCV(....)
model.fit(X, Y)
if __name__ == '__main__':
Test()
最佳答案
joblib
以这种行为而闻名,并且在文档中相当明确:
Warning
Under Windows, it is important to protect the main loop of code to avoid recursive spawning of
subprocesses
when usingjoblib.Parallel
. In other words, you should be writing code like this:
import ....
def function1(...):
...
def function2(...):
...
...
if __name__ == '__main__':
# do stuff with imports and functions defined about
...
No code should run outside of the
“if __name__ == ‘__main__’”
blocks, only imports and definitions.
因此,重构您的代码以满足这一明确定义的要求,您的代码将开始受益于 joblib
-tools 的强大功能。
关于python - 并行运行 CrossValidationCV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48631907/