python - GridSearchCV - 错误 : The truth value of an array with more than one element is ambiguous. 使用 a.any() 或 a.all()

标签 python scikit-learn grid-search gridsearchcv

我正在尝试在 python 中使用 scikit-learn 进行神经网络分类

我生成了数据,将其拆分以进行训练和测试,并在模型 MLPClassifier() 中使用它。

我下一步计划做的是使用 sklearn.model_selection.GridSearchCV 评估该模型中使用的参数。

这是我的代码:

import matplotlib.pyplot as plt
import numpy as np
import itertools

from sklearn.neural_network import MLPClassifier
from sklearn.datasets.samples_generator import make_blobs, make_moons
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV

X, y = make_blobs(n_samples=500, centers=5, n_features=2, random_state=10, cluster_std=2.5)
y[y==0] = -1
X_train, X_test,  y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=10)

X_trainX_test 是具有 2 个特征的数组。

model_MLP_RAW = MLPClassifier()
model_MLP_RAW.fit(X_train, y_train)
model_MLP_RAW.predict(X_test) == y_test
model_MLP_RAW.score(X_test, y_test)

model_MLP_RAW = MLPClassifier()

param_gridMLPC = {
    'learning_rate': ["constant", "invscaling", "adaptive"],
    'hidden_layer_sizes': [x for x in itertools.product((10,20,30,40,50,100),repeat=3)],
    'alpha': [10.0 ** -np.arange(1, 7)],
    'activation': ["logistic", "relu", "tanh"]
}

CV_unknwnMLPC = GridSearchCV(estimator=model_MLP_RAW, param_grid=param_gridMLPC, cv= 5)
CV_unknwnMLPC.fit(X_train, y_train)

print(CV_unknwnMLPC.best_params_)

一切正常,但在 CV_unknwnMLPC.fit(X_train, y_train) 行,我收到以下错误:

ValueError                                Traceback (most recent call last)
<ipython-input-30-90faf7e56738> in <module>()
     10 
     11 CV_unknwnMLPC = GridSearchCV(estimator=model_MLP_RAW, param_grid=param_gridMLPC, cv= 5)
---> 12 CV_unknwnMLPC.fit(X_train, y_train)
     13 
     14 print(CV_unknwnMLPC.best_params_)

~\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    638                                   error_score=self.error_score)
    639           for parameters, (train, test) in product(candidate_params,
--> 640                                                    cv.split(X, y, groups)))
    641 
    642         # if one choose to see train score, "out" will contain train score info

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self, iterable)
    777             # was dispatched. In particular this covers the edge
    778             # case of Parallel used with an exhausted iterator.
--> 779             while self.dispatch_one_batch(iterator):
    780                 self._iterating = True
    781             else:

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in dispatch_one_batch(self, iterator)
    623                 return False
    624             else:
--> 625                 self._dispatch(tasks)
    626                 return True
    627 

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in _dispatch(self, batch)
    586         dispatch_timestamp = time.time()
    587         cb = BatchCompletionCallBack(dispatch_timestamp, len(batch), self)
--> 588         job = self._backend.apply_async(batch, callback=cb)
    589         self._jobs.append(job)
    590 

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py in apply_async(self, func, callback)
    109     def apply_async(self, func, callback=None):
    110         """Schedule a func to be run"""
--> 111         result = ImmediateResult(func)
    112         if callback:
    113             callback(result)

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py in __init__(self, batch)
    330         # Don't delay the application, to avoid keeping the input
    331         # arguments in memory
--> 332         self.results = batch()
    333 
    334     def get(self):

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):

~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in <listcomp>(.0)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):

~\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, error_score)
    456             estimator.fit(X_train, **fit_params)
    457         else:
--> 458             estimator.fit(X_train, y_train, **fit_params)
    459 
    460     except Exception as e:

~\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in fit(self, X, y)
    971         """
    972         return self._fit(X, y, incremental=(self.warm_start and
--> 973                                             hasattr(self, "classes_")))
    974 
    975     @property

~\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in _fit(self, X, y, incremental)
    324 
    325         # Validate input parameters.
--> 326         self._validate_hyperparameters()
    327         if np.any(np.array(hidden_layer_sizes) <= 0):
    328             raise ValueError("hidden_layer_sizes must be > 0, got %s." %

~\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in _validate_hyperparameters(self)
    390         if self.max_iter <= 0:
    391             raise ValueError("max_iter must be > 0, got %s." % self.max_iter)
--> 392         if self.alpha < 0.0:
    393             raise ValueError("alpha must be >= 0, got %s." % self.alpha)
    394         if (self.learning_rate in ["constant", "invscaling", "adaptive"] and

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

我在线检查了一些答案,并仔细检查了 param_gridMLPC 中的参数,以确保它们提供良好,但错误仍然存​​在。

我做错了什么?

提前致谢

最佳答案

'alpha': [10.0 ** -np.arange(1, 7)]

documentation of MLPClassifier :-

alpha : float, optional, default 0.0001

L2 penalty (regularization term) parameter.

"alpha" 应该是 float 的。所以在参数网格中,它可以是不同 float 的列表。

但是当你这样做时:

'alpha': [10.0 ** -np.arange(1, 7)]

这成为 numpy 数组的列表。这是一种序列的序列(列表的列表、数组的数组、二维数组等)。这意味着列表的第一个元素是一个 numpy 数组,它将传递到内部 MLPClassifier 代替 “alpha”。这就是错误。

您可以执行以下操作:

'alpha': 10.0 ** -np.arange(1, 7)

这将是一个简单的数组,将从其中选择元素(浮点值)发送到模型中。

关于python - GridSearchCV - 错误 : The truth value of an array with more than one element is ambiguous. 使用 a.any() 或 a.all(),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54028667/

相关文章:

python - Biopython 可以执行 Seq.find() 解释歧义代码吗

python - 预处理数据时 ValueError : Input contains NaN, 无穷大或对于 dtype ('float64' 的值太大)

python - 从Python中的IDLE编辑器指向源文件

machine-learning - 交叉验证和网格搜索有什么区别?

r - 寻找理想的滤波器设置以最大化目标函数

python - Gentoo Dropbox CLI

python - 在经过身份验证的 session 中使用 twill/mechanize 检索 application/json 文档

Python、Django——如何在中间件中存储http请求?

python - 对新出现的情况进行分类 - 多项式朴素贝叶斯

python - 为什么 GridSearchCV 模型结果与我手动调整的模型不同?