python - 如何修复名称错误: name 'X_train' is not defined?

标签 python machine-learning scikit-learn multilabel-classification scikit-multilearn

我正在运行多标签分类的[代码] 1 .如何修复“X_train”未定义的NameError。Python代码如下。

import scipy
from scipy.io import arff
data, meta = scipy.io.arff.loadarff('./yeast/yeast-train.arff')
from sklearn.datasets import make_multilabel_classification

# this will generate a random multi-label dataset
X, y = make_multilabel_classification(sparse = True, n_labels = 20,
return_indicator = 'sparse', allow_unlabeled = False)

# using binary relevance
from skmultilearn.problem_transform import BinaryRelevance
from sklearn.naive_bayes import GaussianNB

# initialize binary relevance multi-label classifier
# with a gaussian naive bayes base classifier
classifier = BinaryRelevance(GaussianNB())

# train
classifier.fit(X_train, y_train)

# predict
predictions = classifier.predict(X_test)

from sklearn.metrics import accuracy_score
accuracy_score(y_test,predictions)

最佳答案

您忘记将数据集拆分为训练集和测试集。

导入库

from sklearn.model_selection import train_test_split

在 classifier.fit() 之前添加此行

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

关于python - 如何修复名称错误: name 'X_train' is not defined?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51258645/

相关文章:

python - scikit-learn:标记化时不要分隔带连字符的单词

python - 预先确定放大叶片的最佳水平

python - pandas 数据透视表 : calculate weighted averages through aggfunc

Python:如何迭代文本文件中行中的特定列

c# - 如何使用回归任务预测 ML.NET 的多个标签?

machine-learning - 测试/验证集中的数据增强?

python - 在 2d numpy 数组的每一行中找到最小的非零值

algorithm - 线性回归的梯度下降没有找到最佳参数

python - 在 scikit-learn 中估算分类缺失值

python - tfidfvectorizer 根据所有单词打印结果