python - 索引 1 超出轴 0 的范围，决策树分类的大小为 1 错误

我对我的数据集进行了分类，它有超过 46 列和大约 500,000 行

import numpy as np 
import pandas as pd
from sklearn.cross_validation import train_test_split
%matplotlib inline

这里我导入了数据集

df=pd.read_csv('Terror.csv', sep=',')
df.head()

这里我把列分为target和train

column_target=['success']
column_train=['iyear','country','region','latitude','longitude','specificity','vicinity','doubtterr','alternative','attacktype1','multiple','targtype1','natlty1','gname_id']
x=df[column_train]
y=df[column_target]

这里我用NA填充空行

x['latitude']=x['latitude'].fillna(x['latitude'].median())
x['longitude']=x['longitude'].fillna(x['longitude'].median())
x['doubtterr']=x['doubtterr'].fillna(x['doubtterr'].median())
x['alternative']=x['alternative'].fillna(x['alternative'].median())
x['natlty1']=x['natlty1'].fillna(x['natlty1'].median())
x['natlty1']=x['natlty1'].fillna(x['natlty1'].median())

这里我将 x 和 y 分开来进行测试和训练

x_train, x_test, y_train, y_test=train_test_split(x, y, test_size=0.33, 
random_state=42)

试图有一个决策 TreeMap

from sklearn import tree
Tree=tree.DecisionTreeClassifier()
Tree=Tree.fit(x_train,y_train)
import pydotplus
from IPython.display import Image
dot_data= tree.export_graphviz(Tree, out_file=None,feature_names=x_train.columns,class_names=y_train.columns,filled=True,rounded=True,special_characters=True,max_depth=10)
graph= pydotplus.graph_from_dot_data(dot_data)
Image(graph.create_png())

但它给了我这个错误

IndexError                                Traceback (most recent call last)
<ipython-input-42-1ac22988949f> in <module>()
1 import pydotplus
2 from IPython.display import Image
----> 3 dot_data= tree.export_graphviz(Tree, out_file=None, 

feature_names=x_train.columns,class_names=y_train.columns,
filled=True,rounded=True,special_characters=True,max_depth=10)
4 graph= pydotplus.graph_from_dot_data(dot_data)
5 Image(graph.create_png())

C:\Users\dell\Anaconda2\lib\site-packages\sklearn\tree\export.pyc in 
export_graphviz(decision_tree, out_file, max_depth, feature_names, class_names, label, filled, leaves_parallel, impurity, node_ids, proportion, rotate, rounded, special_characters)
431             recurse(decision_tree, 0, criterion="impurity")
432         else:
--> 433             recurse(decision_tree.tree_, 0, criterion=decision_tree.criterion)
434 
435         # If required, draw leaf nodes at same depth as each other

C:\Users\dell\Anaconda2\lib\site-packages\sklearn\tree\export.pyc in 
recurse(tree, node_id, criterion, parent, depth)
319             out_file.write('%d [label=%s'
320                            % (node_id,
--> 321                               node_to_str(tree, node_id, criterion)))
322 
323             if filled:

C:\Users\dell\Anaconda2\lib\site-packages\sklearn\tree\export.pyc in 
node_to_str(tree, node_id, criterion)
284                 node_string += 'class = '
285             if class_names is not True:
--> 286                 class_name = class_names[np.argmax(value)]
287             else:
288                 class_name = "y%s%s%s" % (characters[1],

C:\Users\dell\Anaconda2\lib\site-packages\pandas\indexes\base.pyc in 
__getitem__(self, key)
1421 
1422         if is_scalar(key):
-> 1423             return getitem(key)
1424 
1425         if isinstance(key, slice):

IndexError: index 1 is out of bounds for axis 0 with size 1

我不知道我的代码有什么问题，它没有给我决策 TreeMap 像

最佳答案

您可以将“class_names=y_train.columns”替换为 class_names = df.columns.values[df 中“success”列的编号]。它应该可以解决您的问题。

关于python - 索引 1 超出轴 0 的范围，决策树分类的大小为 1 错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43349040/

python - 索引 1 超出轴 0 的范围，决策树分类的大小为 1 错误

上一篇：Azure AD 邀请用户

下一篇：python - 保护或许可 Django 应用程序