python - 索引 1 超出轴 0 的范围,决策树分类的大小为 1 错误

标签 python csv

我对我的数据集进行了分类,它有超过 46 列和大约 500,000 行

import numpy as np 
import pandas as pd
from sklearn.cross_validation import train_test_split
%matplotlib inline

这里我导入了数据集

df=pd.read_csv('Terror.csv', sep=',')
df.head()

这里我把列分为target和train

column_target=['success']
column_train=['iyear','country','region','latitude','longitude','specificity','vicinity','doubtterr','alternative','attacktype1','multiple','targtype1','natlty1','gname_id']
x=df[column_train]
y=df[column_target]

这里我用NA填充空行

x['latitude']=x['latitude'].fillna(x['latitude'].median())
x['longitude']=x['longitude'].fillna(x['longitude'].median())
x['doubtterr']=x['doubtterr'].fillna(x['doubtterr'].median())
x['alternative']=x['alternative'].fillna(x['alternative'].median())
x['natlty1']=x['natlty1'].fillna(x['natlty1'].median())
x['natlty1']=x['natlty1'].fillna(x['natlty1'].median())

这里我将 x 和 y 分开来进行测试和训练

x_train, x_test, y_train, y_test=train_test_split(x, y, test_size=0.33, 
random_state=42) 

试图有一个决策 TreeMap

from sklearn import tree
Tree=tree.DecisionTreeClassifier()
Tree=Tree.fit(x_train,y_train)
import pydotplus
from IPython.display import Image
dot_data= tree.export_graphviz(Tree, out_file=None,feature_names=x_train.columns,class_names=y_train.columns,filled=True,rounded=True,special_characters=True,max_depth=10)
graph= pydotplus.graph_from_dot_data(dot_data)
Image(graph.create_png())

但它给了我这个错误

IndexError                                Traceback (most recent call last)
<ipython-input-42-1ac22988949f> in <module>()
1 import pydotplus
2 from IPython.display import Image
----> 3 dot_data= tree.export_graphviz(Tree, out_file=None, 

feature_names=x_train.columns,class_names=y_train.columns,
filled=True,rounded=True,special_characters=True,max_depth=10)
4 graph= pydotplus.graph_from_dot_data(dot_data)
5 Image(graph.create_png())

C:\Users\dell\Anaconda2\lib\site-packages\sklearn\tree\export.pyc in 
export_graphviz(decision_tree, out_file, max_depth, feature_names, class_names, label, filled, leaves_parallel, impurity, node_ids, proportion, rotate, rounded, special_characters)
431             recurse(decision_tree, 0, criterion="impurity")
432         else:
--> 433             recurse(decision_tree.tree_, 0, criterion=decision_tree.criterion)
434 
435         # If required, draw leaf nodes at same depth as each other

C:\Users\dell\Anaconda2\lib\site-packages\sklearn\tree\export.pyc in 
recurse(tree, node_id, criterion, parent, depth)
319             out_file.write('%d [label=%s'
320                            % (node_id,
--> 321                               node_to_str(tree, node_id, criterion)))
322 
323             if filled:

C:\Users\dell\Anaconda2\lib\site-packages\sklearn\tree\export.pyc in 
node_to_str(tree, node_id, criterion)
284                 node_string += 'class = '
285             if class_names is not True:
--> 286                 class_name = class_names[np.argmax(value)]
287             else:
288                 class_name = "y%s%s%s" % (characters[1],

C:\Users\dell\Anaconda2\lib\site-packages\pandas\indexes\base.pyc in 
__getitem__(self, key)
1421 
1422         if is_scalar(key):
-> 1423             return getitem(key)
1424 
1425         if isinstance(key, slice):

IndexError: index 1 is out of bounds for axis 0 with size 1

我不知道我的代码有什么问题,它没有给我决策 TreeMap 像

最佳答案

您可以将“class_names=y_train.columns”替换为 class_names = df.columns.values[df 中“success”列的编号]。它应该可以解决您的问题。

关于python - 索引 1 超出轴 0 的范围,决策树分类的大小为 1 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43349040/

相关文章:

python - 如何简单地导入文件

python - Django SQLite 测试异常 : Different path of execution?

php - 如何将 Javascript 对象传递给 php 以进行 sql 查询并将数据作为下载的 csv 文件返回?

python - Airflow :运行一次 Airflow 子标记的模式

Python 按位与多个数字,比迭代按位运算符更快的方法?

python - 在 python 插值中检测 %s 与命名占位符的混合

python - 在 python 的 csv 文件中添加新行以输出

sqlite - 检查 SQlite 中文本的编码

Python:读取 csv 文件并将列保存为变量

ruby-on-rails - 忽略 csv 解析 Rails 上的第一行