python - 类型错误 : can only concatenate str (not "numpy.int64") to str while trying to plot the graph tree

标签 python jupyter-notebook decision-tree

我尝试使用以下代码绘制树:

import sklearn.tree
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
model1 = sklearn.tree.DecisionTreeClassifier()


covidCases['New_cases'].value_counts()
feature_cols = ['New_cases', 'New_deaths']
X = covidCases[feature_cols] # Features
y = covidCases['New_deaths']
print(X)
print(y)

X_train, X_test, y_train, y_test = train_test_split(X,    # predictive features
                                                      y,      # target column
                                                      test_size=0.30,    # 30% of dataset will be set aside for test set
                                                      random_state=1)

clf = DecisionTreeClassifier()

# Train Decision Tree Classifer
clf = clf.fit(X_train,y_train)

#Predict the response for test dataset
y_pred = clf.predict(X_test)

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
dot_data = sklearn.tree.export_graphviz(clf, out_file=None, 
                                feature_names=X.columns,  
                                class_names=y.unique(),
                                filled=True)

graph = graphviz.Source(dot_data, format="png") 
graph

但是我收到错误 TypeError: can only concatenate str (not "numpy.int64") to str ,并且不知何故我对 python 很陌生。 所以任何帮助将不胜感激 该错误与图形绘制有关。

更新:错误消息如下:

  TypeError                                 Traceback (most recent call last)
Input In [18], in <cell line: 1>()
----> 1 dot_data = sklearn.tree.export_graphviz(clf, out_file=None, 
      2                                 feature_names=X.columns,  
      3                                 class_names=y.unique(),
      4                                 filled=True)
      6 graph = graphviz.Source(dot_data, format="png") 
      7 graph

File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/tree/_export.py:889, in export_graphviz(decision_tree, out_file, max_depth, feature_names, class_names, label, filled, leaves_parallel, impurity, node_ids, proportion, rotate, rounded, special_characters, precision, fontname)
    870     out_file = StringIO()
    872 exporter = _DOTTreeExporter(
    873     out_file=out_file,
    874     max_depth=max_depth,
   (...)
    887     fontname=fontname,
    888 )
--> 889 exporter.export(decision_tree)
    891 if return_string:
    892     return exporter.out_file.getvalue()

File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/tree/_export.py:462, in _DOTTreeExporter.export(self, decision_tree)
    460     self.recurse(decision_tree, 0, criterion="impurity")
    461 else:
--> 462     self.recurse(decision_tree.tree_, 0, criterion=decision_tree.criterion)
    464 self.tail()

File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/tree/_export.py:521, in _DOTTreeExporter.recurse(self, tree, node_id, criterion, parent, depth)
    517 else:
    518     self.ranks[str(depth)].append(str(node_id))
    520 self.out_file.write(
--> 521     "%d [label=%s" % (node_id, self.node_to_str(tree, node_id, criterion))
    522 )
    524 if self.filled:
    525     self.out_file.write(
    526         ', fillcolor="%s"' % self.get_fill_color(tree, node_id)
    527     )

File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/tree/_export.py:374, in _BaseTreeExporter.node_to_str(self, tree, node_id, criterion)
    368     else:
    369         class_name = "y%s%s%s" % (
    370             characters[1],
    371             np.argmax(value),
    372             characters[2],
    373         )
--> 374     node_string += class_name
    376 # Clean up any trailing newlines
    377 if node_string.endswith(characters[4]):

TypeError: can only concatenate str (not "numpy.int64") to str

数据如下: enter image description here

最佳答案

我想我发现了问题,它与 y.unique() 有关,它是一个整数数组,将其转换为 val = np.array(y.unique()).astype('str').tolist() 成功了

关于python - 类型错误 : can only concatenate str (not "numpy.int64") to str while trying to plot the graph tree,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73497699/

相关文章:

python - Pandas MultiIndex 级别的自定义分组

hive - 如何使用 jupyter notebook 在 pyspark 中的 Hive 上使用 %sql Magic string 启用 spark SQL

python - Sympy:如何根据 X 解决

python - 将文件从 jupyter 笔记本上传到 github 时出错

python - 哪些库用于在 Python 中对复杂问卷进行建模?

python - scikit learn - 决策树中的特征重要性计算

python - 计算 Pandas 数据框中某个值的出现次数

python - 如何将周一至周五与周六和周日 Pandas 分开?

python - sklearn.tree.DecisionTreeClassifier : Get all samples that fell into leaf node

python - 我可以使用生成的 swig 代码将 C++ 对象转换为 PyObject 吗?