python - 索引 38 超出轴 1 的范围，尺寸为 38 - Sklearn

我在使用朴素贝叶斯 CategoricalNB 算法时遇到了这个错误

它在我运行单元格的第二次尝试后出现上述错误。这意味着它在第一次运行时没有任何错误，当我尝试更改某些内容(小到评论)并再次运行笔记本时，它给出了错误:

IndexError: index 38 is out of bounds for axis 1 with size 38

我不知道出了什么问题以及如何解决。当我重新启动内核并再次尝试时，它工作正常，并且在第一次尝试之后的每次尝试都失败并给出上述错误。

%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd

dataframe = pd.read_csv("hr_dataset.csv")
# dataframe = pd.read_csv("WA_Fn-UseC_-HR-Employee-Attrition.csv")

dataframe.head(2)

from sklearn.naive_bayes import CategoricalNB
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
# inputs = scaled_df
X_train, X_test, y_train, y_test = train_test_split(inputs, target, test_size=0.2)

categoricalNB_ = CategoricalNB()


categoricalNB_.fit(X_train, y_train)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

pred = categoricalNB_.predict(X_test) # --------------> gives the error for every attempt after the 1st attempt. weird

categoricalNB_.score(X_test, y_test)
# accuracy_score(y_test,pred)

最佳答案

我认为您的问题与列车和功能集中的功能具有不同的一组值有关。

我查看了你的数据库，发现你只有一条记录，总工作年限是 38。如果那条记录只能在测试集中访问，那么你在训练集中的拟合将不包含以下概率:值 38 从而引发越界错误。

您可以使用 class_prior 参数 ( for more details read docs ) 解决此问题，或者您可以确保每个类别的每个特征至少有一定数量的记录。

关于python - 索引 38 超出轴 1 的范围，尺寸为 38 - Sklearn，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62011091/

python - 索引 38 超出轴 1 的范围，尺寸为 38 - Sklearn

上一篇：SWIFTUI 和 Core Motion

下一篇：c# - 如何突出显示具有不同于 RichTextBox 文本中所有其他选择的颜色的单词或短语？