python - 为什么 pandas 分类 DataFrame 给出真值错误？

标签 python pandas machine-learning scikit-learn

我的数据包含“已婚”列，其分类值为"is"或“否”。我将其更改为数字类型:

 train['Married']=train['Married'].astype('category')
 train['Married'].cat.categories=[0,1]

现在我使用以下代码来填充缺失的值:

train['Married']=train['Married'].fillna(train['Married'].mode())

它给出了错误:

 ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

谁能解释一下为什么吗？

最佳答案

该错误表明您正在 numpy 数组或 pandas 系列上使用基本 Python 中的逻辑运算符，例如 not、and、or:

例如:

s = pd.Series([1,1,2,2])
not pd.isnull(s.mode())

给出同样的错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

如果您查看堆栈跟踪，错误来自此行:

fillna(self, value, method, limit)
   1465         else:
   1466 
-> 1467             if not isnull(value) and value not in self.categories:
   1468                 raise ValueError("fill value must be in categories")
   1469

因此它正在检查您要填充的值是否在类别中；此行要求该值是标量，以便与 not 和 and 兼容；但是，series.mode() 始终返回一个系列，这会导致此行失败，请尝试从 mode() 中提取值并填充它:

train['Married']=train['Married'].fillna(train['Married'].mode().iloc[0])

一个工作示例:

s = pd.Series(["YES", "NO", "YES", "YES", None])    
s1 = s.astype('category')
s1.cat.categories = [0, 1]

s1
#0    1.0
#1    0.0
#2    1.0
#3    1.0
#4    NaN
#dtype: category
#Categories (2, int64): [0, 1]

s1.fillna(s1.mode().iloc[0])
#0    1
#1    0
#2    1
#3    1
#4    1
#dtype: category
#Categories (2, int64): [0, 1]

关于python - 为什么 pandas 分类 DataFrame 给出真值错误？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45340989/

上一篇：python - python .get() 方法在这种情况下正在做什么？

下一篇：python - 我应该下载哪个语料库来访问 nltk.corpus.words？

python - 如何在Django中上传多个文件

python - 加快 Pandas 中自回归项的创建？

python - 如何将python字典转换为pandas中的数据框

python - 如何将Tensorflow张量尺寸(形状)作为int值？

python - 如果 python 脚本在任何时候崩溃，有没有办法向它提供命令？

python - Vim:如何在 Python 中使用 omnicompletion 显示所有补全而不是扩展到“self”？

python - 如何将 Python 中 DataFrame 中的行转换为字典

process - 模拟游戏控制，就像正在玩一样

matlab - 最近均值分类器的距离计算器