python - 为什么 pandas isnull() 有效但 ==None 无效？

我正在尝试选择 df 的行，其中列 label 的值为 None。 (它是我从另一个函数获得的值 None，不是 NaN)

为什么 df[df['label'].isnull()] 返回我想要的行，

但是 df[df['label'] == None] 返回 Empty DataFrame 列:[路径、fanId、标签、增益、顺序] 索引:[] ?

最佳答案

如上注释所述，pandas中的缺失数据用一个NaN表示，其中NaN是一个数值，即float类型.但是 None 是 Python NoneType，因此 NaN 将不等同于 None。

In [27]: np.nan == None
Out[27]: False

在此Github thread他们进一步讨论，并指出:

This was done quite a while ago to make the behavior of nulls consistent, in that they don't compare equal. This puts None and np.nan on an equal (though not-consistent with python, BUT consistent with numpy) footing.

这意味着当你执行 df[df['label'] == None] 时，你将elementwise 检查 np.nan == np.nan，我们知道这是错误的。

In [63]: np.nan == np.nan
Out[63]: False

此外，当您申请 Boolean indexing 时，您不应该执行 df[df['label'] == None] ，将 == 用于 NoneType 并不是最佳实践，因为 PEP8提及:

Comparisons to singletons like None should always be done with is or is not, never the equality operators.

例如，您可以执行 tst.value.apply(lambda x: x is None)，它会产生与 .isnull() 相同的结果，说明如何 pandas 将这些视为 NaN。注意这是针对下面的tst 数据框示例，其中tst.value.dtypes 是一个对象，我明确指定了 NoneType 元素。

有一个不错的example在说明这一点及其效果的 pandas 文档中。

例如，如果您有两列，一个是 float 类型，另一个是 object 类型，您可以看到 pandas 如何处理 None 类型以一种很好的方式，注意 float 它正在使用 NaN。

In [32]: tst = pd.DataFrame({"label" : [1, 2, None, 3, None], "value" : ["A", "B", None, "C", None]})

Out[39]:
   label value
0    1.0     A
1    2.0     B
2    NaN  None
3    3.0     C
4    NaN  None

In [51]: type(tst.value[2])
Out[51]: NoneType

In [52]: type(tst.label[2])
Out[52]: numpy.float64

这篇文章很好地解释了 NaN 和 None 之间的区别，肯定会看看这个。

What is the difference between NaN and None?

关于python - 为什么 pandas isnull() 有效但 ==None 无效？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58190953/

python - 为什么 pandas isnull() 有效但 ==None 无效？

上一篇：python - 如何在 Python BeautifulSoup 上有效地解析大型 HTML div 类和跨度数据？

下一篇：python - 如何在 Python 中得到 1020 + 10-20 的精确结果？它给了我 1e+20

python - 为什么 pandas isnull() 有效但 ==None 无效？

上一篇：python - 如何在 Python BeautifulSoup 上有效地解析大型 HTML div 类和跨度数据？

下一篇：python - 如何在 Python 中得到 10**20 + 10**-20 的精确结果？它给了我 1e+20

下一篇：python - 如何在 Python 中得到 1020 + 10-20 的精确结果？它给了我 1e+20