python - Pandas 将 NaN 替换为 None 表现出违反直觉的行为

给定一个序列

s = pd.Series([1.1, 1.2, np.nan])
s
0    1.1
1    1.2
2    NaN
dtype: float64

如果需要将 NaN 转换为 None(例如，使用 Parquet )，那么我想要

0     1.1
1     1.2
2    None
dtype: object

我假设 Series.replace 是执行此操作的明显方法，但函数返回的内容如下:

s.replace(np.nan, None)

0    1.1
1    1.2
2    1.2
dtype: float64

NaN 被向前填充，而不是被替换。通过 docs ，我看到如果第二个参数是 None，那么第一个参数应该是一个字典。基于此，我希望 replace 要么按预期替换，要么抛出异常。

我相信这里的解决方法是

pd.Series([x if pd.notna(x) else None for x in s], dtype=object) 
0     1.1
1     1.2
2    None
dtype: object

这很好。但我想了解为什么会发生这种行为，是否有记录，或者它是否只是一个错误，我必须清理我的 git 配置文件并在问题跟踪器上记录一个......有什么想法吗？

最佳答案

此行为在方法参数的文档中:

method : {‘pad’, ‘ffill’, ‘bfill’, None}

The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None.

所以在您的示例中，to_replace 是一个标量，value 是None。默认方法是 pad，来自 fillna 的文档:

pad / ffill: propagate last valid observation forward to next valid

关于python - Pandas 将 NaN 替换为 None 表现出违反直觉的行为，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54021490/

相关文章：

python - 检查一个数据帧中的值是否存在于另一个数据帧中，打印所有值对