python - 使用 pandas apply 时保留 None 值

标签 python pandas

我有一个需要列的数据框，c3 ，添加。该列中的每个条目取决于其他两列中同一行的条目 c1和c2 。 c3最初是通过将函数映射到 c1 中的条目对而创建的。和c2 。我正在尝试加快 c3 的创建速度，由于数据很多，通过使用apply 。这是我现在拥有的:

frame['c3'] = frame.apply(lambda x: my_func(x[c1], x[c2],
                          extra_arg1, extra_arg2), axis=1).

但是，当我这样做时，'c3' 变成 float64 ，而我需要它的类型为 object保存None我用于进一步处理数据帧的值(而不是将它们转换为 NaN ，这就是给定代码行所发生的情况，因为函数生成的其他值的类型为 int )。我知道可以使用astype更改列的类型，但在已创建的列上使用它不起作用 - NaN值保持为 NaN值(value)观。有什么办法可以告诉apply我想保留 None值(value)观？我是否需要在 lambda 表达式或 my_func 内做一些特殊的事情？？

最佳答案

Pandas(至少在 18.0 版本中)有一个

convert_dtype : boolean, default True

Try to find better dtype for elementwise function results. If False, leave as dtype=object

a=pd.Series(['1','2','3',None])
a.apply(lambda x: int(x) if x is not None else None,convert_dtype=False)

Out[101]: 

0       1
1       2
2       3
3    None
dtype: object

map 功能没有类似的功能

关于python - 使用 pandas apply 时保留 None 值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33484751/

上一篇：python - 使用奇怪的编码从Python中的.txt url收集数据

下一篇：python - Google Apps 域管理员可以列出用户的消息吗？

相关文章：

python - 如何在django请求中获取POST、DELETE

python - os.remove 去哪儿了？

Python Matplotlib : Splitting one Large Graph into several Sub-Graphs (Subplot)

python - 将包含字符串的 Pandas 系列转换为 boolean 值

python - 无法调用导入模块的函数

python - Pandas 值错误: too many values to unpack np. polyfit

python - 组合相似的数据框行

python - 在 Pandas 中有效地合并两个 Dataframes 列和行

python - 将 pandas DataFrame 保存到 feather 时是否可以指定列类型？

python - Pandas 数据帧 : Converting between currencies