据说,pandas.apply() 函数不适用于 null 元素。但是,这不会发生在以下代码中。为什么会这样?
import pandas as pd
df = pd.Series([[1,2],[2,3,4,5],None])
df
0 [1, 2]
1 [2, 3, 4, 5]
2 None
dtype: object
df.apply(lambda x: len(x))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\series.py", l
ine 2169, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\src\inference.pyx", line 1059, in pandas.lib.map_infer (pandas\li
b.c:62578)
File "<stdin>", line 1, in <lambda>
TypeError: object of type 'NoneType' has no len()
最佳答案
None 和 nan 在语义上是等价的。用 numpy.nan 替换 None 没有意义。 apply
仍会将该函数应用于 NaN 元素。
df[2] = numpy.nan
df.apply(lambda x: print(x))
Output: [1, 2]
[2, 3, 4, 5]
nan
您必须检查要应用或使用的函数中的缺失值
pandas.dropna
并将函数应用于结果:df.dropna().apply(lambda x: print(x))
或者,使用
pandas.notnull()
它返回一系列 bool 值:df[df.notnull()].apply(lambda x: print(x))
另请阅读:http://pandas.pydata.org/pandas-docs/stable/missing_data.html
具体来说,这个:
Warning:
One has to be mindful that in python (and numpy), the nan's don’t compare equal, but None's do. Note that Pandas/numpy uses the fact that np.nan != np.nan, and treats None like np.nan.
关于python - 为什么 pandas.apply() 在空元素上执行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34574499/