python - 计算 Pandas 数据框中 np.nan 的数量

我有一个包含 np.nan (Numpy not-a-number) 值的 Pandas 数据框:

            field1
2020-12-24  NaN
2020-12-25  NaN
2020-12-26  1.0
2020-12-27  2.0
2020-12-28  NaN
2020-12-29  1.0
2020-12-30  2.0

(索引是日期时间。)
我想获得一个带有开始日期和 np.nan 出现次数的新数据框，即

            field1
2020-12-24  2
2020-12-28  1

我试过这个代码:

prev = 1
for col_name, el in df.iterrows():
    print(el)
    if prev != np.nan and el[0] == np.nan:
        cnt = 1
    if prev == np.nan and el[0] == np.nan:
        cnt = cnt + 1
    if prev == np.nan and el[0] != np.nan:
        print(cnt)
    prev = el[0]

但它没有按预期工作，而且我想避免“for”循环，因为我希望它们在更大的数据帧上非常慢。任何帮助，将不胜感激!

最佳答案

您可以通过 Series.notna 测试非缺失值来创建组与 Series.cumsum 然后只过滤 NaN s 行，然后通过 Series.map 获得计数和 Series.value_counts 并通过 Series.duplicated 过滤第一个重复的行:

m = df['field1'].notna()
s = m.cumsum()[~m]

df1 = s.map(s.value_counts())[~s.duplicated()].to_frame()
print (df1)
            field1
2020-12-24       2
2020-12-28       1

关于python - 计算 Pandas 数据框中 np.nan 的数量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65559423/

上一篇：c++ - 接口(interface)继承

下一篇：c++ - c++ 中的 float a = -a 和 a *= -1 之间有区别吗？

相关文章：

python - python中的零列表

python - 删除 Pandas 中的闰日

python - Pandas:过滤每组数据帧，条件至少匹配组中的一项

python - 在 Python 3.3 中使用 requests 和 requests-oauthlib 验证 API 调用

python - 根据条件按以上 5 行过滤数据框

python - 替换与数据框中特定字符串匹配的值

python - 从多索引中过滤掉列表中的特定日期

python - 将 pandas 分组列转换为字符串时出错

python - 无法写入 excel AttributeError : 'Worksheet' object has no attribute 'write'

pandas - 如何在某个描述上提取多个关键字