python - 分组依据，在 Pandas 中

标签 python pandas group-by

select df.id, count(distinct airports) as num
from df
group by df.id
having count(distinct airports) > 3

我正在尝试在 Python pandas 中执行与上述相同的操作。我尝试了 filter、nunique、agg 的不同组合，但没有任何效果。有什么建议吗？

例如: df

df   
id     airport
1      lax
1      ohare
2      phl
3      lax
2      mdw
2      lax
2      sfw
2      tpe

所以我希望结果是:

id     num
2      5

最佳答案

您可以使用 SeriesGroupBy.nunique与 boolean indexing或 query :

s = df.groupby('id')['airport'].nunique()
print (s)
id
1    2
2    5
3    1
Name: airport, dtype: int64

df1 = s[s > 3].reset_index()
print (df1)
   id  airport
0   2        5

或者:

df1 = df.groupby('id')['airport'].nunique().reset_index().query('airport > 3')
print (df1)
   id  airport
1   2        5

关于python - 分组依据，在 Pandas 中，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46387992/

上一篇：python - 从返回最大值和相关数据的列表中抓取数据

下一篇：python - 将 python 对象列表传递给 qml

python - Pandas 数据框旋转

Python 每 4.5MB 分割每个 JSON 文件

python - Ac2git 给出属性错误

python - 从 pip 转移到诗歌，现在 pytest-cov 不会收集覆盖率数据

python - Pandas 分组并总结两列

python - 计算数据框 pandas 中唯一行的数量

MySQL group-by 非常慢

Oracle 组/计数查询

python - 打印列表高级