python - Pandas +群

标签 python pandas pandas-groupby data-analysis

数据集包含 4 列，其中 name 是 child 的名字，yearofbirth 表示 child 出生的年份，number 表示以该特定名字命名的婴儿的数量。

   For example, entry 1 reads, in the year 1880, 7065 girl children were named Mary.

通过 pandas，我试图找出每年哪个名字是最常用的。我的代码

   df.groupby(['yearofbirth']).agg({'number':'max'}).reset_index()

以上代码部分回答了手头的问题。

我想要名称和最大数量。

最佳答案

基于 this question 的回答我想出了这个解决方案:

idx = df.groupby(['yearofbirth'])['number'].transform(max) == df['number']
df = df[idx]

print(df)

    name    number  sex yearofbirth
0   Mary    7065    F   1880

关于python - Pandas +群，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52422195/

上一篇：Python3 如何通过唯一键连接两个字典列表

下一篇：python - 如何使用python脚本替换要解析的yaml文件中的环境变量值

相关文章：

python - 根据特定条件过滤数据

python - 如何在 Pandas 的 groupby 对象中获取组数？

python - 在 Python 中对连续日期进行分组

python - python函数 `datetime.now()` 和 `datetime.today()` 有什么区别？

python - 卡住模型并训练它

python - 如何读取pandas中的多个表文件并取平均值？

python - Bash 将字符串参数传递给 python 脚本

python - Pandas:如何将 int64 纪元秒的索引转换为日期时间

python - 从 Google Cloud Storage 使用 pandas 读取 parquet 元数据

python - 使用 Pandas 进行数据分组