python - 自定义 groupby 函数 pandas python

我有以下数据框:

我想按 id 分组并添加一个包含 Y 的标志列，如果任何时候 Y 出现在 id 上，则生成的 DF 将如下所示:

这是我的方法，太耗时且不确定正确性:

temp=pd.DataFrame()
j='flag'
for i in df['id'].unique():
  test=df[df['id']==i]
  test[j]=np.where(np.any((test[j]=='Y')),'Y',test[j])
temp=temp.append(test)

最佳答案

您可以执行groupby + max，因为Y > N:

df.groupby('id', as_index=False)['flag'].max()

关于python - 自定义 groupby 函数 pandas python，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/72116380/

上一篇：mongodb - 如何在 MongoDB 中获取以字段开头字母(A 到 Z)的随机文档？

下一篇：pine-script - 策略: Understanding strategy. exit() 和strategy.close()。如何决定应该使用哪一个？

python - SWIG:返回原始指针与共享 ptrs 的 vector

python - 无法从一个日期时间列中减去另一列，减去操作不能使用类型为 dtype ('S1' ) 和 dtype ('<M8[ns]' ) 的操作数

c++ - 简单素数程序....我的代码/脚本有什么问题？

python - 我可以在 python 中编写忽略特殊字符(如逗号、空格、感叹号等)的代码吗？

python - 拆分 pandas DataFrame 中的单元格并计算值

python - 使用 pandas 将 Unix 13 位数字转换为日期时间/时间戳格式

python - 根据 Pandas 中另一个单元格的内容在一个单元格中写入特定数据

swift - 为什么 O(n) 比 O(n^2) 花费的时间更长？

python - 在类中使用 'for' 循环迭代字典