python Pandas : Is There a Faster Way to Split and Recombine a DataFrame based on criteria?

我想根据特定列“ContactID”对该 DataFrame 进行分组，但如果该组的列“PaymentType”不包含特定值，那么我想从 DataFrame 中删除整个组。

我有这样的东西:

UniqueID = data.drop_duplicates('ContactID')['ContactID'].tolist()
OnlyRefinance=[]
for i in UniqueID:
    splits = data[data['ContactID']==i].reset_index(drop=True)
    if any(splits['PaymentType']==160):
        OnlyRefinance.append(splits)
OnlyRefinance = pd.concat(OnlyRefinance)

这可行，但速度非常慢，我想知道是否有更快的方法来完成此任务。

最佳答案

您可以使用 groupby.filter 的另一个选项:

data.groupby("ContactID").filter(lambda g: (g.PaymentType == 160).any())

这只会保留 PaymentType 包含 160 的组。

关于 python Pandas : Is There a Faster Way to Split and Recombine a DataFrame based on criteria?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42095584/

上一篇：python - 根据第二个数组的索引对数组进行矢量化和

下一篇：python - 从文件中去除 UTF-8 字符范围的脚本

python - 移动没有频率的时间序列？

python - 过滤 Pandas 行，其中列中的第一个字母是/不是某个值

python - 循环遍历 1 个数据帧并将结果添加到另一个数据帧

python - 在 Pandas 数据框中获取组大小

scala - 如何将DataFrame中的struct映射到case类？

python - reduce() 有什么问题？

python - 将整数从一个字符串移动到另一个字符串？

python - Pandas 中惯用的多索引列分配

python - 如何在 python 中更改 gtk messagedialog 中的主要文本？