import pandas as pd
dfa = {'account':['a','b','a','c','a'],
'ret_type':['CTR','WO','T','CTR','T'],
'val':['0.0','0.1','0.2','0.3','0.4'],
'ins_date':['11','12','11','13','14']}
df = pd.DataFrame(dfa)
account ret_type val ins_date
0 a CTR 0.0 11
1 b WO 0.1 12
2 a T 0.2 11
3 c CTR 0.3 13
4 a T 0.4 14
我有一个要求,我需要消除重复的行,以便
1 duplicate row means combination of (account,ins_dat)
2 if duplicate found i need to keep row with ret type CTR abd drop row with T
3 i dont want to delete T rows for which no duplicate row is there like 4
4 in this example fr ex 2nd row is deleted as output finally
我应该怎么做?
最佳答案
请检查一下。你会得到答案。
df["duplicated"] = df[["account", "ins_date"]].duplicated(keep=False)
df = df[(df.ret_type == 'CTR') | ~df["duplicated"]]
关于python - 消除数据框中的重复行并保留具有特定字符串值的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54144880/