我需要根据列值删除连续的行。我的数据框如下所示
df = pd.DataFrame({
"CustID":
["c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2"],
"saleValue":
[10, 12, 13, 6, 4 , 2, 11, 17, 1,5,8,2,16,13,1,4],
"Status":
[0, 0, 0, 1, 1 ,1, 0, 0, 1,1,1,1,0,0,1,1]
})
dataframe looks like below
CustID saleValue Status
c1 10 0
c1 12 0
c1 13 0
c1 6 1
c1 4 1
c1 2 1
c1 11 0
c1 17 0
c1 1 1
c1 5 1
c2 8 1
c2 2 1
c2 16 0
c2 13 0
c2 1 1
c2 4 1
只有当 Status 为 1 时,我才需要删除每个 CustID 的连续行。你能告诉我最好的方法吗
so the output should look like below.
CustID saleValue Status
c1 10 0
c1 12 0
c1 13 0
c1 6 1
c1 11 0
c1 17 0
c1 1 1
c2 8 1
c2 16 0
c2 13 0
c2 1 1
最佳答案
为整个 DataFrame 创建一个 bool 掩码。
鉴于DataFrame已经按ID分组,找到值为1,上一行也为1,且ID与上一行ID相同的行。这些是要删除的行,因此请保留其余行。
to_drop = (df['Status'].eq(1) & df['Status'].shift().eq(1) # Consecutive 1s
& df['CustID'].eq(df['CustID'].shift())) # Within same ID
df[~to_drop]
CustID saleValue Status
0 c1 10 0
1 c1 12 0
2 c1 13 0
3 c1 6 1
6 c1 11 0
7 c1 17 0
8 c1 1 1
10 c2 8 1
12 c2 16 0
13 c2 13 0
14 c2 1 1
关于Python dataframe - 根据列删除连续的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64231937/