我有这样的 DataFrame:
product_id dt stock_qty
226870 2948259 2017-11-11 17.000
233645 2948259 2017-11-12 17.000
240572 2948260 2017-11-13 5.000
247452 2948260 2017-11-14 5.000
233644 2948260 2017-11-12 5.000
226869 2948260 2017-11-11 5.000
247451 2948262 2017-11-14 -2.000
226868 2948262 2017-11-11 -1.000 <- not duplicated
240571 2948262 2017-11-13 -2.000
240570 2948264 2017-11-13 5.488
233643 2948264 2017-11-12 5.488
244543 2948269 2017-11-11 2.500
247450 2948276 2017-11-14 3.250
226867 2948276 2017-11-11 3.250
我必须删除 stock_qty
不同但 product_id
值相同的行。所以我应该像这样得到 DataFrame:
product_id dt stock_qty
226870 2948259 2017-11-11 17.000
233645 2948259 2017-11-12 17.000
240572 2948260 2017-11-13 5.000
247452 2948260 2017-11-14 5.000
233644 2948260 2017-11-12 5.000
226869 2948260 2017-11-11 5.000
240570 2948264 2017-11-13 5.488
233643 2948264 2017-11-12 5.488
244543 2948269 2017-11-11 2.500
247450 2948276 2017-11-14 3.250
226867 2948276 2017-11-11 3.250
感谢您的帮助!
最佳答案
你需要drop_duplicates
获取所有 product_id
值,然后通过 isin
排除它们另一个条件由 xor
(^)
链接:
m1 = df['product_id'].isin(df.drop_duplicates('stock_qty', keep=False)['product_id'])
m2 = df.duplicated('product_id', keep=False)
df = df[m1 ^ m2]
print (df)
product_id dt stock_qty
226870 2948259 2017-11-11 17.000
233645 2948259 2017-11-12 17.000
240572 2948260 2017-11-13 5.000
247452 2948260 2017-11-14 5.000
233644 2948260 2017-11-12 5.000
226869 2948260 2017-11-11 5.000
240570 2948264 2017-11-13 5.488
233643 2948264 2017-11-12 5.488
244543 2948269 2017-11-11 2.500
247450 2948276 2017-11-14 3.250
226867 2948276 2017-11-11 3.250
详细信息:
print (m1)
226870 False
233645 False
240572 False
247452 False
233644 False
226869 False
247451 True
226868 True
240571 True
240570 False
233643 False
244543 True
247450 False
226867 False
Name: product_id, dtype: bool
print (m2)
226870 True
233645 True
240572 True
247452 True
233644 True
226869 True
247451 True
226868 True
240571 True
240570 True
233643 True
244543 False
247450 True
226867 True
dtype: bool
关于Python/Pandas - 删除不重复的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47312040/