Python Pandas : removing rows not matching multiple conditions from dataframe

假设我有一个使用 pandas.dataframe 的列，如下所示:

index  fruits    origin      attribute
 1     apple     USA         tasty
 2     apple     France      yummy
 3     apple     USA         juicy
 4     apple     England     juicy
 5     apple     Japan       normal
 6     banana    Canada      nice
 7     banana    Italy       good
 .....

我想选择 yummy apple from France(2) 并从表中删除不匹配的 apples，如下所示:

index  fruits    origin      attribute
 1     apple     France      yummy
 2     banana    Canada      nice
 3     banana    Italy       good
 .....

我认为以下应该有效。但事实并非如此:

df.drop(df[(df.fruits == "apple") & (df.origin != "France") | (df.fruits == "apple") & (df.attribute != "yummy")].index)

然后我尝试了以下方法，但也不起作用:

df = df[~df[(df.fruits == "apple") & (df.origin != "France") & (df.attribute != "yummy")]

小伙子们有什么帮助吗？

最佳答案

如果按匹配条件选择:

df[(df.fruits != 'apple') | ((df.fruits == 'apple') & (df.origin == 'France') & (df.attribute == 'yummy'))]

#index  fruits  origin  attribute
#1  2    apple  France      yummy
#5  6   banana  Canada       nice
#6  7   banana   Italy       good

如果按不匹配条件删除:需要删除的是 fruits 为 apple 但 origin 与 France 不匹配的行或 attribute 与 yummy 不匹配:

df[~((df.fruits == 'apple') & ((df.origin != 'France') | (df.attribute != 'yummy')))]

# index fruits  origin  attribute
#1    2  apple  France      yummy
#5    6 banana  Canada       nice
#6    7 banana   Italy       good

关于Python Pandas : removing rows not matching multiple conditions from dataframe，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45558673/

上一篇：python - 为什么调用其他方法的类方法应该用 'self' 调用？

下一篇： python 3 : How to categorize all the 151 builtins in a single code line?

相关文章：

python - 如何在某些后缀的s3中从一个桶复制到另一个桶

python - 删除一个词，除非它是另一个词的一部分

python - 计算标准偏差时忽略多个 NaN

python pandas 不计算订单对

python - "by = lambda x: lambda y: getattr(y, x)"是什么意思？

python - 使用 Python 的 isinstance

python - 如何使用 Python/Python RQ 正确处理 Redis 连接？

python - 将函数应用于我的数据框中的所有列

Python的multiprocessing.Queue + Process : Properly terminating both programs

Python Pandas 不正确的日期计数