我正在用 pandas 编写一个带有以下数据框的 python 脚本:
dog dog 1 1 1 1 1 1 0 0 1 1
fox 1 1 1 1 1 1 0 0 1 1
the 1 1 1 1 1 1 1 0 1 1
jumps 1 1 1 1 1 1 0 1 1 1
over 1 1 1 1 1 1 0 0 1 1
fox dog 1 1 1 1 1 1 0 0 1 1
fox 1 1 1 1 1 1 0 0 1 1
the 1 1 1 1 1 1 1 0 1 1
jumps 1 1 1 1 1 1 0 1 1 1
over 1 1 1 1 1 1 0 0 1 1
jumps dog 1 1 1 1 1 1 1 0 1 0
fox 1 1 1 1 1 1 1 0 1 0
the 1 0 1 1 1 1 0 0 1 0
jumps 1 1 1 1 1 1 0 0 1 0
over 1 0 1 1 1 0 0 1 1 0
over dog 1 1 1 1 1 1 0 0 1 0
fox 1 1 1 1 1 1 0 0 1 0
the 1 0 1 1 1 0 0 1 1 0
jumps 1 1 0 1 0 1 1 0 1 0
over 1 1 1 1 1 1 0 0 1 0
the dog 1 1 1 1 1 1 0 1 1 0
fox 1 1 1 1 1 1 0 1 1 0
the 1 1 1 1 1 1 0 0 1 0
jumps 1 1 0 1 1 1 0 0 1 0
over 1 1 0 1 0 1 1 0 1 0
在这里,我想消除第一级或第二级行索引中包含单词“fox”的任何行,以便新的数据帧变为:
dog dog 1 1 1 1 1 1 0 0 1 1
the 1 1 1 1 1 1 1 0 1 1
jumps 1 1 1 1 1 1 0 1 1 1
over 1 1 1 1 1 1 0 0 1 1
jumps dog 1 1 1 1 1 1 1 0 1 0
the 1 0 1 1 1 1 0 0 1 0
jumps 1 1 1 1 1 1 0 0 1 0
over 1 0 1 1 1 0 0 1 1 0
over dog 1 1 1 1 1 1 0 0 1 0
the 1 0 1 1 1 0 0 1 1 0
jumps 1 1 0 1 0 1 1 0 1 0
over 1 1 1 1 1 1 0 0 1 0
the dog 1 1 1 1 1 1 0 1 1 0
the 1 1 1 1 1 1 0 0 1 0
jumps 1 1 0 1 1 1 0 0 1 0
over 1 1 0 1 0 1 1 0 1 0
如果我可以在单个查询中消除多个这样的单词,那将是有利的。例如“狐狸”和“结束”。我尝试过使用 df.xs 的组合和 df.drop 但似乎没有任何工作正常。有什么想法吗?
最佳答案
这是一个最小的例子:
df = pd.DataFrame([['dog', 'dog', 1], ['dog', 'fox', 1], ['dog', 'the', 1],
['fox', 'dog', 0], ['fox', 'fox', 0], ['fox', 'the', 0],
['jumps', 'dog', 1], ['jumps', 'fox', 1], ['jumps', 'the', 1]],
columns=['A', 'B', 'C'])
df = df.set_index(['A', 'B'])
# C
# A B
# dog dog 1
# fox 1
# the 1
# fox dog 0
# fox 0
# the 0
# jumps dog 1
# fox 1
# the 1
def remover(df, lst):
return df.drop(lst, level=0).drop(lst, level=1)
df = df.pipe(remover, ['fox', 'dog'])
# C
# A B
# jumps the 1
关于python - 从数据框中完全消除行索引及其行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48873887/