df = DataFrame({'A':['Cat had a nap','Dog had puppies','Did you see a Donkey','kitten got angry','puppy was cute'],'Cat':[1,0,0,1,0],'Dog':[0,1,0,0,1]})
A Cat Dog
0 Cat had a nap 1 0
1 Dog had puppies 0 1
2 Did you see a Donkey 0 0
3 kitten got angry 1 0
4 puppy was cute 0 1
编辑1: 如何使用该行中包含“1”的串联列名映射每一行?
预期输出:
A Cat Dog Category
0 Cat had a nap 1 0 Cat, Dog
1 Dog had puppies 0 1 Dog
2 Did you see a Donkey 0 0 NaN
3 kitten got angry 1 0 Cat, Dog
4 puppy was cute 0 1 Dog
最佳答案
比较eq
DataFrame 的所有值,并按 any
检查每行列至少一个 True
:
对于过滤器行:
df = df[df.eq(1).any(axis=1)]
print (df)
A Cat Dog
0 Cat had a nap 1 0
1 Dog had puppies 0 1
3 kitten got angry 1 0
4 puppy was cute 0 1
对于过滤列:
df = df.loc[:, df.eq(1).any()]
print (df)
Cat Dog
0 1 0
1 0 1
2 0 0
3 1 0
4 0 1
对于过滤列和行:
m = df.eq(1)
df = df.loc[m.any(axis=1), m.any()]
print (df)
Cat Dog
0 1 0
1 0 1
3 1 0
4 0 1
编辑:
df['Category'] = df.eq(1).dot(df.columns + ',').str[:-1]
print (df)
A Cat Dog Category
0 Cat had a nap 1 0 Cat
1 Dog had puppies 0 1 Dog
2 Did you see a Donkey 0 0
3 kitten got angry 1 0 Cat
4 puppy was cute 0 1 Dog
关于python - 根据行条件对轴 1 上的数据框进行子集化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49996410/