我编写了一段代码来从数据帧中提取索引,但我不知道如何使用这些索引从原始数据帧创建另一个数据帧。
是否也可以缩短我当前的代码?相当长。
已编辑==
import pandas as pd
a = pd.DataFrame({"a":["I have something", "I have nothing", "she has something", "she is nice", "she is not nice","Me", "He"],
"b":[["man"], ["man", "eating"], ["cat"], ["man"], ["cat"], ["man"], ["cat"]]})
a = a[a.b.apply(lambda x:len(x)) == 1] # is it possible to shorten the code from here
c = a.explode("b").groupby("b")
k = ["man", "cat"]
bb = a
for x in k:
bb = c.get_group(x).head(2).index # to here?.... this part is supposed to take the first 2 indexes of each element in k
当前结果:
a b
4 she is not nice [cat]
Expected results:
a b
0 I have something [man]
2 she has something [cat]
3 she is nice [man]
4 she is not nice [cat]
最佳答案
首先按Series.str.len
过滤然后将一个元素字符串转换为字符串,因此可能通过 Series.duplicated
测试口是心非。通过 ~
反转 bool 掩码并通过 boolean indexing
过滤:
a = a[a.b.str.len() == 1]
b = a[~a['b'].str[0].duplicated()]
print (b)
a b
3 she is nice [man]
4 she is not nice [cat]
编辑:对于多个值,请使用 GroupBy.head
:
b1 = a.groupby(a['b'].str[0]).head(2)
print (b1)
a b
0 I have something [man]
2 she has something [cat]
3 she is nice [man]
4 she is not nice [cat]
关于python - 根据索引获取行,然后创建另一个单独的数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58133717/