python - 根据索引获取行,然后创建另一个单独的数据框

标签 python pandas

我编写了一段代码来从数据帧中提取索引,但我不知道如何使用这些索引从原始数据帧创建另一个数据帧。

是否也可以缩短我当前的代码?相当长。

已编辑==

import pandas as pd

a = pd.DataFrame({"a":["I have something", "I have nothing", "she has something", "she is nice", "she is not nice","Me", "He"],
                 "b":[["man"], ["man", "eating"], ["cat"], ["man"], ["cat"], ["man"], ["cat"]]})
a = a[a.b.apply(lambda x:len(x)) == 1] # is it possible to shorten the code from here
c = a.explode("b").groupby("b")
k = ["man", "cat"]
bb = a
for x in k:
    bb = c.get_group(x).head(2).index # to here?.... this part is supposed to take the first 2 indexes of each element in k

当前结果:

    a       b
4   she is not nice [cat]

Expected results:


    a       b
0   I have something    [man]
2   she has something   [cat]
3   she is nice [man]
4   she is not nice [cat]

最佳答案

首先按Series.str.len过滤然后将一个元素字符串转换为字符串,因此可能通过 Series.duplicated 测试口是心非。通过 ~ 反转 bool 掩码并通过 boolean indexing 过滤:

a = a[a.b.str.len() == 1]

b = a[~a['b'].str[0].duplicated()]
print (b)
                 a      b
3      she is nice  [man]
4  she is not nice  [cat]

编辑:对于多个值,请使用 GroupBy.head :

b1 = a.groupby(a['b'].str[0]).head(2)
print (b1)
                   a      b
0   I have something  [man]
2  she has something  [cat]
3        she is nice  [man]
4    she is not nice  [cat]

关于python - 根据索引获取行,然后创建另一个单独的数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58133717/

相关文章:

python - 如何从 pyspark.rdd.PipelinedRDD 中过滤掉值?

android - 使用 adb shell 获取 Android 目录中的文件数

基于数据集中先前数字的 Pandas DataFrame 编号

python - 当存在 unicode 值时计算 NaN

python - 如何在 Pandas 中的 groupby 中进行滚动窗口聚合?

Python Tabulate 将字符格式化为单独的列

python - 使用 django 光标保存转义字符时出错

coding-style - 在 Python 代码中使用名称 "function"作为变量

android - 使用 Appium-Python 在 Android 应用程序中垂直滚动

python - OSError : [Errno 36] File name too long: for python package and . txt 文件, Pandas 打开