python - 合并数据框的特定行并删除未使用的行

标签 python database python-2.7 pandas dataframe

我有一个名为 df1 的数据框,如下所示:

details            endFrame id  indexID object  startFrame
'series of numbers’  1111   78  0   Motorbike   1
'series of numbers’  3647   78  1   Motorbike   1112
'series of numbers’  3678   78  2   Motorbike   3649
'series of numbers’  704    120 3   Pedestrian  66
'series of numbers’  817    120 4   Pedestrian  705
'series of numbers’  922    120 5   Pedestrian  818
'series of numbers’  121    110 6   Pedestrian  69
'series of numbers’  140    109 7   Pedestrian  69
'series of numbers’  4161   109 8   Pedestrian  140
'series of numbers’  4344   109 9   Pedestrian  4163
'series of numbers’  3603   79  10  Motorbike   70

我还有另一个 df2,看起来像这样:

indexID matchID
0   1
1   2
3   4
4   5
7   8
8   9

匹配 ID 显示需要加入哪些 ID。例如,从前 2 行开始,索引 0,1 和 2 应该连接在一起。在 df1 中,所有细节都应该加在一起。最终的final df应该是这样的:

details                                                       id    indexID
'series of numbers’'series of numbers’'series of numbers’     78    0
'series of numbers’'series of numbers’'series of numbers’     120   3
'series of numbers’                                           110   6
'series of numbers’'series of numbers’'series of numbers’     109   7
'series of numbers’                                            79   10

我该怎么做?

编辑 这一系列数字实际上是一个列表,而不是像这样的输出:

details                                                  id    indexID
[series of numbers][series of numbers][series of numbers]     78    0
[series of numbers][series of numbers][series of numbers]     120   3
[series of numbers]                                           110   6
[series of numbers][series of numbers][series of numbers]     109   7
[series of numbers]                                            79   10

我希望它有这样的输出:

details                                                  id    indexID
[series of numbersseries of numbersseries of numbers]     78    0
[series of numbersseries of numbersseries of numbers]     120   3
[series of numbers]                                           110   6
[series of numbersseries of numbersseries of numbers]     109   7
[series of numbers]                                            79   10

最佳答案

mask 将匹配值替换为缺失值与 isin并按以前的值向前填充:

g = df1['indexID'] .mask(df1['indexID'].isin(df2['matchID'])).ffill().astype(int)
print (g)
0      0
1      0
2      0
3      3
4      3
5      3
6      6
7      7
8      7
9      7
10    10
Name: indexID, dtype: int32

然后使用 groupbyjoin:

#if want grouping only be new Series g
df = df1.groupby(g).agg({'details':' '.join, 'id':'first'}).reset_index()
print (df)
   indexID                                            details   id
0        0  'series of numbers' 'series of numbers' 'serie...   78
1        3  'series of numbers' 'series of numbers' 'serie...  120
2        6                                'series of numbers'  110
3        7  'series of numbers' 'series of numbers' 'serie...  109
4       10                                'series of numbers'   79

#or also by id column
df = df1.groupby(['id',g], sort=False)['details'].agg(' '.join).reset_index()
print (df)
    id  indexID                                            details
0   78        0  'series of numbers' 'series of numbers' 'serie...
1  120        3  'series of numbers' 'series of numbers' 'serie...
2  110        6                                'series of numbers'
3  109        7  'series of numbers' 'series of numbers' 'serie...
4   79       10                                'series of numbers'

关于python - 合并数据框的特定行并删除未使用的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52438106/

相关文章:

python - 如何在sqlite3中同时搜索多个表?

python - 返回语句时速度慢

python 2.7 : test if characters in a string are all Chinese characters

java - J2SE 和数据库访问

c# - DataSet - 在填充外连接查询时确保主键数据

python - 仅给出子列表的元素(Python),查找列表中元素的索引的最有效方法是什么

python - 我有两个 .py 文件。如何将一个程序的多行输出转换为另一个程序 GUI 的 tkinter 文本?

python - 在笛卡尔坐标系中计算不规则形状的边界 2D

python - Odoo 13 如何通过 api 运行操作?

android - 使用android直接连接mysql服务器并获取数据