我有 2 个 pandas 数据帧(df1、df2),我正在尝试从中提取数据并创建第三个数据帧(df3)
df1 有 2 列(一个 id 列和另一列,用于保存第二个数据帧 (df2) 中的列名称
df1 looks like:
===============
id1 name
--- ----
1 df2_column1_name
5 df2_column1_name
33 df2_column3_name
...
... and so on
df2 looks like:
===============
id2 df2_column1_name df2_column2_name df2_column2_name .... and so on
--- ---------------- ---------------- ----------------
12 Jimmy male 25 ....
16 Becky female 30 ....
75 Mike male 80 ....
....
.... and so on
I am trying to create df3 to look like:
=======================================
column1 Column2 Column3
------- ------- -------
1 12 Jimmy
5 12 male
33 12 25
.
.
1 16 Becky
5 16 female
33 16 30
.
.
1 75 Mike
5 75 male
33 75 80
.
.
.
数据帧可能非常大。我试图找出最有效的方法来做到这一点,如果可能的话,无需双循环。请建议最好的方法来做到这一点。谢谢
最佳答案
堆栈和合并可以帮助您完成大部分工作:
In [11]: df2.set_index("id2").stack().reset_index(name='value')
Out[11]:
id2 level_1 value
0 12 df2_column1_name Jimmy
1 12 df2_column2_name male
2 12 df2_column3_name 25
3 16 df2_column1_name Becky
4 16 df2_column2_name female
5 16 df2_column3_name 30
6 75 df2_column1_name Mike
7 75 df2_column2_name male
8 75 df2_column3_name 80
In [12]: df2.set_index("id2").stack().reset_index(name='value').merge(df1, right_on="name", left_on="level_1")
Out[12]:
id2 level_1 value id1 name
0 12 df2_column1_name Jimmy 1 df2_column1_name
1 16 df2_column1_name Becky 1 df2_column1_name
2 75 df2_column1_name Mike 1 df2_column1_name
3 12 df2_column2_name male 5 df2_column2_name
4 16 df2_column2_name female 5 df2_column2_name
5 75 df2_column2_name male 5 df2_column2_name
6 12 df2_column3_name 25 33 df2_column3_name
7 16 df2_column3_name 30 33 df2_column3_name
8 75 df2_column3_name 80 33 df2_column3_name
最后,您必须仅选择所需的列并进行排序:
In [13]: df2.set_index("id2").stack().reset_index(name='value').merge(df1, right_on="name", left_on="level_1")[["id1", "id2", "value"]].sort_v
...: alues("id2")
Out[13]:
id1 id2 value
0 1 12 Jimmy
3 5 12 male
6 33 12 25
1 1 16 Becky
4 5 16 female
7 33 16 30
2 1 75 Mike
5 5 75 male
8 33 75 80
关于python - pandas dataframes 将行合并为列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54583621/