我有以下数据框:
df1
:
+-----+------+------+------+------+------+
| No. | col1 | col2 | col3 | Type | ... |
+-----+------+------+------+------+------+
| 123 | 2 | 5 | 2 | MN | ... |
| 453 | 4 | 3 | 1 | MN | ... |
| 146 | 7 | 9 | 4 | AA | ... |
| 175 | 2 | 4 | 3 | MN | ... |
| 643 | 0 | 0 | 0 | NAN | ... |
+-----+------+------+------+------+------+
df2
:
+-----+------+------+------+------+
| No. | col1 | col2 | col3 | Type |
+-----+------+------+------+------+
| 123 | 24 | 57 | 22 | MN |
| 453 | 41 | 39 | 15 | MN |
| 175 | 21 | 43 | 37 | MN |
+-----+------+------+------+------+
我想用 df2 中的相应值替换
如果 df1
中的 col1
、col2
和 col3
Type
等于 MN
期望的输出:
df1
:
+-----+------+------+------+------+-----+
| No. | col1 | col2 | col3 | Type | ... |
+-----+------+------+------+------+-----+
| 123 | 24 | 57 | 22 | MN | ... |
| 453 | 41 | 39 | 15 | MN | ... |
| 146 | 7 | 9 | 4 | AA | ... |
| 175 | 21 | 43 | 37 | MN | ... |
| 643 | 0 | 0 | 0 | NAN | ... |
+-----+------+------+------+------+-----+
编辑
我试过:
df1[df1.Type == 'MN'] = df2.values
但是我得到这个错误:
ValueError: Must have equal len keys and value when setting with an ndarray
猜测原因是 df2
的列数不相等。那么我如何确保只有特定列 (col1
- col3
) 在 df1
中被替换?
最佳答案
我认为需要combine_first
按 No.
列匹配:
#filter only `MN` rows if necessary
df22 = df2[df2['Type'] == 'MN'].set_index('No.')
df1 = df22.combine_first(df1.set_index('No.')).reset_index().reindex(columns=df1.columns)
print (df1)
No. col1 col2 col3 Type col
0 123 24.0 57.0 22.0 MN ...
1 146 7.0 9.0 4.0 AA ...
2 175 21.0 43.0 37.0 MN ...
3 453 41.0 39.0 15.0 MN ...
4 643 0.0 0.0 0.0 NAN ...
关于python - Pandas:根据条件用另一个数据帧值替换数据帧中的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50324600/