python - 在合并/左连接期间替换数据框中的 NaN

我将两个数据帧合并在一起作为左连接。但是，如果特定列中的值是空白或 NaN，我想从“正确”数据框中替换该值(并且仅在这种情况下。否则，我想忽略 df2 中的“成本”数据)

df1 = pd.DataFrame({
         'ID':[1,2,3,4,5,6],
         'Version':[1,1,2,2,1,2],
         'Cost':[17,np.nan,24,21,'',8]})

df2 = pd.DataFrame({
         'ID':[1,2,3,4,5,6,7,8,9],
         'Color':["Red","Orange","Green","Blue","Indigo", "Violet","Black","White","Gold"],
         'UnUsedData': ['foo','bar','foo','bar','foo','bar','foo','bar','foo'],
         'Cost':[17,34,54,28,22,8,43,23,12]})

合并语句是:

df_new = pd.merge(df1, df2[['ID','Color']], on ='ID', how ='left')

其产量:

   ID  Version Cost   Color
0   1        1   17     Red
1   2        1   NaN  Orange
2   3        2   24   Green
3   4        2   21    Blue
4   5        1       Indigo
5   6        2    8  Violet

但我希望输出看起来像这样:[索引行 #s 1 和 4 中的成本列值发生变化]

   ID  Version Cost   Color
0   1        1   17   Red
1   2        1   34   Orange
2   3        2   24   Green
3   4        2   21   Blue
4   5        1   22   Indigo
5   6        2    8   Violet

我可以循环遍历 df_new 成本列的各个值，然后在 df2 中查找每个空白或 NaN 的值，但似乎会有一种更优雅/更简单的方法。也许以某种方式使用 fillna() ？我见过的例子似乎是用常量值替换 NaN，而不是根据项目而变化的值。

最佳答案

您可以使用combine_first获取第一个非na信息:

# merge
dfx = pd.merge(df1, df2[['ID','Color','Cost']], on ='ID', how ='left')

# replace empty space with NAN
dfx = dfx.replace("", np.nan)

# coalesce cost column to get first non NA value
dfx['Cost'] = dfx['Cost_x'].combine_first(dfx['Cost_y']).astype(int)

# remove the cols
dfx = dfx.drop(['Cost_x', 'Cost_y'], 1)
print(dfx)

   ID  Version   Color  Cost
0   1        1     Red    17
1   2        1  Orange    34
2   3        2   Green    24
3   4        2    Blue    21
4   5        1  Indigo    22
5   6        2  Violet     8

关于python - 在合并/左连接期间替换数据框中的 NaN，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63007584/

python - 在合并/左连接期间替换数据框中的 NaN

上一篇：android - 添加圆角到约束布局会留下白色背景

下一篇：python - pywinauto TypeError : item 2 in _argtypes_ passes a union by value, 不受支持