python - Pandas:根据另一列的匹配项替换列值

我在第一个数据框中有一列 df1["ItemType"]如下所示，

Dataframe1

ItemType1
redTomato
whitePotato
yellowPotato
greenCauliflower
yellowCauliflower
yelloSquash
redOnions
YellowOnions
WhiteOnions
yellowCabbage
GreenCabbage

我需要根据从另一个数据框创建的字典来替换它。

Dataframe2

ItemType2          newType
whitePotato        Potato
yellowPotato       Potato
redTomato          Tomato
yellowCabbage   
GreenCabbage    
yellowCauliflower   yellowCauliflower
greenCauliflower    greenCauliflower
YellowOnions        Onions
WhiteOnions         Onions
yelloSquash         Squash
redOnions           Onions

请注意，

在 dataframe2一些 ItemType与 ItemType 相同在 dataframe1 .
一些 ItemType在 dataframe2 中有 null像 yellowCabbage 这样的值。
ItemType在 dataframe2 中关于 ItemType 是乱序的在dataframe

我需要替换 Dataframe1 中的值ItemType如果在相应的 Dataframe2 中存在匹配值的列ItemType与 newType请牢记要点中列出的上述异常(exception)情况。
如果没有匹配项，则值需要保持原样[无变化]。

到目前为止我得到的是。

import pandas as pd

#read second `csv-file`
df2 = pd.read_csv('mappings.csv',names = ["ItemType", "newType"])
#conver to dict
df2=df2.set_index('ItemType').T.to_dict('list')

下面给出的匹配替换不起作用。他们正在插入 NaN值而不是实际值。这些都是基于讨论here在 SO 上。

df1.loc[df1['ItemType'].isin(df2['ItemType'])]=df2[['NewType']]

或

df1['ItemType']=df2['ItemType'].map(df2)

提前致谢

编辑
两个数据框中的两个列标题具有不同的名称。所以 dataframe1 列是 ItemType1，第二个数据框中的第一列是 ItemType2。在第一次编辑时错过了。

最佳答案

使用 map

您需要的所有逻辑:

def update_type(t1, t2, dropna=False):
    return t1.map(t2).dropna() if dropna else t1.map(t2).fillna(t1)

让我们将 'ItemType2' 设为 Dataframe2 的索引

update_type(Dataframe1.ItemType1,
            Dataframe2.set_index('ItemType2').newType)

0                Tomato
1                Potato
2                Potato
3      greenCauliflower
4     yellowCauliflower
5                Squash
6                Onions
7                Onions
8                Onions
9         yellowCabbage
10         GreenCabbage
Name: ItemType1, dtype: object

update_type(Dataframe1.ItemType1,
            Dataframe2.set_index('ItemType2').newType,
            dropna=True)

0                Tomato
1                Potato
2                Potato
3      greenCauliflower
4     yellowCauliflower
5                Squash
6                Onions
7                Onions
8                Onions
Name: ItemType1, dtype: object

验证

updated = update_type(Dataframe1.ItemType1, Dataframe2.set_index('ItemType2').newType)

pd.concat([Dataframe1, updated], axis=1, keys=['old', 'new'])

时机

def root(Dataframe1, Dataframe2):
    return Dataframe1['ItemType1'].replace(Dataframe2.set_index('ItemType2')['newType'].dropna())

def piRSquared(Dataframe1, Dataframe2):
    t1 = Dataframe1.ItemType1
    t2 = Dataframe2.set_index('ItemType2').newType
    return update_type(t1, t2)

关于python - Pandas:根据另一列的匹配项替换列值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38466682/

python - Pandas:根据另一列的匹配项替换列值

验证

时机

上一篇：python - 如何使用 Beautiful Soup 提取 <script> 标签中的字符串？

下一篇：python - 提取嵌套括号内的字符串

python - Pandas:根据另一列的匹配项替换列值

验证

时机

上一篇：python - 如何使用 Beautiful Soup 提取 &lt;script&gt; 标签中的字符串？

下一篇：python - 提取嵌套括号内的字符串

上一篇：python - 如何使用 Beautiful Soup 提取 <script> 标签中的字符串？