python - 由于没有考虑后面的字符，str.match 不完全匹配

我有一个 CSV 文件:

State,  Region                  
AK,     Pacific Non Continuous
HI,     Pacific Non Continuous 
AL,     East South Central  
AZ,     Mountain                
CA,     Pacific                
OR,     Pacific

当我运行时:

df = pd.read_csv('C:...\input.csv')

df['SuperRegion'] = pd.np.where(df.Region.str.match("New England|Middle Atlantic|South Atlantic"), "East",
                pd.np.where(df.Region.str.match("East North Central|East South Central|West North Central|West South Central"), "Mid West",
                pd.np.where(df.Region.str.match("Mountain|Pacific"), "West", "Other")))

df.to_csv('C:...\Output.csv', index=False)

我希望前两行的 SuperRegion 值为 Other

State,  Region,                  SuperRegion
AK,     Pacific Non Continuous,  **Other**
HI,     Pacific Non Continuous,  **Other**
AL,     East South Central,      Mid West
AZ,     Mountain,                West
CA,     Pacific,                 West
OR,     Pacific,                 West

但我得到的是:

State,  Region,                  SuperRegion
AK,     Pacific Non Continuous,  **West**
HI,     Pacific Non Continuous,  **West**
AL,     East South Central,      Mid West
AZ,     Mountain,                West
CA,     Pacific,                 West
OR,     Pacific,                 West

我假设当它运行时，它不会像我希望的那样区分Pacific和Pacific Non Continuous。有什么建议吗？

最佳答案

为什么不改变:

pd.np.where(df.Region.str.match("Mountain|Pacific"), "West", "Other")))

至:

pd.np.where(df.Region.str.match("Mountain|Pacific|Pacific Non Continuous"), "West", "West", "Other")))

或者单独添加案例:

df['SuperRegion'] = pd.np.where(df.Region.str.match("New England|Middle Atlantic|South Atlantic"), "East",
                pd.np.where(df.Region.str.match("East North Central|East South Central|West North Central|West South Central"), "Mid West",
                pd.np.where(df.Region.str.match("Pacific Non Continuous"), "Other",
                pd.np.where(df.Region.str.match("Mountain|Pacific"), "West")))

对此的理想解决方案是创建一个字典，其中键作为区域，值作为 super 区域，并使用

df['Regions'].map(dict)

关于python - 由于没有考虑后面的字符，str.match 不完全匹配，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46717122/

python - 由于没有考虑后面的字符，str.match 不完全匹配

上一篇：mysql-5.6 - cursor.query( 'select * from %s;' , ('thistable' ,) ) 抛出语法错误 1064 : . ..near ' ' thistable' ' at

下一篇：python - 使用 python 中的坐标列表和多数组错误绘制图形