python - 用于重新映射值的 If 语句

标签 python pandas

我正在尝试创建一个 if 语句,如果“薪资类别”列是“费用”,则用“法律”填充工具类型列。

但是,无论付款类别如何,它都会将包含 Legal 的所有内容标记为 Legal

test={"Pay Category":["Indemnity","Indemnity","Indemnity","Indemnity","Expense","Expense","Expense","Medical"],"Description of Payment":["Legal","Legal","Legal","Legal","Legal","Legal","Frog","Legal",]}
test=pd.DataFrame(test)

test["Tool Type"]=""
if (test["Pay Category"]=="Medical") is not False: 
test["Tool Type"][test["Description of Payment"].str.contains("Pharmacy|Prescription|RX",case=False)]="Pharmacy"

if (test["Pay Category"]=='Expense') is not False:
test["Tool Type"][test["Description of Payment"].str.contains("Legal|Attorney|Court|Defense",case=False)]="Legal"

我的理解是,if (test["Pay Category"]=='Expense') is not False: 是一个 bool 值,True 或 False ,它应该只在满足条件“不为假”时执行 if 语句。我错过了什么?

布兰登

最佳答案

我认为您需要添加条件并将它们与 & (and)链接:

test["Tool Type"]=""
m1 = test["Description of Payment"].str.contains("Pharmacy|Prescription|RX",case=False)
m2 = test["Pay Category"]=="Medical"

m3 = test["Description of Payment"].str.contains("Legal|Attorney|Court|Defense",case=False)
m4 = test["Pay Category"]=="Expense"

test.loc[m1 & m2, "Tool Type"]="Pharmacy"
test.loc[m3 & m4, "Tool Type"]="Legal"
print (test)
  Description of Payment Pay Category Tool Type
0                  Legal    Indemnity          
1                  Legal    Indemnity          
2                  Legal    Indemnity          
3                  Legal    Indemnity          
4                  Legal      Expense     Legal
5                  Legal      Expense     Legal
6                   Frog      Expense          
7                  Legal      Medical          

另一个带有双 numpy.where 的解决方案:

test['Tool Type'] = np.where(m1 & m2, 'Pharmacy',
                    np.where(m3 & m4, 'Legal', ''))
print (test)
  Description of Payment Pay Category Tool Type
0                  Legal    Indemnity          
1                  Legal    Indemnity          
2                  Legal    Indemnity          
3                  Legal    Indemnity          
4                  Legal      Expense     Legal
5                  Legal      Expense     Legal
6                   Frog      Expense          
7                  Legal      Medical          

编辑:非常好的解决方案 unutbu评论是使用numpy.select

test['Tool Type']  = np.select([(m1 & m2), (m3 & m4)], ['Pharmacy', 'Legal'], default='')
print (test)
  Description of Payment Pay Category Tool Type
0                  Legal    Indemnity          
1                  Legal    Indemnity          
2                  Legal    Indemnity          
3                  Legal    Indemnity          
4                  Legal      Expense     Legal
5                  Legal      Expense     Legal
6                   Frog      Expense          
7                  Legal      Medical          

关于python - 用于重新映射值的 If 语句,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45576329/

相关文章:

python - 从 Spark/Dataproc 将 .txt 文件写入 GCS : How to write only one large file instead of it automatically splitting in to multiple?

python pandas多索引数据框选择

python - 如何在数据框中应用第 5 列的 cummax 逻辑

python - 在 Azure Batch API 中检索处于特定状态的任务数

python - 使用 MySQLdb 的嵌套查询

python - 是否可以分别对多个列进行 GROUP BY 并使用 django ORM 通过其他列聚合它们中的每一个?

python - 如何将值作为 Pandas 数据框中的新列

python - 如何用特殊字符python替换字符串列表中的精确匹配?

python - 数据框 set_index 未设置

python - 如何删除 Seaborn facetgrid 中重复的轴标签?