python - 在 pandas 数据框中添加新列

标签 python pandas numpy

我对 python 还很陌生,所以请排除拼写错误等。

我试图根据不同列的特定条件在数据框中添加新列。因此,它不是返回值,而是返回我刚刚传递的字符串。

我不知道为什么会发生这种情况以及如何摆脱这种情况。

屏幕enter image description here附镜头。

vdx_access_table["Delivered_Engagements"]=vdx_access_table["Delivered_Engagements"].astype(int)

    vdx_access_table["Delivered_Impressions"]=vdx_access_table["Delivered_Impressions"].astype(int)

    choices_vdx_eng = vdx_access_table["Delivered_Engagements"]/vdx_access_table["BOOKED_IMP#BOOKED_ENG"]

    choices_vdx_cpcv = vdx_access_table["Delivered_Impressions"]/vdx_access_table["BOOKED_IMP#BOOKED_ENG"]

    vdx_access_table['Delivery%']=[choices_vdx_eng if x=='CPE' or x=='CPE+' else choices_vdx_cpcv for x in
                                   vdx_access_table['COST_TYPE']]

enter image description here

最佳答案

使用numpy.where条件为isin :

choices_vdx_eng=vdx_access_table["Delivered_Engagements"]/vdx_access_table['BOOKED_IMP#BOOKED_ENG'] 
choices_vdx_imp=vdx_access_table["Delivered_Impressions"]/vdx_access_table['BOOKED_IMP#BOOKED_ENG'] 

mask = vdx_access_table['COST_TYPE'].isin(['CPE','CPE+'])
vdx_access_table['Delivery%']= np.where(mask, choices_vdx_eng, choices_vdx_imp )

或者:

mask = vdx_access_table['COST_TYPE'].isin(['CPE','CPE+'])
vdx_access_table['Delivery%']= np.where(mask, 
                                        vdx_access_table["Delivered_Engagements"], 
                                        vdx_access_table["Delivered_Impressions"]) /vdx_access_table['BOOKED_IMP#BOOKED_ENG'] 

编辑:

df = pd.DataFrame({'Delivered_Engagements':[10,20,30,40,50],
                   'Delivered_Impressions':[5,4,8,7,3],
                   'BOOKED_IMP#BOOKED_ENG':[3,2,0,4,2],
                   'COST_TYPE':['CPE','CPE+','CPM','CPCV','AAA']})

df["Delivered_Engagements"]=df["Delivered_Engagements"].astype(int)
df["Delivered_Impressions"]=df["Delivered_Impressions"].astype(int)

eng = df["Delivered_Engagements"]/df["BOOKED_IMP#BOOKED_ENG"]
cpcv = df["Delivered_Impressions"]/df["BOOKED_IMP#BOOKED_ENG"]

mask1 = df["COST_TYPE"].isin(['CPE','CPE+'])
mask2 = df["COST_TYPE"].isin(['CPM','CPCV'])


df['Delivery%']=np.select([mask1, mask2], [eng, cpcv], default=0)

df['Delivery%']=df['Delivery%'].replace(np.inf,0)

print (df)
   BOOKED_IMP#BOOKED_ENG COST_TYPE  Delivered_Engagements  \
0                      3       CPE                     10   
1                      2      CPE+                     20   
2                      0       CPM                     30   
3                      4      CPCV                     40   
4                      2       AAA                     50   

   Delivered_Impressions  Delivery%  
0                      5   3.333333  
1                      4  10.000000  
2                      8   0.000000  
3                      7   1.750000  
4                      3   0.000000  

关于python - 在 pandas 数据框中添加新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49063589/

相关文章:

python - 在终端中运行脚本时不显示打印语句

python - Pandas - DF 与列表 - 查找与任何列中的字符串匹配的所有行

python - 数据功能在其域的一小部分上的多重集成 - 准确性和效率

python - 计算随机2个人在同一组的概率?

python - 字符串操作 : partly convert to lowercase

python - 如何在 Python 中发布分块编码数据

python - 为 CentOS 6.9 安装 Ipython

python - pandas:根据另一列的值添加值

python - 如何选择 DataFrame 列在 Pandas 中进行绘图?

python - 为什么我得到 "ufunc ' multiply' did not contain a loop with signature matching types dtype ('S32' ) dtype ('S32' ) dtype ('S32' )"with values from raw_input