Python用前缀分割字符串

标签 python pandas

如果我有一个充满文本和价格的数据框列。

 0  £75 BT Reward Card
1   £125 BT Reward Card 
2   £50 Retail Voucher
3   £100 BT Reward Card 
4   £150 BT Reward Card 
5   £50 Cashback
6   Fibre Connection Fee (£50 Credit
7   £75 BT Reward Card  
8   £125 BT Reward Card 
9   £50 Cashback
10  £0 Fibre Connection Fee (£50 Credit

我只想在£符号后面直接返回数字。

到目前为止我已经知道了,但是对于索引 6 和 10 来说,它会分崩离析

df['col']=df['col'].apply(lambda x: x.split(' ')  [0])

我也试过这个:

df['col']=df['col'].apply(lambda x: x.split('£')  [1])

最佳答案

如果需要第一个值只使用extract并在必要时转换为整数:

df['new'] = df['col'].str.extract('£(\d+)').astype(int)
print (df)
                                      col  new
0                      £75 BT Reward Card   75
1                    £125 BT Reward Card   125
2                      £50 Retail Voucher   50
3                    £100 BT Reward Card   100
4                    £150 BT Reward Card   150
5                            £50 Cashback   50
6        Fibre Connection Fee (£50 Credit   50
7                    £75 BT Reward Card     75
8                    £125 BT Reward Card   125
9                            £50 Cashback   50
10    £0 Fibre Connection Fee (£50 Credit    0

如果列表中的所有值都使用 str.findall :

#values are strings
df['new'] = df['col'].str.findall('£(\d+)')
#values are integers
#df['new'] = df['col'].str.findall('£(\d+)').apply(lambda x: [int(y) for y in x])
print (df)
                                      col      new
0                      £75 BT Reward Card     [75]
1                    £125 BT Reward Card     [125]
2                      £50 Retail Voucher     [50]
3                    £100 BT Reward Card     [100]
4                    £150 BT Reward Card     [150]
5                            £50 Cashback     [50]
6        Fibre Connection Fee (£50 Credit     [50]
7                    £75 BT Reward Card       [75]
8                    £125 BT Reward Card     [125]
9                            £50 Cashback     [50]
10    £0 Fibre Connection Fee (£50 Credit  [0, 50]

如果在新列中需要它们,请使用 extractallunstack , add_prefixjoin :

df = df.join(df['col'].str.extractall('£(\d+)')[0].unstack().astype(float).add_prefix('new'))
print (df)
                                      col   new0  new1
0                      £75 BT Reward Card   75.0   NaN
1                    £125 BT Reward Card   125.0   NaN
2                      £50 Retail Voucher   50.0   NaN
3                    £100 BT Reward Card   100.0   NaN
4                    £150 BT Reward Card   150.0   NaN
5                            £50 Cashback   50.0   NaN
6        Fibre Connection Fee (£50 Credit   50.0   NaN
7                    £75 BT Reward Card     75.0   NaN
8                    £125 BT Reward Card   125.0   NaN
9                            £50 Cashback   50.0   NaN
10    £0 Fibre Connection Fee (£50 Credit    0.0  50.0

关于Python用前缀分割字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54868242/

相关文章:

python - 找到一个被三个大写字母包围的小写字母

python - 如何使用 pymongo 中对象列表中的值创建新的字段名称?

python - 如何测量独特 ROI 中的 RGB 均值?

python - "index.week"使用 iterrows 时进行过滤

python - 将数据框的第三列放在下一行

python - 遍历 pandas 中的组以提取顶部

python - 通过复制行摆脱 "count"列

python - 使用 Python 和 Google API 进行反向地理编码

python - 更新 pandas 数据框中的值

python - 简化反三角代码(While 循环)