我在 Excel 文件的列中应用了一种处理方法。现在,我想导出这个经过处理的列和所有其他未经过处理的列。
我的数据(小例子):
A B C
French house Phone <phone_numbers>
English house email blablabla@gmail.com
French apartment my name is Liam
French house Hello George
English apartment Ethan, my phone is <phone_numbers>
我的脚本:
import re
import pandas as pd
from pandas import Series
df = pd.read_excel('data.xlsx')
data = Series.to_string(df['C'])
def emails(data):
mails = re.compile(r'[\w\.-]+@[\w\.-]+')
replace_mails = mails.sub('<adresse_mail>', data)
return replace_mails
no_mails = emails(data)
no_mails.to_excel('new_data.xlsx')
我的输出:
AttributeError Traceback (most recent call last)
<ipython-input-7-8fd973998937> in <module>()
7
8 no_mails = emails(data)
----> 9 no_mails.to_excel('new_data.xlsx')
AttributeError: 'str' object has no attribute 'to_excel'
良好的输出:
A B C
French house Phone <phone_numbers>
English house email <adresse_mail>
French apartment my name is Liam
French house Hello George
English apartment Ethan, my phone is <phone_numbers>
我的脚本工作正常,只是
no_mails.to_excel('new_data.xlsx')
似乎不起作用。
最佳答案
您可以在 pandas 系列上使用 replace
:
df['C'] = df['C'].str.replace(r'[\w\.-]+@[\w\.-]+','<adresse_mail>')
df.to_excel('new_data.xlsx')
关于python - 用 Pandas 导出数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53062244/