我有一个包含以下内容的文件
first_name,last_name,uid,email,dep_code,dep_name
john,smith,jsmith,<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c7adb4aaaeb3af87a0aaa6aeabe9a4a8aa" rel="noreferrer noopener nofollow">[email protected]</a>,finance,21230
john,king,jking,<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bbd1d1d2d5dcfbdcd6dad2d795d8d4d6" rel="noreferrer noopener nofollow">[email protected]</a>,human resource,31230
我想复制列“email”并创建一个新列“email2”,然后将 gmail.com 从列 email2 替换为 hotmail.com
我是Python新手,所以需要专家的帮助,我尝试了一些脚本,但如果有更好的方法,请告诉我。原始文件包含 60000 行。
with open('c:\\Python27\\scripts\\colnewfile.csv', 'rb') as fp_in1, open('c:\\Python27\\scripts\\final.csv', 'wb') as fp_out1:
writer1 = csv.writer(fp_out1, delimiter=",")
reader1 = csv.reader(fp_in1, delimiter=",")
domain = "@hotmail.com"
for row in reader1:
if row[2:3] == "uid":
writer1.append("Email2")
else:
writer1.writerow(row+[row[2:3]])
这是最终的脚本,唯一的问题是它没有完成整个输出文件,它只显示 61409 行,而输入文件中有 61438 行。
inFile = 'c:\Python27\scripts\in-093013.csv' outFile = 'c:\Python27\scripts\final.csv'
将 open(inFile, 'rb') 作为 fp_in1, open(outFile, 'wb') 作为 fp_out1: writer = csv.writer(fp_out1, 分隔符=“,”) 读者= csv.reader(fp_in1,分隔符=“,”) 对于阅读器中的 col: 德尔科尔[6:] writer.writerow(col) 标题=下一个(读者) writer.writerow(标题 + ['email2']) 对于读卡器中的行: 如果长度(行)> 3: 电子邮件 = email.split('@', 1)[0] + '@hotmail.com' writer.writerow(行 + [电子邮件])
最佳答案
如果你在阅读器上调用next()
,你一次会得到一行;用它来复制标题。复制电子邮件列非常简单:
import csv
infilename = r'c:\Python27\scripts\colnewfile.csv'
outfilename = r'c:\Python27\scripts\final.csv'
with open(infilename, 'rb') as fp_in, open(outfilename, 'wb') as fp_out:
reader = csv.reader(fp_in, delimiter=",")
headers = next(reader) # read first row
writer = csv.writer(fp_out, delimiter=",")
writer.writerow(headers + ['email2'])
for row in reader:
if len(row) > 3:
# make sure there are at least 4 columns
email = row[3].split('@', 1)[0] + '@hotmail.com'
writer.writerow(row + [email])
此代码在第一个 @
符号处拆分电子邮件地址,获取拆分的第一部分并在其后添加 @hotmail.com
:
>>> '<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="96f3eef7fbe6faf3d6f1fbf7fffab8f5f9fb" rel="noreferrer noopener nofollow">[email protected]</a>'.split('@', 1)[0]
'example'
>>> '<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="46233e272b362a2306212b272f2a6825292b" rel="noreferrer noopener nofollow">[email protected]</a>'.split('@', 1)[0] + '@hotmail.com'
'<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="6f0a170e021f030a2f07001b020e0603410c0002" rel="noreferrer noopener nofollow">[email protected]</a>'
上面的结果是:
first_name,last_name,uid,email,dep_code,dep_name,email2
john,smith,jsmith,<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="ed879e80849985ad8a808c8481c38e8280" rel="noreferrer noopener nofollow">[email protected]</a>,finance,21230,<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="167c657b7f627e567e79627b777f7a3875797b" rel="noreferrer noopener nofollow">[email protected]</a>
john,king,jking,<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="8de7e7e4e3eacdeae0ece4e1a3eee2e0" rel="noreferrer noopener nofollow">[email protected]</a>,human resource,31230,<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bdd7d7d4d3dafdd5d2c9d0dcd4d193ded2d0" rel="noreferrer noopener nofollow">[email protected]</a>
用于您的示例输入。
关于python csv复制列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19324968/