[使用Python3]我有一个csv文件,它有两列(一个电子邮件地址和一个国家/地区代码;如果原始文件中不是这种情况,则编写脚本实际上使其成为两列 - 有点),我想要按第二列中的值拆分并输出到单独的 csv 文件中。
eppetj@desrfpkwpwmhdc.com us ==> output-us.csv
uheuyvhy@zyetccm.com de ==> output-de.csv
avpxhbdt@reywimmujbwm.com es ==> output-es.csv
gqcottyqmy@romeajpui.com it ==> output-it.csv
qscar@tpcptkfuaiod.com fr ==> output-fr.csv
qshxvlngi@oxnzjbdpvlwaem.com gb ==> output-gb.csv
vztybzbxqq@gahvg.com us ==> output-us.csv
... ... ...
目前我的代码就是这样做的,但不是将每个电子邮件地址写入 csv,而是覆盖之前放置的电子邮件。有人可以帮我解决这个问题吗?
我对编程和 Python 非常陌生,我可能没有以最 Pythonic 的方式编写代码,所以我真的很感激任何关于代码的反馈!
提前致谢!
代码:
import csv
def tsv_to_dict(filename):
"""Creates a reader of a specified .tsv file."""
with open(filename, 'r') as f:
reader = csv.reader(f, delimiter='\t') # '\t' implies tab
email_list = []
# Checks each list in the reader list and removes empty elements
for lst in reader:
email_list.append([elem for elem in lst if elem != '']) # List comprehension
# Stores the list of lists as a dict
email_dict = dict(email_list)
return email_dict
def count_keys(dictionary):
"""Counts the number of entries in a dictionary."""
return len(dictionary.keys())
def clean_dict(dictionary):
"""Removes all whitespace in keys from specified dictionary."""
return { k.strip():v for k,v in dictionary.items() } # Dictionary comprehension
def split_emails(dictionary):
"""Splits out all email addresses from dictionary into output csv files by country code."""
# Creating a list of unique country codes
cc_list = []
for v in dictionary.values():
if not v in cc_list:
cc_list.append(v)
# Writing the email addresses to a csv based on the cc (value) in dictionary
for key, value in dictionary.items():
for c in cc_list:
if c == value:
with open('output-' +str(c) +'.csv', 'w') as f_out:
writer = csv.writer(f_out, lineterminator='\r\n')
writer.writerow([key])
最佳答案
您可以使用 defaultdict
来简化此过程:
import csv
from collections import defaultdict
emails = defaultdict(list)
with open('email.tsv','r') as f:
reader = csv.reader(f, delimiter='\t')
for row in reader:
if row:
if '@' in row[0]:
emails[row[1].strip()].append(row[0].strip()+'\n')
for key,values in emails.items():
with open('output-{}.csv'.format(key), 'w') as f:
f.writelines(values)
由于您的分隔文件不是逗号分隔的,而是单列 - 您不需要 csv 模块,只需编写行即可。
emails
字典包含每个国家/地区代码的键以及所有匹配电子邮件地址的列表。为了确保电子邮件地址打印正确,我们删除所有空格并添加换行符(这样我们以后就可以使用 writelines
)。
填充字典后,只需逐步执行键来创建文件,然后写出结果列表即可。
关于python - 根据字典中的值将键写入单独的csv,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16956523/