python - 将两个词典合并在一起并将 None 添加到所需位置

标签 python python-2.7 csv dictionary

我已经抓取了一些 data由于网站的结构方式,我将数据放入两个词典中。

>>>pprint(dict(data))
{u'Additional compensation': [u'$32,241'],
 u'Agency': [u'Chesterfield County Schools', u'City of Richmond Schools'],
 u'Bonuses or other allowances': [u'$12,500'],
 u'COMMENTS': [u'$28,088 - Board Paid Annuity; $4,153 - Excess Health Benefit Contribution;',
               u''],
 u'Full Name': [u'Marcus J. Newsome', u'Dana T. Bedden'],
 u'Total Compensation': [u'$282,258', u'']}

>>>pprint(dict(data2))
{u'Base Salary': [u'$229,758', u'$234,068'],
 u'COMMENTS': [u'12,500 CAR ALLOWANCE, 40,000 DEFFERRED COMPENSATION'],
 u'Deferred compensation': [u'$40,000'],
 u'Job Title': [u'SUPERINTENDENT', u'SUPERINTENDENT'],
 u'Total Compensation': [u'$266,309'],
 u'Work location': [u'Office Of Superintendent']}

我已将数据合并到一个主词典中,并尝试将其放入一个 csv 文件中。

for d in data2, data:
    for k, v in d.iteritems():
        master_data[k].append(v)

with open('test2.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(zip(*([k] + master_data[k] for k in sorted(master_data))))

问题是只有第一个人(Marcus J. Newsome)的信息被导出到 csv。我认为这是因为 Marcus 中不存在属于 Dana T. Bedden 的键/值(例如 Additional compensation) J Newsome 的数据。

为了解决这个问题,我尝试将 None 添加到位置来解决这个问题。

for d in data2, data:
    master_data.update((k, [None, master_data[k]]) for k in master_data if k not in d)

>>>pprint(dict(master_data))
{u'Additional compensation': [None, [[u'$32,241']]],
 u'Agency': [None,
             [[u'Chesterfield County Schools', u'City of Richmond Schools']]],
 u'Base Salary': [None, [[u'$229,758', u'$234,068']]],
 u'Bonuses or other allowances': [None, [[u'$12,500']]],
 u'COMMENTS': [[u'12,500 CAR ALLOWANCE, 40,000 DEFFERRED COMPENSATION'],
               [u'$28,088 - Board Paid Annuity; $4,153 - Excess Health Benefit Contribution;',
                u'']],
 u'Deferred compensation': [None, [[u'$40,000']]],
 u'Full Name': [None, [[u'Marcus J. Newsome', u'Dana T. Bedden']]],
 u'Job Title': [None, [[u'SUPERINTENDENT', u'SUPERINTENDENT']]],
 u'Total Compensation': [[u'$266,309'], [u'$282,258', u'']],
 u'Work location': [None, [[u'Office Of Superintendent']]]}

不幸的是,这似乎并没有按照我想要的方式工作。最终我希望我的输出看起来像这样:

期望的输出

{u'Additional compensation': [[None, [u'$32,241']]],
 u'Agency': [[u'Chesterfield County Schools'], [u'City of Richmond Schools']]],
 u'Base Salary': [[u'$229,758'], [u'$234,068']]],
 u'Bonuses or other allowances': [[u'$12,500'], None]],
 u'COMMENTS': [[u'12,500 CAR ALLOWANCE, 40,000 DEFFERRED COMPENSATION'],
               [u'$28,088 - Board Paid Annuity; $4,153 - Excess Health Benefit Contribution;',
                u'']],
 u'Deferred compensation': [[u'$40,000'], None]],
 u'Full Name': [[u'Marcus J. Newsome'], [u'Dana T. Bedden']]],
 u'Job Title': [[u'SUPERINTENDENT'], [u'SUPERINTENDENT']]],
 u'Total Compensation': [[u'$266,309'], [u'$282,258', u'']],
 u'Work location': [None, [u'Office Of Superintendent']]]}

有人有什么想法吗?

最佳答案

最好改变存储抓取数据的方式。

伪代码:

data = []
for row in table:
    person = get_data_from_row(row)
    person.update(get_data_from_person_page(row))
    data.append(person)

然后你可以使用csv.DictWriter没有任何复杂的数据操作:

with open('data.csv', 'w') as f:
    fieldnames = data[0].keys()
    writer = csv.DictWriter(f, fieldnames)
    writer.writeheader()
    for row in data:
        writer.writerow(row)

关于python - 将两个词典合并在一起并将 None 添加到所需位置,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39923922/

相关文章:

python - 如何使用 Python 更新 HTML 文件中的值?

Python 和网络标签正则表达式

python - 如何对 pandas、torch 和 numpy 输入使用打字覆盖

python - 尝试相对导入超出顶级包

python - SQLAlchemy - 经典 map 关系

python - 基于2个元素的列表去重列表

google-app-engine - Golang CSV error bare "in non-quoted-field

python - 使用Python将Excel数据导出到Google Sheets

csv - 如何制作 schema.ini - 需要帮助

python - PyPa setup.py 测试脚本