python - 当 header 是动态的时,避免将批量数据导出到 csv

标签 python csv stream python-3.6

我偶然发现了一个我似乎无法找到解决方案的非常简单的情况。

我想做的很简单:将一些数据写入 .csv 文件,其中包含:

  • 动态 header
  • 一些数据

我现在的做法似乎是我能想出的唯一解决方案:

  • 将我需要的数据存储在字典列表中
  • 获取上面列表中每个字典的keys()并将它们添加到set()(这将是标题)
  • 使用writer.writerows(data)将数据写入文件

基本上,一个简单的 MCVE 可能如下所示:

from csv import DictWriter

RESULT_FILE = 'test_result.csv'


def get_fieldnames(data):
    fieldnames = set()
    for item in data:
        fieldnames.update(item.keys())
    return fieldnames


def main(data):
    fieldnames = get_fieldnames(data)

    with open(RESULT_FILE, 'a', newline='', encoding='utf-8') as f:
        writer = DictWriter(f, fieldnames=fieldnames, delimiter=',')
        writer.writeheader()
        writer.writerows(data)


if __name__ == '__main__': 
    data_ = [
        {
            'a': '1',
            'b': '2',
            'c': '3',
        },
        {
            'a': '6',
            'd': '1',
            'b': '3',
        },
        {
            'c': '2',
            'e': '1',
            'f': '9',
        }
    ]
    main(data_)

现在,我不喜欢这个:

  • 列表可能会变得非常大(约 100k 个字典/每个字典包含大约 10 个字段)。
  • 如果在将 66666 字典添加到列表时程序崩溃,则所有内容都会丢失,而且 csv 中也没有任何数据。因为我必须等待所有数据都添加到列表中才能获取所有可能的 header ,所以我无法避免这种情况。

当 header 是动态的时,如何避免在 csv 中一次导出所有数据?


根据要求,真实数据是这样的:

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': 'Exclusive single-piece hub design reduces pad vibration and '
                'ensures smooth performance.',
 'Each': '$ 24.70',
 'Info': '',
 'Line art': '',
 'Name': '(5") Non-Vacuum Disc Pad Vinyl-Face',
 'Product number': '91456106T',
 'Technical specifications': '',
 'image_1': 'https://www.richelieu.com/documents/docsGr/120/107/6/1201076/1419675_700.jpg'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '',
 'Each': '$ 8.19',
 'Info': '<p><strong>material: </strong>Cork</p>',
 'Line art': '',
 'Name': 'Replacement Plate for MKT9924DB Belt Sander',
 'Product number': 'MKT4230358',
 'Technical specifications': '<p><strong>brand: </strong>Makita</p>',
 'image_1': 'https://www.richelieu.com/documents/docsGr/116/631/4/1166314/1281513_700.jpg',
 '\xa0': '$ 257.80'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '',
 'Each': '$ 8.19',
 'Info': '<p><strong>material: </strong>Graphite</p>',
 'Line art': '',
 'Name': 'Replacement Plate for MKT9924DB Belt Sander',
 'Product number': 'MKT4230366',
 'Technical specifications': '<p><strong>brand: </strong>Makita</p>',
 'image_1': 'https://www.richelieu.com/documents/docsPr/MK/T4/23/03/66/MKT4230366/1281514_700.jpg',
 '\xa0': '$ 257.80'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '- Exclusive single-piece hub design reduces pad vibration and '
                'ensures smooth performance.',
 'Each': '$ 38.47',
 'Info': '',
 'Line art': '',
 'Name': 'Non-Grip Vacuum Pads',
 'Product number': '9154325',
 'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                             'in</p><p><strong>density: '
                             '</strong>Medium</p><p><strong>nap: '
                             '</strong>Short</p>',
 'image_1': 'https://www.richelieu.com/documents/docsPr/91/54/32/5/9154325/1213330_700.jpg',
 'image_2': 'https://www.richelieu.com/documents/docsPr/91/54/32/5/9154325/1213331_700.jpg'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '- Exclusive single-piece hub design reduces pad vibration and '
                'ensures smooth performance.',
 'Each': '$ 52.92',
 'Info': '',
 'Line art': '',
 'Name': 'Non-Grip Vacuum Pads',
 'Product number': '9154327',
 'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                             'in</p><p><strong>density: '
                             '</strong>Medium</p><p><strong>nap: '
                             '</strong>Short</p>',
 'image_1': 'https://www.richelieu.com/documents/docsGr/105/122/1/1051221/1213328_700.jpg',
 'image_2': 'https://www.richelieu.com/documents/docsPr/91/54/32/7/9154327/1213332_700.jpg'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '- Unique one-piece hub design reduces pad vibration and '
                'ensures smooth performance.',
 'Each': '$ 26.84',
 'Info': '',
 'Line art': '',
 'Name': 'Stick-on Non-Vacuum Pads',
 'Product number': '9156106',
 'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                             'in</p><p><strong>density: </strong>Medium</p>',
 'image_1': 'https://www.richelieu.com/documents/docsGr/105/122/4/1051224/1213343_700.jpg',
 'image_2': 'https://www.richelieu.com/documents/docsPr/91/56/10/6/9156106/1213345_700.jpg'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '- Unique one-piece hub design reduces pad vibration and '
                'ensures smooth performance.',
 'Each': '$ 51.70',
 'Info': '',
 'Line art': '',
 'Name': 'Stick-on Non-Vacuum Pads',
 'Product number': '9156107',
 'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                             'in</p><p><strong>density: </strong>Medium</p>',
 'image_1': 'https://www.richelieu.com/documents/docsPr/91/56/10/7/9156107/1213344_700.jpg',
 'image_2': 'https://www.richelieu.com/documents/docsPr/91/56/10/7/9156107/1213346_700.jpg'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': 'Size: 2-1/2" x 14".',
 'Each': '$ 12.36',
 'Info': '',
 'Line art': '',
 'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
 'Product number': 'PC371K060',
 'Technical specifications': '',
 'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/06/0/PC371K060/1263523_700.jpg',
 '\xa0': '$ 148.18'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': 'Size: 2-1/2" x 14".',
 'Each': '$ 12.36',
 'Info': '',
 'Line art': '',
 'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
 'Product number': 'PC371K080',
 'Technical specifications': '',
 'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/08/0/PC371K080/1263524_700.jpg',
 '\xa0': '$ 148.18'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': 'Size: 2-1/2" x 14".',
 'Each': '$ 12.36',
 'Info': '',
 'Line art': '',
 'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
 'Product number': 'PC371K120',
 'Technical specifications': '',
 'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/12/0/PC371K120/1263526_700.jpg',
 '\xa0': '$ 148.18'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': 'Size: 2-1/2" x 14".',
 'Each': '$ 12.36',
 'Info': '',
 'Line art': '',
 'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
 'Product number': 'PC371K100',
 'Technical specifications': '',
 'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/10/0/PC371K100/1263525_700.jpg',
 '\xa0': '$ 148.18'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': 'Exclusive single-piece hub design reduces pad vibration and '
                'ensures smooth performance.',
 'Each': '$ 25.22',
 'Info': '',
 'Line art': '',
 'Name': '5" Non-Vacuum Disc Pad Hook-Face',
 'Product number': '91454325T',
 'Technical specifications': '',
 'image_1': 'https://www.richelieu.com/documents/docsGr/120/107/7/1201077/1419678_700.jpg'}

{'Catalog link': '',
 'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
             'Accessories / Sander Accessories',
 'Description': '- Pads mount with screws.',
 'Each': '$ 31.80',
 'Info': '',
 'Line art': '',
 'Name': 'Plates for Non-Vacuum (Grip-On) Dynabug II Disc Pads - 7.62 cm x '
         '10.79 cm (3" x 4-1/4")',
 'Product number': '9156315',
 'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                             'in</p><p><strong>density: </strong>Medium</p>',
 'image_1': 'https://www.richelieu.com/documents/docsGr/116/625/4/1166254/1280825_700.jpg',
 '\xa0': '$ 179.95'}

最佳答案

Edit-1 26-Dec:更新代码以根据您的数据生成数据

根据您的要求,我建议如下

  • 在 headers.csv 文件中写入标题
  • 将数据写入data.csv文件
  • 当你想读取/发送这个文件时,只需将两个文件合并为一个
  • 在程序开始时读取现有的 headers.csv 文件并创建字段到索引的映射
  • 当您在数据中遇到新键时,您会使用新索引更新 header 映射并更新 header.csv
  • 在编写字典数据时,您将使用标题映射来创建行数据

下面是相同的快速/不完善的 POC,它对我来说效果很好

import csv

try:
    f = open("headers.csv", mode="r+", encoding="utf-8")
except FileNotFoundError:
    f = open("headers.csv", mode="w+", encoding="utf-8")

f2 = open("data.csv", mode="a+", encoding="utf-8")
f.seek(0)
headers = f.readline().strip().split(",")
if headers == ['']:
    headers = []

headers_map = {}

for index, field in enumerate(headers):
    headers_map[field] = index


def update_header_dict(data):
    updated_headers = False
    for key in data.keys():
        if key not in headers_map:
            new_index = len(headers_map)
            headers_map[key] = new_index
            updated_headers = True

    if updated_headers:
        f.seek(0)
        csv.DictWriter(f, headers_map.keys()).writeheader()
        f.flush()


def get_row_data_dict(data):
    row_data = [""] * len(headers_map)

    for k, v in data.items():
        # if v and v[0] in ('=', '-'):
        #     # Mark the value as text, only needed if you want to display data in excel
        #     # else should be commented out
        #     v = "'" + v
        row_data[headers_map[k]] = v

    return row_data


def main(data):
    data_writer = csv.writer(f2)
    for row in data:
        update_header_dict(row)
        data_writer.writerow(get_row_data_dict(row))


data_ = [
    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': 'Exclusive single-piece hub design reduces pad vibration and '
                    'ensures smooth performance.',
     'Each': '$ 24.70',
     'Info': '',
     'Line art': '',
     'Name': '(5") Non-Vacuum Disc Pad Vinyl-Face',
     'Product number': '91456106T',
     'Technical specifications': '',
     'image_1': 'https://www.richelieu.com/documents/docsGr/120/107/6/1201076/1419675_700.jpg'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '',
     'Each': '$ 8.19',
     'Info': '<p><strong>material: </strong>Cork</p>',
     'Line art': '',
     'Name': 'Replacement Plate for MKT9924DB Belt Sander',
     'Product number': 'MKT4230358',
     'Technical specifications': '<p><strong>brand: </strong>Makita</p>',
     'image_1': 'https://www.richelieu.com/documents/docsGr/116/631/4/1166314/1281513_700.jpg',
     '\xa0': '$ 257.80'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '',
     'Each': '$ 8.19',
     'Info': '<p><strong>material: </strong>Graphite</p>',
     'Line art': '',
     'Name': 'Replacement Plate for MKT9924DB Belt Sander',
     'Product number': 'MKT4230366',
     'Technical specifications': '<p><strong>brand: </strong>Makita</p>',
     'image_1': 'https://www.richelieu.com/documents/docsPr/MK/T4/23/03/66/MKT4230366/1281514_700.jpg',
     '\xa0': '$ 257.80'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '- Exclusive single-piece hub design reduces pad vibration and '
                    'ensures smooth performance.',
     'Each': '$ 38.47',
     'Info': '',
     'Line art': '',
     'Name': 'Non-Grip Vacuum Pads',
     'Product number': '9154325',
     'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                                 'in</p><p><strong>density: '
                                 '</strong>Medium</p><p><strong>nap: '
                                 '</strong>Short</p>',
     'image_1': 'https://www.richelieu.com/documents/docsPr/91/54/32/5/9154325/1213330_700.jpg',
     'image_2': 'https://www.richelieu.com/documents/docsPr/91/54/32/5/9154325/1213331_700.jpg'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '- Exclusive single-piece hub design reduces pad vibration and '
                    'ensures smooth performance.',
     'Each': '$ 52.92',
     'Info': '',
     'Line art': '',
     'Name': 'Non-Grip Vacuum Pads',
     'Product number': '9154327',
     'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                                 'in</p><p><strong>density: '
                                 '</strong>Medium</p><p><strong>nap: '
                                 '</strong>Short</p>',
     'image_1': 'https://www.richelieu.com/documents/docsGr/105/122/1/1051221/1213328_700.jpg',
     'image_2': 'https://www.richelieu.com/documents/docsPr/91/54/32/7/9154327/1213332_700.jpg'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '- Unique one-piece hub design reduces pad vibration and '
                    'ensures smooth performance.',
     'Each': '$ 26.84',
     'Info': '',
     'Line art': '',
     'Name': 'Stick-on Non-Vacuum Pads',
     'Product number': '9156106',
     'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                                 'in</p><p><strong>density: </strong>Medium</p>',
     'image_1': 'https://www.richelieu.com/documents/docsGr/105/122/4/1051224/1213343_700.jpg',
     'image_2': 'https://www.richelieu.com/documents/docsPr/91/56/10/6/9156106/1213345_700.jpg'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '- Unique one-piece hub design reduces pad vibration and '
                    'ensures smooth performance.',
     'Each': '$ 51.70',
     'Info': '',
     'Line art': '',
     'Name': 'Stick-on Non-Vacuum Pads',
     'Product number': '9156107',
     'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                                 'in</p><p><strong>density: </strong>Medium</p>',
     'image_1': 'https://www.richelieu.com/documents/docsPr/91/56/10/7/9156107/1213344_700.jpg',
     'image_2': 'https://www.richelieu.com/documents/docsPr/91/56/10/7/9156107/1213346_700.jpg'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': 'Size: 2-1/2" x 14".',
     'Each': '$ 12.36',
     'Info': '',
     'Line art': '',
     'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
     'Product number': 'PC371K060',
     'Technical specifications': '',
     'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/06/0/PC371K060/1263523_700.jpg',
     '\xa0': '$ 148.18'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': 'Size: 2-1/2" x 14".',
     'Each': '$ 12.36',
     'Info': '',
     'Line art': '',
     'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
     'Product number': 'PC371K080',
     'Technical specifications': '',
     'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/08/0/PC371K080/1263524_700.jpg',
     '\xa0': '$ 148.18'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': 'Size: 2-1/2" x 14".',
     'Each': '$ 12.36',
     'Info': '',
     'Line art': '',
     'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
     'Product number': 'PC371K120',
     'Technical specifications': '',
     'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/12/0/PC371K120/1263526_700.jpg',
     '\xa0': '$ 148.18'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': 'Size: 2-1/2" x 14".',
     'Each': '$ 12.36',
     'Info': '',
     'Line art': '',
     'Name': 'Sandpaper Belt 2½ " x 14" for Compact Belt Sander PC371 or PC371K',
     'Product number': 'PC371K100',
     'Technical specifications': '',
     'image_1': 'https://www.richelieu.com/documents/docsPr/PC/37/1K/10/0/PC371K100/1263525_700.jpg',
     '\xa0': '$ 148.18'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': 'Exclusive single-piece hub design reduces pad vibration and '
                    'ensures smooth performance.',
     'Each': '$ 25.22',
     'Info': '',
     'Line art': '',
     'Name': '5" Non-Vacuum Disc Pad Hook-Face',
     'Product number': '91454325T',
     'Technical specifications': '',
     'image_1': 'https://www.richelieu.com/documents/docsGr/120/107/7/1201077/1419678_700.jpg'},

    {'Catalog link': '',
     'Category': 'Tools and Shop Supplies / Workshop Accessories / Tool '
                 'Accessories / Sander Accessories',
     'Description': '- Pads mount with screws.',
     'Each': '$ 31.80',
     'Info': '',
     'Line art': '',
     'Name': 'Plates for Non-Vacuum (Grip-On) Dynabug II Disc Pads - 7.62 cm x '
             '10.79 cm (3" x 4-1/4")',
     'Product number': '9156315',
     'Technical specifications': '<p><strong>thickness: </strong>3/8 '
                                 'in</p><p><strong>density: </strong>Medium</p>',
     'image_1': 'https://www.richelieu.com/documents/docsGr/116/625/4/1166254/1280825_700.jpg',
     '\xa0': '$ 179.95'}
]

data2_ = [
    {
        'a': '2',
        'f': '1',
        'z': '9',
    },
]

main(data_)
# main(data2_)

f.close()
f2.close()

在上面运行会生成两个文件,然后我在终端下面运行

cat headers.csv data.csv > output.csv

然后在excel中打开output.csv

Excel Data

您可能看到的唯一问题是 #NAME?,但这是因为 excel 正在尝试处理文本开头的 -。如果您要处理这样的文本,您需要取消注释以下代码部分

    # if v and v[0] in ('=', '-'):
    #     # Mark the value as text, only needed if you want to display data in excel
    #     # else should be commented out
    #     v = "'" + v

关于python - 当 header 是动态的时,避免将批量数据导出到 csv,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47914844/

相关文章:

python - re.rompile 返回 true 和 false,不允许使用符号

python - 用 Python 编写的 SQL 查询的语法问题

python - Pandas 将多个 CSV 文件合并为一个大文件

mysql - 使用 Mule Studio 读取 CSV

php - 什么设置(ab)使用node.js作为超快速轮询和ajax服务器来更新数据库(类似于Google Spreadsheet方法)

python - 当权重参数为整数时如何从 numpy.bincount 获取整数数组

python - OpenCV Python : How to detect if a window is closed?

java - JAVA 如何知道我的 CSV 文件列的数据类型

java - 在类中使用修饰的 OutputStream/InputStream 字段

c# - "The given path' 格式不受支持。”