python - 如何使用 Python 将按列嵌套的 CSV 文件转换为嵌套字典？

我有一张 Google 类别表。

[嵌套类别的 Google 表格][1] [1]:/image/3OAi5.png/我将其导出到 csv 文件，结果如下:

Substructure,,,
,Foundations,,
,,Standard Foundations,
,,,Wall Foundations   
,,,Column Foundations   
,,,Standard Foundation Supplementary Components   
,,Special Foundations,
,,,Driven Piles   
,,,Bored Piles   
,,,Caissons   
,,,Special Foundation Walls   
,,,Foundation Anchors   
,,,Underpinning   
,,,Raft Foundations   
,,,Pile Caps   
,,,Grade Beams

使用 Python，我想将此 CSV 文件转换为具有以下格式的嵌套字典:

categories = [
    {
      id: 0,
      title: 'parent'
    }, {
      id: 1,
      title: 'parent',
      subs: [
        {
          id: 10,
          title: 'child'
        }, {
          id: 11,
          title: 'child'
        }, {
          id: 12,
          title: 'child'
        }
      ]
    }, {
      id: 2,
      title: 'parent'
    },
    // more data here
];

因此，需要明确的是，每个 csv 行都应添加到如下字典中:{id:x,title:y}，如果它有子项，则应如下所示:{id:x,title: y,subs:[逗号分隔的子字典]}。

我花了大约一天半的时间来解决这里的类似问题，但对于我目前的技能水平来说，它们都太不同了，无法让它们解决这个问题。我感觉很糟糕，非常感谢一些帮助。如果可能的话，我也想在其他场景中使用该解决方案，针对不同级别的 child 。此示例为 child 提供了三个级别，有些只有两个或一个。

非常感谢您的帮助。

最佳答案

递归!

import csv
from pprint import pprint

filename = 'myfile.csv'
with open(filename) as f:
    matrix = list(csv.reader(f))

current_id = -1


def next_id():
    global current_id
    current_id += 1
    return current_id


def group(column, rows):
    if column == len(matrix[0]) - 1:
        return [
            {'id': next_id(), 'title': row[column].strip()}
            for row in rows
        ]

    result = []
    item = None
    sub = None
    for row in rows:
        title = row[column]
        if title:
            if item:
                item['subs'] = group(column + 1, sub)
            item = {'id': next_id(), 'title': title.strip()}
            result.append(item)
            sub = []
        else:
            sub.append(row)
    item['subs'] = group(column + 1, sub)
    return result


pprint(group(0, matrix))

输出:

[{'id': 0,
  'subs': [{'id': 1,
            'subs': [{'id': 2,
                      'subs': [{'id': 3, 'title': 'Wall Foundations'},
                               {'id': 4, 'title': 'Column Foundations'},
                               {'id': 5,
                                'title': 'Standard Foundation Supplementary Components'}],
                      'title': 'Standard Foundations'},
                     {'id': 6,
                      'subs': [{'id': 7, 'title': 'Driven Piles'},
                               {'id': 8, 'title': 'Bored Piles'},
                               {'id': 9, 'title': 'Caissons'},
                               {'id': 10,
                                'title': 'Special Foundation Walls'},
                               {'id': 11, 'title': 'Foundation Anchors'},
                               {'id': 12, 'title': 'Underpinning'},
                               {'id': 13, 'title': 'Raft Foundations'},
                               {'id': 14, 'title': 'Pile Caps'},
                               {'id': 15, 'title': 'Grade Beams'}],
                      'title': 'Special Foundations'}],
            'title': 'Foundations'}],
  'title': 'Substructure'}]

关于python - 如何使用 Python 将按列嵌套的 CSV 文件转换为嵌套字典？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58246880/

python - 如何使用 Python 将按列嵌套的 CSV 文件转换为嵌套字典？

上一篇：python - 为什么线程没有等待输入就结束了？

下一篇：python - BeautifulSoup 如何选择带有空格的 <a href> 和 <td> 元素