python - 使用字典和数组将点符号字符串转换为嵌套的 Python 对象

标签 python arrays python-3.x dictionary

背景

对于某些背景,我正在尝试创建一个使用 Python 3.5 将工作表转换为 API 调用的工具

为了将表格单元格转换为 API 调用所需的模式,我开始使用类似 javascript 的语法来处理电子表格中使用的标题。例如:

工作表标题(字符串)

dict.list[0].id

Python 字典

{
  "dict":
    "list": [
      {"id": "my cell value"}
    ]
}

header 模式也可能具有嵌套数组/字典:

one.two[0].three[0].four.five[0].six

而且我还需要在创建对象后附加到对象,因为我遍历每个 header 。

我尝试过的

添加分支

基于 https://stackoverflow.com/a/47276490/2903486我可以使用 one.two.three.four 等值设置嵌套字典,并且可以在遍历行时附加到现有字典,但我无法添加支持数组:

def add_branch(tree, vector, value):
    key = vector[0]
    tree[key] = value \
        if len(vector) == 1 \
        else add_branch(tree[key] if key in tree else {},
                        vector[1:],
                        value)
    return tree

file = Worksheet(filePath, sheet).readRow()
rowList = []
for row in file:
    rowObj = {}
    for colName, rowValue in row.items():
        rowObj.update(add_branch(rowObj, colName.split("."), rowValue))
    rowList.append(rowObj)
return rowList

我自己的 add_branch 版本

import re, json
def branch(tree, vector, value):
    """
    Used to convert JS style notation (e.g dict.another.array[0].id) to a python object
    Originally based on https://stackoverflow.com/a/47276490/2903486
    """

    # Convert Boolean
    if isinstance(value, str):
        value = value.strip()

        if value.lower() in ['true', 'false']:
            value = True if value.lower() == "true" else False

    # Convert JSON
    try:
        value = json.loads(value)
    except:
        pass

    key = vector[0]
    arr = re.search('\[([0-9]+)\]', key)
    if arr:
        arr = arr.group(0)
        key = key.replace(arr, '')
        arr = arr.replace('[', '').replace(']', '')

        newArray = False
        if key not in tree:
            tree[key] = []
            tree[key].append(value \
                                 if len(vector) == 1 \
                                 else branch({} if key in tree else {},
                                             vector[1:],
                                             value))
        else:
            isInArray = False
            for x in tree[key]:
                if x.get(vector[1:][0], False):
                    isInArray = x[vector[1:][0]]

            if isInArray:
                tree[key].append(value \
                                     if len(vector) == 1 \
                                     else branch({} if key in tree else {},
                                                 vector[1:],
                                                 value))
            else:

                tree[key].append(value \
                                     if len(vector) == 1 \
                                     else branch({} if key in tree else {},
                                                 vector[1:],
                                                 value))

        if len(vector) == 1 and len(tree[key]) == 1:
            tree[key] = value.split(",")
    else:
        tree[key] = value \
            if len(vector) == 1 \
            else branch(tree[key] if key in tree else {},
                        vector[1:],
                        value)
    return tree

还需要什么帮助

我的分支解决方案在添加了一些东西之后实际上现在工作得很好但我想知道我是否在这里做错了/困惑或者是否有更好的方法来处理我正在编辑嵌套数组的地方(我的尝试开始在代码的 if IsInArray 部分)

我希望这两个标题编辑最后一个数组,但我最终在第一个数组上创建了一个重复的字典:

file = [{
    "one.array[0].dict.arrOne[0]": "1,2,3",
    "one.array[0].dict.arrTwo[0]": "4,5,6"
}]
rowList = []
for row in file:
    rowObj = {}
    for colName, rowValue in row.items():
        rowObj.update(add_branch(rowObj, colName.split("."), rowValue))
    rowList.append(rowObj)
return rowList

输出:

[
    {
        "one": {
            "array": [
                {
                    "dict": {
                        "arrOne": [
                            "1",
                            "2",
                            "3"
                        ]
                    }
                },
                {
                    "dict": {
                        "arrTwo": [
                            "4",
                            "5",
                            "6"
                        ]
                    }
                }
            ]
        }
    }
]

代替:

[
    {
        "one": {
            "array": [
                {
                    "dict": {
                        "arrOne": [
                            "1",
                            "2",
                            "3"
                        ],
                        "arrTwo": [
                            "4",
                            "5",
                            "6"
                        ]
                    }
                }
            ]
        }
    }
]

最佳答案

所以我不确定这个解决方案中是否有任何注意事项,但这似乎适用于我提出的一些用例:

import json, re
def build_job():

    def branch(tree, vector, value):
        
        # Originally based on https://stackoverflow.com/a/47276490/2903486

        # Convert Boolean
        if isinstance(value, str):
            value = value.strip()

            if value.lower() in ['true', 'false']:
                value = True if value.lower() == "true" else False

        # Convert JSON
        try:
            value = json.loads(value)
        except:
            pass
        
        key = vector[0]
        arr = re.search('\[([0-9]+)\]', key)
            
        if arr:
            
            # Get the index of the array, and remove it from the key name
            arr = arr.group(0)
            key = key.replace(arr,'')
            arr = int(arr.replace('[','').replace(']',''))
            
            if key not in tree:
                
                # If we dont have an array already, turn the dict from the previous 
                # recursion into an array and append to it
                tree[key] = []
                tree[key].append(value \
                    if len(vector) == 1 \
                    else branch({} if key in tree else {},
                                vector[1:],
                                value))
            else:
                
                # Check to see if we are inside of an existing array here
                isInArray = False
                for i in range(len(tree[key])):
                    if tree[key][i].get(vector[1:][0], False):
                        isInArray = tree[key][i][vector[1:][0]]
                        
                if isInArray and arr < len(tree[key]) \
                   and isinstance(tree[key][arr], list):
                    # Respond accordingly by appending or updating the value
                    tree[key][arr].append(value \
                        if len(vector) == 1 \
                        else branch(tree[key] if key in tree else {},
                                    vector[1:],
                                    value))
                else:
                    # Make sure we have an index to attach the requested array to
                    while arr >= len(tree[key]):
                        tree[key].append({})

                    # update the existing array with a dict
                    tree[key][arr].update(value \
                        if len(vector) == 1 \
                        else branch(tree[key][arr] if key in tree else {},
                                    vector[1:],
                                    value))
            
            # Turn comma deliminated values to lists
            if len(vector) == 1 and len(tree[key]) == 1:
                tree[key] = value.split(",")
        else:
            # Add dictionaries together
            tree.update({key: value \
                if len(vector) == 1 \
                else branch(tree[key] if key in tree else {},
                            vector[1:],
                            value)})
        return tree

    file = [{
        "one.array[0].dict.dont-worry-about-me": "some value",
        "one.array[0].dict.arrOne[0]": "1,2,3",
        "one.array[0].dict.arrTwo[1]": "4,5,6",
        "one.array[1].x.y[0].z[0].id": "789"
    }]
    rowList = []
    for row in file:
        rowObj = {}
        for colName, rowValue in row.items():
            rowObj.update(branch(rowObj, colName.split("."), rowValue))
        rowList.append(rowObj)
    return rowList
print(json.dumps(build_job(), indent=4))

结果:

[
    {
        "one": {
            "array": [
                {
                    "dict": {
                        "dont-worry-about-me": "some value",
                        "arrOne": [
                            "1",
                            "2",
                            "3"
                        ],
                        "arrTwo": [
                            "4",
                            "5",
                            "6"
                        ]
                    }
                },
                {
                    "x": {
                        "y": [
                            {
                                "z": [
                                    {
                                        "id": 789
                                    }
                                ]
                            }
                        ]
                    }
                }
            ]
        }
    }
]

关于python - 使用字典和数组将点符号字符串转换为嵌套的 Python 对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53270227/

相关文章:

python - 在一行中打印 for 循环结果并排序

python GTK : How to insert items from list to combobox

python - 将现有 pandas 数据框中的一些行复制到新数据框中

python - Pandas series.rename 给出 TypeError : 'str' object is not callable error

javascript - 如何在 JavaScript 中将数字数组拆分为单个数字?

python-3.x - 合并字典和计数器对象

python - 根据过滤器使用第二个数据帧中的值更新数据帧

ios - 以编程方式在 plist 文件中添加索引

java - 在 Java 中将 2 的补码字节转换为无符号正值

python - 使用 pytesseract 执行 OCR 时出错