python - 如何将嵌套未知层数的新 JSON 节点添加到现有 JSON 文件中?

标签 python json python-3.x machine-learning

最近,我偶然发现了下图,它描述了一个控制台应用程序,该应用程序试图通过提出一系列问题来猜测用户正在思考的动物,并在猜测错误时更新所提出的问题:

algo

尽管我对机器学习一无所知,但我认为这是一个使用决策树复制的非常简单的程序,所以我将下面的 python 代码组合在一起:

import json

json_file = open("DecisionTree1.json", "r")
decision_tree = json.loads(json_file.read())
partial_decision_tree = decision_tree["start"]


def get_user_input(prompt, validation):
    if validation == "yes_no":
        print(prompt)
        while True:
            answer = input()
            if answer.lower() not in ('yes', 'no'):
                print("Please enter 'Yes' or 'No'")
            else:
                return answer.lower()
    elif validation == "not_empty":
        while True:
            answer = input(prompt + "\n")
            if answer != "":
                return answer.lower()


def create_new_node(guess):
    correct_answer = get_user_input("What animal were you thinking of?", "not_empty")
    new_question = get_user_input("Enter a question for which the answer is 'Yes' for " + correct_answer + " and 'No' for " + guess, "not_empty")
    new_node = json.loads('{"question": "' + new_question + '","children":{"yes": {"question": "Is it a ' + correct_answer + '?","children": null},"no": {"question": "Is it a rabbit?","children": null}}}')
    return json.dumps(new_node)


answer_array = list()


while partial_decision_tree["children"]:
    answer = get_user_input(partial_decision_tree["question"], "yes_no")
    answer_array.append(answer)
    partial_decision_tree = partial_decision_tree["children"][answer]

if get_user_input(partial_decision_tree["question"], "yes_no") == "no":
    select_conditions = '["start"]'
    for answer in answer_array:
        select_conditions += '["children"]["' + answer + '"]'
    query = "decision_tree" + select_conditions + " = '" + create_new_node(partial_decision_tree["question"].split(" ")[-1][0:len(partial_decision_tree["question"].split(" ")[-1])-1]) + "'"
    exec(query)

JSON 文件 DecisionTree1.json 包含以下数据,应该表示一个(非常小的)决策树:

{
    "start":
    {
        "question": "Is it smaller than a bicycle?",
        "children": 
        {
            "yes": {
                "question": "Is it a rabbit?",
                "children": null
            },
            "no": {
                "question": "Is it an elephant?",
                "children": null
            }
        }
    }
}

这个想法应该是,如果用户猜测不正确,那么程序在进行猜测时所查看的叶节点应该替换为新的内部节点,该内部节点对程序的猜测提供了额外的过滤级别。

就 JSON 而言,这意味着:

  1. 用用户指定的问题替换包含当前猜测的节点的“question”属性
  2. 将节点的“children”属性更新为包含两个新节点(而不是 null),每个节点构成一个猜测(即叶节点)

我的问题是如何以这种方式更新文件中的 JSON?

目前,我的python中的query变量更新了JSON,使得“children”属性的值变成了一个字符串,而不是两个子节点.

编辑:根据 martineau 的评论,下面是 JSON 更新后的外观示例:

假设用户正在思考一只乌龟。按照目前的情况,程序会错误地猜测他们的动物是兔子。当被要求“输入一个问题,对于乌龟来说,答案是肯定,而对于兔子来说,答案是否定”时,他们可能会指定问题“它有壳吗?”。现有的 JSON(如上所示)应变为

{
    "start":
    {
        "question": "Is it smaller than a bicycle?",
        "children": 
        {
            "yes": {
                "question": "Does it have a shell?",
                "children":
                {
                    "yes": {
                        "question": "Is it a tortoise?",
                        "children": null
                    },
                    "no": {
                        "question": "Is it a rabbit?",
                        "children": null
                    }
                }
            },
            "no": {
                "question": "Is it an elephant?",
                "children": null
            }
        }
    }
}

最佳答案

你的问题确实很有趣。我尝试了一下,但在解释解决方案之前,我想指出我在处理代码时看到的一些概念问题。以下是我所做的更改:

  1. 我认为您的 JSON 结构 不太容易解析。我通过删除“子”节点选择了更舒服的方式,因为总是只有两种可能的选择:

  2. 不要混淆:加载后,像您这样的 JSON 消息只不过是嵌入了其他 dict 的简单 dict。您不需要自己在 JSON 文件中编写任何内容,因为 Python 知道如何翻译此类结构。

  3. 就性能和易用性而言,在树层次结构中使用迭代被认为是一个糟糕的选择。使用递归非常简单,我强烈建议您了解它如何在某些情况下(例如本例)发挥作用

  4. 我发现验证类似于迭代过程的副作用。在我看来,您不需要到处都有额外的 validation 参数,但可以在您真正需要时将其重新集成

  5. 我通过添加一些内容来“优化”您的代码,例如 if __name__ == "__main__" 其目的是检查 Python 文件是否作为模块启动或者是嵌入到另一个解决方案中。您还滥用了 open 指令来读取和写入文件。我为你解决了这些问题

希望您能通过阅读我的解决方案学到一些技巧,这至少是我编写它的原因。我不会假装知道一切,因此也可能会出现错误,但这应该有助于您继续您的项目。我通常希望您将脚本的学习部分拆分为其他函数。

主要脚本:

import json

def get_question_node(tree, question):
    """
    Finds the node which contains the given question, recursively.

    :param tree: the tree level from where to start
    :param question: the question to find
    """
    if 'question' in tree and tree['question'] == question:
        # If the current node contains the question, return it
        return tree
    if 'yes' in tree and tree['yes']:
        # If there is a 'yes' node, check its underlying question
        result = get_question_node(tree['yes'], question)
        if result:
            return result
    if 'no' in tree and tree['no']:
        # If there is a 'no' node, check its underlying question
        result = get_question_node(tree['no'], question)
        if result:
            return result

def guess(tree):
    """
    Guesses based on a user's input and the given decision tree.

    :param tree: the current node to go through for the user
    """
    # A question has been found
    question = tree['question']
    answer = input(question + '\n').lower()

    if answer == 'yes':
        if tree['yes']:
            # There are sub questions, ask them
            return guess(tree['yes'])
        else:
            # Final question answered correctly, so we won!
            print('Yay, I guessed correctly!')
            return True
    elif answer == 'no':
        if tree['no']:
            # There are sub questions, ask them
            return guess(tree['no'])
        else:
            # No more question, create a new one and "learn"
            correct_answer = input('What animal were you thinking of?\n')
            new_question = input('Enter a question for which the answer is "Yes" for {}\n'.format(correct_answer))

            # Return to the caller to fix the order of questions
            return {
                'old': tree['question'],
                'new': new_question,
                'correct_answer': correct_answer,
            }
    else:
        # Answer needs to be yes or no, any other answer loops back to the question
        print('Sorry, I didn\'t get that... let\'s try again!')
        return guess(tree)


if __name__ == '__main__':
    # Load and parse the decision tree
    with open("DecisionTree1.json", "r") as json_file:
        decision_tree = json.loads(json_file.read())

    # Start guessing
    partial_decision_tree = decision_tree["start"]
    result = guess(partial_decision_tree)

    if type(result) == dict:
        # Ah! We learned something new, let's swap questions
        question_node = get_question_node(partial_decision_tree, result['old'])
        new_yes = {
            'question': 'Is it a {}?'.format(result['correct_answer']),
            'yes': None,
            'no': None,
        }
        new_no = {
            'question': question_node['question'],
            'yes': question_node['yes'],
            'no': question_node['no'],
        }

        question_node['no'] = new_no
        question_node['yes'] = new_yes
        question_node['question'] = result['new']

    # Persist changes to the decision tree file
    with open('DecisionTree1.json', 'w') as tree_file:
        json.dump(decision_tree, tree_file, indent=4)

以及改进的DecisionTree1.json:

{
    "start":
    {
        "question": "Is it smaller than a bicycle?",
        "yes": {
            "question": "Is it a rabbit?",
            "yes": null,
            "no": null
        },
        "no": {
            "question": "Is it an elephant?",
            "yes": null,
            "no": null
        }
    }
}

我希望我回答了你的问题。

关于python - 如何将嵌套未知层数的新 JSON 节点添加到现有 JSON 文件中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60129207/

相关文章:

python-3.x - "Getting SSLError while trying to ' 获取 ' content using ' 请求 ' from a ' HTTPS ' url"

python - 创建新列 PySpark SQL - Python

将 sys.stdout 缓冲区设置为零的 Python 标准习惯用法不适用于 Unicode

Python flask.ext.mysql 已弃用?

javascript - 如何映射/合并不同结构的对象

javascript - 使用ajax增量加载大量数据

python-3.x - pd.read_html-值错误 : No tables found

python - 如何在 VS Code 上调试? "Bad file descriptor"错误

ios - JSONModel 无效的 JSON 数据

python - 在pyspark中进行分组时,对另一列上满足额外条件的元素进行计数