python - 拆分字符串值,然后使用 pandas 创建一个新值

标签 python pandas

我有一个具有以下结构的 json 文件

 [
        {
            "name": "Collection 1",
            "details": [
                {
                    "id": 302,
                    "description":"Book destined for kids",
                },
                {
                    "id": 304,
                    "description":"Book destined for Teen",
                },
                {
                    "id": 305,
                    "description":"Only for teen under the age of 13",
                },
            ]
        },

      {
            "name": "Collection 1",
            "details": [
                {
                    "id": 400,
                    "description":"books for adults to read",
                },
     
            ]
        },

    ]

我需要添加一个新的键/值,该值应该是描述的子字符串,例如[青少年, child ,成人]

预期输出:

[
    {
        "name": "Collection 1",
        "details": [
            {
                "id": 302,
                "description":"Book destined for kids",
                "age range":"kids"
            },
            {
                "id": 304,
                "description":"Book destined for Teen",
                "age range":"teen"
            },
            {
                "id": 305,
                "description":"Only for teen under the age of 13",
                "age range":"teen"
            },
        ]
    },

  {
        "name": "Collection 2",
        "details": [
            {
                "id": 400,
                "description":"books for adults to read",
                "age range":"adults"
            },
        ]
    },
]

任何人都知道如何使用pandas有效地做到这一点(我需要保持相同的结构)

最佳答案

我会这样做:

import json

age_keywords = ["kids", "adults", "teen"] #Extend if needed

#json string to json object
json_string = '[{"name": "Collection 1","details": [{"id": 302,"description":"Book destined for kids"},{"id": 304,"description":"Book destined for Teen"},{"id": 305,"description":"Only for teen under the age of 13"}]},{"name": "Collection 1","details": [{"id": 400,"description":"books for adults to read"}]}]'
json = json.loads(json_string)

#iterate details of every book
for collection in json:
    for detail in collection['details']:
        description = detail['description'] # get description
        for keyword in age_keywords: #iterate every keyword
            if keyword in description.lower(): #check if keyword is in description
                detail['age_range'] = keyword #if keyword in description --> add age_range to json
print(json)
#save json to file --> data.json = filename
with open('data.json', 'w') as f:
    json.dump(data, f)

但我不太明白你想在哪里使用 pandas?

这是我的 json 输出:

[
   {
      "name":"Collection 1",
      "details":[
         {
            "id":302,
            "description":"Book destined for kids",
            "age_range":"kids"
         },
         {
            "id":304,
            "description":"Book destined for Teen",
            "age_range":"teen"
         },
         {
            "id":305,
            "description":"Only for teen under the age of 13",
            "age_range":"teen"
         }
      ]
   },
   {
      "name":"Collection 1",
      "details":[
         {
            "id":400,
            "description":"books for adults to read",
            "age_range":"adults"
         }
      ]
   }
]

关于python - 拆分字符串值,然后使用 pandas 创建一个新值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71573805/

相关文章:

python - Pandas GroupBy 具有特殊总和

python - Pandas .idxmin() 使用 Groupby 抛出 ValueError

Python Pandas 多索引 : keeping same length of level=1 with all level=0 indexes

python - 当重复仅在第一列时,pandas 中 drop_duplicates

python - ValueError at/(未设置所需的参数名称)

python - 满足我的 API 服务器需求的最佳 Python Web 框架

python 在 csv 中找到重复项并删除最旧的

python - 使用正则表达式 OR 运算符来适应用户输入 "A"或 "An"

python - Render_to_string 和 response.content.decode() 不匹配

python - 如何在 Pandas 和 sklearn 中将预测值合并回原始 DataFrame