我有一个看起来像这样的数据框
我想解析评级中的数据,使其看起来像
我尝试使用 } 作为分隔符来分解数据
#explode ratings by title
df['ratings'] = df['ratings'].str.split('}')
df_explode_ratings = df.explode('ratings').reset_index(drop=True)
cols = list(df_explode_ratings.columns)
cols.append(cols.pop(cols.index('title')))
df_explode_ratings = df_explode_ratings[cols]
df_explode_cols = ['title', 'ratings']
df_explode_ratings = df_explode_ratings.drop(columns=[col for col in df_explode_ratings if col not in df_explode_cols])
这可行,但我仍然需要进一步解析它,我打算在 上再次拆分,但最终在评级列中得到 NaN 值。
最佳答案
您的列“Ratings”是字符串还是字典列表?如果是字符串,您可以应用 ast.literal_eval
然后分解列(如果是字典列表,您可以省略 literal_eval
步骤):
from ast import literal_eval
df.Ratings = df.Ratings.apply(literal_eval)
df = df.explode("Ratings")
df["Rating"] = df.apply(lambda x: x["Ratings"]["name"], axis=1)
df["Count"] = df.apply(lambda x: x["Ratings"]["count"], axis=1)
df = df.drop(columns="Ratings")
print(df)
打印:
Title Rating Count
0 Do schools kill creativity? Funny 19645
0 Do schools kill creativity? Beautiful 4573
0 Do schools kill creativity? Ingenious 6073
0 Do schools kill creativity? Courageous 3253
0 Do schools kill creativity? Longwinded 387
0 Do schools kill creativity? Confusing 242
0 Do schools kill creativity? Informative 7346
0 Do schools kill creativity? Fascinating 10581
0 Do schools kill creativity? Unconvincing 300
0 Do schools kill creativity? Persuasive 10704
0 Do schools kill creativity? Jaw-dropping 4439
0 Do schools kill creativity? OK 1174
0 Do schools kill creativity? Obnoxious 209
0 Do schools kill creativity? Inspiring 24924
1 Simple designs to save a life Ingenious 269
1 Simple designs to save a life Courageous 92
1 Simple designs to save a life Funny 131
1 Simple designs to save a life Confusing 42
1 Simple designs to save a life Beautiful 91
1 Simple designs to save a life Informative 446
1 Simple designs to save a life Inspiring 397
1 Simple designs to save a life Fascinating 515
1 Simple designs to save a life Longwinded 45
1 Simple designs to save a life Unconvincing 49
1 Simple designs to save a life Persuasive 1234
1 Simple designs to save a life OK 73
1 Simple designs to save a life Jaw-dropping 139
1 Simple designs to save a life Obnoxious 21
但正如评论中所建议的,更好的是在创建 DataFrame 之前处理/解析数据。
关于python - 如何从列中提取字符串的某些部分以在 Pandas 中创建其他列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67115140/