(或列表的列表...我刚刚编辑)
是否有现有的 python/pandas 方法来转换这样的结构
food2 = {}
food2["apple"] = ["fruit", "round"]
food2["bananna"] = ["fruit", "yellow", "long"]
food2["carrot"] = ["veg", "orange", "long"]
food2["raddish"] = ["veg", "red"]
进入像这样的数据透视表?
+---------+-------+-----+-------+------+--------+--------+-----+
| | fruit | veg | round | long | yellow | orange | red |
+---------+-------+-----+-------+------+--------+--------+-----+
| apple | 1 | | 1 | | | | |
+---------+-------+-----+-------+------+--------+--------+-----+
| bananna | 1 | | | 1 | 1 | | |
+---------+-------+-----+-------+------+--------+--------+-----+
| carrot | | 1 | | 1 | | 1 | |
+---------+-------+-----+-------+------+--------+--------+-----+
| raddish | | 1 | | | | | 1 |
+---------+-------+-----+-------+------+--------+--------+-----+
天真地,我可能只是循环浏览字典。我知道如何在每个内部列表上使用 map ,但我不知道如何在字典中加入/堆叠它们。一旦我加入他们,我就可以使用 pandas.pivot_table
for key in food2:
attrlist = food2[key]
onefruit_pairs = map(lambda x: [key, x], attrlist)
one_fruit_frame = pd.DataFrame(onefruit_pairs, columns=['fruit', 'attr'])
print(one_fruit_frame)
fruit attr
0 bananna fruit
1 bananna yellow
2 bananna long
fruit attr
0 carrot veg
1 carrot orange
2 carrot long
fruit attr
0 apple fruit
1 apple round
fruit attr
0 raddish veg
1 raddish red
最佳答案
使用 Pandas 的答案。
# Test data
food2 = {}
food2["apple"] = ["fruit", "round"]
food2["bananna"] = ["fruit", "yellow", "long"]
food2["carrot"] = ["veg", "orange", "long"]
food2["raddish"] = ["veg", "red"]
df = DataFrame(dict([ (k,Series(v)) for k,v in food2.items() ]))
# pivoting to long format
df = pd.melt(df, var_name='item', value_name='categ')
# cross-tabulation
df = pd.crosstab(df['item'], df['categ'])
# sorting index, maybe not necessary
df.sort_index(inplace=True)
df
# categ fruit long orange red round veg yellow
# item
# apple 1 0 0 0 1 0 0
# bananna 1 1 0 0 0 0 1
# carrot 0 1 1 0 0 1 0
# raddish 0 0 0 1 0 1 0
关于python - 将不规则的列表字典转换为 pandas 数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34727716/