为了避免同一用户的重复,我想使用 pandas groupby
函数整齐地组织一个 {k:artist1,artist2,artist3,etc}
的嵌套字典。这是示例数据(我的直觉告诉我链接一个聚合函数?)
...像df.groupby('users')
?
users artist
0 00001411dc427966b17297bf4d69e7e193135d89 the most serene republic
1 00001411dc427966b17297bf4d69e7e193135d89 stars
2 00001411dc427966b17297bf4d69e7e193135d89 broken social scene
3 00001411dc427966b17297bf4d69e7e193135d89 have heart
4 00001411dc427966b17297bf4d69e7e193135d89 luminous orange
5 00001411dc427966b17297bf4d69e7e193135d89 boris
6 00001411dc427966b17297bf4d69e7e193135d89 arctic monkeys
7 00001411dc427966b17297bf4d69e7e193135d89 bright eyes
8 00001411dc427966b17297bf4d69e7e193135d89 coaltar of the deepers
9 00001411dc427966b17297bf4d69e7e193135d89 polar bear club
10 00001411dc427966b17297bf4d69e7e193135d89 the libertines
11 00001411dc427966b17297bf4d69e7e193135d89 death from above 1979
12 00001411dc427966b17297bf4d69e7e193135d89 owl city
13 00001411dc427966b17297bf4d69e7e193135d89 coldplay
14 00001411dc427966b17297bf4d69e7e193135d89 okkervil river
15 00001411dc427966b17297bf4d69e7e193135d89 jim sturgess
16 00001411dc427966b17297bf4d69e7e193135d89 deerhoof
17 00001411dc427966b17297bf4d69e7e193135d89 fear before the march of flames
18 00001411dc427966b17297bf4d69e7e193135d89 breathe carolina
19 00001411dc427966b17297bf4d69e7e193135d89 mstrkrft
最佳答案
我相信您正在此处寻找 groupby
+ agg
。
df.groupby('users').artist.apply(list).to_dict()
{'00001411dc427966b17297bf4d69e7e193135d89': ['the most serene republic',
'stars',
'broken social scene',
'have heart',
'luminous orange',
'boris',
...
]
}
关于python - pandas dataframe - 根据唯一用户对艺术家进行分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48493370/