Python - 对 pandas 数据框中列表中的行进行分组

标签 python list pandas group-by

I have a dataframe like this:

long  lat      Place 
-6.779       61.9     Aarhus  
-6.790     62.0       Aarhus      
54.377     24.4       Dhabi   
38.834     9.0        Addis 
35.698     9.2        Addis 
    Is it possible to transform the dataframe into a format like below?

Office    long + lat
Aarhus     [[-6.779,61.9], [-6.790,62.0]]
Dhabi      [[54.377]]
Addis      [[38.834,9.0], [35.698,9.2]]

I tried different methods but still couldn't work this out. This is what I tried to get a list for each distinct place value:

df2["index"] = df2.index
df2["long"]=df2.groupby('index')['long'].apply(list)
list 1= [] 
for values in ofce_list:
    if df['Office'].any() == values:
        list1.append(df.loc[df['Office'] == values, 'long'])

    But this returned a series in a list instead which is not desired. Please help. Thank you so much.

最佳答案

 df.groupby('Place')[['long','lat']].apply(lambda x :x.values.tolist()).\
      reset_index(name='long + lat')
Out[1380]: 
    Place                       long + lat
0  Aarhus  [[-6.779, 61.9], [-6.79, 62.0]]
1   Addis   [[38.834, 9.0], [35.698, 9.2]]
2   Dhabi     [[54.376999999999995, 24.4]]

关于Python - 对 pandas 数据框中列表中的行进行分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47162248/

相关文章:

python - 从字符串的其余部分拆分前导空格

java - java中使用for循环比较两个列表

pandas - 哪些 pandas 方法具有 engine_kwargs 来支持 numba?

python - 无法按分类列过滤 pandas 数据框

python - HTTP 错误 504 : Gateway Time-out when trying to read a reddit comments post

python - 使用 Python 多处理/线程解决数据不一致问题

python - 洗牌随机整数,打印所有可能性

c# - 在 C# 列表中查找重复项的最快方法

python - 如果特定键的值相同,则合并字典

python - 属性错误: 'Timestamp' object has no attribute 'read'