我有一个数据框,我使用 group by 对它们进行分组,如下所示
Name Nationality age
Peter UK 28
John US 29
Wiley UK 28
Aster US 29
grouped = self_ex_df.groupby([Nationality, age])
- 现在我想为每个值附加一个唯一的 ID
我正在尝试这个,但不确定它是否有效?
uniqueID = 'ID_'+ grouped.groups.keys().astype(str)
uniqueID Name Nationality age
ID_UK28 Peter UK 28
ID_US29 John US 29
ID_UK28 Wiley UK 28
ID_US29 Aster US 29
我现在想将其合并到一个新的 DF 中,如下所示
uniqueID Nationality age Text ID_UK28 UK 28 Peter and Whiley have a combined age of 56 ID_US_29 US 29 John and Aster have a combined age of 58
如何实现上述目标?
最佳答案
希望足够接近,无法获得平均年龄:
import pandas as pd
#create dataframe
df = pd.DataFrame({'Name': ['Peter', 'John', 'Wiley', 'Aster'], 'Nationality': ['UK', 'US', 'UK', 'US'], 'age': [28, 29, 28, 29]})
#make uniqueID
df['uniqueID'] = 'ID_' + df['Nationality'] + df['age'].astype(str)
#groupby has agg method that can take dict and preform multiple aggregations
df = df.groupby(['uniqueID', 'Nationality']).agg({'age': 'sum', 'Name': lambda x: ' and '.join(x)})
#to get text you just combine new Name and sum of age
df['Text'] = df['Name'] + ' have a combined age of ' + df['age'].astype(str)
关于python - 如何将 group by Keys 应用到相关组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43719179/