我有这样的东西:
fromJobtitle toJobtitle size
0 CEO CEO 65
1 CEO Vice President 23
2 CEO Employee 56
3 Vice President CEO 112
4 Employee CEO 20
我想计算同时出现的次数,以便它结合两次出现(仅显示 2 之间有多少元素)
示例输出:
0 CEO Vice President 135
1 CEO Employee 76
2 CEO CEO 65
最佳答案
import pandas as pd
df = pd.DataFrame({
'fromJobtitle': ['CEO', 'CEO', 'CEO', 'Vice President', 'Employee'],
'toJobtitle': ['CEO', 'Vice President', 'Employee', 'CEO', 'CEO'],
'size': [65, 23, 56, 112, 20]
})
df['combination'] = df.apply(lambda row: tuple(sorted([
row['fromJobtitle'],
row['toJobtitle']
])), axis=1)
然后:
df = df.groupby('combination').sum().reset_index()
结果:
combination size
0 (CEO, CEO) 65
1 (CEO, Employee) 76
2 (CEO, Vice President) 135
最后:
df['from'] = df.apply(lambda row: row['combination'][0], axis=1)
df['to'] = df.apply(lambda row: row['combination'][1], axis=1)
df = df.drop('combination', axis=1)
df.head()
结果:
size from to
0 65 CEO CEO
1 76 CEO Employee
2 135 CEO Vice President
关于python - 从数据框中提取共现数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67822416/