python - Pandas 获取行组合和组

我有一个df

我必须找到组的所有组合(假设是 2 对)，然后必须通过唯一 ID 将它们分组

输出:

目前我找到了一种生成所有组合的方法，但似乎无法按唯一 ID 进行分组

我还引用了以下链接: Pandas find all combinations of rows under a budget

生成对的代码:

from itertools import combinations
li_4 =[]
for index in list(combinations(df.group.unique(),2)):
       li_4.append([index[0],index[1]])

最佳答案

我们可以先进行merge，然后进行np.sort，并在使用drop_duplicates删除重复项后将结果传递给crosstab >

s = df.merge(df,on='Id')
s['New'] = list(map(lambda x : ''.join(x),np.sort(s[['Group_x','Group_y']].values,axis=1).tolist()))
s = s.drop_duplicates(['Id','New'])
s = pd.crosstab(s.Id,s.New)
s
Out[88]: 
New  aa  ab  ac  ad  af  bb  bc  bd  be  bf  cc  cd  dd  de  ee  ff
Id                                                                 
2     1   1   1   1   0   1   1   1   0   0   1   1   1   0   0   0
3     0   0   0   0   0   1   0   1   1   0   0   0   1   1   1   0
4     1   1   0   0   1   1   0   0   0   1   0   0   0   0   0   1

关于python - Pandas 获取行组合和组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67595026/

上一篇：python - 如何在单个 Conda 环境中安装两个版本的 Python？

下一篇：python - 如何从 Cython 调用 C-API 函数，例如 PyUnicode_READ_CHAR？

相关文章：

Python Pandas 绘制多索引指定 x 和 y

Python 使用方法链保留子类

python - 为什么我收到多处理的递归错误？

python - 如何修复 'Float' 对象没有属性 'exp'？

python - tensorflow 中具有权重衰减参数的 SGD

python - 从 pandas 数据框中的变量中提取数值

Python:在 pandas 中读取 #0000000000 格式的 Excel 列时如何保留前导零

python - 仅监听 python 中的 cloud firestore 集合的添加内容

python - MXNet 打印中间符号值

python - Pandas 错误 TypeError : data type not understood