python - 有没有办法将数据框设为 "unstack"并作为列表值返回

我有一个如下所示的数据框:

import pandas as pd

df = pd.DataFrame({'type_a': [1,0,0,0,0,1,0,0,0,1],
                   'type_b': [0,1,0,0,0,0,0,0,1,1],
                   'type_c': [0,0,1,1,1,1,0,0,0,0],
                   'type_d': [1,0,0,0,0,1,1,0,1,0],
                  })

我想基于这 4 列创建一个新列，只要这 4 列中的值等于 1，它将返回列名，如果同时有多个列等于 1，则它将返回列表这些列名的名称，否则它将是 nan。

输出数据框将如下所示:

df = pd.DataFrame({'type_a': [1,0,0,0,0,1,0,0,0,1],
                   'type_b': [0,1,0,0,0,0,0,0,1,1],
                   'type_c': [0,0,1,1,1,1,0,0,0,0],
                   'type_d': [1,0,0,0,0,1,1,0,1,0],
                   'type':[['type_a','type_d'], 'type_b', 'type_c', 'type_c','type_c', ['type_a','type_c','type_d'], 'type_d', 'nan', ['type_b','type_d'],['type_a','type_b']]
                  })

任何帮助将不胜感激。谢谢!

最佳答案

这也是另一种方式:

import pandas as pd

df['type'] = (pd.melt(df.reset_index(), id_vars='index')
 .query('value == 1')
 .groupby('index')['variable']
 .apply(list))


   type_a  type_b  type_c  type_d                      type
0       1       0       0       1          [type_a, type_d]
1       0       1       0       0                  [type_b]
2       0       0       1       0                  [type_c]
3       0       0       1       0                  [type_c]
4       0       0       1       0                  [type_c]
5       1       0       1       1  [type_a, type_c, type_d]
6       0       0       0       1                  [type_d]
7       0       0       0       0                       NaN
8       0       1       0       1          [type_b, type_d]
9       1       1       0       0          [type_a, type_b]

关于python - 有没有办法将数据框设为 "unstack"并作为列表值返回，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73887700/

上一篇：r - R 操作中的一对一应用/映射

下一篇：python - 连续数据帧行之间的时间差

python - 在 Python 中将列中的数字相加无法正常工作？

r - 比较两个数据框中的数据并隔离结果

python - pipenv : how to force virtualenv directory?

python - 清理 python 返回赋值

python - 从日期时间到时间戳Python

python - 使用 strptime 的日期超出月份范围

python - Pylab - 为一些子图调整 hspace

python - 如何从具有向量列的 DataFrame 创建 tensorflow 数据集？

python - 根据每个 DataFrame 行的匹配值将列表的值添加到 pandas DataFrame 列