我有一个如下所示的数据框:
import pandas as pd
df = pd.DataFrame({'type_a': [1,0,0,0,0,1,0,0,0,1],
'type_b': [0,1,0,0,0,0,0,0,1,1],
'type_c': [0,0,1,1,1,1,0,0,0,0],
'type_d': [1,0,0,0,0,1,1,0,1,0],
})
我想基于这 4 列创建一个新列,只要这 4 列中的值等于 1,它将返回列名,如果同时有多个列等于 1,则它将返回列表这些列名的名称,否则它将是 nan。
输出数据框将如下所示:
df = pd.DataFrame({'type_a': [1,0,0,0,0,1,0,0,0,1],
'type_b': [0,1,0,0,0,0,0,0,1,1],
'type_c': [0,0,1,1,1,1,0,0,0,0],
'type_d': [1,0,0,0,0,1,1,0,1,0],
'type':[['type_a','type_d'], 'type_b', 'type_c', 'type_c','type_c', ['type_a','type_c','type_d'], 'type_d', 'nan', ['type_b','type_d'],['type_a','type_b']]
})
任何帮助将不胜感激。谢谢!
最佳答案
这也是另一种方式:
import pandas as pd
df['type'] = (pd.melt(df.reset_index(), id_vars='index')
.query('value == 1')
.groupby('index')['variable']
.apply(list))
type_a type_b type_c type_d type
0 1 0 0 1 [type_a, type_d]
1 0 1 0 0 [type_b]
2 0 0 1 0 [type_c]
3 0 0 1 0 [type_c]
4 0 0 1 0 [type_c]
5 1 0 1 1 [type_a, type_c, type_d]
6 0 0 0 1 [type_d]
7 0 0 0 0 NaN
8 0 1 0 1 [type_b, type_d]
9 1 1 0 0 [type_a, type_b]
关于python - 有没有办法将数据框设为 "unstack"并作为列表值返回,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73887700/