我的数据集看起来像这样,
Col1 Col2 Col3
A 10 x1
B 100 x2
C 1000 x3
这就是我想要的输出,
Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9
A 10 x1 Empty Empty Empty Empty Empty Empty
B 100 x2 Empty Empty Empty Empty Empty Empty
C 1000 x3 Empty Empty Empty Empty Empty Empty
A 10 x1 B 100 x2 Empty Empty Empty
B 100 x2 C 1000 x3 Empty Empty Empty
A 10 x1 B 100 x2 C 1000 x3
我可以使用 itertools.combinations 获取 A、B、C 的各种组合,但如何获取此表?
最佳答案
使用itertools.combinations
和itertools.chain.from_iterable
:
arr = list(itertools.chain.from_iterable(
[[j for i in el for j in i] for el in itertools.combinations(df.values.tolist(), i)]
for i in range(1, len(df)+1)
)
)
pd.DataFrame(arr)
0 1 2 3 4 5 6 7 8
0 A 10 x1 None NaN None None NaN None
1 B 100 x2 None NaN None None NaN None
2 C 1000 x3 None NaN None None NaN None
3 A 10 x1 B 100.0 x2 None NaN None
4 A 10 x1 C 1000.0 x3 None NaN None
5 B 100 x2 C 1000.0 x3 None NaN None
6 A 10 x1 B 100.0 x2 C 1000.0 x3
使用concat
的另一个选项:
out = pd.concat(
[pd.DataFrame(list(itertools.combinations(df.values.tolist(), i)))
for i in range(1, len(df)+1)]
)
out.applymap(lambda x: [] if type(x) == float else x).sum(1).apply(pd.Series)
0 1 2 3 4 5 6 7 8
0 A 10 x1 NaN NaN NaN NaN NaN NaN
1 B 100 x2 NaN NaN NaN NaN NaN NaN
2 C 1000 x3 NaN NaN NaN NaN NaN NaN
0 A 10 x1 B 100.0 x2 NaN NaN NaN
1 A 10 x1 C 1000.0 x3 NaN NaN NaN
2 B 100 x2 C 1000.0 x3 NaN NaN NaN
0 A 10 x1 B 100.0 x2 C 1000.0 x3
关于python - Pandas 与格式的组合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51700452/