python - Pandas 与格式的组合

标签 python python-3.x pandas csv python-itertools

我的数据集看起来像这样,

Col1    Col2    Col3
A       10      x1
B       100     x2
C       1000    x3

这就是我想要的输出,

Col1    Col2    Col3    Col4    Col5    Col6    Col7    Col8    Col9
A       10      x1      Empty   Empty   Empty   Empty   Empty   Empty
B       100     x2      Empty   Empty   Empty   Empty   Empty   Empty
C       1000    x3      Empty   Empty   Empty   Empty   Empty   Empty
A       10      x1      B       100     x2      Empty   Empty   Empty
B       100     x2      C       1000    x3      Empty   Empty   Empty
A       10      x1      B       100     x2      C       1000    x3

我可以使用 itertools.combinations 获取 A、B、C 的各种组合,但如何获取此表?

最佳答案

使用itertools.combinationsitertools.chain.from_iterable:

arr = list(itertools.chain.from_iterable(
    [[j for i in el for j in i] for el in itertools.combinations(df.values.tolist(), i)]
    for i in range(1, len(df)+1)
    )
)

pd.DataFrame(arr)

   0     1   2     3       4     5     6       7     8
0  A    10  x1  None     NaN  None  None     NaN  None
1  B   100  x2  None     NaN  None  None     NaN  None
2  C  1000  x3  None     NaN  None  None     NaN  None
3  A    10  x1     B   100.0    x2  None     NaN  None
4  A    10  x1     C  1000.0    x3  None     NaN  None
5  B   100  x2     C  1000.0    x3  None     NaN  None
6  A    10  x1     B   100.0    x2     C  1000.0    x3

使用concat的另一个选项:

out = pd.concat(
          [pd.DataFrame(list(itertools.combinations(df.values.tolist(), i)))
          for i in range(1, len(df)+1)]
)

out.applymap(lambda x: [] if type(x) == float else x).sum(1).apply(pd.Series)

   0     1   2    3       4    5    6       7    8
0  A    10  x1  NaN     NaN  NaN  NaN     NaN  NaN
1  B   100  x2  NaN     NaN  NaN  NaN     NaN  NaN
2  C  1000  x3  NaN     NaN  NaN  NaN     NaN  NaN
0  A    10  x1    B   100.0   x2  NaN     NaN  NaN
1  A    10  x1    C  1000.0   x3  NaN     NaN  NaN
2  B   100  x2    C  1000.0   x3  NaN     NaN  NaN
0  A    10  x1    B   100.0   x2    C  1000.0   x3

关于python - Pandas 与格式的组合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51700452/

相关文章:

python - 类型错误 : cannot perform __sub__ with this index type: <class 'pandas.core.indexes.base.Index' >

python - 将具有分层列索引的宽格式 pandas DataFrame 转换为整齐格式

python - Unicode 在 tkinter 中显示不正确

python - 如何使用 pandas 编写 Excel 中列的异常代码?

python - 使用列表匹配包含整个单词的正则表达式

Python 日期范围查询

python - 在 Pandas 中舍入一列

python - 正则表达式捕获一组的多个重复

python - Pandas `read_json` 函数将字符串转换为 DateTime 对象,即使指定了 `convert_dates=False` attr

python - df.rename 不会更改 df 列名称,但 df.columns 和 df.set_axis 会更改(Pandas)