python - 带有 MultiIndex 列的 Pandas to_sql 索引

我正在尝试将具有 MultiIndex 列的 DataFrame 写入 MS SQL 数据库。索引的输出为 NULL。如果我只有单列，它工作正常。

l1 = ['foo', 'bar']
l2 = ['a', 'b', 'c']
cols = pd.MultiIndex.from_product([l1, l2])
df = pd.DataFrame(np.random.random((3,6)), index=[1,2,3], columns=cols)
df.to_sql('test', conn, if_exists='replace')

How it looks in SQL

这是一个错误还是我需要做其他事情才能正确编写索引？

最佳答案

您可以连接数据框的每个第一层:

l1 = ['foo', 'bar']
l2 = ['a', 'b', 'c']
cols = pd.MultiIndex.from_product([l1, l2])
df = pd.DataFrame(np.random.random((3,6)), index=[1,2,3], columns=cols)
pd.concat([df['foo'],df['bar']]).to_sql('test', conn, if_exists='replace')

此表中的结果:

index                a                      b                      c
-------------------- ---------------------- ---------------------- ----------------------
1                    0.803555407060559      0.0185295254735488     0.702949767792433
2                    0.257823384796912      0.985716269729717      0.749719964181681
3                    0.909115063376081      0.236242172285058      0.932813789580215
1                    0.898527697819921      0.874431627680823      0.805393798630385
2                    0.97537971906356       0.319221893730643      0.584449093938984
3                    0.678625747581189      0.606321574437647      0.437746301372623

如果您想要更接近您链接到的 SQL 表示例的内容，您可以使用合并并为每一列添加后缀:

l1 = ['foo', 'bar']
l2 = ['a', 'b', 'c']
cols = pd.MultiIndex.from_product([l1, l2])
df = pd.DataFrame(np.random.random((3,6)), index=[1,2,3], columns=cols)
pd.merge(df['foo'], df['bar'],
         right_index=True, left_index=True,
         suffixes=['_' + s for s in df.columns.levels[0].to_list()]
         ).to_sql('test', conn, if_exists='replace')

这会让你:

index                a_bar                  b_bar                  c_bar                  a_foo                  b_foo                  c_foo
-------------------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ----------------------
1                    0.989229457189419      0.0759829132299624     0.172846406489083      0.154227020200058      0.386003904079867      0.733402063652856
2                    0.839971061213949      0.975761261358953      0.252917398323633      0.0881692963378311     0.560403977291031      0.806066332511174
3                    0.914544313717528      0.921965094934119      0.821869705625485      0.337292501691803      0.125899685577926      0.527830968883373

关于python - 带有 MultiIndex 列的 Pandas to_sql 索引，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43945361/

python - 带有 MultiIndex 列的 Pandas to_sql 索引

上一篇：python - 计算允许 R 中的 QWERTY 错误的 Levenshtein 距离

下一篇：Python 日志记录按级别格式化