python - 加速 Pandas to_sql()？

我有一个 1,000,000 x 50 Pandas DataFrame，我目前正在使用以下方法写入 SQL 表:

df.to_sql('my_table', con, index=False)

这需要很长时间。我已经看到关于如何在线加速此过程的各种解释，但它们似乎都不适用于 MSSQL。

如果我尝试以下方法:

Bulk Insert A Pandas DataFrame Using SQLAlchemy

然后我收到一个 no attribute copy_from 错误。
如果我从以下位置尝试多线程方法:

http://techyoubaji.blogspot.com/2015/10/speed-up-pandas-tosql-with.html

然后我得到一个 QueuePool limit of size 5 overflow 10 reach, connection timed out 错误。

是否有任何简单的方法可以加快 to_sql() 到 MSSQL 表的速度？通过 BULK COPY 或其他方法，但完全来自 Python 代码？

最佳答案

我使用 ctds 执行批量插入，使用 SQL Server 时速度要快得多。在下面的示例中，df 是 pandas DataFrame。 DataFrame 中的列序列与 mydb 的架构相同。

import ctds

conn = ctds.connect('server', user='user', password='password', database='mydb')
conn.bulk_insert('table', (df.to_records(index=False).tolist()))

关于python - 加速 Pandas to_sql()？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41554963/

上一篇：Python Brainf*** – while 循环中的错误

下一篇：python - python 中的触摸条支持

相关文章：

python - 在 pandas 数据框中寻找值(value)

python - Python 中的列表仅避免第一个负数元素

python - 如何将 (n,) 数组添加到 numpy 中的 (n,m) 数组？

mysql - 在sql中使用某些条件将行转换为列

c++ - 使用 XMM0 寄存器和内存提取(C++ 代码)比仅使用 XMM 寄存器的 ASM 快两倍 - 为什么？

php - 如何让基于 curl 的 URL 监控服务轻量级运行？

java - 关于性能和绘制位图

python - python中多个函数的有序reduce

MySQL - 从事件列表聚合数据，查询优化

java - [Microsoft][ODBC 驱动程序管理器] 无效字符串或缓冲区长度异常