python - 如何使用 pyodbc 加速批量插入 MS SQL Server

下面是我需要帮助的代码。我必须运行它超过 1,300,000 行，这意味着插入大约 300,000 行需要 40 分钟。

我认为批量插入是加快速度的途径吗？还是因为我通过 for data in reader: 部分遍历行？

#Opens the prepped csv file
with open (os.path.join(newpath,outfile), 'r') as f:
    #hooks csv reader to file
    reader = csv.reader(f)
    #pulls out the columns (which match the SQL table)
    columns = next(reader)
    #trims any extra spaces
    columns = [x.strip(' ') for x in columns]
    #starts SQL statement
    query = 'bulk insert into SpikeData123({0}) values ({1})'
    #puts column names in SQL query 'query'
    query = query.format(','.join(columns), ','.join('?' * len(columns)))

    print 'Query is: %s' % query
    #starts curser from cnxn (which works)
    cursor = cnxn.cursor()
    #uploads everything by row
    for data in reader:
        cursor.execute(query, data)
        cursor.commit()

我是故意动态选择列标题的(因为我想创建尽可能 Python 的代码)。

SpikeData123 是表名。

最佳答案

正如对另一个答案的评论中所述，T-SQL BULK INSERT 命令仅在要导入的文件与 SQL Server 实例位于同一台计算机上或位于 SMB 中时才有效/CIFS SQL Server 实例可以读取的网络位置。因此它可能不适用于源文件位于远程客户端的情况。

pyodbc 4.0.19 添加了 Cursor#fast_executemany在这种情况下可能会有所帮助的功能。 fast_executemany 默认为“off”，下面的测试代码...

cnxn = pyodbc.connect(conn_str, autocommit=True)
crsr = cnxn.cursor()
crsr.execute("TRUNCATE TABLE fast_executemany_test")

sql = "INSERT INTO fast_executemany_test (txtcol) VALUES (?)"
params = [(f'txt{i:06d}',) for i in range(1000)]
t0 = time.time()
crsr.executemany(sql, params)
print(f'{time.time() - t0:.1f} seconds')

...在我的测试机器上执行大约需要 22 秒。只需添加 crsr.fast_executemany = True ...

cnxn = pyodbc.connect(conn_str, autocommit=True)
crsr = cnxn.cursor()
crsr.execute("TRUNCATE TABLE fast_executemany_test")

crsr.fast_executemany = True  # new in pyodbc 4.0.19

sql = "INSERT INTO fast_executemany_test (txtcol) VALUES (?)"
params = [(f'txt{i:06d}',) for i in range(1000)]
t0 = time.time()
crsr.executemany(sql, params)
print(f'{time.time() - t0:.1f} seconds')

...将执行时间缩短到 1 秒多一点。

关于python - 如何使用 pyodbc 加速批量插入 MS SQL Server，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29638136/

python - 如何使用 pyodbc 加速批量插入 MS SQL Server

上一篇：python - 通过 XPath 解析 HTML

下一篇：python - 如何在 Python 中创建一个位数组？