我有一个 MS Access 表 (SearchAdsAccountLevel),需要通过 python 脚本经常更新。我已经设置了 pyodbc 连接,现在我想根据 Date_ AND CampaignId 字段是否与 df 数据匹配,将行从 pandas df 更新/插入到 MS Access 表。
查看前面的示例,我构建了 UPDATE 语句,该语句使用 iterrows 迭代 df 中的所有行并执行 SQL 代码,如下所示:
connection_string = (
r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
r"c:\AccessDatabases\Database2.accdb;"
)
cnxn = pyodbc.connect(connection_string, autocommit=True)
crsr = cnxn.cursor()
for index, row in df.iterrows():
crsr.execute("UPDATE SearchAdsAccountLevel SET [OrgId]=?, [CampaignName]=?, [CampaignStatus]=?, [Storefront]=?, [AppName]=?, [AppId]=?, [TotalBudgetAmount]=?, [TotalBudgetCurrency]=?, [DailyBudgetAmount]=?, [DailyBudgetCurrency]=?, [Impressions]=?, [Taps]=?, [Conversions]=?, [ConversionsNewDownloads]=?, [ConversionsRedownloads]=?, [Ttr]=?, [LocalSpendAmount]=?, [LocalSpendCurrency]=?, [ConversionRate]=?, [Week_]=?, [Month_]=?, [Year_]=?, [Quarter]=?, [FinancialYear]=?, [RowUpdatedTime]=? WHERE [Date_]=? AND [CampaignId]=?",
row['OrgId'],
row['CampaignName'],
row['CampaignStatus'],
row['Storefront'],
row['AppName'],
row['AppId'],
row['TotalBudgetAmount'],
row['TotalBudgetCurrency'],
row['DailyBudgetAmount'],
row['DailyBudgetCurrency'],
row['Impressions'],
row['Taps'],
row['Conversions'],
row['ConversionsNewDownloads'],
row['ConversionsRedownloads'],
row['Ttr'],
row['LocalSpendAmount'],
row['LocalSpendCurrency'],
row['ConversionRate'],
row['Week_'],
row['Month_'],
row['Year_'],
row['Quarter'],
row['FinancialYear'],
row['RowUpdatedTime'],
row['Date_'],
row['CampaignId'])
crsr.commit()
我想迭代 df 中的每一行(大约 3000),如果 ['Date_'] 和 ['CampaignId'] 匹配,我会更新所有其他字段。否则我想在我的 Access 表中插入整个 df 行(创建新行)。实现这一目标最高效、最有效的方法是什么?
最佳答案
考虑DataFrame.values
并将列表传递到 executemany
调用中,确保为 UPDATE
查询相应地排序列:
cols = ['OrgId', 'CampaignName', 'CampaignStatus', 'Storefront',
'AppName', 'AppId', 'TotalBudgetAmount', 'TotalBudgetCurrency',
'DailyBudgetAmount', 'DailyBudgetCurrency', 'Impressions',
'Taps', 'Conversions', 'ConversionsNewDownloads', 'ConversionsRedownloads',
'Ttr', 'LocalSpendAmount', 'LocalSpendCurrency', 'ConversionRate',
'Week_', 'Month_', 'Year_', 'Quarter', 'FinancialYear',
'RowUpdatedTime', 'Date_', 'CampaignId']
sql = '''UPDATE SearchAdsAccountLevel
SET [OrgId]=?, [CampaignName]=?, [CampaignStatus]=?, [Storefront]=?,
[AppName]=?, [AppId]=?, [TotalBudgetAmount]=?,
[TotalBudgetCurrency]=?, [DailyBudgetAmount]=?,
[DailyBudgetCurrency]=?, [Impressions]=?, [Taps]=?, [Conversions]=?,
[ConversionsNewDownloads]=?, [ConversionsRedownloads]=?, [Ttr]=?,
[LocalSpendAmount]=?, [LocalSpendCurrency]=?, [ConversionRate]=?,
[Week_]=?, [Month_]=?, [Year_]=?, [Quarter]=?, [FinancialYear]=?,
[RowUpdatedTime]=?
WHERE [Date_]=? AND [CampaignId]=?'''
crsr.executemany(sql, df[cols].values.tolist())
cnxn.commit()
对于插入,请使用具有精确结构的临时临时表作为最终表,您可以使用 make-table 查询创建该表:SELECT TOP 1 * INTO temp FROM Final
。该临时表将定期清理并插入所有数据框行。最终查询仅将新行从 temp 迁移到 Final 中 NOT EXISTS
, NOT IN
, or LEFT JOIN/NULL
。您可以随时运行此查询,而不必担心 Date_ 和 CampaignId 列出现重复。
# CLEAN OUT TEMP
sql = '''DELETE FROM SearchAdsAccountLevel_Temp'''
crsr.executemany(sql)
cnxn.commit()
# APPEND TO TEMP
sql = '''INSERT INTO SearchAdsAccountLevel_Temp (OrgId, CampaignName, CampaignStatus, Storefront,
AppName, AppId, TotalBudgetAmount, TotalBudgetCurrency,
DailyBudgetAmount, DailyBudgetCurrency, Impressions,
Taps, Conversions, ConversionsNewDownloads, ConversionsRedownloads,
Ttr, LocalSpendAmount, LocalSpendCurrency, ConversionRate,
Week_, Month_, Year_, Quarter, FinancialYear,
RowUpdatedTime, Date_, CampaignId)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?);'''
crsr.executemany(sql, df[cols].values.tolist())
cnxn.commit()
# MIGRATE TO FINAL
sql = '''INSERT INTO SearchAdsAccountLevel
SELECT t.*
FROM SearchAdsAccountLevel_Temp t
LEFT JOIN SearchAdsAccountLevel f
ON t.Date_ = f.Date_ AND t.CampaignId = f.CampaignId
WHERE f.OrgId IS NULL'''
crsr.executemany(sql)
cnxn.commit()
关于python - 使用 Python 在 MS Access 数据库中插入或更新行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55955753/