我有一个网络抓取工具,可以为本月的抓取内容创建一个 excel 文件。我想在每次运行时将今天的抓取和那个月的每个抓取作为新工作表添加到该文件中。但是,我的问题是它只用新工作表覆盖现有工作表,而不是将其作为单独的新工作表添加。我已经尝试使用 xlrd、xlwt、pandas 和 openpyxl 来实现。
对于 Python 来说仍然是全新的,所以简单是值得赞赏的!
下面只是编写excel文件的代码。
# My relevant time variables
ts = time.time()
date_time = datetime.datetime.fromtimestamp(ts).strftime('%y-%m-%d %H_%M_%S')
HourMinuteSecond = datetime.datetime.fromtimestamp(ts).strftime('%H_%M_%S')
month = datetime.datetime.now().strftime('%m-%y')
# Creates a writer for this month and year
writer = pd.ExcelWriter(
'C:\\Users\\G\\Desktop\\KickstarterLinks(%s).xlsx' % (month),
engine='xlsxwriter')
# Creates dataframe from my data, d
df = pd.DataFrame(d)
# Writes to the excel file
df.to_excel(writer, sheet_name='%s' % (HourMinuteSecond))
writer.save()
最佳答案
更新:
此功能已添加到 pandas 0.24.0 :
ExcelWriter now accepts
mode
as a keyword argument, enabling append to existing workbooks when using the openpyxl engine (GH3441)
上一版本:
Pandas 有一个 open feature request为此。罢工>
与此同时,这里有一个添加 pandas.DataFrame
的函数到现有工作簿:
代码:
def add_frame_to_workbook(filename, tabname, dataframe, timestamp):
"""
Save a dataframe to a workbook tab with the filename and tabname
coded to timestamp
:param filename: filename to create, can use strptime formatting
:param tabname: tabname to create, can use strptime formatting
:param dataframe: dataframe to save to workbook
:param timestamp: timestamp associated with dataframe
:return: None
"""
filename = timestamp.strftime(filename)
sheet_name = timestamp.strftime(tabname)
# create a writer for this month and year
writer = pd.ExcelWriter(filename, engine='openpyxl')
try:
# try to open an existing workbook
writer.book = load_workbook(filename)
# copy existing sheets
writer.sheets = dict(
(ws.title, ws) for ws in writer.book.worksheets)
except IOError:
# file does not exist yet, we will create it
pass
# write out the new sheet
dataframe.to_excel(writer, sheet_name=sheet_name)
# save the workbook
writer.save()
测试代码:
import datetime as dt
import pandas as pd
from openpyxl import load_workbook
data = [x.strip().split() for x in """
Date Close
2016-10-18T13:44:59 2128.00
2016-10-18T13:59:59 2128.75
""".split('\n')[1:-1]]
df = pd.DataFrame(data=data[1:], columns=data[0])
name_template = './sample-%m-%y.xlsx'
tab_template = '%d_%H_%M'
now = dt.datetime.now()
in_an_hour = now + dt.timedelta(hours=1)
add_frame_to_workbook(name_template, tab_template, df, now)
add_frame_to_workbook(name_template, tab_template, df, in_an_hour)
( Source )
关于python - 将 pandas.DataFrame 添加到现有 Excel 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42589835/