python - 将 pandas.DataFrame 添加到现有 Excel 文件

标签 python excel python-3.x pandas openpyxl

我有一个网络抓取工具,可以为本月的抓取内容创建一个 excel 文件。我想在每次运行时将今天的抓取和那个月的每个抓取作为新工作表添加到该文件中。但是,我的问题是它只用新工作表覆盖现有工作表,而不是将其作为单独的新工作表添加。我已经尝试使用 xlrd、xlwt、pandas 和 openpyxl 来实现。

对于 Python 来说仍然是全新的,所以简单是值得赞赏的!

下面只是编写excel文件的代码。

# My relevant time variables
ts = time.time()
date_time = datetime.datetime.fromtimestamp(ts).strftime('%y-%m-%d %H_%M_%S')
HourMinuteSecond = datetime.datetime.fromtimestamp(ts).strftime('%H_%M_%S')
month = datetime.datetime.now().strftime('%m-%y')

# Creates a writer for this month and year
writer = pd.ExcelWriter(
    'C:\\Users\\G\\Desktop\\KickstarterLinks(%s).xlsx' % (month), 
    engine='xlsxwriter')

# Creates dataframe from my data, d
df = pd.DataFrame(d)

# Writes to the excel file
df.to_excel(writer, sheet_name='%s' % (HourMinuteSecond))
writer.save()

最佳答案

更新:

此功能已添加到 pandas 0.24.0 :

ExcelWriter now accepts mode as a keyword argument, enabling append to existing workbooks when using the openpyxl engine (GH3441)

上一版本:

Pandas 有一个 open feature request为此。

与此同时,这里有一个添加 pandas.DataFrame 的函数到现有工作簿:

代码:

def add_frame_to_workbook(filename, tabname, dataframe, timestamp):
    """
    Save a dataframe to a workbook tab with the filename and tabname
    coded to timestamp

    :param filename: filename to create, can use strptime formatting
    :param tabname: tabname to create, can use strptime formatting
    :param dataframe: dataframe to save to workbook
    :param timestamp: timestamp associated with dataframe
    :return: None
    """
    filename = timestamp.strftime(filename)
    sheet_name = timestamp.strftime(tabname)

    # create a writer for this month and year
    writer = pd.ExcelWriter(filename, engine='openpyxl')
    
    try:
        # try to open an existing workbook
        writer.book = load_workbook(filename)
        
        # copy existing sheets
        writer.sheets = dict(
            (ws.title, ws) for ws in writer.book.worksheets)
    except IOError:
        # file does not exist yet, we will create it
        pass

    # write out the new sheet
    dataframe.to_excel(writer, sheet_name=sheet_name)
    
    # save the workbook
    writer.save()

测试代码:

import datetime as dt
import pandas as pd
from openpyxl import load_workbook

data = [x.strip().split() for x in """
                   Date  Close
    2016-10-18T13:44:59  2128.00
    2016-10-18T13:59:59  2128.75
""".split('\n')[1:-1]]
df = pd.DataFrame(data=data[1:], columns=data[0])

name_template = './sample-%m-%y.xlsx'
tab_template = '%d_%H_%M'
now = dt.datetime.now()
in_an_hour = now + dt.timedelta(hours=1)
add_frame_to_workbook(name_template, tab_template, df, now)
add_frame_to_workbook(name_template, tab_template, df, in_an_hour)

( Source )

关于python - 将 pandas.DataFrame 添加到现有 Excel 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42589835/

相关文章:

python - 如何使用多列的值计数按组汇总 pandas DataFrame?

python - 为什么使用 lambda 与 1 行函数声明?

.net - Office 2003 互操作问题,未找到接口(interface)、方法

sql - 日期差异,不包括某些时间和日期

python-3.x - 如何处理从嵌套交叉验证获得的网格搜索的 best_score?

python - 求解多个线性稀疏矩阵方程 : "numpy.linalg.solve" vs. "scipy.sparse.linalg.spsolve"

django - 如何在 Django 中获取二进制发布数据!很好?

Excel:如何使用 Xlookup 动态返回列

django - 在 django 中创建了一个模型,缺少主键

python - 类型错误:在字符串格式化期间、将数据上传到 Web sql 服务器期间,并非所有参数都被转换,