python - 如何使用 pandas.read_excel() 直接从 Dropbox 的 API 读取 Excel 文件？

我有兴趣将存储在 Dropbox 中的两个小型 Excel 文件作为单独版本进行比较。

使用 Python SDK，特别是 files_download() method ，我得到了一个 requests.models.Response 对象，但是我在获取 pandas.read_excel() 时遇到了问题消费它。

代码片段如下:

with open(resp.content, "rb") as handle:
    df = pandas.read_excel(handle.read())

错误:

TypeError('file() argument 1 must be encoded string without null bytes, not str',)

我知道我缺少一些基本的东西，可能需要将文件编码为二进制文件。 (尝试过 base64.b64encode 和其他一些东西，但还没有成功。)我希望有人可以帮助我指出正确的方向，可能是 io 模块？

我使用的是 Python 2.7.15

为免生疑问，我特别希望避免首先将 Excel 文件保存到文件系统的步骤。我确定我可以通过这种方式实现更广泛的目标，但为了优化我正在尝试将 Dropbox 中的文件直接读取到 pandas DataFrames 中，并且 read_excel() 方法需要一个文件-like 对象意味着——我认为——我应该能够做到这一点。

基本上，我认为 this总结了我此刻的痛苦。我需要将 Dropbox 的响应转换为类文件对象的形式。

最佳答案

下面的代码会做你想做的事。

# Imports and initialization of variables
from contextlib import closing # this will correctly close the request
import io
import dropbox
token = "YOURTOKEN" #get token on https://www.dropbox.com/developers/apps/
dbx = dropbox.Dropbox(token)
yourpath = "somefile.xlsx" # This approach is not limited to excel files

# Relevant streamer
def stream_dropbox_file(path):
    _,res=dbx.files_download(path)
    with closing(res) as result:
        byte_data=result.content
        return io.BytesIO(byte_data)

# Usage
file_stream=stream_dropbox_file(yourpath)
pd.read_excel(file_stream)

这种方法的优点在于使用 io.BytesIO 将数据转换为通用的文件类对象。因此，您还可以使用它通过 pd.read_csv() 读取诸如 csv's 之类的内容。

该代码也应该适用于非 pandas io 方法，例如加载图像，但我没有明确测试过。

关于python - 如何使用 pandas.read_excel() 直接从 Dropbox 的 API 读取 Excel 文件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53697160/

python - 如何使用 pandas.read_excel() 直接从 Dropbox 的 API 读取 Excel 文件？

上一篇：flutter - 如何在 flutter 中制作自定义小部件/组件？

下一篇：spring-boot - 如何确保只有一个消费者实际消费已发布的消息？