python - Google Cloud BigQuery load_table_from_dataframe() Parquet AttributeError 错误

我正在尝试使用 BigQuery 包与 Pandas DataFrames 进行交互。在我的场景中，我查询 BigQuery 中的基表，使用 .to_dataframe()，然后将其传递给 load_table_from_dataframe() 以将其加载到 BigQuery 中的新表中。

我最初的问题是 str(uuid.uuid4()) (对于随机 ID)被自动转换为字节而不是字符串，所以我强制使用模式而不是允许它自动检测要做什么。

但是现在，我通过包含架构的 job_config 字典传递了 job_config，现在我收到此错误:

File "/usr/local/lib/python2.7/dist-packages/google/cloud/bigquery/client.py", line 903, in load_table_from_dataframe

job_config.source_format = job.SourceFormat.PARQUET AttributeError: 'dict' object has no attribute 'source_format'

我已经安装了 PyArrow，并尝试安装 FastParquet，但它没有帮助，而且在我尝试强制模式之前这并没有发生。

有什么想法吗？

https://google-cloud-python.readthedocs.io/en/latest/bigquery/usage.html#using-bigquery-with-pandas

https://google-cloud-python.readthedocs.io/en/latest/_modules/google/cloud/bigquery/client.html#Client.load_table_from_dataframe

查看实际的包，它似乎强制使用 Parquet 格式，但就像我说的，我之前没有问题，只是现在我正在尝试提供表架构。

编辑:只有当我尝试写入 BigQuery 时才会发生这种情况。

最佳答案

想通了。在清理了 Google 的文档后，我忘记了:

load_config = bigquery.LoadJobConfig()
load_config.schema = SCHEMA

哎呀。从未从 BigQuery 包加载配置字典。

关于python - Google Cloud BigQuery load_table_from_dataframe() Parquet AttributeError 错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51013943/

python - Google Cloud BigQuery load_table_from_dataframe() Parquet AttributeError 错误

上一篇：python - 重新加载本地模块不起作用

下一篇：python - 当模块分配给变量时如何使用模块中的方法