我正在尝试将文件从 Google 云存储导出到 Google bigquery,但遇到了许多问题。
#standardSQL
import json
import argparse
import time
import uuid
from google.cloud import bigquery
from google.cloud import storage
dataset = 'dataworks-356fa'
source = 'gs://dataworks-356fa-backups/pullnupload.json'
def load_data_from_gcs(dataset, test10, source ):
bigquery_client = bigquery.Client(dataset)
dataset = bigquery_client.dataset('FirebaseArchive')
table = dataset.table(test10)
job_name = str(uuid.uuid4())
job= bigquery_client.load_table_from_storage(
job_name, table, "gs://dataworks-356fa-backups/pullnupload.json")
job.source_format = 'NEWLINE_DELIMITED_JSON'
job.begin()
# wait_for_job(job)
print("state of job is: " + job.state)
# print("errors: " + job.errors)
load_data_from_gcs(dataset, 'test10', source)
当 wait_for_job(job)
行未注释掉时,我收到此错误
Traceback (most recent call last):
File "cloudtobq.py", line 42, in <module>
load_data_from_gcs(dataset, 'test10', source)
File "cloudtobq.py", line 38, in load_data_from_gcs
wait_for_job(job)
NameError: global name 'wait_for_job' is not defined
当 print("errors: "+ job.errors)
未注释掉时,我收到此错误。
Traceback (most recent call last):
File "cloudtobq.py", line 42, in <module>
load_data_from_gcs(dataset, 'test10', source)
File "cloudtobq.py", line 40, in load_data_from_gcs
print("errors: " + job.errors)
TypeError: cannot concatenate 'str' and 'NoneType' objects
当两者都被注释掉时,这就是我收到的内容,然后返回到原始终端屏幕。
Wess-MacBook-Pro:desktop wesstephens$ python cloudtobq.py
state of job is: RUNNING
Wess-MacBook-Pro:desktop wesstephens$
最佳答案
您需要包含函数 from the documentation sample code 的定义:
def wait_for_job(job):
while True:
job.reload()
if job.state == 'DONE':
if job.error_result:
raise RuntimeError(job.errors)
return
time.sleep(1)
您不需要打印 job.errors
,因为如果作业不成功,wait_for_job
将引发异常。
关于python - 无法使用python将JSON文件从谷歌云存储加载到bigquery,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44918752/