python - 错误: "Required Parameter is Missing" - While Getting Anonymous Table in BigQuery

标签 python google-bigquery

我正在尝试查询因在 BigQuery 中查询表数据集而创建的匿名表。我正在尝试使用 Jobs.get() 使用 Google BigQuery Analytics 中的示例查找匿名表名称但我遇到了错误。

Google BigQuery Analytics 示例(第 209 页):

Reference (Google BigQuery Analytics)

查询 1:

class QueryHandler(webapp2.RequestHandler):
credentials = GoogleCredentials.get_application_default()
service = discovery.build('bigquery', 'v2', credentials=credentials)
def query1(self):
    myquery = {'configuration': {
        'query': {
            'query': 'SELECT DISTINCT user_id FROM `app.mydataset.mytable`',
            'destinationTable': {
            'projectId': projectId,
            'datasetId': datasetId,
            'tableId': 'tableId'},
            'useLegacySql': False
                 }
            }
    }

    response = service.jobs().query(projectId=projectId, body=myquery).execute()
    job = service.jobs().get(**response['jobReference']).execute()
    # both versions of this variable (destination_table) produce the same error message
    # destination_table = job['configuration']['query']['destinationTable']
    destination_table = job['destinationTable']

    table = service.jobs().get(projectId=destination_table['projectId'],
                              datasetId=destination_table['datasetId'],
                              tableId=destination_table['tableId']).execute()
    return table

错误:

Internal Server Error

The server has either erred or is incapable of performing the requested operation.

....

HttpError: https://www.googleapis.com/bigquery/v2/projects/app_id/queries?alt=json returned "Required parameter is missing">

我的问题:

  1. 为什么我会收到此错误? (我按照这个例子,但看不出我错过了什么)
  2. 如何使用 Python 将第一个查询中的匿名表名称传递到第二个查询中?例如:

查询 2:

def query2(self):
....
query: SELECT * FROM [anonymous table from query 1]

最佳答案

  1. Why am I getting this error? (I followed the example and I can't see what I missed)

您的请求正文格式不正确,为 jobs.query API call 。您不需要“配置”或“查询”对象来包装您所拥有的内容。

尝试:

myquery = {
    'query': 'SELECT DISTINCT user_id FROM `app.mydataset.mytable`',
    'useLegacySql': False
}

response = service.jobs().query(projectId=projectId, body=myquery).execute()

作为元评论,我们(BigQuery 团队)意识到“缺少必需的参数”错误消息过于模糊而无法调试,并会导致类似这样的令人困惑的情况。此外,无法识别的参数(如“配置”对象)会被简单地忽略,因此,如果您在请求中错误地命名了参数,则很容易出现“缺少必需的参数”错误。我们希望在未来的 API 更新中解决此问题。


  1. How can I pass an anonymous table name from the first query in the second query using Python?

您应该能够从 jobs.get response 检索目标表,假设您传入了预期的 jobReference

但是,请注意,在另一个查询中使用此匿名表是 unsupported operation on anonymous results tables ,没有任何保证:

The query results from this method are saved to a temporary table that is deleted approximately 24 hours after the query is run. You can read this results table by calling either bigquery.tabledata.list(table_reference) or bigquery.jobs.getQueryResults(job_reference). The table and dataset name are non-standard, and cannot be used in any other APIs, as the behavior may be unpredictable.

相反,您最好传递一个显式的目标表,这只能通过 jobs.insert 来完成。而不是使用jobs.query。查找参数configuration.query.destinationTable

您可以将这些目标表放入 sets up an expiration time for contained tables 的数据集中如果您担心将它们保留一段时间,则在一段时间(一小时、一天或...)后。

关于python - 错误: "Required Parameter is Missing" - While Getting Anonymous Table in BigQuery,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40351646/

相关文章:

python - Pandas 读取 csv 替换#DIV/0!和#VALUE!与南

python - 尽管规范相同,但为什么不同 Linux 云提供商设置的 Python 线程平均负载会有所不同?

google-bigquery - 大查询 : Get size of each row in table

sql - BigQuery 所有先前行的运行总计

python - pandas 列表列的频率计数

python - SECRET_KEY 设置不能为空 - django+pycharm

Python Mysql 类错误

google-analytics - 为什么BigQuery中的hits.transaction ID为null?

google-bigquery - Apache 光束 : Transform an objects having a list of objects to multiple TableRows to write to BigQuery

google-bigquery - 使用 BigQuery 进行日志分析