python - 使用本地工作站查询公共(public) BigQuery 数据时遇到问题

标签 python google-bigquery google-colaboratory kaggle

我正在尝试从我的 Colab 上的 BigQuery API(以太坊数据集)查询公共(public)数据。

我已经尝试过了

from google.colab import auth
auth.authenticate_user()
from google.cloud import bigquery
eth_project_id = 'crypto_ethereum_classic'
client = bigquery.Client(project=eth_project_id)

并收到此错误消息:

WARNING:google.auth._default:No project ID could be determined. Consider running `gcloud config set project` or setting the GOOGLE_CLOUD_PROJECT environment variable

我还尝试使用 BigQueryHelper 库并收到类似的错误消息

from bq_helper import BigQueryHelper
eth_dataset = BigQueryHelper(active_project="bigquery-public-data",dataset_name="crypto_ethereum_classic") 

错误:

WARNING:google.auth._default:No project ID could be determined. Consider running `gcloud config set project` or setting the GOOGLE_CLOUD_PROJECT environment variable
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-21-53ac8b2901e1> in <module>()
      1 from bq_helper import BigQueryHelper
----> 2 eth_dataset = BigQueryHelper(active_project="bigquery-public-data",dataset_name="crypto_ethereum_classic")

/content/src/bq-helper/bq_helper.py in __init__(self, active_project, dataset_name, max_wait_seconds)
     23         self.dataset_name = dataset_name
     24         self.max_wait_seconds = max_wait_seconds
---> 25         self.client = bigquery.Client()
     26         self.__dataset_ref = self.client.dataset(self.dataset_name, project=self.project_name)
     27         self.dataset = None

/usr/local/lib/python3.6/dist-packages/google/cloud/bigquery/client.py in __init__(self, project, credentials, _http, location, default_query_job_config)
    140     ):
    141         super(Client, self).__init__(
--> 142             project=project, credentials=credentials, _http=_http
    143         )
    144         self._connection = Connection(self)

/usr/local/lib/python3.6/dist-packages/google/cloud/client.py in __init__(self, project, credentials, _http)
    221 
    222     def __init__(self, project=None, credentials=None, _http=None):
--> 223         _ClientProjectMixin.__init__(self, project=project)
    224         Client.__init__(self, credentials=credentials, _http=_http)

/usr/local/lib/python3.6/dist-packages/google/cloud/client.py in __init__(self, project)
    176         if project is None:
    177             raise EnvironmentError(
--> 178                 "Project was not passed and could not be "
    179                 "determined from the environment."
    180             )

OSError: Project was not passed and could not be determined from the environment.

重申一下,我正在使用 Colab - 我知道如何查询 Kaggle 上的数据,但需要在我的 colab 上执行

最佳答案

在 Colab 中 - 您需要首先进行身份验证。

from google.colab import auth
auth.authenticate_user()

这将向项目验证您的用户帐户。

关于python - 使用本地工作站查询公共(public) BigQuery 数据时遇到问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54942607/

相关文章:

python - 在 Python 中,什么时候和什么时候不使用带有 multiprocessing.Pool 的 map() ?输入值较大的情况

Python 正则表达式大写 unicode 字

json - 大查询表在本地计算机中提取为 JSON

google-bigquery - 如何使用带有Java Java客户端库的pageTokens请求分页的BigQuery查询结果?

jupyter-notebook - 如何更改 Colab Markdown 中的字体样式?

python - 如何使这个 KNN 代码在 google colab 或任何其他基于 ipython 的环境中更快?

javascript - 如何让JavaScript从Django中的views.py中识别context_dict对象

python - 为什么 Python 类会继承对象?

python - 如何从 Google Dataflow 中的 PCollection 中获取元素列表并在管道中使用它来循环写入转换?

jupyter-notebook - 如何从右到左 (RTL) google colaboratory