我是 GCP 和 Terraform 的新手。我正在开发 terraform 脚本来提供大约 50 个 BQ 数据集,每个数据集至少有 10 个表。所有表都没有相同的架构。
我已经开发了脚本来创建数据集和表格,但我面临着向表格添加模式的挑战,我需要帮助。我正在使用 terraform 变量来构建脚本。
这是我的代码。我需要集成逻辑来为表创建模式。
变体.tf
variable "test_bq_dataset" {
type = list(object({
id = string
location = string
}))
}
variable "test_bq_table" {
type = list(object({
dataset_id = string
table_id = string
}))
}
terraform.tfvars
test_bq_dataset = [{
id = "ds1"
location = "US"
},
{
id = "ds2"
location = "US"
}
]
test_bq_table = [{
dataset_id = "ds1"
table_id = "table1"
},
{
dataset_id = "ds2"
table_id = "table2"
},
{
dataset_id = "ds1"
table_id = "table3"
}
]
主.tf
resource "google_bigquery_dataset" "dataset" {
count = length(var.test_bq_dataset)
dataset_id = var.test_bq_dataset[count.index]["id"]
location = var.test_bq_dataset[count.index]["location"]
labels = {
"environment" = "development"
}
}
resource "google_bigquery_table" "table" {
count = length(var.test_bq_table)
dataset_id = var.test_bq_table[count.index]["dataset_id"]
table_id = var.test_bq_table[count.index]["table_id"]
labels = {
"environment" = "development"
}
depends_on = [
google_bigquery_dataset.dataset,
]
}
我尝试了所有可能性来为数据集中的表创建模式。然而都没有用。
最佳答案
大概你所有的表都应该有相同的模式......
我会尝试这种方式
在
资源“google_bigquery_table”“表”
例如,您可以在标签之后添加:
schema = file("${path.root}/subdirectories-path/table_schema.json")
哪里
${path.root}
- 是您生成地形文件的地方subdirectories-path
- 零个或多个子目录table_schema.json
- 带有模式的 json 文件
==> 2021 年 2 月 14 日更新
根据请求显示表架构不同的示例...对原始问题进行最少的修改。
变量.tf
variable "project_id" {
description = "The target project"
type = string
default = "ishim-sample"
}
variable "region" {
description = "The region where resources are created => europe-west2"
type = string
default = "europe-west2"
}
variable "zone" {
description = "The zone in the europe-west region for resources"
type = string
default = "europe-west2-b"
}
# ===========================
variable "test_bq_dataset" {
type = list(object({
id = string
location = string
}))
}
variable "test_bq_table" {
type = list(object({
dataset_id = string
table_id = string
schema_id = string
}))
}
terraform.tfvars
test_bq_dataset = [
{
id = "ds1"
location = "EU"
},
{
id = "ds2"
location = "EU"
}
]
test_bq_table = [
{
dataset_id = "ds1"
table_id = "table1"
schema_id = "table-schema-01.json"
},
{
dataset_id = "ds2"
table_id = "table2"
schema_id = "table-schema-02.json"
},
{
dataset_id = "ds1"
table_id = "table3"
schema_id = "table-schema-03.json"
},
{
dataset_id = "ds2"
table_id = "table4"
schema_id = "table-schema-04.json"
}
]
json 架构文件的示例 - table-schema-01.json
[
{
"name": "table_column_01",
"mode": "REQUIRED",
"type": "STRING",
"description": ""
},
{
"name": "_gcs_file_path",
"mode": "REQUIRED",
"type": "STRING",
"description": "The GCS path to the file for loading."
},
{
"name": "_src_file_ts",
"mode": "REQUIRED",
"type": "TIMESTAMP",
"description": "The source file modification timestamp."
},
{
"name": "_src_file_name",
"mode": "REQUIRED",
"type": "STRING",
"description": "The file name of the source file."
},
{
"name": "_firestore_doc_id",
"mode": "REQUIRED",
"type": "STRING",
"description": "The hash code (based on the file name and its content, so each file has a unique hash) used as a Firestore document id."
},
{
"name": "_ingested_ts",
"mode": "REQUIRED",
"type": "TIMESTAMP",
"description": "The timestamp when this record was processed during ingestion into the BigQuery table."
}
]
main.tf
provider "google" {
project = var.project_id
region = var.region
zone = var.zone
}
resource "google_bigquery_dataset" "test_dataset_set" {
project = var.project_id
count = length(var.test_bq_dataset)
dataset_id = var.test_bq_dataset[count.index]["id"]
location = var.test_bq_dataset[count.index]["location"]
labels = {
"environment" = "development"
}
}
resource "google_bigquery_table" "test_table_set" {
project = var.project_id
count = length(var.test_bq_table)
dataset_id = var.test_bq_table[count.index]["dataset_id"]
table_id = var.test_bq_table[count.index]["table_id"]
schema = file("${path.root}/bq-schema/${var.test_bq_table[count.index]["schema_id"]}")
labels = {
"environment" = "development"
}
depends_on = [
google_bigquery_dataset.test_dataset_set,
]
}
项目目录结构 - 截图
请记住子目录名称 - “bq-schema”,因为它用于“main.tf”文件中“google_bigquery_table”资源的“schema”属性。
BigQuery 控制台 - 屏幕截图
“terraform apply”命令的结果。
关于google-cloud-platform - 使用 terraform 配置 bigquery 数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66172075/