sql - BigQuery - 对结构数组求和

标签 sql google-bigquery

我有一个如下所示的表格:

word   nb_by_date.date    nb_by_date.nb
---------------------------------------
abc    2020-01-01         17
       2020-01-06         43
abc    2020-01-01         33
       2020-01-05         12
       2020-01-06         5
def    2020-01-02         11
       2020-01-05         8
def    2020-01-02         1

您可以使用以下方式获取此示例:

WITH t AS (
SELECT "abc" AS word, [STRUCT('2020-01-01' AS date, 17 AS nb), STRUCT('2020-01-06' AS date, 43 AS nb)]
UNION ALL SELECT "abc" AS word, [STRUCT('2020-01-01' AS date, 33 AS nb), STRUCT('2020-01-05' AS date, 12 AS nb), STRUCT('2020-01-06' AS date, 5 AS nb)]
UNION ALL SELECT "def" AS word, [STRUCT('2020-01-02' AS date, 11 AS nb), STRUCT('2020-01-05' AS date, 8 AS nb)]
UNION ALL SELECT "def" AS word, [STRUCT('2020-01-02' AS date, 1 AS nb)]
)

我的目标是获得:

word   nb_by_date.date    nb_by_date.nb
---------------------------------------
abc    2020-01-01         50
       2020-01-05         12
       2020-01-06         55
def    2020-01-02         22
       2020-01-05         8

这是我的尝试:

SELECT
  word,
  ARRAY(
  SELECT STRUCT(date, SUM(nb))
  FROM UNNEST(nb_by_date)
  GROUP BY date
  ORDER BY date) nb_by_date
FROM (
  SELECT word, ARRAY_CONCAT_AGG(nb_by_date) nb_by_date
  FROM t
  GROUP BY word
)

它适用于这个玩具示例。但是,我有大量数据,并且使用 ARRAY_CONCAT_AGG(nb_by_date) 创建了超出 100MB 限制的行(无法查询大于 100MB 限制的行。) 。 我如何调整查询以使其即使在处理大量数据的情况下也能正常工作?

最佳答案

您可以使用两个级别的聚合:

WITH t AS (
      SELECT 'abc' AS word, [STRUCT('2020-01-01' AS date, 17 AS nb), STRUCT('2020-01-06' AS date, 43 AS nb)] as ar UNION ALL
      SELECT 'abc' AS word, [STRUCT('2020-01-01' AS date, 33 AS nb), STRUCT('2020-01-05' AS date, 12 AS nb), STRUCT('2020-01-06' AS date, 5 AS nb)] UNION ALL
      SELECT 'def' AS word, [STRUCT('2020-01-02' AS date, 11 AS nb), STRUCT('2020-01-05' AS date, 8 AS nb)] UNION ALL
      SELECT 'def' AS word, [STRUCT('2020-01-02' AS date, 1 AS nb)]
     )
select t.word, array_agg(struct( date, nb) order by date) as ar
from (select t.word, el.date, sum(el.nb) as nb
      from t cross join
           unnest(t.ar) el
      group by t.word, el.date
     ) t
group by word

关于sql - BigQuery - 对结构数组求和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62998158/

相关文章:

sql - 两个表相互引用 : How to insert row in an Oracle database?

SQL复杂动态透视2

sql - 如何在 BigQuery 的新 SQL 查询中连接/使用已保存的查询

sql - BigQuery 的 StandardSQL NTH() 和 FIRST() 函数

google-bigquery - Google Big Query 中的加权排名/综合得分

mysql - 当所有表都是 MYISAM 时,打开 INNODB 引擎是否安全?

sql - 从表中删除 a = b 和 b = a 的记录

Mysql 查询按日期搜索

java - 从 ISO 8601 日期字符串转换为 BigQuery 时间戳时出错

arrays - 跨 Bigquery 数组的非重复计数