SQL:取消嵌套数组,在 aws athena 中保持相同的行数

标签 sql amazon-web-services amazon-athena presto unnest

基于以下查询

SELECT internal_transaction_id, tags FROM "bankstatements"."statements_transactions_sample_data"

我得到下表

+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| internal_transaction_id     | tags                                                                                                                                                                                                                                                                                                                                                                   |
+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2173059                     | [{category=null, creditdebit=credit, lendertype=null, pending=null, pre_authorisation=null, thirdparty=null}, {category=null, creditdebit=null, lendertype=null, pending=null, pre_authorisation=null, thirdparty=Internal Transfer Credit}, {category=Internal Transfer, creditdebit=null, lendertype=null, pending=null, pre_authorisation=null, thirdparty=null}]
+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2173061                     | [{category=null, creditdebit=credit, lendertype=null, pending=null, pre_authorisation=null, thirdparty=null}, {category=null, creditdebit=null, lendertype=null, pending=null, pre_authorisation=null, thirdparty=UBER}, {category=External Transfer, creditdebit=null, lendertype=null, pending=null, pre_authorisation=null, thirdparty=null}]
+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

我想取消嵌套“标签”列,保持相同的行数。目前我的查询

SELECT 
internal_transaction_id, t.category, t.creditdebit, t.lendertype, t.pending, t.pre_authorisation, t.thirdparty
FROM 
"bankstatements"."statements_transactions_sample_data"
CROSS JOIN UNNEST(tags) AS tag (t)  

结果:

+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| internal_transaction_id     | category            | creditdebit   | lendertype    | pending   | pre_authorisation     | thirdparty                |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173059                     |                     | credit        |               |           |                       |                           |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173059                     |                     |               |               |           |                       | Internal Transfer Credit  |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173059                     | Internal Transfer   |               |               |           |                       |                           |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173061                     |                     | credit        |               |           |                       |                           |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173061                     |                     |               |               |           |                       | UBER                      |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173061                     | External Transfer   |               |               |           |                       |                           |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+

我想知道如何取消嵌套标签,使其只有 2 行,如下所示:

+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| internal_transaction_id     | category            | creditdebit   | lendertype    | pending   | pre_authorisation     | thirdparty                |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173059                     | Internal Transfer   | credit        |               |           |                       | Internal Transfer Credit  |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+
| 2173061                     | External Transfer   |               |               |           |                       | UBER                      |
+-----------------------------+---------------------+---------------+---------------+-----------+-----------------------+---------------------------+

JSON 中的 tag 标签如下所示:

[
    {
        "thirdParty": "Other Credits"
    },
    {
        "category": "All Other Credits"
    },
    {
        "creditDebit": "credit"
    }
]

当我定义创建时:

tags: array<
    struct<
        category: string,
        creditdebit: string,
        lendertype: string,
        pending: string,
        pre_authorisation: string,
        thirdparty: string                                            
    >
>

最佳答案

使用最小值或最大值进行聚合:

SELECT 
internal_transaction_id, max(t.category) as category, max(t.creditdebit) as creditdebit, max(t.lendertype) as lendertype, max(t.pending) as pending, max(t.pre_authorisation) as pre_authorisation, max(t.thirdparty) as thirdparty
FROM 
"bankstatements"."statements_transactions_sample_data"
CROSS JOIN UNNEST(tags) AS tag (t) 
GROUP BY internal_transaction_id 

关于SQL:取消嵌套数组,在 aws athena 中保持相同的行数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66739291/

相关文章:

amazon-s3 - Amazon Athena 相同的查询有时会因 HIVE_CURSOR_ERROR 而失败

sql - PostgreSQL:所有文本列中的 ILIKE

php - 获取特定日期的记录

amazon-web-services - 在已创建的资源上部署 lambda 函数

amazon-ec2 - 连接到 AWS 上的 Windows Server 服务总线

amazon-web-services - 使用 Athena Terraform 脚本

python - pyspark对角比较两列

mysql - 选择同一个表中的多个列

amazon-web-services - 如何在AWS SAM模板中添加API Gateway自定义授权者?

amazon-web-services - SQL Athena 上的 Date_Part - "Function date_part not registered"