我希望使用 aws s3api --list-objects
和 jq
的组合为 Redshift COPY
生成一个 list 文件作为下面:-
aws s3api list-objects --bucket annalects3 --prefix "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression" --output json --query '{"entries": Contents[].{"url":"Key"}}' | jq '.entries[].mandatory = true'
生成如下输出:-
{ "entries": [
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092507_20160926_002328_292527438.csv.gz"
},
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092508_20160926_020131_292592736.csv.gz"
},
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092509_20160926_030312_292502379.csv.gz"
},
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092510_20160926_033656_292590227.csv.gz"
}
]
}
然而, list 文件需要以存储桶名称为前缀的 URL 对象,我还没有解决这个问题。输出需要看起来像
{ "entries": [
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092507_20160926_002328_292527438.csv.gz"
},
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092508_20160926_020131_292592736.csv.gz"
},
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092509_20160926_030312_292502379.csv.gz"
},
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092510_20160926_033656_292590227.csv.gz"
}
]
}
最佳答案
下面会实现你想要的
aws s3api list-objects \
--bucket <mybucket> \
--prefix "<myprefix>" \
--output json \
--query '{"entries": Contents[].{"url":"Key"}}' \
| jq '.entries[] | .url = "s3://<mybucket>/\(.entries.url)" | .mandatory = true'
我正在使用 String interpolation更新 entries[].url
值
关于json - jq 为 json 对象中的字符串添加前缀,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39785343/