sql - AWS Athena CTAS 查询失败,建议清空空存储桶

标签 sql amazon-web-services amazon-s3 amazon-athena

我正在运行“CREATE TABLE AS SELECT (CTAS) 查询”( https://docs.aws.amazon.com/athena/latest/ug/ctas.html ),查询复制在底部。我收到以下错误消息:

HIVE_PATH_ALREADY_EXISTS: Target directory for table 'default.openaq_processed' already exists:
 s3://<processed-data-bucketname>/. You may need to manually clean the data at location 
's3://<athena-query-results-bucketname>/Unsaved/2021/04/29/tables/82025a35-8867-4865-8f42-f40adb6bee4c' 
before retrying. Athena will not delete data in your account.

This query ran against the "default" database, unless qualified by the query. Please post the
 error message on our forum or contact customer support with Query Id: 82025a35-8867-4865-8f42-f40adb6bee4c.

有关此错误 ( https://aws.amazon.com/premiumsupport/knowledge-center/athena-hive-path-already-exists/ ) 的 AWS 知识中心页面(与上面的错误消息类似)建议修复方法是确保用于存储查询结果的位置必须为空。

但它已经是了。事实上没有tables/ s3://<athena-query-results-bucketname>/Unsaved/2021/04/29/ 中的前缀/文件夹,以及s3://<processed-data-bucketname>/桶完全是空的。

我已在 AWS 论坛上发布了该问题,但尚未收到任何回复。我怎样才能让这个 CTAS 查询成功?

更新

抛出错误的查询:

CREATE TABLE openaq_processed
WITH (format='PARQUET', 
parquet_compression='SNAPPY', 
partitioned_by=array['country', 'parameter'], 
external_location = '<processed-data-bucketname>') 
AS
SELECT date_utc as date_utc_str,
date_local as date_local_str,
CAST(from_iso8601_timestamp(date_utc) as timestamp) as timestamp_utc,
CAST(from_iso8601_timestamp(date_local) as timestamp) as timestamp_local,
"location",  -- location is a reserved word for Athena, needs quotes
value,
unit,
city,
attribution,
averagingperiod,
coordinates."latitude" as latitude,
coordinates."longitude" as longitude,
sourcename,
sourcetype,
mobile,
country,
parameter
FROM openaq_pq2_tables

最佳答案

因此,我寻求 AWS 开发人员支持并提出了这个问题。我得到的响应确实修复了错误,是在我的 external_location 中创建一个文件夹。桶。不知道为什么这是必要的,但显然是这样。

所以,从外壳: $ aws s3 mb s3://<processed-data-bucketname>/processed_data/

(上面的mb代表“制作桶”)。

然后更新external_location = 's3://<processed-data-bucketname>'在上面的查询中 external_location = 's3://<processed-data-bucketname>/processed_data/')

关于sql - AWS Athena CTAS 查询失败,建议清空空存储桶,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67373870/

相关文章:

ios - 如何使用 Swift 4 避免文件上传重复

SQL在多个条件下按月和年带来结果

c# - 从数据库中读取 SQL Varbinary Blob

java - 如何设置 AWS APIGateway 阶段的 CloudWatch 设置

amazon-web-services - aws Elastic Beanstalk : cannot deploy to worker environment via eb cli

csv - 使用COPY命令的Redshift错误1202 “Extra column(s) found”

java - S3 预签名 URL 回调

ruby-on-rails - Paperclip AV 转码器无法在远程服务器上运行

PHP/MySQL : Displaying data from different tables based on a users ID

sql - PostgreSQL with-delete "relation does not exists"