python - 使用 Lambda Python 将 json 写入 Parquet 对象以放入 S3

我想使用 Amazon Lambda (python) 将一个 json 对象写入 parquet 中的 S3!

但是我无法将 fastparquet 库与 boto3 连接起来，因为第一个库有一个写入文件的方法，而 boto3 期望将一个对象放入 S3 存储桶

有什么建议吗？

快速 Parquet 示例

fastparque.write('test.parquet', df, compression='GZIP', file_scheme='hive')

Boto3 示例

 client = authenticate_s3()
        response = client.put_object(Body=Body, Bucket=Bucket, Key=Key)

Body 将对应于 Parquet 的内容!它将允许写入 S3

最佳答案

您可以使用write 方法的open_with 参数将任何数据帧写入S3(参见fastparquet's doc)

import s3fs
from fastparquet import write

s3 = s3fs.S3FileSystem()
myopen = s3.open
write(
    'bucket-name/filename.parq.gzip',
    frame,
    compression='GZIP',
    open_with=myopen
)

关于python - 使用 Lambda Python 将 json 写入 Parquet 对象以放入 S3，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42721341/

上一篇：python - 在附加到列表之前展平列表

下一篇：python - (Python) 字符串首字母打印两次

python - 使用 django 在 HTML 页面中显示数据库中的数据

python - Aws 将凭据传递给 ansible s3 模块

java - 我想设置 s3 对象特定保留

Python字符串编码-文件名

python - 从 next_sibling 获取文本 - BeautifulSoup 4

.net-core - 我的 Lambda 在启动和第一行之间做什么？

amazon-web-services - 是否将AWS Lambda视为并行处理？

python - 如何在 python 中获取多个正则表达式匹配项？