我正在尝试基于 this tutorial 运行无服务器 LibreOffice .这是完整的 Python lambda 函数:
import boto3
import os
s3_bucket = boto3.resource("s3").Bucket("lambda-libreoffice-demo")
os.system("curl https://s3.amazonaws.com/lambda-libreoffice-demo/lo.tar.gz -o /tmp/lo.tar.gz && cd /tmp && tar -xf /tmp/lo.tar.gz")
convertCommand = "instdir/program/soffice --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir /tmp"
def lambda_handler(event,context):
inputFileName = event['filename']
# Put object wants to be converted in s3
with open(f'/tmp/{inputFileName}', 'wb') as data:
s3_bucket.download_fileobj(inputFileName, data)
# Execute libreoffice to convert input file
os.system(f"cd /tmp && {convertCommand} {inputFileName}")
# Save converted object in S3
outputFileName, _ = os.path.splitext(inputFileName)
outputFileName = outputFileName + ".pdf"
f = open(f"/tmp/{outputFileName}","rb")
s3_bucket.put_object(Key=outputFileName,Body=f,ACL="public-read")
f.close()
运行完整脚本时的响应是:
"errorMessage": "ENOENT: 没有那个文件或目录,打开 '/tmp/example.pdf'",
于是开始逐行调试。
根据我的调试打印,它似乎一开始就失败了,当试图在第二行提取二进制文件时:
os.path.exists('/tmp/lo.tar.gz') // => true
os.path.exists('/tmp/instdir/program/soffice.bin') // => false
所以看起来 tar 是那里有问题的部分。
如果我从 S3 下载文件并在本地运行 tar
命令,它似乎可以很好地提取文件。
尝试使用节点、python 3.8、python 3.6。
还尝试使用和不使用图层(以及 /opt/lo.tar.br
路径)作为 described here .
最佳答案
我遇到了同样的问题。
我怀疑问题是执行/tmp 中文件的权限错误。
尝试将 instdir/
复制到您的主文件夹并从那里运行它。
请回信确认您是否对此进行了测试!
我最终创建了一个正确安装 LibreOffice 的 Docker 容器,例如:
# Use Amazon Linux 2 (It's based on CentOS) as base image
FROM amazon/aws-lambda-provided:al2
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Download and install LibreOffice (and deps)
RUN yum update -y \
&& yum clean all \
&& yum install -y wget tar gzip
RUN cd /tmp \
&& wget http://download.documentfoundation.org/libreoffice/stable/7.0.4/rpm/x86_64/LibreOffice_7.0.4_Linux_x86-64_rpm.tar.gz \
&& tar -xvf LibreOffice_7.0.4_Linux_x86-64_rpm.tar.gz
# For some reason we need to "clean all"
RUN cd /tmp/LibreOffice_7.0.4.2_Linux_x86-64_rpm/RPMS \
&& yum clean all \
&& yum -y localinstall *.rpm
# Required deps for soffice
RUN yum -y install \
fontconfig libXinerama.x86_64 cups-libs dbus-glib cairo libXext libSM libXrender
# NOTE: Should we install libreoffice-writer? (doesn't seem to be required)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# We need to read/write to S3 bucket
RUN yum -y install \
awscli \
jq
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# We test with this file
COPY test-template.docx /home/test-template.docx
# This code derives from Ari's original article
COPY process_doc.sh /home/process_doc.sh
COPY bootstrap /var/runtime/bootstrap
COPY function.sh /var/task/function.sh
RUN chmod u+rx \
/home/process_doc.sh \
/var/runtime/bootstrap \
/var/task/function.sh
CMD [ "function.sh.handler" ]
# ^ Why CMD not ENTRYPOINT
... 并运行一个容器化的 lambda:https://github.com/p-i-/lambda-container-image-with-custom-runtime-example
关于amazon-s3 - AWS lambda tar 文件提取似乎不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65884502/