python - 读取由 s3 事件触发的文件

标签 python csv amazon-s3 aws-lambda serverless-framework

这是我想做的:

  1. 用户将 csv 文件上传到 AWS S3 存储桶。
  2. 上传文件后,S3 存储桶会调用我创建的 lambda 函数。
  3. 我的 lambda 函数读取 csv 文件内容,然后发送包含文件内容和信息的电子邮件

本地环境

无服务器框架版本 1.22.0

python 2.7

这是我的 serverless.yml 文件

service: aws-python # NOTE: update this with your service name

provider:
  name: aws
  runtime: python2.7
  stage: dev
  region: us-east-1
  iamRoleStatements:
        - Effect: "Allow"
          Action:
              - s3:*
              - "ses:SendEmail"
              - "ses:SendRawEmail"
              - "s3:PutBucketNotification"
          Resource: "*"

functions:
  csvfile:
    handler: handler.csvfile
    description: send mail whenever a csv file is uploaded on S3 
    events:
      - s3:
          bucket: mine2
          event: s3:ObjectCreated:*
          rules:
            - suffix: .csv

这是我的 lambda 函数:

import json
import boto3
import botocore
import logging
import sys
import traceback
import csv

from botocore.exceptions import ClientError
from pprint import pprint
from time import strftime, gmtime
from json import dumps, loads, JSONEncoder, JSONDecoder


#setup simple logging for INFO
logger = logging.getLogger()
logger.setLevel(logging.INFO)

from botocore.exceptions import ClientError

def csvfile(event, context):
    """Send email whenever a csvfile is uploaded to S3 """
    body = {}
    emailcontent = ''
    status_code = 200
    #set email information
    email_from = '****@*****.com'
    email_to = '****@****.com'
    email_subject = 'new file is uploaded'
    try:
        s3 = boto3.resource(u's3')
        s3 = boto3.client('s3')
        for record in event['Records']:
            filename = record['s3']['object']['key']
            filesize = record['s3']['object']['size']
            source = record['requestParameters']['sourceIPAddress']
            eventTime = record['eventTime']
        # get a handle on the bucket that holds your file
        bucket = s3.Bucket(u'mine2')
        # get a handle on the object you want (i.e. your file)
        obj = bucket.Object(key= event[u'Records'][0][u's3'][u'object'][u'key'])
        # get the object
        response = obj.get()
        # read the contents of the file and split it into a list of lines
        lines = response[u'Body'].read().split()
        # now iterate over those lines
        for row in csv.DictReader(lines):    
            print(row)
            emailcontent = emailcontent + '\n' + row 
    except Exception as e:
        print(traceback.format_exc())
        status_code = 500
        body["message"] = json.dumps(e)

    email_body = "File Name: " + filename + "\n" + "File Size: " + str(filesize) + "\n" +  "Upload Time: " + eventTime + "\n" + "User Details: " + source + "\n" + "content of the csv file :" + emailcontent
    ses = boto3.client('ses')
    ses.send_email(Source = email_from,
        Destination = {'ToAddresses': [email_to,],}, 
            Message = {'Subject': {'Data': email_subject}, 'Body':{'Text' : {'Data': email_body}}}
            )
    print('Function execution Completed')

我不知道我做错了什么,因为当我只是获取有关文件的信息时,这部分工作正常,当我添加读取部分时,lambda 函数不返回任何内容

最佳答案

我建议将对 Cloudwatch 的访问权限也添加到您的 IAM 策略中。 实际上你的 lambda 函数没有返回任何东西,但你可以在 Cloudwatch 中看到你的日志输出。我强烈建议您在设置 logger 时使用 logger.info(message) 而不是 print

我希望这有助于调试您的功能。

除了发送的部分,我将这样重写它(刚刚在 AWS 控制台测试过):

import logging
import boto3

logger = logging.getLogger()
logger.setLevel(logging.INFO)

s3 = boto3.client('s3')

def lambda_handler(event, context):
    email_content = ''

    # retrieve bucket name and file_key from the S3 event
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    file_key = event['Records'][0]['s3']['object']['key']
    logger.info('Reading {} from {}'.format(file_key, bucket_name))
    # get the object
    obj = s3.get_object(Bucket=bucket_name, Key=file_key)
    # get lines inside the csv
    lines = obj['Body'].read().split(b'\n')
    for r in lines:
       logger.info(r.decode())
       email_content = email_content + '\n' + r.decode()
    logger.info(email_content)

关于python - 读取由 s3 事件触发的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46928105/

相关文章:

python - 正则表达式匹配字符串内的 float ,中间有任何字符或空格

ms-access - 在 VBA 中 Access 数据项目导入 CSV 文件

javascript - 如何将从 SVG 元素和 csv 文件生成的 PNG 图像导出到可下载的 ZIP 文件中

javascript - Stream.pipe() 到 node.js 中的变量

ios - 使用 xcode 从 s3 访问数据

swift - AWS iOS 开发工具包 : Using both S3 and Mobile Analytics in two different regions

python - Pyspark 中的随机样本没有重复

python - 将 SQL 查询限制在 Graphene-SQLAlchemy 中定义的字段/列

python - 根据 pandas 的计数对齐数据框

csv - 无需 Stata 软件即可将 dta 文件转换为 csv