python - 通过 AWS Lambda 在 AWS Redshift 中插入数据

标签 python amazon-web-services amazon-s3 aws-lambda

我正在尝试执行以下操作:

当我在 AWS S3 中上传一个 csv 文件时,AWS Lambda 需要检测它并在 AWS Redshift 中创建一个表并将数据存储在其中。此过程在没有 lambda 的情况下有效。但我想让它自动化。

因此,我创建了一个 lambda 函数来检测上传操作、csv 文件等等。

现在,在解决了一些错误之后,我得到了一个对我来说什么都没有的错误..

Loading function
START RequestId: e8baee71-c36b-11e5-b1cb-87083ac95a25 Version: $LATEST
END RequestId: e8baee71-c36b-11e5-b1cb-87083ac95a25
REPORT RequestId: e8baee71-c36b-11e5-b1cb-87083ac95a25  Duration: 67.04 ms  Billed Duration: 100 ms     Memory Size: 512 MB Max Memory Used: 44 MB  

这是我的 lambda python 文件。它位于我的 zip 文件的根目录中。在 zip 文件中,他们是另一张 map “psycopg2”

from __future__ import print_function

import json
import urllib
import boto3
import psycopg2
import linecache

print('Loading function')

s3 = boto3.client('s3')


def lambda_handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')
    try:
        response = s3.get_object(Bucket=bucket, Key=key)

        #SEND MAIL WHEN CREATED

        #from = "email@gmail.com"
        #password = "password.email"
        #mail = smtplib.SMTP("smtp.gmail.com",587)
        #mail.ehlo()
        #mail.starttls()
        #mail.login(from,password)

        #recipient = "recipient.email"
        #mail.sendmail(from,recipient,key)


        #CREATE REDSHIFT TABLE WHEN CSV FILE UPLOADED
        if(key == "*.csv"):
            conn_string = "dbname=" + "xxxx" + " port=" + "5439" + " user=" + "yyyyy" + " password=" + "xxxxx*" + " host=" + "xxxxxxx.amazonaws.com";
            connection = psycopg2.connect(conn_string)
            cursor = connection.cursor();

            cursor.execute("select exists(select * from information_schema.tables where table_name=%s)", (key,))
            if(cursor.fetchone()[0]):
                return
            else:
                sqlcommand = 'create table ' + key + '('

                line = linecache.getline(key,1)
                line = line.replace(' ', '')
                line = line.replace('/', '')
                line = line.replace(':', '')
                line2 = linecache.getline(key,2)
                df1 = line
                df2 = line2
                output = ''
                output2 = ''
                for row1 in df1:
                    output = output + row1

                for row2 in df2:
                    output2 = output2 + row2

                new = output.split(',')
                new2 = output2.split(',')
                i = 0;
                for var in new:
                    new2[i] = new2[i].replace(' ', '')
                    sqlcommand = sqlcommand + var + ' ' + self._strType(new2[i])
                    i = i + 1;
                sqlcommand = sqlcommand[:-1]
                sqlcommand = sqlcommand + ');'

                cursor.execute(sqlcommand)
                connection.commit();

                print("CONTENT TYPE: " + response['ContentType'])
                return response['ContentType']
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

最佳答案

这不是错误。这就是成功的样子。

关于python - 通过 AWS Lambda 在 AWS Redshift 中插入数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34994594/

相关文章:

python - 在元组中强制元组?

amazon-web-services - 将 RedShift 文件作为 CSV 移动到 S3

ruby - 如何在 Ruby 中执行相当于 's3cmd setacl --acl-grant=read:82b82d.. s3://somebucket/..' 的操作?

python - 如何在Python中用点覆盖水平条形图?

python - 使用 pearsonr 时遇到无效值

linux - 我收到错误 : mappings value are not allowed here whenever I am running the yml policy to tag the instance

database - DynamoDB 查找最近的整数键

java - Amazon EMR : java. io.IOException:文件已存在:s3n://<bucketname>/output/part-r-00002

Python-Django : user info in template

node.js - dynamodb 本地 shell 不使用 docker 镜像列出表