python - PyMySQL INSERT INTO 查询忽略/切断未运行的查询

标签 python mysql amazon-web-services lambda pymysql

我有两个 INSERT INTO 查询,具有两个不同的 conn 和游标,sql_into_scores 查询运行并向数据库添加值(分数和统计输出),但 sql_into_metrics 不会。两者都应将值列表添加到数据库中。它告诉我 sql_into_metrics 查询中有一个问题(也单独检查了其他查询)。这一切都在 lambda 函数中运行并将结果存储在 RDS 数据库中。 MySQL版本为5.7。

connect(user 等...全部工作,因为我还完全删除了 sql_into_metrics 查询,并且 sql_into_scores 查询中的值存储在 AWS RDS 上。

数据从 s3 中提取(在 .txt 文件中)并通过 read_files 函数读入两个 numpy 数组。文本文件中的数据示例:

{“entity_id”:[42,11,12],“分数”:[95,95,30],“TruePos”:[0],“FalsePos”:[1],“FalseNeg”:[0 ]、“TrueNeg”:[0]、“score_overall”:[0.0]、“精度”:[0.0]、“召回率”:[0.0]、“fscore”:[0.0]、“支持”:[0.0]}

我已尝试以下方法来修复指标查询不想执行的原因:

1) 将从 (metric_list) 添加值的列表修改为以下格式:

- [ , , , ]
- ( , , , )
- [ [ , , , ] ]

2) 将列表/元组内的值全部更改为 float (然后分别更改 VALUES (%f, ...)。

3)我尝试过 inscurs 到 inscurs.execute 和 inscurs.executemany

4) 两个查询都有 conn 和 inscurs,并通过仅用于 sql_into_metrics 查询的第二个 conn_2 和 inscurs_2 来分隔。

def lambda_handler(event, context):

    '''
    Take the metric data collected after each run and
    store in RDS for performance tracking internally.
    '''

    # Insert in database
    conn = pymysql.connect(user=DB_USERNAME, password=DB_PASSWORD, host=DB_HOST, database=DB_DATABASE)
    conn_2 = pymysql.connect(user=DB_USERNAME, password=DB_PASSWORD, host=DB_HOST, database=DB_DATABASE)

    sql_into_metrics = "INSERT INTO dealscore_metrics_main (score_overall, precision, recall, fscore, support, true_pos, false_pos, false_neg, true_neg) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)"
    sql_into_scores = "INSERT INTO dealscore_scores (entity_id, score) VALUES (%s,%s)"

    inscurs = pymysql.cursors.Cursor(conn)
    inscurs_2 = pymysql.cursors.Cursor(conn_2)
    try:
        print("Adding the latest metric data into RDS...")
        score_data_s3,metric_data_s3 = read_files(prefix=prefix_to_metrics, bucket=BUCKET)
        metric_list = metric_data_s3.tolist()
        # Commented out last modifications to fix the problem(which didnt work, this was the last try, before reverting back to output now)
        # metric_list = [item for t in metric_list for item in t]
        # metric_list = [float(x) for x in metric_list]
        # print(type(metric_list[0]))

        score_data_s3 = score_data_s3.tolist()

        print("This is metric_list:", metric_list)
        print("This is score_data_s3:", score_data_s3)
        print(conn)
        try:
            inscurs_2.executemany(sql_into_metrics, metric_list)
            #inscurs_1.execute(sql_into_metrics, metric_list)
        except:
            print("did not upload")
            raise 

        inscurs.executemany(sql_into_scores, score_data_s3)
        print(inscurs)
        print(inscurs_2)

        conn.commit()
        conn_2.commit()

        print("Now deleting the file from s3")
        delete_s3_file(prefix=prefix_to_metrics)
        print("Delete finished, and")
    except Exception as e:
        print('Unable to add latest metrics to RDS, Error: ', e)
        print(traceback.format_exc())
        conn.rollback()
        conn_2.rollback()
    inscurs.close()
    inscurs_2.close()

    return print("The upload was successful, shutting down...")

#where it pulls the data from (S3, .txt file, transforms to np array for #scores and metrics and returns the two arrays to then be transformed #into lists before executing the query. (This read_files function does #run normally)

def read_files(prefix, bucket):
    metric_bucket = bucket
    session = boto3.session.Session()
    REGION = session.region_name

    s3 = boto3.client('s3', region_name=REGION)
    txt_file = s3.get_object(Bucket=metric_bucket, Key=prefix)

    metrics = txt_file['Body'].read()
    df = json.loads(metrics)        

    df['entity_id'] = [int(x) for x in df['entity_id']]
    df['score'] = [int(x) for x in df['score']]
    df['TruePos'] = [int(x) for x in df['TruePos']]
    df['FalsePos'] = [int(x) for x in df['FalsePos']]
    df['FalseNeg'] = [int(x) for x in df['FalseNeg']]
    df['TrueNeg'] = [int(x) for x in df['TrueNeg']]
    df['score_overall'] = [float(x) for x in df['score_overall']]
    df['precision'] = [float(x) for x in df['precision']]
    df['recall'] = [float(x) for x in df['recall']]
    df['fscore'] = [float(x) for x in df['fscore']]
    df['support'] = [float(x) for x in df['support']]

    print(df)

    numpy_array_scores = np.column_stack(
        (
        df['entity_id'],
        df['score']
        )
        )
    numpy_array_metrics = np.column_stack(
        (
        df['score_overall'],
        df['precision'],
        df['recall'],
        df['fscore'],
        df['support'],
        df['TruePos'],
        df['FalsePos'],
        df['FalseNeg'],
        df['TrueNeg']


        )
        )
    print("The numpy array_scores:", numpy_array_scores)
    print("The numpy array_metrics:", numpy_array_metrics)
    return numpy_array_scores, numpy_array_metrics

这是来自 AWS lambda 的日志输出/错误消息,我希望将 metric_list 和 Score_data_s3 值(在列表内)添加到 RDS。

START RequestId: .....
Connecting to the Dealscore RDS DB:
Adding the latest metric data into RDS...

{'entity_id': [42, 11, 12], 'score': [95, 95, 30], 'TruePos': [0], 'FalsePos': [1], 'FalseNeg': [0], 'TrueNeg': [0], 'score_overall': [0.0], 'precision': [0.0], 'recall': [0.0], 'fscore': [0.0], 'support': [0.0]}

The numpy array_scores: [[42 95]
 [11 95]
 [12 30]]

The numpy array_metrics: [[0. 0. 0. 0. 0. 0. 1. 0. 0.]]

This is metric_list: [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]]

This is score_data_s3: [[42, 95], [11, 95], [12, 30]]
<pymysql.connections.Connection object at 0x7fa52ff33f28>

did not upload

Unable to add latest metrics to RDS, Error:  (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'precision, recall, fscore, support, true_pos, false_pos, false_neg, true_neg) VA' at line 1")

Traceback (most recent call last):

  File "/var/task/upload_to_rds.py", line 57, in lambda_handler
    inscurs_2.executemany(sql_into_metrics, metric_list)
  File "/var/task/pymysql/cursors.py", line 197, in executemany
    self._get_db().encoding)
  File "/var/task/pymysql/cursors.py", line 234, in _do_execute_many
    rows += self.execute(sql + postfix)
  File "/var/task/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/var/task/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/var/task/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/var/task/pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "/var/task/pymysql/connections.py", line 1075, in read
    first_packet = self.connection._read_packet()
  File "/var/task/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "/var/task/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/var/task/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)

pymysql.err.ProgrammingError: 

(1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'precision, recall, fscore, support, true_pos, false_pos, false_neg, true_neg) VA' at line 1")

The upload was successful, shutting down...


END RequestId: ...
REPORT RequestId: ...   
Duration: 813.15 ms Billed Duration: 900 ms     Memory Size: 256 MB Max Memory Used: 103 MB

最佳答案

问题在于 precision是Mysql保留字,并且您已将其用作列名,因此您应该使用反引号对其进行四舍五入并将查询更改为

sql_into_metrics = "INSERT INTO dealscore_metrics_main 
                    (score_overall, `precision`, recall, fscore, support, 
                     true_pos, false_pos, false_neg, true_neg) 
                    VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)"

关于python - PyMySQL INSERT INTO 查询忽略/切断未运行的查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57689815/

相关文章:

mysql - 提高mysql加载数据infile的性能

amazon-web-services - AWS 横向扩展、纵向扩展

amazon-web-services - Linux (Ubuntu) 上的 CPU 使用率下降

python - 我用什么,CherryPy 或 Pylons?

python - 向函数添加行

php - 是否可以从上一个循环的值中添加或减去?

amazon-web-services - AWS Cognito 和应用程序负载均衡器 : Error in SAML response processing RedirectUri is not registered with the client

java - 如何使用 pyinstaller 在可执行文件中包含特定的 .jar 文件?

python - 从 Python 调用这个 OpenCV 函数的正确方法是什么?

mysql - 如何将 mysql workbench 连接到托管数据库