我正在尝试测量每个 SQL 查询的查询处理时间。我需要多次运行一些 SQL 查询,但使用随机生成的日期范围。所以我需要保存查询循环中生成的所有结果,但在不同的数据框中。
我尝试过使用 globals(),但问题是我无法生成保存在这些列表中的结果的形状。
导入MySQLdb 随机导入 从随机导入 randint 导入日期时间 从日期时间导入时间增量
导入时间
将 numpy 导入为 np
将 pandas 导入为 pd
db_connection = MySQLdb.connect(host="localhost", user="root", passwd="050194.Piku", db = "lineitem")
光标=db_connection.cursor()
对于范围 (2) 内的 x: date_range1 = datetime.date(randint(1992, 1995), randint(1, 12), randint(1, 30)) date_range2 = datetime.date(randint(1996, 1998), randint(1, 12), randint(1, 30)) mdate1 = str(date_range1.year) + "-"+ str(date_range1.month) + "-"+ str(date_range1.day) mdate2 = str(date_range2.year) + "-"+ str(date_range2.month) + "-"+ str(date_range2.day)
orderkey = str(randint(1, 6000000))
lineitem_extended_price_range1 = round(random.uniform(900, 90000), 5)
lineitem_extended_price_range2 = round(random.uniform(90001, 110000), 5)
lineitem_ext_price1 = str(lineitem_extended_price_range1)
lineitem_ext_price2 = str(lineitem_extended_price_range2)
order_total_price_range1 = round(random.uniform(850, 85000), 5)
order_total_price_range2 = round(random.uniform(85001, 560000), 5)
order_total_price1 = str(order_total_price_range1)
order_total_price2 = str(order_total_price_range2)
sql_query_lineitem1 = "SELECT * FROM lineitem_table WHERE L_SHIPDATE BETWEEN '" + mdate1 + "' AND '" + mdate2 + "' LIMIT 10;"
# sql_query_lineitem2 = "SELECT * FROM lineitem_table WHERE L_EXTENDEDPRICE BETWEEN '" + lineitem_ext_price1 + "' AND '" + lineitem_ext_price2 + "';"
# sql_query_lineitem3 = "SELECT * FROM lineitem_table WHERE L_ORDERKEY = '" + orderkey + "';"
# sql_query_order4 = "SELECT * FROM order_table WHERE O_ORDERKEY = '" + orderkey + "';"
# sql_query_order5 = "SELECT * FROM order_table WHERE O_ORDERDATE BETWEEN '" + mdate1 + "' AND '" + mdate2 + "';"
# sql_query_order6 = "SELECT * FROM order_table WHERE O_TOTALPRICE BETWEEN '" + order_total_price1 + "' AND '" + order_total_price2 + "';"
# sql_query_join = "SELECT * FROM lineitem_table INNER JOIN order_table ON lineitem_table.L_ORDERKEY = order_table.O_ORDERKEY;"
globals()["mdate1" + str(x)] = mdate1
globals()["mdate2" + str(x)] = mdate2
globals()["ext_price1" + str(x)] = lineitem_ext_price1
globals()["ext_price2" + str(x)] = lineitem_ext_price2
globals()["orderkey" + str(x)] = orderkey
globals()["total_price1" + str(x)] = order_total_price1
globals()["total_price2" + str(x)] = order_total_price2
#average_execution_sum = 0
#initial_time1 = time.time()
cursor.execute(sql_query_lineitem1)
d = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#time_taken1 = time.time() - initial_time
# cursor.execute(sql_query_lineitem2)
# globals()["df_02" + str(x)] = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#
#
# cursor.execute(sql_query_lineitem3)
# globals()["df_03" + str(x)] = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#
#
# cursor.execute(sql_query_order4)
# globals()["df_04" + str(x)] = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#
#
# cursor.execute(sql_query_order5)
# globals()["df_05" + str(x)] = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#
#
# cursor.execute(sql_query_order6)
# globals()["df_06" + str(x)] = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#
#
# cursor.execute(sql_query_join)
# globals()["df_03" + str(x)] = pd.DataFrame.from_records(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
#
光标.close() db_connection.close()
打印(df_010.shape(0))
类型错误:“元组”对象不可调用
最佳答案
也许你可以尝试使用
df = pd.read_sql_query("SELECT * FROM table_name", db_connection)
将 SQL 查询中选定的表存储到 pandas 数据框 df。
关于mysql - 如何将每个sql结果存储在循环生成的数据帧中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57236558/