我尝试将 ScyllaDB 与 python 一起使用,但它非常慢。当我运行底部显示的实例代码时,我得到:
26:23:109998
26:23:112695
我关心的是最好的性能,不幸的是这次向数据库添加数据的时间肯定太长了。有什么办法可以加快这个过程吗?
print(datetime.now().strftime("%M:%S:%f"))
session.execute(
"""
INSERT INTO log (id, date, message)
VALUES (now(), %s, %s)
""",
(date, message)
)
print(datetime.now().strftime("%M:%S:%f"))
更新
在这个话题的推荐下,我决定按照官方文档使用prepared statements和batches来提高向ScyllaDB添加数据的性能。我的代码目前看起来如下所示,但效率没有显着变化。还有其他想法吗?
print("time 0: " + str(datetime.now()))
query = "INSERT INTO message (id, message) VALUES (uuid(), ?)"
prepared = session.prepare(query)
for key in range(100):
print(key)
try:
batch = BatchStatement(consistency_level=ConsistencyLevel.QUORUM)
for key in range(100):
batch.add(prepared, ("example message",))
session.execute(batch)
except Exception as e:
print("An error occured : " + str(e))
pass
print("time 1: " + str(datetime.now()))
运行此源代码后,结果如下所示:
test 0: 2018-06-19 11:10:13.990691
0
1
...
41
cAn error occured : Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out for messages.message - received only 1 responses from 2 CL=QUORUM." info={'write_type': 'BATCH', 'required_responses': 2, 'consistency': 'QUORUM', 'received_responses': 1}
42
...
52 An error occured : errors={'....0.3': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=.....0.3
53
An error occured : Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out for messages.message - received only 1 responses from 2 CL=QUORUM." info={'write_type': 'BATCH', 'required_responses': 2, 'consistency': 'QUORUM', 'received_responses': 1}
54
...
59
An error occured : Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out for messages.message - received only 1 responses from 2 CL=QUORUM." info={'write_type': 'BATCH', 'required_responses': 2, 'consistency': 'QUORUM', 'received_responses': 1}
60
61
62
...
69
70
71
An error occured : errors={'.....0.2': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=.....0.2
72
An error occured : errors={'....0.1': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=....0.1
73
74
...
98
99
test 1: 2018-06-19 11:11:03.494957
最佳答案
有多种因素会限制您的表现。从 Scylla 服务器配置开始。例如,如果您创建了一个具有非常小、速度慢的网络实例的集群。继续,客户端硬件和实例本身的工作负载,同时考虑每个主机的连接数、每个连接的线程数和驱动程序/连接器端的其他可调参数。最后,使用准备好的语句以更有效的方式将信息写入 Scylla。
详细了解您正在使用的环境和工作负载的目的有助于推荐更具体的操作方案。
关于python - 是否可以使用 ScyllaDB for python 更有效地添加数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50719426/