database - Neo4j 在执行递归查询时比 MySQL 慢

我想比较 Neo4j(ver. 3.1) 和 MySQL 在执行递归查询方面的表现。因此，我在 MySQL 数据库中创建了两个表 - Customer 和 CustomerFriend。

第二个表由CustomerID和FriendID列组成，它们都指向Customer表中的CustomerID列.在 Neo4j 中创建了相应的实体:

Customer 节点和 FRIEND_OF 关系 (c:Customer)-[f:FRIEND_OF]->(cc:Customer)。数据库中填充了相同的数据: 100000 个客户，每个客户有 100 个关系。执行以下查询:

MySQL(60 年代)

SELECT distinct cf4.FriendID FROM customerfriend cf1
join customerfriend cf2 on cf1.FriendID = cf2.CustomerID
join customerfriend cf3 on cf2.FriendID = cf3.CustomerID
join customerfriend cf4 on cf3.FriendID = cf4.CustomerID
where cf1.CustomerID =99;

Neo4j(240s)

match (c:Customer{CustomerID:99})-[:FRIEND_OF*4]->(cc:Customer)
return distinct cc.CustomerID;

查询是从简单的 Java 应用程序运行的，它只需连接到数据库(使用可用的连接器)、运行查询并测量执行时间。

测量时间清楚地表明 Neo4j 在执行上述查询时比 MySQL 慢(MySQL 60s，Neo4j 240s)。我已经针对每个客户的 50 个关系测试了上述查询，并且我取得了相同的结果(MySQL 7s 比 Neo4j 17s 快)。

我阅读了一些关于在 Neo4j 中执行递归查询的文章，这些文章表明 Neo4j 应该比 MySQL 更好地管理此类查询。这就是为什么我开始怀疑我是做错了什么还是执行时间合适 (??)。

我想知道 Neo4j 中是否存在任何调整系统性能的可能性。对于 MySQL，我将 innodb_buffer_pool_size 设置为 3g，这会影响更好的查询性能(更短的执行时间)。

--------------------------------编辑------ ----------------------

我考虑了以下将我的 Noe4j 查询重写为新形式的建议:

match (c:Customer{CustomerID:99})-[:FRIEND_OF]->(c1)-[:FRIEND_OF]->(c2)
with distinct c2
match (c2)-[:FRIEND_OF]->(c3)
with distinct c3
match (c3)-[:FRIEND_OF]->(cc:Customer)
with distinct cc
return cc.CustomerID;

并实现了更好的查询时间:40s

在 MySQL 的情况下，我已经找到了优化先前查询的方法，类似于 Neo4j 查询优化的想法:

select distinct FriendID as depth4
from customerfriend
where CustomerID in
(select distinct FriendID as depth3
from customerfriend
where CustomerID in
(select distinct FriendID as depth2
from customerfriend
where CustomerID in
(select distinct FriendID as depth
from customerfriend
where CustomerID =99
)));

执行此查询花费了 24 秒

Neo4j 还是比 MySQL 差...

最佳答案

你能试试吗:

match (c:Customer{CustomerID:99})-[:FRIEND_OF]->(c1)-[:FRIEND_OF]->(c2)
with distinct c2
match (c2)-[:FRIEND_OF]->(c3)
with distinct c3
match (c3)-[:FRIEND_OF]->(cc)
with distinct cc
return cc.CustomerID;

并分享你的查询计划和这个查询的查询计划？

更新

要仅测量没有电汇的查询时间，您可以尝试运行这个:

match (c:Customer{CustomerID:99})-[:FRIEND_OF]->(c1)-[:FRIEND_OF]->(c2)
with distinct c2
match (c2)-[:FRIEND_OF]->(c3)
with distinct c3
match (c3)-[:FRIEND_OF]->(cc)
with distinct cc
with cc.CustomerID 
return count(*);

关于database - Neo4j 在执行递归查询时比 MySQL 慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41943800/

database - Neo4j 在执行递归查询时比 MySQL 慢

更新

上一篇：database - 为什么一个包含许多打开的连接的连接池对于系统来说比每次打开一个新连接的成本要低？

下一篇：database - 在 Laravel 中将请求输入值从空字符串更改为 null