graph - 如何提高计数查询的性能

查询的目标是找到返回的节点和边的数量。查询如下:

g.inject(1).union(V().has('property1', 'A').aggregate('v').outE().has('property1', 'E').aggregate('e').inV().has('property1', 'B').aggregate('v')).select('v').dedup().as('vertexCount').select('e').dedup().as('edgeCount').select('vertexCount','edgeCount').by(unfold().count())

输出:vertexCount:200k edgeCount:250k 耗时:1.5 分钟

我试图优化查询并尝试了以下操作:

g.inject(1).union(V().has('property1', 'A').as('v1').outE().has('property1', 'E').as('e').inV().has('property1', 'B').as('v2')).select('v1','e','v2').by(valueMap().by(unfold())).count()

输出:250k 花费时间:30秒它仅返回边缘计数。

我们如何优化查询以返回顶点和边数，并在需要时限制顶点或边？

最佳答案

我不确定我能提供什么突破性的东西，但似乎只需删除计数不需要的处理，您的第二个查询就会变得更快:

g.V().has('property1', 'A').
  outE().has('property1', 'E').
  inV().has('property1', 'B').
  count()

我可以想象，如果“property1”(代表“A”)被索引，则删除 inject()/union()将允许该索引获得命中(不确定 JanusGraph 是否会优化该查询，因为它与 inject()/union() 一起使用，并且似乎都没有达到目的)。根据“E”的“property1”的性质 vertex centric index也可能有帮助。 select().by()似乎是不必要且可能成本高昂的转换，因为它启用路径跟踪并强制添加 Map您刚刚在 count() 中丢弃的转换

您的评论表明您需要源顶点和边的计数。也许这样的事情会起作用:

gremlin> g.V(1).aggregate('e').by(constant(1)).
......1>   outE().
......2>   inV().count().
......3>   math("(2 * _) + x").
......4>     by().
......5>     by(select('e').unfold().sum()) 
==>7.0

aggregate()只是为列表中的每个源顶点保留一个“1” sum()稍后在 math()步。由于边的数量应等于 inV() 的数量您只需将其乘以“2”，然后添加该总和即可得到您要查找的内容的数量。

或者，如果边可以指向相同的目标顶点，只需将聚合模式扩展到边和 dedup() inV() :

gremlin> g.V(1).aggregate('s').by(constant(1)).
......1>   outE().aggregate('e').by(constant(1)).
......2>   inV().dedup().count().
......3>   math("_ + source + edge").
......4>     by().
......5>     by(select('s').unfold().sum()).
......6>     by(select('e').unfold().sum())  
==>7.0

如果您不想计算与目标完整路径不匹配的任何源顶点，您也可以添加过滤:

gremlin> g.V(1).filter(outE().has('weight',gt(0)).inV().hasLabel('person','software')).
......1>   aggregate('s').by(constant(1)).
......2>   outE().has('weight',gt(0)).
......3>   aggregate('e').by(constant(1)).
......4>   inV().hasLabel('person','software').dedup().count().
......5>   math("_ + source + edge").
......6>     by().
......7>     by(select('s').unfold().sum()).
......8>     by(select('e').unfold().sum()) 
==>7.0

关于graph - 如何提高计数查询的性能，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67908881/

graph - 如何提高计数查询的性能

上一篇：typescript - jest-mock-extended - 使用对象输入调用模拟 [Typescript]

下一篇：android - Dagger 柄。运行时错误。错误: ViewModel has no zero argument constructor (kotlin)