performance - 如何过滤掉indexsearcher.search()函数返回的 "Hits"结果？

标签 performance lucene indexing confluence

如何减小 indexsearcher.search() 函数返回的“命中”对象的大小？

目前我在做类似的事情:

Hits hits = indexSearch.search(query,filter,...);

Iterator hitsIt = hits.iterator();
int newSize=0;
while (hitsIt.hasNext()){
   Hit currHit = (Hit)hitsIt.next();

   if (hasPermission(currHit)){
      newSize++;
   }
}

但是，当命中数很大(例如 500 或更多)时，这会产生巨大的性能问题。

我听说过一种叫做“HitsCollector”或“Collector”的东西，应该有助于提高性能，但我不知道如何使用它。

如果有人能指出正确的方向，我将不胜感激。

我们使用 Apache Lucene 在 Atlassian Confluence 网络应用程序中建立索引。

最佳答案

Collector 只是一个简单的回调机制，它会在每次文档命中时被调用，您可以像这样使用一个收集器:-

public class MyCollector extends HitCollector {

// this is called back for every document that 
// matches, with the docid and the score

public void collect(int doc, float score){

    // do whatever you have to in here

}
}

..

HitCollector collector = new MyCollector();

indexSearch(query,filter,collector);

关于performance - 如何过滤掉indexsearcher.search()函数返回的 "Hits"结果？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9673349/

上一篇：asp.net - 如何强制 ASP.Net 无效(从而重新加载)当前应用程序实例？

下一篇：exchange-server-2007 - Exchange 2003 Web 服务和 Exchange 2007 Web 服务之间的差异

相关文章：

python - Tensorflow:在 GPU 和 CPU 上同时进行预测

java - 如何在 java 中配置和使用 KStem？

javascript - 从具有多个 ID 的单个 JS 对象中获取数据并被搜索引擎索引

indexing - 搜索引擎网站索引说明？

java - hibernate 搜索中分析器的编译时错误

javascript - JQuery .index() 不返回元素的正确索引

MongoDB 索引不帮助查询多键索引

c++ - 避免 C++ 循环中复杂对象的最小范围效率低下的技术？

c - 赋值与增量操作

php - Solr/lucene 搜索 - 有多好用 - 哪一个？