php - Elasticsearch滚动API搜索 “from”

标签 php xml elasticsearch

我有一个脚本,该脚本基于URL索引http://example.com/sitemap.index.xml生成站点 map ,其中index是数字>0,用于定义应在每个块中包含的结果。

$chunk = 10000;
$counter = 0;

$scroll = $es->search(array(
    "index" => "index",
    "type" => "type",
    "scroll" => "1m",
    "search_type" => "scan",
    "size" => 10,
    "from" => $chunk * ($index - 1)
));
$sid = $scroll['_scroll_id'];

while($counter < $chunk){
    $docs = $es->scroll(array(
        "scroll_id" => $sid,
        "scroll" => "1m"
    ));
    $sid = $docs['_scroll_id'];
    $counter += count($docs['hits']['hits']);
}

// ...

现在,每次我访问http://example.com/sitemap.1.xmlhttp://example.com/sitemap.2.xml时,从ES返回的结果都是完全相同的。它返回50结果(每个分片10个),但似乎不计算from = 0from = 10000的数量。

我正在使用elasticsearch-php作为E​​S库。

有任何想法吗?

最佳答案

在Java中,可以如下进行

QueryBuilder query = QueryBuilders.matchAllQuery();
SearchResponse scrollResp = Constants.client.prepareSearch(index)
        .setTypes(type).setSearchType(SearchType.SCAN)
        .setScroll(new TimeValue(600000)).setQuery(query)
        .setSize(500).execute().actionGet();
while (true) {
    scrollResp = Constants.client
            .prepareSearchScroll(scrollResp.getScrollId())
            .setScroll(new TimeValue(600000)).execute().actionGet();
    System.out.println("Record count :"
            + scrollResp.getHits().getHits().length);
    total = total + scrollResp.getHits().getHits().length;
    System.out.println("Total record count: " + total);
    for (SearchHit hit : scrollResp.getHits()) {
    //handle the hit
    }
    // Break condition: No hits are returned
    if (scrollResp.getHits().getHits().length == 0) {
        System.out.println("All records are fetched");
        break;
    }
}

希望能帮助到你。

关于php - Elasticsearch滚动API搜索 “from”,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26058551/

相关文章:

elasticsearch - Elasticsearch适用于少用词的文档

php - 创建 Wordpress 子插件

java - 每个上下文的 Log4J

xml - XSL 根据其他节点的值查找节点 + 多级排序

elasticsearch - 平均和按查询ElasticSearch分组

elasticsearch - Logstash标准化JSON日志中的URL

php - Laravel - 如何从/存储中获取图像

php - Phalcon:css 仅在 indexAction 中工作

php - HTML-PHP 登录后和登录前

xml - 如何查看XML格式的Excel文件?