php - Zend_Search_Lucene 帮助

标签 php zend-framework lucene search-engine

编辑:

我已经成功地通过使用解决了这个问题:

+"lorem ipsum" +type:photo
+"lorem ipsum" +type:video

另一个问题是索引返回正确的结果,但 id 错误(id 是主键)。更具体地说,返回的 id 字段比我用来构建索引的数据库中的真实 id (id - 1) 少 1。

这很奇怪。


这些搜索查询出了什么问题:

"lorem ipsum" AND +type:photo
"lorem ipsum" AND +type:video

第一个查询应该只查找 type = photo 的结果,第二个查询仅搜索视频。但他们都返回了照片和视频。

这是我构建索引的方法:

    // create media index
    $index = Zend_Search_Lucene::create('/data/media_index');
    // get all media
    $media = $this->_getTable('Media')->get();
    // iterate through media and build index
    foreach ($media as $m) {

        $doc = new Zend_Search_Lucene_Document();

        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('id',
                                                           $m->id));
        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('thumb_path',
                                                           $m->thumb_path));
        $doc->addField(Zend_Search_Lucene_Field::Keyword('title',
                                                         $m->title));
        $doc->addField(Zend_Search_Lucene_Field::UnStored('description',
                                                          $m->description));
        $doc->addField(Zend_Search_Lucene_Field::Keyword('type',
                                                         $m->type));

        $index->addDocument($doc);

    }
    // commit the index
    $index->commit();

这是我的搜索方式:

    $index = Zend_Search_Lucene::open('/data/media_index');
    $this->view->photos = $index->find('"lorem ipsum" AND +type:photo');
    $this->view->videos = $index->find('"lorem ipsum" AND +type:video');

有什么想法吗?

最佳答案

我刚刚对自己的搜索索引进行了一些测试,问题似乎出在查询本身而不是代码中。查询中的“AND”是运算符,“+”也是运算符。查询解析器似乎对中间没有术语的双运算符逻辑感到困惑。这是我在他们的文档中找到的一段引用:

If the AND/OR/NOT style is used, then an AND or OR operator must be present between all query terms. Each term may also be preceded by NOT operator. The AND operator has higher precedence than the OR operator. This differs from Java Lucene behavior.

现在,通过解析器运行查询,这是 Search_Query 对象:

string '"lorem ipsum" AND +type:photo' (length=29)

object(Zend_Search_Lucene_Search_Query_MultiTerm)[230]
  private '_terms' => 
    array
      0 => 
        object(Zend_Search_Lucene_Index_Term)[236]
          public 'field' => null
          public 'text' => string 'lorem' (length=5)
      1 => 
        object(Zend_Search_Lucene_Index_Term)[237]
          public 'field' => null
          public 'text' => string 'ipsum' (length=5)
      2 => 
        object(Zend_Search_Lucene_Index_Term)[238]
          public 'field' => null
          public 'text' => string 'and' (length=3)
      3 => 
        object(Zend_Search_Lucene_Index_Term)[239]
          public 'field' => null
          public 'text' => string 'type' (length=4)
      4 => 
        object(Zend_Search_Lucene_Index_Term)[240]
          public 'field' => null
          public 'text' => string 'photo' (length=5)

稍微更改一下查询,删除 AND 或删除 +,并且仅使用 1。

string '"lorem ipsum" +type:photo' (length=25)
string '"lorem ipsum" AND type:photo' (length=28)

object(Zend_Search_Lucene_Search_Query_Boolean)[227]
  private '_subqueries' => 
    array
      0 => 
        object(Zend_Search_Lucene_Search_Query_Phrase)[230]
          private '_terms' => 
            array
              0 => 
                object(Zend_Search_Lucene_Index_Term)[233]
                  public 'field' => null
                  public 'text' => string 'lorem' (length=5)
              1 => 
                object(Zend_Search_Lucene_Index_Term)[234]
                  public 'field' => null
                  public 'text' => string 'ipsum' (length=5)
      1 => 
        object(Zend_Search_Lucene_Search_Query_Term)[235]
          private '_term' => 
            object(Zend_Search_Lucene_Index_Term)[232]
              public 'field' => string 'type' (length=4)
              public 'text' => string 'photo' (length=5)

唯一的区别:AND:

  private '_signs' => 
    array
      0 => boolean true
      1 => boolean true

+:

  private '_signs' => 
    array
      0 => null
      1 => boolean true

AND 运算符要求结果中需要两个搜索查询,而 + 仅需要右侧的值。

所以只需将查询更改为

“lorem ipsum”和类型:照片

您应该会得到您正在寻找的结果。

关于php - Zend_Search_Lucene 帮助,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1247728/

相关文章:

php - 无法将 PHP 文件包含到我的 HTML 中

php - Google App Engine 启动器在部署图像文件时抛出 ascii 编解码器错误

Zend 框架复选框上的 jQuery 验证器问题

php - 如何从MySql在ZF2中显示utf8字符

java - 当查询匹配时返回 Lucene 字段名称

lucene - Play Framework 2 的搜索模块

php - PHP 中的字符串和 "binary"字符串有什么区别?

javascript - PHP/Javascript 中的文本修改

PHP 变量范围和 "foreach"

search - 如何使用与 Solr 的 n-grams 近似匹配?