solr - delta 导入处理程序无法正常工作

标签 solr dih

我按照@提到的步骤操作:http://wiki.apache.org/solr/DataImportHandler
我还尝试了来自 stackoverflow 的其他解决方案,但仍然无法正常工作。

问题是:
每次运行时,我仍然配置了 Delta-import 处理程序;它索引数据库中的所有记录。
我在数据库中有 30 条记录。每次我运行 delta import 时,它都会索引所有 30 条记录。我只想索引那些被更改/删除的内容。

对此问题的任何快速帮助/指针/解决方案表示赞赏。

数据-config.xml

<dataConfig>
    <dataSource type="JdbcDataSource" name="ds-books" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/test" user="root" password=""/>
    <document name="books">
        <entity name="books" pk="id" query="select * from books" deltaImportQuery="SELECT * FROM books WHERE id = '${dataimporter.delta.id}'"  deltaQuery="SELECT  id FROM books WHERE last_modified &gt;  '${dataimporter.last_index_time}'" >
            <field column="id" name="id"  indexed="true" stored="true"/>
            <field column="NAME" name="name" />
            <field column="PRICE" name="price" />
        <field column="last_modified" name="last_modified" />
        </entity>
    </document>
</dataConfig>

我用来执行它的命令是:
http://localhost:8983/solr/dataimport?command=delta-import

dataimport.properties 文件:

2013 年 5 月 10 日星期五 17:13:18 IST

last_index_time=2013-05-10 17\:13\:18

book.last_index_time=2013-05-10 17\:13\:18

dataimporter.last_index_time=2013-05-10 17\:11\:42

我得到的 XML 响应如下:
 <response>
  <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">4</int>
      </lst>
      <lst name="initArgs">
          <lst name="defaults">
            <str name="config">data-config.xml</str>
          </lst>
        </lst>
  <str name="command">delta-import</str>
  <str name="status">idle</str>
  <str name="importResponse"/>
  <lst name="statusMessages">
      <str name="Total Requests made to DataSource">1</str>
      <str name="Total Rows Fetched">30</str>
      <str name="Total Documents Skipped">0</str>
      <str name="Delta Dump started">2013-05-10 17:13:17</str>
      <str name="Identifying Delta">2013-05-10 17:13:17</str>
      <str name="Deltas Obtained">2013-05-10 17:13:17</str>
      <str name="Building documents">2013-05-10 17:13:17</str>
      <str name="Total Changed Documents">30</str>
      <str name="">Indexing completed. Added/Updated: 30 documents. Deleted 0 documents.</str>
  <str name="Committed">2013-05-10 17:13:17</str>
  <str name="Total Documents Processed">30</str>
  <str name="Time taken">0:0:0.303</str></lst>
  <str name="WARNING">This response format is experimental.  It is likely to change in the future.</str>
  </response>

在日志文件中,我得到以下信息:
INFO: Read dataimport.properties
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder doDelta
INFO: Starting delta collection.
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Running ModifiedRowKey() for Entity: books
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity books with URL: jdbc:mysql://localhost/test
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 9
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Completed ModifiedRowKey for Entity: books rows obtained : 30
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Completed DeletedRowKey for Entity: books rows obtained : 0
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Completed parentDeltaQuery for Entity: books
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder doDelta
INFO: Delta Import completed successfully
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
May 10, 2013 5:13:18 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommi
t=false}
May 10, 2013 5:13:18 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2

最佳答案

更改 data-config.xml 中的以下值已解决问题

${dih.last_index_time} 而不是 ${dataimporter.last_index_time}

${dih.delta.id} 而不是 ${dataimporter.delta.id}

我正在使用 SOLR 4.0

关于solr - delta 导入处理程序无法正常工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16481816/

相关文章:

django - Haystack more_like_this 返回全部

xml - 在 Apache Solr 中索引 XML 文件

Solr DIH delta-import 与复合主键?

json - 如何在 solr 6 中导入露天元数据和内容

search - 如何在solr中搜索负数?

ajax - 使用restful api的简单搜索

java - Solr 空白建议器/查询分析器

solr - 索引具有 mm 关系的表

solr - 如何在solr中的固定位置对某些项目实现自定义排序顺序?