ElasticSearch:inner_hits 和 hightlight_query

标签 elasticsearch elasticsearch-2.0

简短版本:是否有可能以某种方式获得 highlight_query 提供的功能以突出显示 inner_hits 结果?

长版: 请考虑以下映射:

{
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "docs": {
      "properties": {
        "doctext": {
          "type": "string",
          "store": "yes"
        },
        "sentences": {
          "type": "nested",
          "properties": {
            "text": {
              "type": "string",
              "store": "yes"
            }
          }
        }
      }
    }
  }
}

如您所见,有 doctextsentences 字段。这个想法是将文档文本分成句子以允许基于句子的搜索。

让这成为一个示例文档:

{
  "doctext": "I will do a presentation. I talk about lions and show images of zebras. I hope it will be fun.",
  "sentences": [
    {
      "text": "I will do a presentation."
    },
    {
      "text": "I talk about lions and show images of zebras."
    },
    {
      "text": "I hope it will be fun."
    }
  ]
}

现在我可以搜索整个文本和单个句子,我什至可以突出显示两者:

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "doctext": "zebras"
          }
        },
        {
          "nested": {
            "path": "sentences",
            "query": {
              "match": {
                "sentences.text": "zebras"
              }
            },
            "inner_hits": {
              "highlight": {
                "fields": {
                  "sentences.text": {}
                }
              }
            }
          }
        }
      ]
    }
  },
  "_source": false,
  "highlight": {
    "fields": {
      "doctext": {
        "highlight_query": {
          "match": {
            "doctext": "lions"
          }
        }
      }
    }
  }
}

请不要以下内容:

  1. 嵌套查询句子
  2. 该查询的 inner_hits 部分
  3. second highlighthighlight_query 部分,而不是 inner_hits 中的部分

发出此查询将导致此响应:

"hits": [
      {
        "_index": "documents",
        "_type": "docs",
        "_id": "123456",
        "_score": 0.6360315,
        "highlight": {
          "doctext": [
            "I will do a presentation. I talk about <em>lions</em> and show images of zebras. I hope it will be fun."
          ]
        },
        "inner_hits": {
          "sentences": {
            "hits": {
              "total": 1,
              "max_score": 0.5291085,
              "hits": [
                {
                  "_index": "documents",
                  "_type": "docs",
                  "_id": "123456",
                  "_nested": {
                    "field": "sentences",
                    "offset": 1
                  },
                  "_score": 0.5291085,
                  "_source": {
                    "text": "I talk about lions and show images of zebras."
                  },
                  "highlight": {
                    "sentences.text": [
                      "I talk about lions and show images of <em>zebras</em>."
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]

请注意,尽管我们搜索的是 zebras,但对于 doctext 字段,lions 是如何突出显示的。 inner_hits 会突出显示这些内容,因为我们没有指定要执行的其他操作。但我希望内部点击突出显示 lions,就像 doctext 突出显示一样。

我试图将查询的 inner_hits 部分更改为

 "inner_hits": {
          "highlight": {
            "fields": {
              "text": {
                "highlight_query": {
                  "match": {
                    "sentences.text": "lions"
                  }
                }
              }
            }
          }
        }

但这会导致以下异常:

Failed to execute phase [query_fetch], all shards failed; shardFailures {[9-pMHRPsRiyITgsRNFnkEA][documents][0]: RemoteTransportException[[Fafnir][127.0.0.1:9300][indices:data/read/search[phase/query+fetch]]]; nested:         SearchParseException[failed to parse search source [{
      "query": {
        "bool": {
          "should": [
            {
              "match": {
                "doctext": "fun"
              }
            },
            {
              "nested": {
                "path": "sentences",
                "query": {
                  "match": {
                    "sentences.text": "zebras"
                  }
                },
                "inner_hits": {
                  "highlight": {
                    "fields": {
                      "sentences.text": {
                        "highlight_query": {
                          "match": {
                            "doctext": "lions"
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          ]
        }
      },
      "_source": false,
      "highlight": {
        "fields": {
          "doctext": {
            "highlight_query": {
              "match": {
                "doctext": "lions"
              }
            }
          }
        }
      }
    }]]; nested: NullPointerException; }
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(        TransportSearchTypeAction.java:228)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(    TransportSearchTypeAction.    java:174)
        at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:46)
        at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:821)
        at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:799)
        at org.elasticsearch.transport.TransportService$4.onFailure(TransportService.java:361)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
    Caused by: ; nested: NullPointerException;
        at org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:382)
    at org.elasticsearch.action.search.SearchPhaseExecutionException.guessRootCauses(SearchPhaseExecutionException.    java:152)
        at org.elasticsearch.action.search.SearchPhaseExecutionException.getCause(SearchPhaseExecutionException.java:99)
        at java.lang.Throwable.printStackTrace(Throwable.java:665)
        at java.lang.Throwable.printStackTrace(Throwable.java:721)
        at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:60)
        at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
        at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
        at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
        at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.log(Category.java:856)
        at org.elasticsearch.common.logging.log4j.Log4jESLogger.internalInfo(Log4jESLogger.java:125)
        at org.elasticsearch.common.logging.support.AbstractESLogger.info(AbstractESLogger.java:90)
        at org.elasticsearch.rest.BytesRestResponse.convert(BytesRestResponse.java:131)
        at org.elasticsearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:96)
        at org.elasticsearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:87)
        at org.elasticsearch.rest.action.support.RestActionListener.onFailure(RestActionListener.java:60)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.raiseEarlyFailure(        TransportSearchTypeAction.java:316)
        ... 10 more
    Caused by: java.lang.NullPointerException
        at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:258)
        at org.elasticsearch.index.query.BoolQueryParser.parse(BoolQueryParser.java:116)
        at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:257)
        at org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:303)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:206)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:201)
        at org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:831)
        at org.elasticsearch.search.SearchService.createContext(SearchService.java:651)
        at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:617)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:460)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(        SearchServiceTransportAction.java:392)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(        SearchServiceTransportAction.java:389)
        at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:350)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        ... 3 more

有什么办法可以实现吗?我在这个上弄错了DSL吗? inner_hits 上的文档仅说明突出显示可以工作 (https://www.elastic.co/guide/en/elasticsearch/reference/2.1/search-request-inner-hits.html),但没有涉及任何细节。

非常感谢阅读和任何提示!

最佳答案

一个很好的问题,如果在查询上下文中,内部命中不支持突出显示查询。不确定为什么这很可能是一个错误。

解决方法是将突出显示的嵌套查询包装在最后的突出显示中 询问。但是,它需要原始嵌套查询作为过滤器: OP 中突出显示的示例解决方法如下所示:

put test/docs/1
{
  "doctext": "I will do a presentation. I talk about lions and show images of zebras. I hope it will be fun.",
  "sentences": [
    {
      "text": "I will do a presentation."
    },
    {
      "text": "I talk about lions and show images of zebras."
    },
    {
      "text": "I hope it will be fun for lions"
    }
  ]
}


post test/_search
{
   "query": {
      "bool": {
         "should": [
            {
               "match": {
                  "doctext": "zebras"
               }
            },
            {
               "nested": {
                  "path": "sentences",
                  "query": {
                     "match": {
                        "sentences.text": "zebras"
                     }
                  },
                  "inner_hits": {
                     "highlight": {
                        "fields": {
                           "sentences.text": {}
                        }
                     }
                  }
               }
            }
         ]
      }
   },
   "_source": false,
   "highlight": {
      "fields": {
         "*": {
            "highlight_query": {
               "bool": {
                  "should": [
                     {
                        "match": {
                           "doctext": "lions"
                        }
                     },
                     {
                        "nested": {
                           "path": "sentences",
                           "query": {
                              "filtered": {
                                 "query": {
                                    "match": {
                                       "sentences.text": "lions"
                                    }
                                 },
                                 "filter": {
                                    "query": {
                                       "match": {
                                          "sentences.text": "zebras"
                                       }
                                    }
                                 }
                              }
                           },
                           "inner_hits": {
                              "highlight": {
                                 "fields": {
                                    "sentences.text": {}
                                 }
                              }
                           }
                        }
                     }
                  ]
               }
            }
         }
      }
   }
}

结果:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.6360315,
      "hits": [
         {
            "_index": "test",
            "_type": "docs",
            "_id": "1",
            "_score": 0.6360315,
            "highlight": {
               "doctext": [
                  "I will do a presentation. I talk about <em>lions</em> and show images of zebras. I hope it will be fun."
               ]
            },
            "inner_hits": {
               "sentences": {
                  "hits": {
                     "total": 1,
                     "max_score": 0.40240064,
                     "hits": [
                        {
                           "_index": "test",
                           "_type": "docs",
                           "_id": "1",
                           "_nested": {
                              "field": "sentences",
                              "offset": 1
                           },
                           "_score": 0.40240064,
                           "_source": {
                              "text": "I talk about lions and show images of zebras."
                           },
                           "highlight": {
                              "sentences.text": [
                                 "I talk about <em>lions</em> and show images of zebras."
                              ]
                           }
                        }
                     ]
                  }
               }
            }
         }
      ]
   }
}

关于ElasticSearch:inner_hits 和 hightlight_query,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34221764/

相关文章:

elasticsearch - 在Kibana中没有登录的 Elasticsearch 警报

elasticsearch - 以较高的文档频率获取术语

elasticsearch - 如何在ElasticSearch中过滤嵌套聚合?

search - 在ElasticSearch上按名称搜索

elasticsearch - elasticsearch _update_by_query不起作用

elasticsearch - 将新字段添加到嵌套对象

java - 需要使用 $select 和 $filter 实现 Elasticsearch 查询

python - 如何使用python脚本检查elasticsearch中是否存在索引并对其执行异常处理?

python - 在python查询中将index作为参数传递给elasticsearch

elasticsearch - 小型ES集群卡在initializing_shards上