full-text-search - ElasticSearch中的多语言查询

标签 full-text-search multilingual elasticsearch

假设我们在ElasticSearch中具有以下映射。

{
  "content": {
    "properties": {
      "id": {
        "type": "string",
        "index": "not_analyzed",
        "store": "yes"
      },
      "locale_container": {
        "type": "object",
        "properties": {
          "english": {
            "type": "object",
            "properties": {
              "title": {
                "type": "string",
                "index_analyzer": "english",
                "search_analyzer": "english",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              },
              "text": {
                "type": "string",
                "index_analyzer": "english",
                "search_analyzer": "english",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              }
            }
          },
          "german": {
            "type": "object",
            "properties": {
              "title": {
                "type": "string",
                "index_analyzer": "german",
                "search_analyzer": "german",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              },
              "text": {
                "type": "string",
                "index_analyzer": "german",
                "search_analyzer": "german",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              }
            }
          },
          "russian": {
            "type": "object",
            "properties": {
              "title": {
                "type": "string",
                "index_analyzer": "russian",
                "search_analyzer": "russian",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              },
              "text": {
                "type": "string",
                "index_analyzer": "russian",
                "search_analyzer": "russian",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              }
            }
          },
          "italian": {
            "type": "object",
            "properties": {
              "title": {
                "type": "string",
                "index_analyzer": "italian",
                "search_analyzer": "italian",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              },
              "text": {
                "type": "string",
                "index_analyzer": "italian",
                "search_analyzer": "italian",
                "index": "analyzed",
                "term_vector": "with_positions_offsets",
                "store": "yes"
              }
            }
          }
        }
      }
    }
  }
}

当特定用户查询索引时,我们可以从她的设置中获取其文化,即我们知道要使用哪个分析器。我们该如何制定一个查询,以她自己的语言(比如德语)仅搜索“标题”和“文本”字段,并使用德语分析器对搜索查询进行标记?

最佳答案

我简化了该示例,将standard分析器用于“英语”,将simple(不停止)用于“法语”。对于这样的文档:

{
  id: "abc",
  locale_container: {
    english: {
      title: "abc to ABC",
      text: ""
    },
    french: {
      title: "def to DEF",
      text: ""
    }
  }
}

以下查询可以解决问题:
  • locale_container.english.title:abc->返回文档
  • locale_container.french.title:def->也返回文档
  • locale_container.english.title:to->不返回任何内容,因为“to”是停用词
  • locale_container.french.title:to->返回文档
  • 关于full-text-search - ElasticSearch中的多语言查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7406692/

    相关文章:

    elasticsearch - 如何从Elasticsearch过滤stromcrawler数据

    lucene - 查询时字段被忽略

    sql-server - MS-SQL Server 2000慢速全文本索引

    mysql - 关键字 - 和/

    apache - 根据用户的浏览器区域设置,SEO 对 "redirect"用户有多糟糕?

    c# - 我怎样才能 "pass through"来自 NEST Elasticsearch 查询的原始 json 响应?

    mongodb - typeorm mongo 全文搜索 - 按 $meta : "textScore" 排序

    sql - 如何在postgresql的全文搜索中找到相似的词?

    drupal - 在多语言 drupal 中获取另一种语言首页的一般方法?

    .net 多语言 cms