elasticsearch - Elasticsearch不返回具有相同 token 的结果?

标签 elasticsearch search token

插入ElasticSearch的数据是韩文,因此我无法提供确切的大小写,但可以说
我有一个词ABBCC已被标记为["A","BBCC"]和另一个词AZZXXX被标记为["A","ZZXXX"]

如果我搜索ABBCC,那么AZZXXX是否应该出现,因为它们具有相同的 token ?还是这不是Elasticsearch的工作方式?

这就是我检查分析单词的方式:

GET recpost_test/_analyze
{
  "analyzer": "my_analyzer",
  "text":"my query String!" 
}

这就是我创建索引的方式:
PUT recpost
{
  "settings": {
    "index": {
      "analysis": {
        "tokenizer": {
          "nori_user_dict": {
            "type": "nori_tokenizer",
            "decompound_mode": "mixed",
            "user_dictionary": "userdict_ko.txt"
          }
        },
        "analyzer": {
          "my_analyzer": {
            "type": "custom",
            "tokenizer": "nori_user_dict"
          }
        },
        "filter": {
        "substring": {
          "type": "edgeNGram",
          "min_gram": 1,
          "max_gram": 10
        }
      }
      }
    }
  }
}

这是我搜索的方式:
GET recpost/_search
{
  "_source": [""],
  "from": 0,
  "size": 2,
  "query":{
    "multi_match": {
      "query" : "my query String!",
      "type": "best_fields", 
      "fields" : [
        "brandkor",
        "content",
        "itemname",
        "name",
        "review",
        "shortreview^2",
        "title^3"]
    }
  }
}


编辑:
我尝试在搜索中添加“分析器”字段,但仍然无法正常工作
GET recpost/_search
{
  "_source": [""],
  "from": 0,
  "size": 2,
  "query":{
    "multi_match": {
      "query" : "깡스",
      "analyzer": "my_analyzer", 
      "type": "best_fields", 
      "fields" : [
        "brandkor",
        "content",
        "itemname",
        "name",
        "review",
        "shortreview^2",
        "title^3"]
    }
  }
}

EDIT2:这是我的映射:
{
  "recpost_test" : {
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "brandkor" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "content" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "field_statistics" : {
          "type" : "boolean"
        },
        "fields" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "itemname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "offsets" : {
          "type" : "boolean"
        },
        "payloads" : {
          "type" : "boolean"
        },
        "positions" : {
          "type" : "boolean"
        },
        "review" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "shortreview" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "term_statistics" : {
          "type" : "boolean"
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "type" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

最佳答案

我看不到您将字段装入索引(映射)。
因此,据我所知,您是将所有字段(brandkor,content等)都索引为text ..,并且基本上是在匹配精确值。

除非您将每个字段与其分析器相关联。

关于elasticsearch - Elasticsearch不返回具有相同 token 的结果?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59352302/

相关文章:

lucene - 在没有索引的情况下查询 lucene token

security - JWT 优于数据库中简单随机生成的 token ?

设置环境变量后 Elasticsearch 报告默认堆内存大小

elasticsearch - 为什么_explain返回的文档匹配:_search不返回true?

多个字符串的Javascript过滤数组

elasticsearch - 限制Elasticsearch响应中的列文本长度

asp.net - NEST OptOut代码

python - python 2.7.6至2.7.12破坏了我的脚本

regex - 如何使用“在文件中查找”在 Delphi 中执行 bool 'AND' 搜索?

php - Firebase 无效的自定义 token