我正在尝试使用Elasticsearch对以下项目进行排序
[
{name: 'Company 1'},
{name: 'Company 2'},
{name: 'aa 01'},
{name: 'aabb'}
]
如果我按名称进行排序(-> ...是ES的排序部分)
aa 01 -> 01
Company 1 -> 1
Company 2 -> 2
aabb -> aabb
我想拥有
aa 01
aabb
Company 1
Company 2
我试图用
type: 'keyword'
更改映射(-> ...是ES的排序部分)Company 1 -> Company 1
Company 2 -> Company 2
aa 01 -> aa 01
aabb -> aabb
我试图找到其他警告语,但它似乎是旧的ES版本,例如Elastic search alphabetical sorting based on first character,
index_analyzer
或index
均不起作用
最佳答案
您将按字典顺序获得结果,这对于计算机而言是完全合适的,但对人类来说却没有太大意义(期望结果按字母顺序排序)。
用于表示大写字母的字节的值比用于表示小写字母的字节的值低,因此名称以最低的字节排在最前面。 ASCII Table
为此,您需要以字节顺序对应于所需排序顺序的方式为每个名称建立索引。换句话说,您需要一个将发出单个小写 token 的分析器。
为要排序的字段创建自定义关键字分析器:
PUT /my_index
{
"settings" : {
"analysis" : {
"analyzer" : {
"custom_keyword_analyzer" : {
"tokenizer" : "keyword",
"filter" : ["lowercase"]
}
}
}
},
"mappings" : {
"_doc" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "text",
"analyzer" : "custom_keyword_analyzer",
"fielddata": true
}
}
}
}
}
}
}
索引您的数据:
POST my_index/_doc/1
{
"name" : "Company 01"
}
POST my_index/_doc/2
{
"name" : "Company 02"
}
POST my_index/_doc/3
{
"name" : "aa 01"
}
POST my_index/_doc/4
{
"name" : "aabb"
}
执行排序:
POST /my_index/_doc/_search
{
"sort": "name.raw"
}
响应:
[
{
"_index": "my_index",
"_type": "_doc",
"_id": "3",
"_score": null,
"_source": {
"name": "aa 01"
},
"sort": [
"aa 01"
]
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "4",
"_score": null,
"_source": {
"name": "aabb"
},
"sort": [
"aabb"
]
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": null,
"_source": {
"name": "Company 01"
},
"sort": [
"company 01"
]
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_score": null,
"_source": {
"name": "Company 02"
},
"sort": [
"company 02"
]
}
]
引用:Sorting and Collations
关于elasticsearch - 使用elasticsearch对完整字符串进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52827860/