elasticsearch - 具有多个值的Elasticsearch术语过滤器

标签 elasticsearch

我有以下映射:

PUT /files
{
   "mappings": {
      "file": {
         "properties": {
            "FileID": {
               "type": "integer"
            },
            "FolderID": {
               "type": "integer"
            }
         }
      }
   }
}

我的数据是:
PUT /clients/client/1
{
    "id":"1",
    "name":"Joe Doe", 
    "FolderIDs":["577173","245340","777035"],
    "Emails" : ["some@email.com", "other@email.com"]
}



PUT /files/file/1
{
    "FileID": "10550",
    "FolderID" : "577173"
}

我的查询是:
GET /_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "terms": {
               "FolderID": {
                  "index": "clients",
                  "type": "client",
                  "id": "1",
                  "path": "FolderIDs"
               }
            }
         }
      }
   }
}

这将返回ID 10550的文件,很好。我的问题是如何在带有电子邮件列表的电子邮件字段上执行此操作。

映射:
PUT /emails
{
   "mappings": {
      "email": {
         "properties": {
            "EmailID": {
               "type": "integer"
            },
            "ADDRESS_FROM": {
               "type": "string",
               "index" : "not_analyzed"
            }
      } 
   } 
}

数据:
PUT /emails/email/1
{
    "EmailID": "8335",
    "ADDRESS_FROM" : "random@email.com user@email.com"
}

如何建立一个查询,该查询返回来自客户端“电子邮件”字段中ADDRESS_FROM中没有任何电子邮件的电子邮件?



客户端1具有[“some@email.com”,“other@email.com”]
因此请返回电子邮件1,因为ADDRESS_FROM不包含任何客户端1电子邮件(“random@email.com user@email.com”)。

我已经尝试过类似的操作(不起作用):
GET /_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
   "filter": {
      "bool": {
         "must_not": [
            {
               "terms": {
                  "ADDRESS_FROM": {
                     "index": "clients",
                     "type": "client",
                     "id": "1",
                     "path": "Emails"
                  }
               }
            }
         ]
      }
   }
}

最佳答案

确保客户索引中的Emails字段设置为not_analyzed

理想情况下,AddressForm字段应为电子邮件数组。
"ADDRESS_FROM" : ["random@email.com", "user@email.com"]
代替
"ADDRESS_FROM" : "random@email.com user@email.com"
如果不能更改文档结构,则需要使用whitespace analyzer作为AddressForm

下面的示例演示了这一点:

PUT /clients
{
   "mappings": {
      "client": {
         "properties": {
            "FolderIDs": {
               "type": "integer"
            },
           "Emails" : {
               "type": "string",
               "index" : "not_analyzed"
            }
         }
      }
   }
}
PUT /clients/client/1
{
    "id":"1",
    "name":"Joe Doe", 
    "FolderIDs":["577173","245340","777035"],
    "Emails" : ["some@email.com", "other@email.com"]
}

PUT /clients/client/2
{
    "id":"1",
    "name":"Joe Doe", 
    "FolderIDs":["577173","245340","777035"],
    "Emails" : ["random@email.com", "other@email.com"]
}

PUT /emails
{
   "mappings": {
      "email": {
         "properties": {
            "EmailID": {
               "type": "integer"
            },
            "ADDRESS_FROM": {
               "type": "string",
               "analyzer": "whitespace"
            }
         }
      }
   }
}

PUT /emails/email/1
{
    "EmailID": "8335",
    "ADDRESS_FROM" : "random@email.com user@email.com"
}

示例查询1:
POST emails/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "bool": {
               "must_not": [
                  {
                     "terms": {
                        "ADDRESS_FROM": {
                           "index": "clients",
                           "type": "client",
                           "id": "1",
                           "path": "Emails"
                        }
                     }
                  }
               ]
            }
         }
      }
   }
}

 "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "emails",
            "_type": "email",
            "_id": "1",
            "_score": 1,
            "_source": {
               "EmailID": "8335",
               "ADDRESS_FROM": "random@email.com user@email.com"
            }
         }
      ]
   }

示例Query2(0个匹配):
POST emails/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "bool": {
               "must_not": [
                  {
                     "terms": {
                        "ADDRESS_FROM": {
                           "index": "clients",
                           "type": "client",
                           "id": "2",
                           "path": "Emails"
                        }
                     }
                  }
               ]
            }
         }
      }
   }
}

  "hits": {
      "total": 0,
      "max_score": null,
      "hits": []
   }

关于elasticsearch - 具有多个值的Elasticsearch术语过滤器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36613112/

相关文章:

elasticsearch - 我无法读取 Logstash 中的拆分字段

java.lang.NoClassDefFoundError : org/elasticsearch/script/mustache/SearchTemplateRequest

php - PHP中的Elasticsearch完成建议者查询

elasticsearch - Elasticsearch:在脚本中访问嵌套文档属性

elasticsearch - 如何扩展现有的 docker 镜像?

elasticsearch - 什么是等效于WHERE状态= 'DONE' AND badgeId IN的 Elasticsearch DSL( 'ID1', 'ID2', 'ID3')

php - 我在哪里可以分享我对 elasticsearch php 供应商的基准测试结果

elasticsearch - 如何在多个地理点上使用Elasticsearch距离查询

java - Elasticsearch : Appending object in existing array of object Field

linux - elasticsearch 中的 Gradle 问题