php - ElasticSearch如何在逗号分隔字段中搜索逗号分隔的字符串?

标签 php search indexing elasticsearch

在这种情况下,我应该针对逗号分隔的字段搜索逗号分隔的字符串,因此,我在映射中执行以下操作,但显示MapperParsingException[Analyzer [comma] not found for field [conduct_days]]错误。

            $course = new Course();
            $course->no = '1231321';
            .......
            .......
            $course->save();

            // Now index the new created course

            $client = \Elasticsearch\ClientBuilder::create()->build();




            $params = [
                'index' => 'my_index',
                'type' => 'my_resources',
                'body' => [
                    'my_resources' => [
                        '_source' => [
                            'enabled' => true
                        ],
                        'settings' => [
                            "analysis" => [
                                "tokenizer" => [
                                    "comma" => [
                                        "type" => "pattern",
                                        "pattern" => ","
                                    ]
                                ],
                                "analyzer" => [
                                    "comma" => [
                                        "type" => "custom",
                                        "tokenizer" => "comma"
                                    ]
                                ]
                            ]
                        ],
                        'properties' => [
                            'conduct_days' => array(
                                'type' => 'string',
                                'analyzer' => 'comma'
                            ),
                            'no' => array(
                                'type' => 'string',
                                'analyzer' => 'standard'
                            ),
                            'created_at' => array(
                                'type' => 'date_time',
                                "format"=>"YYYY-MM-dd HH:mm:ss||MM/dd/yyyy||yyyy/MM/dd"
                            ),
                            'updated_at' => array(
                                'type' => 'date_time',
                                "format" => "YYYY-MM-dd HH:mm:ss||MM/dd/yyyy||yyyy/MM/dd"
                            ),
                            'deleted_at' => array(
                                'type' => 'date_time',
                                "format" => "YYYY-MM-dd HH:mm:ss||MM/dd/yyyy||yyyy/MM/dd"
                            ),
                            'created_by' => array(
                                'type' => 'string',
                                'analyzer' => 'standard'
                            ),
                            'updated_by' => array(
                                'type' => 'string',
                                'analyzer' => 'standard'
                            ),
                            'deleted_by' => array(
                                'type' => 'string',
                                'analyzer' => 'standard'
                            )
                        ]
                    ]
                ]
            ];

            // Update the index mapping
            $client->indices()->putMapping($params);

            $params = [
                'index' => 'promote_kmp',
                'type' => 'courses',
                'id' => uniqid(),
                'body' => [
                    'id'                      => $course->id,
                    'conduct_days'            => $course->conduct_days,
                    'no'                      => $course->no,
                    'created_at'              => $course->created_at,
                    'created_by'              => $loggedInUser,
                ]
            ];
            $client->index($params);

假设我必须在操守天数字段中搜索1,3,5,7,该字段可以包含1,21,2,3,以及1,3,5,6等。对于搜索,我想应该爆炸搜索项,例如,如果搜索项是1,2,则应该搜索两次,第一次搜索1,然后搜索2。还有其他搜索解决方案吗?

最佳答案

您不能在settings调用内传递putMapping,它们将被忽略。 settings旨在传递给create调用以创建索引

    $params = [
        'index' => 'my_index',
        'body' => [
                    'settings' => [
                        "analysis" => [
                            "tokenizer" => [
                                "comma" => [
                                    "type" => "pattern",
                                    "pattern" => ","
                                ]
                            ],
                            "analyzer" => [
                                "comma" => [
                                    "type" => "custom",
                                    "tokenizer" => "comma"
                                ]
                            ]
                        ]
                    ]
        ]
    ];

    $response = $client->indices()->create($params);

然后,您可以使用映射类型定义但不使用putMapping来调用settings:
        $params = [
            'index' => 'my_index',
            'type' => 'my_resources',
            'body' => [
                'my_resources' => [
                    '_source' => [
                        'enabled' => true
                    ],
                    'properties' => [
                        'conduct_days' => array(
                            'type' => 'string',
                            'analyzer' => 'comma'
                        ),
                        'no' => array(
                            'type' => 'string',
                            'analyzer' => 'standard'
                        ),
                        'created_at' => array(
                            'type' => 'date_time',
                            "format"=>"YYYY-MM-dd HH:mm:ss||MM/dd/yyyy||yyyy/MM/dd"
                        ),
                        'updated_at' => array(
                            'type' => 'date_time',
                            "format" => "YYYY-MM-dd HH:mm:ss||MM/dd/yyyy||yyyy/MM/dd"
                        ),
                        'deleted_at' => array(
                            'type' => 'date_time',
                            "format" => "YYYY-MM-dd HH:mm:ss||MM/dd/yyyy||yyyy/MM/dd"
                        ),
                        'created_by' => array(
                            'type' => 'string',
                            'analyzer' => 'standard'
                        ),
                        'updated_by' => array(
                            'type' => 'string',
                            'analyzer' => 'standard'
                        ),
                        'deleted_by' => array(
                            'type' => 'string',
                            'analyzer' => 'standard'
                        )
                    ]
                ]
            ]
        ];

        // Update the index mapping
        $client->indices()->putMapping($params);

更新

但是,对于您而言,我认为最好的方法是创建一个index template,其中包含设置(即分析器)和映射。然后,您的应用程序只需关心的是调用index()来索引新的Course文档。 ES将在正确的时间(即您第一次为第一个Course文档建立索引)创建索引和映射。

请注意,为此,您需要
  • 删除当前索引以及代码
  • 中的indices->create()indices->putMapping()调用
  • 使用/ head /插件或Sense或仅使用curl
  • 创建索引模板
  • 仅在您的代码
  • 中调用index()

    关于php - ElasticSearch如何在逗号分隔字段中搜索逗号分隔的字符串?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34870108/

    相关文章:

    php - 如何使用 PHP 去掉 mysql_query 的最后一个逗号?

    database - 为什么我有针对唯一索引的非唯一条目? (PostgreSQL 9.0)

    search - 如何配置 Solr 以对字段名称(而不是值)执行不区分大小写的搜索?

    Mysql LIKE 或 FULLTEXT 搜索 - 这里使用哪一个?

    python - 如何重置 Pandas 数据框中的索引?

    mysql - 在mysql这样的关系型数据库中,这两种场景下索引的行为会怎样呢?

    PHP 用户 session 获取 'Muddled'

    php - 使用 php exec() 执行 linux 命令并运行 shell 脚本

    php - linux:解压缩存档时将文件名中的空格转换为下划线

    performance - 非常快的文档相似度