postgresql - 如何在 PostgreSQL 中设置全文搜索查询

标签 postgresql full-text-search with-statement

我对 PostgreSQL 很陌生,在实现全文搜索时遇到一些问题。我目前正在使用以下设置:

CREATE DATABASE test;

CREATE TABLE data_table (
   id BIGSERIAL PRIMARY KEY,
   name VARCHAR(160) NOT NULL,
   description VARCHAR NOT NULL
);

CREATE INDEX data_table_idx ON data_table 
USING gin(to_tsvector('English', name || ' ' || description)); 

INSERT INTO data_table (name, description) VALUES 
    ('Penguin', 'This is the Linux penguin.'), 
    ('Gnu', 'This is the GNU gnu.'), 
    ('Elephant', 'This is the PHP elephant.'), 
    ('Elephant', 'This is the postgres elephant.'), 
    ('Duck', 'This is the duckduckgo duck.'), 
    ('Cat', 'This is the GitHub cat.'), 
    ('Bird', 'This is the Twitter bird.'), 
    ('Lion', 'This is the Leo lion.');

现在,我尝试在表中搜索给定的用户输入,并返回突出显示的匹配项的整个数据行,其外观应如下所示:

WITH 
    q AS ( SELECT plainto_tsquery('English', 'elephants php') AS query ),
    d AS ( SELECT (name || ' ' || description) AS document FROM data_table ),
    t AS ( SELECT to_tsvector('English', d.document) AS textsearch FROM d ),
    r AS ( SELECT ts_rank_cd(t.textsearch, q.query) AS rank FROM t, q )
SELECT data_table.*, ts_headline('german', d.document, q.query) AS matches
FROM data_table, q, d, t , r
WHERE q.query @@ t.textsearch 
ORDER BY r.rank DESC 
LIMIT 10;

这给我留下了以下输出:

 id |   name   |          description           |              matches               
----+----------+--------------------------------+------------------------------------
  5 | duck     | This is the duckduckgo duck.   | Penguin This is the Linux penguin.
  2 | Gnu      | This is the GNU gnu.           | Gnu This is the GNU gnu.
  3 | Elephant | This is the PHP elephant.      | Penguin This is the Linux penguin.
  4 | elephant | This is the postgres elephant. | Penguin This is the Linux penguin.
  6 | Cat      | This is the GitHub cat.        | Penguin This is the Linux penguin.
  1 | Penguin  | This is the Linux penguin.     | Gnu This is the GNU gnu.
  1 | Penguin  | This is the Linux penguin.     | Penguin This is the Linux penguin.
  2 | Gnu      | This is the GNU gnu.           | Penguin This is the Linux penguin.
  4 | elephant | This is the postgres elephant. | Gnu This is the GNU gnu.
  3 | Elephant | This is the PHP elephant.      | Gnu This is the GNU gnu.
(10 rows)

因此,查询确实返回了一些内容,但它不是按排名排序的,每个文档都与名称/描述的每个组合组合在一起,唯一有效的是文档中搜索结果的正确突出显示。那么我做错了什么以及如何解决它?

最佳答案

我终于能够让它工作了。请在下面找到我的解决方案。我希望这会对某人有所帮助。如果有人知道更好的解决方案,具有更好/更快的索引,我会很高兴知道。

查询:

WITH 
    q AS ( SELECT to_tsquery('german', 'elephant | php') AS query ),
    d AS ( SELECT id, (name || ' ' || description) AS doc FROM data_table ),
    t AS ( SELECT id, doc, to_tsvector('german', doc) AS vector FROM d ),
    r AS ( 
        SELECT id, doc, ts_rank_cd(vector, query) AS rank 
        FROM t, q
        WHERE q.query @@ vector
        ORDER BY rank DESC 
    )
SELECT id, ts_headline('german', doc, q.query) AS matches, rank
FROM r, q
ORDER BY r;

结果:

 id |                         matches                         | rank 
----+---------------------------------------------------------+------
  3 | <b>Elephant</b> This is the <b>PHP</b> <b>elephant</b>. |  0.3
  4 | <b>elephant</b> This is the postgres <b>elephant</b>.   |  0.2

关于postgresql - 如何在 PostgreSQL 中设置全文搜索查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22732507/

相关文章:

function - 如何查找 PostgreSQL 中是否存在函数?

python - sqlalchemy.exc.DataError : (psycopg2. DataError) 类型字符变化的值太长

ruby - Sinatra 中的 Postgresql 连接

python - 在 with 语句中覆盖文件引用变量

delphi - 如何引用 "with"语句中创建的对象?

sql - 如果group by满足一个条件,则过滤

SQL Server、ISABOUT、加权项

mongodb - 如何使用 MongoDB 搜索文档中所有字段的单词或字符串?

linux - 如何使用 bash 从字符串中提取字段

garbage-collection - 在 GC 语言中是否有关于(或更好地使用)RAII 的研究?