sql - 如何获取字典中没有的单词？

我在获取字典中没有的单词时遇到了问题(我使用全文搜索，确切地说是 ispell 字典)，但这些单词会在标题列的文章表中找到。

文章表:

+----+-------------+
| id | title       |
+----+-------------+
| 1  | Lorem ipsum |
| 2  | Text example|
+----+-------------+

例如，在下面的代码中，我得到字典中没有的句子单词。

SELECT token
FROM ts_debug('polish', 'Text lorem ipsum lala')
WHERE lexemes is null and alias != 'blank'

数据库返回:

+-----------+
| token     |
+-----------+
| lorem     |
| ipsum     |
+-----------+

如何编写sql代码，在表中显示文章表中字典中没有的所有单词？我必须使用 for 循环和其他东西吗？

伪代码:

for i = 0; i < count(*) from article; i++
    SELECT token
    FROM ts_debug('polish', article[i].title)
    WHERE lexemes is null and alias != 'blank'
end

提前致谢!

最佳答案

只需获取每篇文章的不匹配词，然后使用 DISTINCT 过滤重复项。

SELECT DISTINCT token
FROM article,
LATERAL ts_debug('polish', article.title)
WHERE lexemes is null and alias != 'blank'

但是，对于 PostgreSQL 9.3 上的英语词典，您的查询似乎无论如何都不起作用:

regress=> SELECT * FROM ts_debug('english', 'sdfsASDADSsfdsfsdf fred to alan word another word') where alias != 'blank' ;
   alias   |   description   |       token        |  dictionaries  |  dictionary  |       lexemes        
-----------+-----------------+--------------------+----------------+--------------+----------------------
 asciiword | Word, all ASCII | sdfsASDADSsfdsfsdf | {english_stem} | english_stem | {sdfsasdadssfdsfsdf}
 asciiword | Word, all ASCII | fred               | {english_stem} | english_stem | {fred}
 asciiword | Word, all ASCII | to                 | {english_stem} | english_stem | {}
 asciiword | Word, all ASCII | alan               | {english_stem} | english_stem | {alan}
 asciiword | Word, all ASCII | word               | {english_stem} | english_stem | {word}
 asciiword | Word, all ASCII | another            | {english_stem} | english_stem | {anoth}
 asciiword | Word, all ASCII | word               | {english_stem} | english_stem | {word}
(7 rows)

此外，LATERAL 仅在 PostgreSQL 9.3 中受支持。如果您使用的是旧版本，则需要在 SELECT 列表和子查询中使用带有 ts_debug 的更复杂的构造，例如:

SELECT DISTINCT (x.ld).token
FROM (
   SELECT ts_debug('polish', article.title)
   FROM article
) x(ld)
WHERE (x.ld).lexemes is null and (x.ld).alias != 'blank';

关于sql - 如何获取字典中没有的单词？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22487888/

sql - 如何获取字典中没有的单词？

上一篇：php - PHP 能否获得 pg_query ('ANALYZE VERBOSE table_name;' ) 的详细输出？

下一篇：python - 带有 OSX Mavericks 和 Psycopg2 的 PostgreSQL 无法连接到 Mac 上的 Django Localhost