我有一张这样的 table
TITLE | DESCRIPTION
------------------------------------------------
test1 | value blah blah value
test2 | value test
test3 | test test test
test4 | value test value test
如何仅选择包含后续冗余字符串的行(“blah blah”,而不是“blah bleh blah”)?
所需的输出应该只是
TITLE | DESCRIPTION
------------------------------------------------
test1 | value blah blah value
test3 | test test test
最佳答案
您可以针对此问题(以及许多其他问题)创建一个包含自然数的辅助表(仅一次)。它可用于多种用途:
create table seq (num int);
insert into seq values (1),(2),(3),(4),(5),(6),(7),(8);
insert into seq select num+8 from seq;
insert into seq select num+16 from seq;
insert into seq select num+32 from seq;
insert into seq select num+64 from seq;
/* continue doubling the number of records until you feel you have enough */
然后您可以在查询中联接该表,其中每个数字用作短语中单词的序列号。这样您就可以提取每个单词并将其与下一个进行比较:
select title, description
from phrases
where description not in (
select description
from phrases p
inner join seq
on seq.num <= length(p.description)
- length(replace(p.description,' ',''))
and substring_index(substring_index(
description, ' ', num), ' ', -1)
= substring_index(substring_index(
description, ' ', num+1), ' ', -1)
)
示例数据的输出为:
| title | description |
|-------|-----------------------|
| test2 | value test |
| test4 | value test value test |
关于mySQL:查找 VARCHAR 字段中字符串的重复项?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37553151/