我试图获取 MYSQL 中分隔符之间字符串的每个不同值。我尝试使用函数 SUBSTRING_INDEX,它适用于第一个字符串和第一个字符串的延续,但不适用于第二个字符串。这就是我的意思:
Table x The result
SELECT SUBSTRING_INDEX(path, ':', 2) as p, sum(count) as N From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 3) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 4) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 5) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 6) as p, sum(count) From x Group by p;
我尝试添加SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 2), ':', 2) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 4), ':', 2) as p, sum(count) From x Group by p
查询,但结果还是一样。我想做的不仅是获取字符串 A1、A2、A3 组合的结果,还获取以 B2、C2、D2 作为第一个获取的字符串的字符串,如下表所示:
+---------------+----+
| p | N |
+---------------+----+
| :A1 | 4 |
| ... | ...|
| :B1 | 3 |
| :B1:C2 | 2 |
|... | ...|
+---------------+----+
获得这样的结果的正确函数是什么?感谢任何帮助,谢谢。
最佳答案
假设路径上的所有字符串节点都是两个字符长,并且所有路径的长度相同。
计划
- creates a sequence of valid substrings from some start to the end of the path using the fixed length of 2 for each chunk..
- joins above to itself to get paths which dont go to the end of the path
- takes substring on x.path using above computed substring indexes
- aggregates sum over above x.path subsequences
设置
create table x
(
path varchar(23) primary key not null,
count integer not null
);
insert into x
( path, count )
values
( ':A1:B2:C1:D1:G1' , 3 ),
( ':A1:B2:C1:D1:G4' , 1 ),
( ':A2:B1:C2:D2:G4' , 2 )
;
drop view if exists digits_v;
create view digits_v
as
select 0 as n
union all
select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
;
查询
select substring(x.path, `start`, `len`) as chunk, sum(x.count)
from x
cross join
(
select o1.`start`, o2.`len`
from
(
select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len`
from digits_v seq
where 1 + 3 * seq.n between 1 and 15
and 15 - 3 * seq.n between 1 and 15
) o1
inner join
(
select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len`
from digits_v seq
where 1 + 3 * seq.n between 1 and 15
and 15 - 3 * seq.n between 1 and 15
) o2
on o2.`start` >= o1.`start`
) splices
where substring(x.path, `start`, `len`) <> ''
group by substring(x.path, `start`, `len`)
order by length(substring(x.path, `start`, `len`)), substring(x.path, `start`, `len`)
;
输出
+-----------------+--------------+
| chunk | sum(x.count) |
+-----------------+--------------+
| :A1 | 4 |
| :A2 | 3 |
| :A3 | 3 |
| ... | ... |
| :A1:B2 | 4 |
| :A2:B1 | 3 |
| :A3:B3 | 2 |
| :A3:B4 | 1 |
| ... | ... |
| :A1:B2:C1 | 4 |
| :A2:B1:C2 | 2 |
| :A2:B1:D2 | 3 |
| :A3:B3:C4 | 2 |
| :A3:B4:C2 | 1 |
| ... | ... |
| :A1:B2:C1:D1 | 4 |
| :A2:B1:C2:D2 | 2 |
| :A3:B3:C4:D3 | 2 |
| :A3:B4:C2:D3 | 1 |
| ... | ... |
| :A1:B2:C1:D1:G1 | 3 |
| :A1:B2:C1:D1:G4 | 1 |
| :A2:B1:C2:D2:G4 | 2 |
| :A3:B3:C4:D3:G7 | 2 |
| :A3:B4:C2:D3:G7 | 1 |
+-----------------+--------------+
<强> sqlfiddle
关于MYSQL SUBSTRING_INDEX 提取列的每个不同字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34578043/