MYSQL SUBSTRING_INDEX 提取列的每个不同字符串

标签 mysql split substring

我试图获取 MYSQL 中分隔符之间字符串的每个不同值。我尝试使用函数 SUBSTRING_INDEX,它适用于第一个字符串和第一个字符串的延续,但不适用于第二个字符串。这就是我的意思:

Table x                    The result

enter image description here

SELECT SUBSTRING_INDEX(path, ':', 2) as p, sum(count) as N From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 3) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 4) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 5) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 6) as p, sum(count) From x Group by p;

我尝试添加SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 2), ':', 2) as p, sum(count) From x Group by p UNION SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 4), ':', 2) as p, sum(count) From x Group by p查询,但结果还是一样。我想做的不仅是获取字符串 A1、A2、A3 组合的结果,还获取以 B2、C2、D2 作为第一个获取的字符串的字符串,如下表所示:

+---------------+----+
|   p           |  N |
+---------------+----+
| :A1           | 4  |
| ...           | ...|
| :B1           | 3  |
| :B1:C2        | 2  |
|...            | ...|
+---------------+----+

获得这样的结果的正确函数是什么?感谢任何帮助,谢谢。

最佳答案

假设路径上的所有字符串节点都是两个字符长,并且所有路径的长度相同。

计划

  • creates a sequence of valid substrings from some start to the end of the path using the fixed length of 2 for each chunk..
  • joins above to itself to get paths which dont go to the end of the path
  • takes substring on x.path using above computed substring indexes
  • aggregates sum over above x.path subsequences

设置

create table x
(
  path varchar(23) primary key not null,
  count integer not null
);

insert into x
( path, count )
values
( ':A1:B2:C1:D1:G1' , 3 ),
( ':A1:B2:C1:D1:G4' , 1 ),
( ':A2:B1:C2:D2:G4' , 2 )
;

drop view if exists digits_v;
create view digits_v
as
select 0 as n
union all
select 1 union all select 2 union all select 3 union all 
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
;

查询

select substring(x.path, `start`, `len`) as chunk, sum(x.count)
from x
cross join
(
  select o1.`start`, o2.`len`
  from
  (
    select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len`
    from digits_v seq
    where 1 + 3 * seq.n between 1 and 15
    and   15 - 3 * seq.n  between 1 and 15
  ) o1
  inner join
  (
    select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len`
    from digits_v seq
    where 1 + 3 * seq.n between 1 and 15
    and   15 - 3 * seq.n  between 1 and 15
  ) o2
  on  o2.`start` >= o1.`start` 
) splices
where substring(x.path, `start`, `len`) <> ''
group by substring(x.path, `start`, `len`)
order by length(substring(x.path, `start`, `len`)), substring(x.path, `start`, `len`)
;

输出

+-----------------+--------------+
|      chunk      | sum(x.count) |
+-----------------+--------------+
| :A1             |            4 |
| :A2             |            3 |
| :A3             |            3 |
| ...             |          ... |
| :A1:B2          |            4 |
| :A2:B1          |            3 |
| :A3:B3          |            2 |
| :A3:B4          |            1 |
| ...             |          ... |
| :A1:B2:C1       |            4 |
| :A2:B1:C2       |            2 |
| :A2:B1:D2       |            3 |
| :A3:B3:C4       |            2 |
| :A3:B4:C2       |            1 |
| ...             |          ... |
| :A1:B2:C1:D1    |            4 |
| :A2:B1:C2:D2    |            2 |
| :A3:B3:C4:D3    |            2 |
| :A3:B4:C2:D3    |            1 |
| ...             |          ... |
| :A1:B2:C1:D1:G1 |            3 |
| :A1:B2:C1:D1:G4 |            1 |
| :A2:B1:C2:D2:G4 |            2 |
| :A3:B3:C4:D3:G7 |            2 |
| :A3:B4:C2:D3:G7 |            1 |
+-----------------+--------------+

<强> sqlfiddle

关于MYSQL SUBSTRING_INDEX 提取列的每个不同字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34578043/

相关文章:

java - java中字符串的正则表达式

java - 从末尾开始 SubString 一个 URL?

php - 获取线程主题中多个用户的信息

php - 如何从测试数据库的任何表中获取所有数据并将其转储到生产数据库上的相同表名(结构)中?

php - PDO Mysql错误1064

go - 在 Golang 中拆分字符串

c - 如何写出标记化的字符串

mysql - HiveQL:在一对多表中查找第 N 个值

sql - 获取SQL中前2个特殊字符之间的字符

Python 子字符串 - 将第 n 个字符拆分到某个字符串的左侧