mysql - 在 MySql 中提取具有特定模式的子字符串

标签 mysql sql

我有一个如下所示的文本字段:

option[A]sum[A]g3et[B]

我想获取 [ ] 中的文本,不要重复。 意思是得到:

A
B

不可能有像 [ [ ] ] 这样的 double 情况。

我知道这是一种在数据库中保存数据的可怕方式。我无法更改数据的保存方式。我只需要从此专栏中获得非常具体(一次性)的信息。

我尝试过:

SELECT substring_index(substring_index(sentence, '[', -1),']', 1)
FROM (SELECT 'THIS[A] IS A TEST' AS sentence) temp;

这给了我 A,但它不适用于许多 []

我想过使用正则表达式,但是我不知道我有多少个 [ ]

我该怎么做?

最佳答案

这不是 DB 的工作,但它是可能的:

CREATE TABLE tab(id INT, col VARCHAR(100));           
INSERT INTO tab(id, col) 
VALUES (1, 'option[A]sum[A]g3et[B]'), (2, '[Cosi]sum[A]g3et[ZZZZ]');      

SELECT DISTINCT *
FROM (
  SELECT id, RIGHT(val, LENGTH(val) - LOCATE('[', val)) AS val
  FROM
  (
    SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ']', n.n), ']', -1) AS val
    FROM tab t 
    CROSS JOIN 
    (
     SELECT a.N + b.N * 10 + 1 n
       FROM 
      (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
      ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
    ) n
    WHERE n.n <= 1 + (LENGTH(t.col) - LENGTH(REPLACE(t.col, ']', '')))
  ) sub
) s
WHERE val <> ''
ORDER BY ID;

SqlFiddleDemo

注意:

根据 col 最大长度,您可能需要在 CROSS JOIN 部分生成更多数字。目前最多为 100。

输出:

enter image description here

工作原理:

  1. CROSS JOIN生成数表
  2. 根据]作为分隔符拆分字符串
  3. RIGHT(val, LENGTH(val) - LOCATE('[', val)) 删除直到 [
  4. 的部分
  5. 过滤掉空记录
  6. 仅获取 DISTINCT

最内层查询:

╔════╦══════════╗
║ id ║   val    ║
╠════╬══════════╣
║  1 ║ option[A ║
║  1 ║ sum[A    ║
║  1 ║ g3et[B   ║
║  1 ║          ║
╚════╩══════════╝

第二个子查询:

╔════╦═════╗
║ id ║ val ║
╠════╬═════╣
║  1 ║ A   ║
║  1 ║ A   ║
║  1 ║ B   ║
║  1 ║     ║
╚════╩═════╝

最外层查询:

╔════╦═════╗
║ id ║ val ║
╠════╬═════╣
║  1 ║ A   ║
║  1 ║ B   ║
╚════╩═════╝

I need the result of query per row.. not combined

所以添加简单的:

WHERE n.n <= 1 + (LENGTH(t.col) - LENGTH(REPLACE(t.col, ']', '')))
  AND t.id = ?

编辑 2:

see http://sqlfiddle.com/#!9/8ee95/1 your query works partially for my data. I also changed the type to longtext.

您想在 MySQL 中解析 JSON。 正如我之前所说,在应用层解析并获取值(value)。这个答案仅用于演示/玩具目的,性能非常低。

如果你还是坚持SQL解决方案:

SELECT id, val,s.n
FROM (
  SELECT id, RIGHT(val, LENGTH(val) - LOCATE('[', val)) AS val,n
  FROM
  (
    SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ']', n.n), ']', -1) AS val, n.n
    FROM (SELECT id, REPLACE(col, '[]','') as col FROM tab) t
    CROSS JOIN 
    (
     SELECT e.N * 10000 + d.N * 1000 + c.N * 100 + a.N + b.N * 10 + 1 n
       FROM 
      (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
      ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
      ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
      ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
      ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) e

    ) n
    WHERE n.n <= 1 + (LENGTH(t.col) - LENGTH(REPLACE(t.col, ']', '')))
  ) sub
) s
WHERE val <> ''
GROUP BY id, val
HAVING n <> MAX(n)
ORDER BY id,n;

SqlFiddleDemo

输出:

╔═════╦═════════════╦════╗
║ id  ║    val      ║ n  ║
╠═════╬═════════════╬════╣
║  1  ║ CE31285LV4  ║  1 ║
║  1  ║ D32E        ║  3 ║
║  1  ║ GTX750      ║  5 ║
║  1  ║ M256S       ║  7 ║
║  1  ║ H2X1T       ║  9 ║
║  1  ║ FMLANE4U4   ║ 11 ║
╚═════╩═════════════╩════╝

编辑 3:

What exactly is done there? Why do you need n

CROSS JOIN 整个子查询只是一个理货表。就这些。如果 MySQL 具有生成数字序列的功能(如 generate_series 或预填充的数字表,则不需要 CROSS JOIN

SUBSTRING_INDEX 需要数表:

SUBSTRING_INDEX(str,delim,count)

Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim.

关于mysql - 在 MySql 中提取具有特定模式的子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37027037/

相关文章:

Sql Server 相当于 COUNTIF 聚合函数

mysql - 更新来自 SELECT 的数据

mysql - sudo yum install postfix mysql-libs 报错

php - mysql 日期格式与 php

mysql 如何使用给定的值集更新每一行的列

sql - ORACLE VARRAY 选择输出

php - mysql连接表并接收一行

sql - 查找所有要查看的引用文献

javascript - 使用php没有记录列时如何禁用按钮?

PHP:一种存储登录尝试次数的方法?