mysql - 如何使用唯一列值作为另一个 select 语句的输入

标签 mysql stored-procedures query-optimization

我有一个表 (MySQL),其中有一列名为 binID。此列中的值范围为 1 到 70。

我想要做的是选择该列的唯一值(应该是从 1 到 70 的数字),然后使用每个值(我们称之为 theBinID)作为另一个 SELECT 语句的参数来迭代它们,例如:

SELECT * FROM MyTable WHERE binID = theBinID ORDER BY createdDate DESC LIMIT 10

基本上,我希望获取每个 binID 的最近 10 行。

我不相信有一种方法可以用基本的 SQL 语句来做到这一点,尽管我希望这是答案,所以我编写了一个存储过程,在SELECT DISTINCT of binIDs,然后迭代它并填充临时表。

我的问题是,这是为了优化,如果我获取 100K 行,平均时间为 1.7 秒。执行我的存储过程以获取 700 行(70 个存储桶 10 条记录)需要 1.4 秒。我意识到 0.3 秒可以被视为相当大的改进,但我希望在 100K 行中获得这个亚秒。

还有更好的办法吗?

完整的存储过程是这样的:

BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE binID INT;
DECLARE cur1 CURSOR FOR SELECT DISTINCT heatmapBinID from MEStressTest ORDER BY heatmapBinID ASC;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

DROP TEMPORARY TABLE IF EXISTS TempResults;

CREATE TEMPORARY TABLE TempResults (
    `recordID` text NOT NULL,
    `queryTerm` text NOT NULL,
    `recordCreated` double(11,0) NOT NULL,
    `recordByID` text NOT NULL,
    `recordByName` text NOT NULL,
    `recordText` text NOT NULL,
    `recordSource` text NOT NULL,
    `rerecordCount` int(11) NOT NULL DEFAULT '0',
    `timecodeOffset` int(11) NOT NULL DEFAULT '-1',
    `recordByImageURL` text NOT NULL,
    `canDelete` int(11) NOT NULL DEFAULT '1',
    `heatmapBinID` int(11) DEFAULT NULL,
    `timelineBinID` int(11) DEFAULT NULL,
    PRIMARY KEY (`recordID`(20))
);

OPEN cur1;

read_loop: LOOP
    FETCH cur1 INTO binID;

    IF done THEN
        LEAVE read_loop;
    END IF;

    INSERT INTO TempResults (recordID, queryTerm, recordCreated, recordByID, recordByName, recordText, recordSource, rerecordCount, timecodeOffset, recordByImageURL, canDelete, heatmapBinID, timelineBinID)
    SELECT * FROM MEStressTest WHERE heatmapBinID = binID ORDER BY recordCreated DESC LIMIT numRecordsPerBin;
END LOOP;

CLOSE cur1;

SELECT * FROM TempResults ORDER BY heatmapBinID ASC, recordCreated DESC;

结束

最佳答案

尝试在MySQL中模拟ROW_NUMBER OVER PARTITION:http://www.sqlfiddle.com/#!2/fd8b5/4

鉴于此数据:

create table sentai(
  band varchar(50),
  member_name varchar(50),
  member_year int not null
);

insert into sentai(band, member_name, member_year) values
('BEATLES','JOHN',1960),
('BEATLES','PAUL',1961),
('BEATLES','GEORGE',1962),
('BEATLES','RINGO',1963),
('VOLTES V','STEVE',1970),
('VOLTES V','MARK',1971),
('VOLTES V','BIG BERT',1972),
('VOLTES V','LITTLE JOHN',1973),
('VOLTES V','JAMIE',1964),
('ERASERHEADS','ELY',1990),
('ERASERHEADS','RAYMUND',1991),
('ERASERHEADS','BUDDY',1992),
('ERASERHEADS','MARCUS',1993);

对象,找到每个乐队的所有三名最新成员。

首先,我们必须根据大多数年份为每个成员添加 row_number(按降序排列)

select *,

  @rn := @rn + 1 as rn
from (sentai s, (select @rn := 0) as vars)
order by s.band, s.member_year desc;

输出:

|        BAND | MEMBER_NAME | MEMBER_YEAR | @RN := 0 | RN |
|-------------|-------------|-------------|----------|----|
|     BEATLES |       RINGO |        1963 |        0 |  1 |
|     BEATLES |      GEORGE |        1962 |        0 |  2 |
|     BEATLES |        PAUL |        1961 |        0 |  3 |
|     BEATLES |        JOHN |        1960 |        0 |  4 |
| ERASERHEADS |      MARCUS |        1993 |        0 |  5 |
| ERASERHEADS |       BUDDY |        1992 |        0 |  6 |
| ERASERHEADS |     RAYMUND |        1991 |        0 |  7 |
| ERASERHEADS |         ELY |        1990 |        0 |  8 |
|    VOLTES V | LITTLE JOHN |        1973 |        0 |  9 |
|    VOLTES V |    BIG BERT |        1972 |        0 | 10 |
|    VOLTES V |        MARK |        1971 |        0 | 11 |
|    VOLTES V |       STEVE |        1970 |        0 | 12 |
|    VOLTES V |       JAMIE |        1964 |        0 | 13 |

然后,当成员位于不同的乐队时,我们重置行号:

select *,

  @rn := IF(@pg = s.band, @rn + 1, 1) as rn,
  @pg := s.band
from (sentai s, (select @pg := null, @rn := 0) as vars)
order by s.band, s.member_year desc;

输出:

|        BAND | MEMBER_NAME | MEMBER_YEAR | @PG := NULL | @RN := 0 | RN | @PG := S.BAND |
|-------------|-------------|-------------|-------------|----------|----|---------------|
|     BEATLES |       RINGO |        1963 |      (null) |        0 |  1 |       BEATLES |
|     BEATLES |      GEORGE |        1962 |      (null) |        0 |  2 |       BEATLES |
|     BEATLES |        PAUL |        1961 |      (null) |        0 |  3 |       BEATLES |
|     BEATLES |        JOHN |        1960 |      (null) |        0 |  4 |       BEATLES |
| ERASERHEADS |      MARCUS |        1993 |      (null) |        0 |  1 |   ERASERHEADS |
| ERASERHEADS |       BUDDY |        1992 |      (null) |        0 |  2 |   ERASERHEADS |
| ERASERHEADS |     RAYMUND |        1991 |      (null) |        0 |  3 |   ERASERHEADS |
| ERASERHEADS |         ELY |        1990 |      (null) |        0 |  4 |   ERASERHEADS |
|    VOLTES V | LITTLE JOHN |        1973 |      (null) |        0 |  1 |      VOLTES V |
|    VOLTES V |    BIG BERT |        1972 |      (null) |        0 |  2 |      VOLTES V |
|    VOLTES V |        MARK |        1971 |      (null) |        0 |  3 |      VOLTES V |
|    VOLTES V |       STEVE |        1970 |      (null) |        0 |  4 |      VOLTES V |
|    VOLTES V |       JAMIE |        1964 |      (null) |        0 |  5 |      VOLTES V |

然后我们只选择每个乐队中最近的三名成员:

select x.band, x.member_name, x.member_year
from
(
  select *,
    @rn := IF(@pg = s.band, @rn + 1, 1) as rn,
    @pg := s.band
  from (sentai s, (select @pg := null, @rn := 0) as vars)
  order by s.band, s.member_year desc
) as x
where x.rn <= 3
order by x.band, x.member_year desc;

输出:

|        BAND | MEMBER_NAME | MEMBER_YEAR |
|-------------|-------------|-------------|
|     BEATLES |       RINGO |        1963 |
|     BEATLES |      GEORGE |        1962 |
|     BEATLES |        PAUL |        1961 |
| ERASERHEADS |      MARCUS |        1993 |
| ERASERHEADS |       BUDDY |        1992 |
| ERASERHEADS |     RAYMUND |        1991 |
|    VOLTES V | LITTLE JOHN |        1973 |
|    VOLTES V |    BIG BERT |        1972 |
|    VOLTES V |        MARK |        1971 |

虽然MySQL上还没有窗口函数(例如ROW_NUMBER OVER PARTITION),但只需用变量来模拟它。请告诉我们这是否比光标方法更快

<小时/>

在支持窗口的 RDBMS 上看起来如何:http://www.sqlfiddle.com/#!1/fd8b5/6

with member_recentness as
(
  select row_number() over each_band as recent, *
  from sentai
  window each_band as (partition by band order by member_year desc)
)
select * 
from member_recentness
where recent <= 3;

输出:

| RECENT |        BAND | MEMBER_NAME | MEMBER_YEAR |
|--------|-------------|-------------|-------------|
|      1 |     BEATLES |       RINGO |        1963 |
|      2 |     BEATLES |      GEORGE |        1962 |
|      3 |     BEATLES |        PAUL |        1961 |
|      1 | ERASERHEADS |      MARCUS |        1993 |
|      2 | ERASERHEADS |       BUDDY |        1992 |
|      3 | ERASERHEADS |     RAYMUND |        1991 |
|      1 |    VOLTES V | LITTLE JOHN |        1973 |
|      2 |    VOLTES V |    BIG BERT |        1972 |
|      3 |    VOLTES V |        MARK |        1971 |

关于mysql - 如何使用唯一列值作为另一个 select 语句的输入,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10762598/

相关文章:

MYSQL从存储过程中的准备语句获取更新的行数

sql - 选择 SQL Server 及以下的所有层次结构级别

sql - 如何在 postgres 中有效地选择具有 MIN 日期的行

MySQL 查询优化器显示对具有主索引和复合索引的表进行查询的随机行为

java - 使用 IN 子句优化 Oracle 查询

php - symfony 形式 : Date is not valid

mysql - 拆分字符串并循环遍历 MySQL 存储过程中的值

c# - 执行 Oracle 过程

mysql - 更改 asp.net 成员(member)中的用户名

php - MySQL:用新值替换特定外键的所有实例