我有多个阈值(通常是 2 个,但可能会有所不同)。我想为每个阈值找到每个类别中最大值小于或等于阈值的行。
例如给定类别 2 和阈值 5 和 10:
mytable:
category | val | data
---------+-----+---------
1 | 1 | 'foo'
1 | 3 | 'bar'
1 | 4 | 'baz'
2 | 2 | 'quz'
2 | 5 | 'wibble'
2 | 6 | 'wobble'
2 | 8 | 'ham'
2 | 12 | 'spam'
3 | 1 | 'eggs'
所以结果应该是:
category | val | data
---------+-----+---------
1 | 4 | 'baz' \
2 | 5 | 'wibble' | These are <= threshold 5
3 | 1 | 'eggs' /
1 | 4 | 'baz' \
2 | 8 | 'ham' | These are <= threshold 10
3 | 1 | 'eggs' /
注意:如果行不同也可以,但不是必需的。
到目前为止,我只能对 1 个阈值进行查询(基本上是标准的每组最大 n 次查询):
SELECT t1.category, t1.val, t1.data
FROM mytable t1
JOIN (
SELECT category, MAX(val) AS val
FROM mytable
GROUP BY category
WHERE val < @threshold
) AS t2
ON t1.category=t2.category AND t1.val=t2.val
如何处理多个阈值?
如果重要的话,我使用的是 T-SQL。通用的 SQL 查询会很好,但不是必需的。
最佳答案
我只需将阈值指定为行,以便更容易加入。
DECLARE @t TABLE (category int, val int, data varchar(10));
INSERT INTO @t VALUES
(1, 1, 'foo'),
(1, 3, 'bar'),
(1, 4, 'baz'),
(2, 2, 'quz'),
(2, 5, 'wibble'),
(2, 6, 'wobble'),
(2, 8, 'ham'),
(2, 12, 'spam'),
(3, 1, 'eggs');
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY threshold, category ORDER BY val DESC) AS rn
FROM (VALUES
(5),
(10)
) thresholds(threshold)
JOIN @t AS t ON val <= threshold
) AS x
WHERE rn = 1
ORDER BY threshold, category
如果(VAULES ...)
子句不可用,您可以简单地使用FROM (SELECT 5 AS Threshold UNION ALL SELECT 10)阈值
。
关于SQL给定多个阈值,获取低于每个阈值的最大值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54475519/