我似乎在如何最好地解决这个需求方面遇到了一些困境。我意识到这个问题与以下其他问题密切相关:
- Problem with Full Outer Join not working as expected
- What is the difference when adding a filter criteria to an outer join instead of a where clause?
- Oracle outer join not working as expected
- 可能还有更多...
附加的问题是,我想知道关于如何解决该问题的普遍看法。
IF OBJECT_ID('tempdb..#A') IS NOT NULL DROP TABLE #A
IF OBJECT_ID('tempdb..#B') IS NOT NULL DROP TABLE #B
GO
CREATE TABLE #A (key1 int NOT NULL PRIMARY KEY,
value1 int NOT NULL,
value2 int NOT NULL,
is_even AS (CASE WHEN key1 % 2 = 0 THEN 1 ELSE 0 END))
CREATE TABLE #B (key1 int NOT NULL PRIMARY KEY,
value1 int NOT NULL,
value2 int NOT NULL,
is_even AS (CASE WHEN key1 % 2 = 0 THEN 1 ELSE 0 END))
GO
-- dummy data
INSERT #A (key1, value1, value2)
SELECT TOP 10 key1 = ROW_NUMBER() OVER (ORDER BY x1.object_id),
value1 = ROW_NUMBER() OVER (ORDER BY x1.object_id) % 7,
value2 = ROW_NUMBER() OVER (ORDER BY x1.object_id) % 5
FROM master.sys.objects x1, master.sys.objects x2, master.sys.objects x3
INSERT #B (key1, value1, value2)
SELECT key1, value1, value2
FROM #A
GO
-- create holes but keep SOME overlap
DELETE #A WHERE value1 > value2 -- removes 3 records
DELETE #B WHERE value1 < value2 -- removes 3 records
GO
-- show effect on tables
--SELECT * FROM #A ORDER BY key1
--SELECT * FROM #B ORDER BY key1
GO
-- create complete overview
SELECT key1 = ISNULL(a.key1, b.key1),
value1a = a.value1, value2a = a.value2,
value1b = b.value1, value2b = b.value2
FROM #A a
FULL OUTER JOIN #B b
ON b.key1 = a.key1
ORDER BY 1
GO
-- what if we only want the even records
-- THIS DOES NOT WORK !
SELECT key1 = ISNULL(a.key1, b.key1),
value1a = a.value1, value2a = a.value2,
value1b = b.value1, value2b = b.value2
FROM #A a
FULL OUTER JOIN #B b
ON b.key1 = a.key1
AND b.is_even = 1
WHERE a.is_even = 1
ORDER BY 1
我知道为什么它不起作用;我只是想知道让它发挥作用并对其他人保持可读性的最清晰方法是什么。如果它也适用于 MSSQL 以外的系统,那就加分了。
到目前为止“我的”解决方案是:
通过捕获由于 OUTER 效应而导致的 NULL:
SELECT key1 = ISNULL(a.key1, b.key1),
value1a = a.value1, value2a = a.value2,
value1b = b.value1, value2b = b.value2
FROM #A a
FULL OUTER JOIN #B b
ON b.key1 = a.key1
WHERE ISNULL(a.is_even, b.is_even) = 1
ORDER BY 1
通过 CTE
;WITH a (key1, value1, value2)
AS (SELECT key1, value1, value2
FROM #A
WHERE is_even = 1),
b (key1, value1, value2)
AS (SELECT key1, value1, value2
FROM #B
WHERE is_even = 1)
SELECT key1 = ISNULL(a.key1, b.key1),
value1a = a.value1, value2a = a.value2,
value1b = b.value1, value2b = b.value2
FROM a
FULL OUTER JOIN b
ON b.key1 = a.key1
ORDER BY 1
通过子查询的方式
SELECT key1 = ISNULL(a.key1, b.key1),
value1a = a.value1, value2a = a.value2,
value1b = b.value1, value2b = b.value2
FROM (SELECT key1, value1, value2
FROM #A
WHERE is_even = 1) a
FULL OUTER JOIN (SELECT key1, value1, value2
FROM #B
WHERE is_even = 1) b
ON b.key1 = a.key1
ORDER BY 1
虽然我更喜欢第一个解决方案,但 CTE 和/或子查询解决方案看起来更明显,即使它们在代码中添加了很多内容。 (而且我不太喜欢 CTE =)
有什么意见吗?其他解决方案?备注(例如,关于“真实”数据的性能)
最佳答案
您的两种方法“使用 CTE”和“使用子查询”完全相同,这只是您使用哪种方法的个人喜好。
所有 3 个查询具有相同的估计成本和相同的 I/O:
Table '#B'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#A'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
但是第一个有一个额外的步骤Filter
,因为子查询/CTE 方法能够在聚集索引扫描的同时应用谓词is_even = 1
。
因此,我会选择子查询方法或 CTE 方法,具体取决于您在视觉上更喜欢哪种方法。在涉及 SQL 时,不要误以为少就是少,编写更详细的查询可能会更有效。
关于sql - 带有过滤数据的完整外部连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20287475/