我如何着手在具有以下签名的列表上编写类似 SQL 的完全连接操作?
fullJoin :: [a] -> [b] -> (a -> b -> Bool) -> [(Maybe a, Maybe b)]
fullJoin xs [] _ = map (\x -> (Just x, Nothing)) xs
fullJoin [] ys _ = map (\y -> (Nothing, Just y)) ys
fullJoin xs@(xh:xt) ys@(yh:yt) on =
if on xh yh
then (Just xh, Just yh) : fullJoin xt yt
else ???
所写条件由 a -> b -> Bool
提供。在示例中,它被设置为 (\n i -> length n == i)
,这意味着来自 names
的记录与 中的数字具有相同的长度数
。
例子:
names = ["Foo","Bar", "John" , "Emily", "Connor"]
nums = [1,2,3,4,5]
fullJoin names nums (\n i -> length n == i)
== [ (Just "Foo", Just 3)
, (Just "Bar", Just 3)
, (Just "John", Just 4)
, (Just "Emily", Just 5)
, (Just "Connor", Nothing)
, (Nothing, Just 1)
, (Nothing, Just 2)
]
为了阐明上述函数在 SQL 中的确切等价物,下面是它在 PostgreSQL 中的写法:
create table names as
select * from (values
('Foo'),
('Bar'),
('John'),
('Emily'),
('Connor')
)
as z(name);
create table nums as
select * from (values
(1),
(2),
(3),
(4),
(5)
)
as z(num);
select * from names
full join nums
on char_length(name) = num
运行它会产生:
name num
(null) 1
(null) 2
Foo 3
Bar 3
John 4
Emily 5
Connor (null)
fiddle 链接:sqlfiddle
最佳答案
表 X
和 Y
在条件 rel(x, y)
之间的完全外部连接可以被认为是联合三个(不相交的)部分:
- 对
(x, y)
的集合,其中rel(x,y)
(x0, null)
对的集合,其中对于 Y 中的所有y
,不是 (rel(x0, y))
(null, y0)
对的集合,其中对于 X 中的所有x
,不是 (rel(x, y0))
我们可以用类似的方式构造我们的 Haskell 程序:
fullJoin :: [a] -> [b] -> (a -> b -> Bool) -> [(Maybe a, Maybe b)]
fullJoin xs ys rel = concat $
[[(Just x, Just y) | y <- ys, x `rel` y] | x <- xs] -- for each x, find all the related ys
<> [[(Just x, Nothing)] | x <- xs, all (\y -> not (x `rel` y)) ys] -- find all xs that have no related ys
<> [[(Nothing, Just y)] | y <- ys, all (\x -> not (x `rel` y)) xs] -- find all ys that have no related xs
问题提出的方式,你不能比 O(n^2)
更快。也就是说,我的解决方案虽然是 O(n^2)
,但它不是最优的:它遍历 xs
两次。你可以想想你必须做什么才能只遍历 xs
一次。它与找到一种方法来跟踪您已经看到的 xs
有关。
关于sql - 列表的内存中完全外部连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60568168/