刚刚接触 Haskell 的排序算法。我已经实现了插入排序和合并排序
insert_sort :: (Ord a, Show a) => [a] -> [a]
insert_sort keys = foldr f [] keys
where f key [] = [key]
f key acc = insert key acc
insert y [] = [y]
insert y (x:xs)
| x < y = x : insert y xs
| otherwise = y : x : xs
merge_sort :: (Ord a, Show a) => [a] -> [a]
merge_sort (x:[]) = [x]
merge_sort keys = merge (merge_sort (take len keys)) (merge_sort (drop len keys))
where len = length keys `div` 2
merge :: [a] -> [a] -> [a]
merge (x:xs) [] = (x:xs)
merge [] (y:ys) = (y:ys)
merge (x:xs) (y:ys) = if x <= y
then x : merge (xs) (y:ys)
else y : merge (x:xs) ys
以下是我比较它们的效率的方式:
insert_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]
merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]
它们都在短暂延迟后开始打印结果,但合并排序打印速度更快。正如我们所知,对于大型数据集,归并排序比插入排序快得多。我认为这将通过他们如何给出结果(比如长延迟与短延迟)而不是他们如何打印结果来显示。是因为我在插入排序中使用了 foldr
吗?幕后是什么?
编辑:谢谢大家。自从我开始学习 Haskell 以来,我就听说过惰性求值,但还是掌握了窍门。有人会用一个小数据集来说明更多吗,比如 [5,2,6,3,1,4]?由于第一个元素最后出现,如何在使用 foldr
完成排序之前输出结果?
最佳答案
幕后是懒惰的评估。排序列表的开始是在排序完成之前确定的,因此可以在工作完成之前输出。由于归并排序速度更快,因此归并排序列表的打印速度更快。
根据要求:排序 [5,2,6,3,1,4]
是如何进行的。为了简洁起见,我使用 insert_sort = foldr ins []
。
insert_sort [5,2,6,3,1,4]
= foldr ins [] [5,2,6,3,1,4]
= 5 `ins` foldr ins [] [2,6,3,1,4]
= 5 `ins` 2 `ins` [6,3,1,4] ...
= 5 `ins` 2 `ins` 6 `ins` 3 `ins` 1 `ins` 4 `ins` []
= 5 `ins` 2 `ins` 6 `ins` 3 `ins` 1 `ins` (4:[])
= 5 `ins` 2 `ins` 6 `ins` 3 `ins` (1:4:[])
= 5 `ins` 2 `ins` 6 `ins` (1 : (3 `ins` (4:[])))
= 5 `ins` 2 `ins` (1 : (6 `ins` (3 `ins` (4:[]))))
= 5 `ins` (1 : (2 `ins` (6 `ins` (3 `ins` (4:[])))))
= 1 : (5 `ins` (2 `ins` (6 `ins` (3 `ins` (4:[]))))) -- now 1 can be output
= 1 : (5 `ins` (2 `ins` (6 `ins` (3:4:[]))))
= 1 : (5 `ins` (2 `ins` (3 : (6 `ins` (4:[])))))
= 1 : (5 `ins` (2 : (3 : (6 `ins` (4:[])))))
= 1 : 2 : (5 `ins` (3 : (6 `ins` (4:[])))) -- now 2 can be output
= 1 : 2 : 3 : (5 `ins` (6 `ins` (4:[]))) -- now 3
= 1 : 2 : 3 : (5 `ins` (4:6:[]))
= 1 : 2 : 3 : 4 : (5 `ins` (6:[])) -- now 4
= 1 : 2 : 3 : 4 : 5 : 6 : [] -- done
和归并排序(缩写:merge = mg
, merge_sort = ms
):
merge_sort [5,2,6,3,1,4]
= mg (ms [5,2,6]) (ms [3,1,4])
= mg (mg (ms [5]) (ms [2,6])) (mg (ms [3]) (ms [1,4]))
= mg (mg [5] (mg [2] [6])) (mg [3] (mg [1] [4]))
= mg (mg [5] [2,6]) (mg [3] [1,4])
= mg (2 : mg [5] [6]) (1 : mg [3] [4])
= 1 : mg (2 : mg [5] [6]) (mg [3] [4]) -- now 1 can be output
= 1 : mg (2 : mg [5] [6]) [3,4]
= 1 : 2 : mg (mg [5] [6]) [3,4] -- now 2 can be output
= 1 : 2 : mg [5,6] [3,4]
= 1 : 2 : 3 : mg [5,6] [4] -- now 3
= 1 : 2 : 3 : 4 : mg [5,6] [] -- now 4
= 1 : 2 : 3 : 4 : 5 : 6 : [] -- now 5 and 6
诚然,我走了一些捷径,但 Haskell 并不是唯一懒惰的。
关于algorithm - 合并排序比插入排序更快的方式让我困惑,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8919785/