performance - Haskell:生成组合的技术比较

标签 performance haskell lazy-evaluation

我在做的几个99 Haskell Problems早些时候,我认为练习 27(“编写一个函数来枚举可能的组合”)很有趣,因为它是一个简单的概念,并且适用于多种实现。

我对相对效率很好奇,所以我决定运行几个不同的实现 - 结果在下表中。 (引用:在 VirtualBox 上运行的 LXDE (Ubuntu 14.04) 中的 Emacs bash ansi-term;Thinkpad X220;8gb RAM,i5 64bit 2.4ghz。)

TL;博士:

(i) 为什么组合生成技术 #7 和 #8(来自下表;代码包含在帖子底部)比其他技术快得多?
(ii) 另外,Bytes 中的数字是什么?列实际上代表什么?

(i) 这很奇怪,因为函数 #7 通过过滤 powerset(比组合列表大 waaaay)来工作;我怀疑这是工作中的懒惰,即这是最有效地利用我们只要求列表长度而不是列表本身这一事实的函数。 (此外,它的“内存使用率”低于其他函数,但话说回来,我不确定显示的是什么与内存相关的统计数据。)

关于功能#8:感谢Bergi 的快速实现,并感谢user5402 建议添加。仍然试图将我的领先优势围绕在这个速度差异上。

(ii) Bytes中的数字运行 :set +s 后由 GHCi 报告列命令;它们显然不代表最大内存使用量,因为我只有 ~25gb 的 RAM + 可用 HD 空间。)?

enter image description here

代码:

import Data.List

--algorithms to generate combinations
--time required to compute the following: length $ 13 "abcdefghijklmnopqrstuvwxyz"

--(90.14 secs, 33598933424 bytes)
combDC1 :: (Eq a) => Int -> [a] -> [[a]]
combDC1 n xs = filter (/= []) $ combHelper n n xs []

combHelper :: Int -> Int -> [a] -> [a] -> [[a]]
combHelper n _ []        chosen           = if length chosen == n
                                               then [chosen]
                                               else [[]]
combHelper n i remaining chosen
  | length chosen == n = [chosen]
  | n - length chosen > length remaining = [[]]                     
  | otherwise = combHelper n (i-1) (tail remaining) ((head remaining):chosen) ++
                combHelper n i (tail remaining) chosen

--(167.63 secs, 62756587760 bytes)
combSoln1 :: Int -> [a] -> [([a],[a])]
combSoln1 0 xs = [([],xs)]
combSoln1 n [] = []
combSoln1 n (x:xs) = ts ++ ds
  where
    ts = [ (x:ys,zs) | (ys,zs) <- combSoln1 (n-1) xs ]
    ds = [ (ys,x:zs) | (ys,zs) <- combSoln1 n     xs ]

--(71.40 secs,  30480652480 bytes)
combSoln2 :: Int -> [a] -> [[a]]
combSoln2 0 _  = [ [] ]
combSoln2 n xs = [ y:ys | y:xs' <- tails xs
                           , ys <- combSoln2 (n-1) xs']

--(83.75 secs, 46168207528 bytes)
combSoln3 :: Int -> [a] -> [[a]]
combSoln3 0 _  = return []
combSoln3 n xs = do
  y:xs' <- tails xs
  ys <- combSoln3 (n-1) xs'
  return (y:ys)

--(92.34 secs, 40541644232 bytes)
combSoln4 :: Int -> [a] -> [[a]]
combSoln4 0 _ = [[]]
combSoln4 n xs = [ xs !! i : x | i <- [0..(length xs)-1] 
                               , x <- combSoln4 (n-1) (drop (i+1) xs) ]

--(90.63 secs, 33058536696 bytes)
combSoln5 :: Int -> [a] -> [[a]]
combSoln5 _ [] = [[]]
combSoln5 0 _  = [[]]
combSoln5 k (x:xs) = x_start ++ others
    where x_start = [ x : rest | rest <- combSoln5 (k-1) xs ]
          others  = if k <= length xs then combSoln5 k xs else []


--(61.74 secs, 33053297832 bytes)                                                                         
combSoln6 :: Int -> [a] -> [[a]]
combSoln6 0 _ = [[]]
combSoln6 _ [] = []
combSoln6 n (x:xs) = (map (x:) (combSoln6 (n-1) xs)) ++ (combSoln6 n xs)


--(8.41 secs, 10785499208 bytes)
combSoln7 k ns = filter ((k==).length) (subsequences ns)


--(3.15 secs, 2889815872 bytes)
subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
                          in if n>l then [] else subsequencesBySize xs !! (l-n)
 where
   subsequencesBySize [] = [[[]]]
   subsequencesBySize (x:xs) = let next = subsequencesBySize xs
                               in zipWith (++) ([]:next) (map (map (x:)) next ++ [[]])                 

最佳答案

您还应该测试在这个 SO 答案中找到的算法:

subsequences of length n from list performance

subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
                          in if n>l then [] else subsequencesBySize xs !! (l-n)
 where
   subsequencesBySize [] = [[[]]]
   subsequencesBySize (x:xs) = let next = subsequencesBySize xs
                             in zipWith (++) ([]:next) (map (map (x:)) next ++ [[]])

在我的机器上,我从 ghci 获得以下时间和内存使用情况:
ghci> length $ combSoln7   13 "abcdefghijklmnopqrstuvwxyz"
10400600
(13.42 secs, 10783921008 bytes)

ghci> length $ subsequencesOfSize  13 "abcdefghijklmnopqrstuvwxyz"
10400600
(6.52 secs, 2889807480 bytes)

关于performance - Haskell:生成组合的技术比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26727673/

相关文章:

java - Lucene性能: Retrieve all document from Searcher

haskell - 在GHC中自动分配成本中心

c++ - 具有表达式模板的多维数组模板类

spring - 如何不允许从跨国方法之外延迟加载?

android - 如何清除缓存Android

javascript - 提高 Javascript 性能的技巧

javascript - 函数范围内与函数外声明函数的性能

mysql - 使用更新增加列或创建更新触发器 - mysql

haskell - 如何在 GHCI 中查找多个导入方法的类型签名

haskell - 使用函数式语言(例如 F#)的函数的参数名称