haskell - Haskell 特里树中的太空泄漏

标签 haskell profiling trie

我用 Haskell 编写了一个 trie 实现,但遇到了一些性能问题。我根据 http://book.realworldhaskell.org/read/profiling-and-optimization.html 相应地分析了我的代码并可以解决这个问题。这是我的 trie 的 insertWith 方法:

import qualified Data.Map.Strict as M
-- i have also tried Data.Map.Lazy

data Trie k a = Trie { value  :: !(Maybe a)
                     , childs :: !(Map.Map k (Trie k a))
                     }

empty :: Trie a k
empty = Trie Nothing Map.empty

insertWith :: Ord k => (a -> a -> a) -> [k] -> a -> Trie k a -> Trie k a
insertWith f [] a t@(Trie Nothing  _) = t { value = Just a }
insertWith f [] a t@(Trie (Just b) _) = t { value = Just $ f a b }
insertWith f (k:ks) a t = t { childs = m }
    where
        recurse = insertWith f ks a
        m = Map.insertWith (\_ child -> recurse child) k (recurse empty) (childs t)

我的分析代码:

import qualified MyTrie as T
main = do
    let nums = zipWith (\a b -> (show a, b)) [0..100000] [0..]
        trie = foldl' (flip $ uncurry $ T.insertWith (+)) T.empty nums
    putStrLn $ show $ T.lookup "100" trie

分析结果:

 CAF:main4                Main                    113           0    0.0    0.0    55.2   38.3
  main                    Main                    126           0    2.6    0.0    55.2   38.3
   main.trie              Main                    127           0   12.6    2.7    52.6   38.3
    childs                Text.NGram.Trie         137     1000001    0.0    0.0     0.0    0.0
    insertWith            Text.NGram.Trie         135     1123337    3.2    6.2    40.1   35.6
     insertWith.recurse   Text.NGram.Trie         140      123336    0.4    0.0     0.4    0.0
     insertWith.m         Text.NGram.Trie         138     1123334   31.4   29.1    36.4   29.4
      insertWith.recurse  Text.NGram.Trie         142           0    0.0    0.0     0.0    0.0
      insertWith.m.\      Text.NGram.Trie         139      123333    1.5    0.0     5.0    0.3
       insertWith.recurse Text.NGram.Trie         141           0    3.5    0.3     3.5    0.3
        childs            Text.NGram.Trie         143      123333    0.0    0.0     0.0    0.0

   1,403,625,328 bytes allocated in the heap
   1,699,102,256 bytes copied during GC
     393,716,888 bytes maximum residency (11 sample(s))
       4,565,856 bytes maximum slop
             771 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2673 colls,     0 par    0.97s    0.98s     0.0004s    0.0027s
  Gen  1        11 colls,     0 par    1.41s    1.41s     0.1285s    0.6767s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.69s  (  0.70s elapsed)
  GC      time    2.38s  (  2.39s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.06s  (  0.06s elapsed)
  Total   time    3.14s  (  3.15s elapsed)

  %GC     time      75.8%  (75.9% elapsed)

  Alloc rate    2,021,450,981 bytes per MUT second

  Productivity  24.2% of total user, 24.2% of total elapsed

ghc版本:7.6.2

有人知道如何使 insertWith 方法执行得更好吗?

谢谢

最佳答案

insertWith f [] a t@(Trie Nothing  _) = t { value = Just a }
insertWith f [] a t@(Trie (Just b) _) = t { value = Just $ f a b }

af a b 未计算而存储。尝试一下

insertWith f [] a t@(Trie Nothing  _) = a `seq` t { value = Just a }
insertWith f [] a t@(Trie (Just b) _) = t { value = Just $! f a b }

关于haskell - Haskell 特里树中的太空泄漏,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16176372/

相关文章:

c++ - 打印出用 map 实现的 Trie 中的所有单词

java - 操作 trie 实现以获取数组中找到的项目

c - 阅读 dict 文件找到 "words"并添加到 trie 中

haskell - 访问列表中元素的两个邻居

linux - 如何测量与文件相关的系统调用在性能中花费了多长时间?

java - 如何在非常大的java项目中检查动态类的使用

python - 是否有 Python 的可视化分析器?

multithreading - Haskell - 线程时缓慢的套接字连接

haskell - 为什么 GHC 认为这个类型变量不是单射的?

exception - 我的并发 monad 是 MonadThrow 的有效实例吗?