haskell - 图形简化/棘手的空间泄漏示例

标签 haskell memory memory-leaks graph-reduction

我想知道我在本页上读到的空间泄漏示例(遗憾的是,那里没有解释): https://en.wikibooks.org/wiki/Haskell/Graph_reduction

Tricky space leak example:

(\xs -> head xs + last xs) [1..n]

(\xs -> last xs + head xs) [1..n]

The first version runs on O(1) space. The second in O(n).

不知道我理解是否正确(希望对你有帮助)。据我所知,惰性求值是指最左边最外面的求值。因此,在这些示例中,我们将减少这些 redexe,如下所示:

(\xs -> head xs + last xs) [1..200]

=> ([1..200] -> head xs + last xs)

=> head [1..200] + last [1..200]

=> 1 + last [1..200]

=> 1 + last [2..200]

=> ... ...

=> 1 + last [199..200]

=> 1 + last [200]

=> 1 + 200

=> 201

还有

(\xs -> last xs + head xs) [1..200]

([1..200] -> last xs + head xs)

last [1..200] + head [1..200]

last [2..200] + head [1..200]

... ...

last [199..200] + head [1..200]

last [200] + head [1..200]

200 + head [1..200]

200 + 1

201

也许我以错误的方式减少了它(如果我错了,请纠正我),但在这里我看不到可能的空间泄漏。因此,我首先使用 ghci 测试了运行时(而不是空间):

1>> (\xs -> head xs + last xs) [1..50000000]
50000001
(10.05 secs, 4,003,086,632 bytes)
2>> (\xs -> last xs + head xs) [1..50000000]
50000001
(2.26 secs, 3,927,748,176 bytes)

根据维基教科书,第二个版本中应该存在空间泄漏,但运行时速度要快得多(这可能是可能的,这里没什么奇怪的)。

我有以下源代码:

module Main where

main = do
--  let a = (\xs -> head xs + last xs) [1..20000000]    -- space leak
let b = (\xs -> last xs + head xs) [1..20000000]    -- no space leak
--  putStrLn ("result for a - (\\xs -> head xs + last xs): " ++ show a)
putStrLn ("result for b - (\\xs -> last xs + head xs): " ++ show b)

...我编译了它,没有优化,然后我调用了该程序:

$ ghc -O0 --make -rtsopts -fforce-recomp Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...

$ ./Main +RTS -sstderr
result for b - (\xs -> last xs + head xs): 20000001
   1,600,057,352 bytes allocated in the heap
         211,000 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3062 colls,     0 par    0.012s   0.012s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0002s    0.0002s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.518s  (  0.519s elapsed)
  GC      time    0.012s  (  0.012s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.534s  (  0.532s elapsed)

  %GC     time       2.3%  (2.3% elapsed)

  Alloc rate    3,088,101,743 bytes per MUT second

  Productivity  97.6% of total user, 98.0% of total elapsed

这是一个很好的结果,我们有 2.3% 的垃圾回收,并且使用的内存约为 1 MB。然后我编译了另一个没有优化的案例,得到了以下结果:

module Main where

main = do
  let a = (\xs -> head xs + last xs) [1..20000000]    -- space leak
--  let b = (\xs -> last xs + head xs) [1..20000000]    -- no space leak
  putStrLn ("result for a - (\\xs -> head xs + last xs): " ++ show a)
--  putStrLn ("result for b - (\\xs -> last xs + head xs): " ++ show b)
<小时/>
$ ghc -O0 --make -rtsopts -fforce-recomp Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...

$ ./Main +RTS -sstderr
result for a - (\xs -> head xs + last xs): 20000001
   1,600,057,352 bytes allocated in the heap
   2,088,615,552 bytes copied during GC
     540,017,504 bytes maximum residency (13 sample(s))
     135,620,768 bytes maximum slop
            1225 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3051 colls,     0 par    0.911s   0.915s     0.0003s    0.0016s
  Gen  1        13 colls,     0 par    2.357s   2.375s     0.1827s    0.9545s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.434s  (  0.430s elapsed)
  GC      time    3.268s  (  3.290s elapsed)
  EXIT    time    0.094s  (  0.099s elapsed)
  Total   time    3.799s  (  3.820s elapsed)

  %GC     time      86.0%  (86.1% elapsed)

  Alloc rate    3,687,222,801 bytes per MUT second

  Productivity  14.0% of total user, 13.9% of total elapsed

这更糟糕,正在进行大量垃圾收集,并且内存使用率更高。

这里到底发生了什么?我不明白为什么会出现空间泄漏。

<小时/>

附注如果您使用完全优化 (O2) 来编译这两种情况,那么这两个程序的效率几乎相同。

最佳答案

xs是一样的。当你写出[1..200]时两次,你就迷惑了自己。最好在表达式求值过程中显式命名所有临时实体:

(\xs -> head xs + last xs) [1,2,3]
head xs + last xs        { xs = [1,2,3] }
(+) (head xs) (last xs)
(+) 1         (last xs)  { xs = 1:xs1 ; xs1 = [2,3] }
              (last xs1) { xs1 = 2:xs2 ; xs2 = [3] }
              (last xs2)
              3
4

在这里我们看到 xs 的绑定(bind)(和 xs1 )在我们继续前进时可以安全地被遗忘,因为没有更多对 xs 的引用。任何地方。

所以你确实正确地减少了它,除了第二个 [1..200]与第一个相同,因此我们必须在找到 last 时捕获它。首先(在第二个变体中),因此泄漏。

当然,编译器可以对此进行优化,由于引用透明原则,用 equals 代替 equals,并执行枚举 [1..200] 两次,因此第二个变体也在 O(1) 空间中运行。

所以说到底,这是一个编译器的事情。空间泄漏可能发生(不应该)。

关于haskell - 图形简化/棘手的空间泄漏示例,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37724003/

相关文章:

excel - JAVA - 写入 Excel 文件时出现 Apache POI OutOfMemoryError

python - Scrapy 隐藏的内存泄漏

scala - 在 Mem (Chisel) 中初始化数据

haskell - 如何合并 Haskell 中的所有相交(Set a)?

haskell - 隐藏 State monad 的类型参数

c++ - 如何释放 C++ 中的结构和对象数组?

c - valgrind 错误 : invalid read

ios - 调整 UIImage 大小时内存泄漏

Haskell 计算器 - 运算顺序

haskell - 将两个 Int 值相除以获得 Float 的正确方法是什么?