haskell - 《Learn You a Haskell for Great Good》一书中的懒惰示例

标签 haskell

我正在阅读 the introduction of Learn You a Haskell for Great Good我无法理解以下对“懒惰”一词的解释。

Say you have an immutable list of numbers xs = [1,2,3,4,5,6,7,8] and a function doubleMe which multiplies every element by 2 and then returns a new list. If we wanted to multiply our list by 8 in an imperative language and did doubleMe(doubleMe(doubleMe(xs))), it would probably pass through the list once and make a copy and then return it. Then it would pass through the list another two times and return the result. In a lazy language, calling doubleMe on a list without forcing it to show you the result ends up in the program sort of telling you "Yeah yeah, I'll do it later!". But once you want to see the result, the first doubleMe tells the second one it wants the result, now! The second one says that to the third one and the third one reluctantly gives back a doubled 1, which is a 2. The second one receives that and gives back 4 to the first one. The first one sees that and tells you the first element is 8. So it only does one pass through the list and only when you really need it.

doubleMe 函数背后的魔力是什么?命令式和函数式有什么区别?

最佳答案

好吧,让我们来说明一下。首先,这里是doubleMe的简单定义:

doubleMe [] = []
doubleMe (x:xs) = x*2 : doubleMe xs

现在,让我们考虑一下严格/急切的语言(不是 Haskell)如何评估 doubleMe (doubleMe (doubleMe [1, 2, 3]))。这种语言的规则是:评估函数调用,完整地评估所有参数,然后将它们传递给函数。所以我们得到了这个:

doubleMe (doubleMe (doubleMe [1, 2, 3]))

  -- Expand the list literal into its structure
  == doubleMe (doubleMe (doubleMe (1 : 2 : 3 : [])))

  -- Eager evaluation requires that we start at the innermost use of doubleMe
  -- and work there until we produce the whole list.
  == doubleMe (doubleMe (2 : doubleMe (2 : 3 : [])))
  == doubleMe (doubleMe (2 : 4 : doubleMe (3 : [])))
  == doubleMe (doubleMe (2 : 4 : 6 : doubleMe []))
  == doubleMe (doubleMe (2 : 4 : 6 : []))

  -- Only now we can move on to the middle use of doubleMe:
  == doubleMe (4 : doubleMe (4 : 6 : []))
  == doubleMe (4 : 8 : doubleMe (6 : []))
  == doubleMe (4 : 8 : 12 : doubleMe [])
  == doubleMe (4 : 8 : 12 : [])
  == 8 : doubleMe (8 : 12 : [])
  == 8 : 16 : doubleMe (12 : [])
  == 8 : 16 : 24 : doubleMe []
  == 8 : 16 : 24 : []
  == [8, 16, 24]

在 Haskell 中,规则更像这样(但不完全是这样):

  1. 要评估函数应用程序,请先应用函数,然后再评估其参数。
  2. 如果外部函数有多种情况(就像我们的 doubleMe 定义那样),那么我们只评估它的参数来确定使用哪种情况。

所以我们得到类似的东西:

doubleMe (doubleMe (doubleMe [1, 2, 3]))
  -- Here we only "pull out" the 1 from the list, because it's all we need to
  -- pick which case we want for doubleMe.
  == doubleMe (doubleMe (doubleMe (1 : [2, 3])))
  == doubleMe (doubleMe (1*2 : doubleMe [2, 3]))

  -- Now instead of continuing with the inner doubleMe, we move on immediately
  -- to the middle one:
  == doubleMe ((1*2)*2 : doubleMe (doubleMe [2, 3]))

  -- And now, since we know which case to use for the outer doubleMe, we expand 
  -- that one:
  == (1*2)*2)*2 : doubleMe (doubleMe (doubleMe [2, 3]))

在 Haskell 中,除非有另一个调用者需要列表头部或尾部的值,否则评估将停止。 (请注意,我什至没有进行乘法运算。)例如,head 是返回列表第一个元素的函数:

head (x:xs) = x

假设我们正在评估 head (doubleMe (doubleMe (doubleMe [1, 2, 3])))。事情是这样的:

head (doubleMe (doubleMe (doubleMe [1, 2, 3])))
  -- repeat the steps from above for the doubleMe part
  head (((1*2)*2)*2 : doubleMe (doubleMe (doubleMe [2, 3]))

  -- By the definition of head:
  ((1*2)*2)*2

所以 doubleMe (doubleMe (doubleMe [2, 3])) 部分在这种情况下被丢弃,因为 head 不需要它来产生结果。如果 Haskell 不是懒惰的,它会计算整个 [8, 12, 24] 列表,然后将 8 放在前面。

GHC 比这更聪明。我们可以使用map函数来写doubleMe:

doubleMe = map (*2)

GHC,当您使用 -O 选项优化编译程序时,已将这条规则融入其中:

map f (map g xs) = map (f . g) xs

这意味着如果它看到 map 的嵌套使用,它可以将它们减少到遍历列表的一次。使用它:

head (doubleMe (doubleMe (doubleMe [1, 2, 3])))
  == head (map (*2) (map (*2) (map (*2) [1, 2, 3])))
  == head (map ((*2) . (*2) . (*2)) [1, 2, 3])
  == head (map (\x -> ((x*2)*2)*2) [1, 2, 3])
  == head (map (\x -> ((x*2)*2)*2) (1 : [2, 3]))
  == head (((1*2)*2)*2 : map (\x -> ((x*2)*2)*2) [2, 3])
  == ((1*2)*2)*2

编辑:在这个问题的答案中,关于一次通过与三次通过的主题显然存在很多混淆。我会全力以赴。

简单的惰性求值(如我的第二个示例求值所示)不会改变通过次数。如果我们评估类似 print (doubleMe (doubleMe (doubleMe [1, 2, 3]))) 的东西,惰性和急切的评估将做相同数量的工作,但顺序不同。让我们编写嵌套表达式的值并排列列表元素,如下所示:

                      doubleMe [1, 2, 3] = [2,  4,  6]
           doubleMe (doubleMe [1, 2, 3]) = [4,  8, 12]
doubleMe (doubleMe (doubleMe [1, 2, 3])) = [8, 16, 24]

现在,如果我们执行类似 print (doubleMe (doubleMe (doubleMe [1, 2, 3]))) 的操作:

  1. 急切的评估将逐行进行攻击。首先它将计算整个列表 [2, 4, 6],然后是列表 [4, 8, 12],然后是列表 [8, 16 , 24],最后它会打印最后一个列表。
  2. 惰性求值逐列攻击。首先它将计算第一个列表中的 2,然后计算第二个列表中的 4,然后是第一个列表中的 8,然后打印8;然后计算第一个列表中的 4,第二个列表中的 8,依此类推。

如果您要打印列表,因为这需要计算所有元素,所以在任何一种情况下,它(或多或少)的工作量和内存量都是相同的。然而,在 head (doubleMe (doubleMe (doubleMe [1, 2, 3]))) 示例中,惰性求值的工作量较少。

最后一个“融合”示例(使用 map f .map g == map (f .g) 的示例)的工作量比其他两个少,因为它确实是一次通过。但这不是因为懒惰,而是因为纯度允许更积极的优化。

关于haskell - 《Learn You a Haskell for Great Good》一书中的懒惰示例,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16126104/

相关文章:

haskell - 在列表中使用 where

haskell - 为什么 `nix flake show` 构建 ghc?

haskell - 如何使用 Haskell 库 Linear 缩放向量?

haskell - F# 类型声明可能是 Haskell 吗?

Haskell:为什么我不能在另一个函数中使用 id ,其中 id 的域显然是函数中所需类型的超集?

haskell - 部分应用类型与种类 (* -> 约束)

haskell - 类型导向的测试生成

haskell - 在所有可能的 "Subwords"中拆分单词 - 所有可能的组合 W/O 导入

haskell - 对于 Monad 理解来说,Applicative 还不够强大吗?为什么不?

haskell - 如何在 Haskell 中搜索库版本?