clojure - Clojure 映射函数的高效、仅副作用类似物

如果 map 和 doseq 生了 child 怎么办？我正在尝试编写一个类似于 Common Lisp 的 mapc 的函数或宏，但是是在 Clojure 中。这本质上与 map 的作用相同，但仅有副作用，因此它不需要生成结果序列，也不会偷懒。我知道可以使用 doseq 迭代单个序列，但 map 可以迭代多个序列，依次将函数应用于所有序列的每个元素。我还知道可以将 map 包装在 dorun 中。 (注意:这个问题在经过许多评论和非常彻底的回答后已被广泛编辑。最初的问题集中在宏上，但这些宏问题结果证明是外围问题。)

这很快(根据标准):

(defn domap2
  [f coll]
  (dotimes [i (count coll)]
    (f (nth coll i))))

但它只接受一个集合。这接受任意集合:

(defn domap3
  [f & colls]
  (dotimes [i (apply min (map count colls))]
    (apply f (map #(nth % i) colls))))

但是相比之下它非常慢。我也可以编写像第一个版本一样的版本，但具有不同的参数情况 [f c1 c2] 、 [f c1 c2 c3] 等，但最终，我需要一个处理任意数量集合的情况，就像最后一个示例一样，无论如何它更简单。我还尝试了许多其他解决方案。

由于第二个示例非常类似于第一个示例，除了在循环内使用 apply 和 map 之外，我怀疑摆脱它们会大大加快速度。我尝试通过将 domap2 编写为宏来实现此目的，但是处理 & 之后的包罗万象的变量的方式一直让我困惑，如上所示。

其他示例(15 或 20 个不同版本)、基准代码以及几年前 Macbook Pro 上的时间(完整源代码 here):

(defn domap1
  [f coll]
  (doseq [e coll] 
    (f e)))

(defn domap7
  [f coll]
  (dorun (map f coll)))

(defn domap18
  [f & colls]
  (dorun (apply map f colls)))

(defn domap15
  [f coll] 
  (when (seq coll)
    (f (first coll))
    (recur f (rest coll))))

(defn domap17
  [f & colls]
  (let [argvecs (apply (partial map vector) colls)] ; seq of ntuples of interleaved vals
    (doseq [args argvecs]
      (apply f args))))

我正在开发一个使用 core.matrix 矩阵和向量的应用程序，但请随意替换下面您自己的副作用函数。

(ns tst
  (:use criterium.core
        [clojure.core.matrix :as mx]))

(def howmany 1000)
(def a-coll (vec (range howmany)))
(def maskvec (zero-vector :vectorz howmany))

(defn unmaskit!
  [idx]
  (mx/mset! maskvec idx 1.0)) ; sets element idx of maskvec to 1.0

(defn runbench
  [domapfn label]
  (print (str "\n" label ":\n"))
  (bench (def _ (domapfn unmaskit! a-coll))))

根据 Criterium 的平均执行时间(以微秒为单位):

domap1: 12.317551 [剂量]
domap2: 19.065317 [dotimes]
domap3: 265.983779 [dotimes with apply, map ]
domap7: 53.263230 [与 dorun 的 map ]
domap18: 54.456801 [与 dorun 的 map ，多个集合]
domap15: 32.034993
domap17: 95.259984 [doseq，使用映射交错的多个集合]

编辑: dorun + map 可能是为多个大型惰性序列参数实现 domap 的最佳方式，但对于单个惰性序列而言，doseq 仍然是王者。执行与上面的 unmask! 相同的操作，但是通过 (mod idx 1000) 运行索引，并迭代 (range 100000000) ，在我的测试中，doseq 的速度大约是 dorun + map 的两倍(即 (def domap25 (comp dorun map)) )。

最佳答案

您不需要宏，而且我不明白为什么宏在这里会有帮助。

user> (defn do-map [f & lists] (apply mapv f lists) nil)
#'user/do-map
user> (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil

注意这里的 do-map 是急切的(感谢mapv)并且只在产生副作用时执行

宏可以使用可变参数列表，正如 do-map 的(无用!)宏版本所示:

user> (defmacro do-map-macro [f & lists] `(do (mapv ~f ~@lists) nil))
#'user/do-map-macro
user> (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
user> (macroexpand-1 '(do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40)))
(do (clojure.core/mapv (comp println +) (range 2 6) (range 8 11) (range 22 40)) nil)

附录: 解决效率/垃圾产生问题:
请注意，出于简洁原因，下面我截断了标准台函数的输出:

(defn do-map-loop
  [f & lists]
  (loop [heads lists]
    (when (every? seq heads)
      (apply f (map first heads))
      (recur (map rest heads)))))


user> (crit/bench (with-out-str (do-map-loop (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
            Execution time mean : 11.367804 µs
...

这看起来很有希望，因为它不会创建我们不使用的数据结构(与上面的 mapv 不同)。但事实证明它比以前慢(也许是因为两次 map 调用？)。

user> (crit/bench (with-out-str (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
             Execution time mean : 7.427182 µs
...
user> (crit/bench (with-out-str (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
             Execution time mean : 8.355587 µs
...

由于循环仍然没有更快，让我们尝试一个专门处理 arity 的版本，这样我们就不需要在每次迭代时调用 map 两次:

(defn do-map-loop-3
  [f a b c]
  (loop [[a & as] a
         [b & bs] b
         [c & cs] c]
    (when (and a b c)
      (f a b c)
      (recur as bs cs))))

值得注意的是，虽然速度更快，但仍然比刚刚使用 mapv 的版本慢:

user> (crit/bench (with-out-str (do-map-loop-3 (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
             Execution time mean : 9.450108 µs
...

接下来我想知道输入的大小是否是一个因素。输入更大...

user> (def test-input (repeatedly 3 #(range (rand-int 100) (rand-int 1000))))
#'user/test-input
user> (map count test-input)
(475 531 511)
user> (crit/bench (with-out-str (apply do-map-loop-3 (comp println +) test-input)))
...
            Execution time mean : 1.005073 ms
...
user> (crit/bench (with-out-str (apply do-map (comp println +) test-input)))
...
             Execution time mean : 756.955238 µs
...

最后，为了完整起见，do-map-loop 的计时(正如预期的那样，比 do-map-loop-3 稍慢)

user> (crit/bench (with-out-str (apply do-map-loop (comp println +) test-input)))
...
             Execution time mean : 1.553932 ms

正如我们所见，即使输入大小较大，mapv 也会更快。

(为了完整起见，我应该在这里指出，map 比 mapv 稍快，但不是很大)。

关于clojure - Clojure 映射函数的高效、仅副作用类似物，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/21449549/

clojure - Clojure 映射函数的高效、仅副作用类似物

上一篇：node.js - 尝试连接调试器时 Azure Functions 崩溃，导致 nodemon 永久重新加载

下一篇：asp.net - 从代码隐藏调用 SignalR Hub 方法