list - 将文件读入 elisp 中的对列表

我正在尝试编写一个 elisp 函数来将文件中的每个单词读成一对。我希望这对中的第一项是按字典顺序排序的字符串，第二项保持不变。

给定示例文件:

cat
cow
dog

我希望列表看起来像:

(act cat)
(cow cow)
(dgo dog)

我最擅长的是:

(defun get-file (filename)
  (with-open-file (stream filename)
    (loop for word = (read-line stream nil)
          while word
          collect ((sort word #'char-lessp) word))))

它在 Emacs lisp 交互模式下正确编译。但是，当我尝试通过执行运行它

(get-file "~/test.txt")

我最终进入了 Emacs 调试器，但它没有告诉我任何有用的信息。 . .

Debugger entered--Lisp error: (void-function get-file)
  (get-file "~/test.txt")
  eval((get-file "~/test.txt") nil)
  eval-last-sexp-1(t)
  eval-last-sexp(t)
  eval-print-last-sexp(nil)
  call-interactively(eval-print-last-sexp nil nil)
  command-execute(eval-print-last-sexp)

我是一个 lisp 初学者，不知道哪里出了问题。

谢谢，

贾斯汀

最佳答案

Vanilla Emacs

首先，让我们只使用 Emacs 的内置函数。 Emacs 中没有对字符串排序的内置函数，所以您首先应该将字符串转换为列表，排序，然后将排序后的列表转换回字符串。你就是这样convert a string to a list :

(append "cat" nil) ; => (99 97 116)

转换为列表的字符串成为字符列表，characters are represented as numbers在 Elisp 中。那你sort名单和convert it to a string :

(concat (sort (append "cat" nil) '<)) ; => "act"

没有将文件内容直接加载到变量中的内置函数，但您可以 load them进入 temporary buffer .然后你可以return the entire temporary buffer作为一个字符串:

(with-temp-buffer
  (insert-file-contents-literally "file.txt")
  (buffer-substring-no-properties (point-min) (point-max))

这将返回字符串 "cat\ncow\ndog\n"，因此您需要 split它:

(split-string "cat\ncow\ndog\n") ; => ("cat" "cow" "dog")

现在你需要traverse此列表并将每个项目转换为一对排序项目和原始项目:

(mapcar (lambda (animal)
          (list (concat (sort (append animal nil) '<)) animal))
        '("cat" "cow" "dog"))
;; returns
;; (("act" "cat")
;;  ("cow" "cow")
;;  ("dgo" "dog"))

完整代码:

(mapcar
 (lambda (animal)
   (list (concat (sort (append animal nil) '<)) animal))
 (split-string
  (with-temp-buffer
    (insert-file-contents-literally "file.txt")
    (buffer-substring-no-properties (point-min) (point-max)))))

Common Lisp 仿真

其中一个 Emacs 内置包是 cl.el ，没有理由不在您的代码中使用它。因此我撒谎了，当我说没有内置函数来对字符串进行排序时，上面是使用内置函数完成任务的唯一方法。所以让我们使用 cl.el。

cl-sort一个字符串(或任何 sequence ):

(cl-sort "cat" '<) ; => "act"

cl-mapcar比 Emacs 内置的 mapcar 更通用，但在这里您可以使用它们中的任何一个。

cl-sort 有问题，是destructive ，这意味着它就地修改了参数。我们在匿名函数中使用了两次局部变量 animal，我们不想弄乱原来的 animal。因此我们应该传递一个 copy将一个序列放入其中:

(lambda (animal)
  (list (cl-sort (copy-sequence animal) '<) animal))

结果代码变为:

(cl-mapcar
 (lambda (animal)
   (list (cl-sort (copy-sequence animal) '<) animal))
 (split-string
  (with-temp-buffer
    (insert-file-contents-literally "file.txt")
    (buffer-substring-no-properties (point-min) (point-max)))))

`seq.el`

在 Emacs 25 中添加了一个新的序列操作库，seq.el . mapcar 的替代方案是seq-map，CL 的cl-sort 的替代方案是seq-sort。完整代码变为:

(seq-map
 (lambda (animal)
   (list (seq-sort animal '<) animal))
 (split-string
  (with-temp-buffer
    (insert-file-contents-literally "file.txt")
    (buffer-substring-no-properties (point-min) (point-max)))))

破折号、s、f

通常处理序列和文件的最佳解决方案是直接访问这 3 个第三方库:

dash用于列表操作
s用于字符串操作
f用于文件操作。

他们的 Github 页面解释了如何安装它们(安装非常简单)。然而，对于这个特定问题，它们有点次优。例如，-sort from dash 只对列表进行排序，所以我们必须回到我们的字符串->列表->字符串转换:

(concat (-sort '< (append "cat" nil))) ; => "act"

s 中的

s-lines 在文件中留下空字符串。在 GNU/Linux 上，文本文件通常以换行符结尾，因此拆分文件看起来像:

(s-lines "cat\ncow\ndog\n") ; => ("cat" "cow" "dog" "")

s-split 支持可选参数来省略空行，但它的分隔符参数是 regex (请注意，对于 portability，您同时需要 \n 和 \r):

(s-split "[\n\r]" "cat\ncow\ndog\n" t) ; => ("cat" "cow" "dog")

但是有两个函数可以简化我们的代码。 -map类似于mapcar:

(-map
  (lambda (animal)
    (list (cl-sort (copy-sequence animal) '<) animal))
  '("cat" "cow" "dog"))
;; return
;; (("act" "cat")
;;  ("cow" "cow")
;;  ("dgo" "dog"))

但是在 dash 中有 anaphoric接受函数作为参数的函数版本，例如 -map。照应版本允许通过将局部变量公开为 it 并以 2 个破折号开头来使用更短的语法。例如。以下是等效的:

(-map (lambda (x) (+ x 1)) (1 2 3)) ; => (2 3 4)
(--map (+ it 1) (1 2 3)) ; => (2 3 4)

另一个改进是f-read-text来自 f，它只是将文件的内容作为字符串返回:

(f-read-text "file.txt") ; => "cat\ncow\ndog\n"

结合世界上最好的东西

(--map (list (cl-sort (copy-sequence it) '<) it)
       (split-string (f-read-text "file.txt")))

关于list - 将文件读入 elisp 中的对列表，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33158468/