我知道正则表达式可用于编写检查器来检查括号的开始和结束符号对:
例如。a.[b.[c.d]].e
产生值 a
、[b.[c.d]]
和 e
如何编写一个正则表达式来找出相同符号的开始和结束括号
例如。a.|b.|c.d||.e
将产生值 a
、|b.|c.d||
和 e
更新
感谢大家的评论。我必须为这个问题提供一些背景信息。我基本上想模仿 javascript 语法
a.hello is a["hello"] or a.hello
a.|hello| is a[hello]
a.|b.c.|d.e||.f.|g| is a[b.c[d.e]].f[g]
所以我想做的是将符号分解为:
[`a`, `|b.c.|d.e||`, `f`, `|g|`]
然后如果它们被管道引用,则重复它们
我在这里实现了不带管道的语法:
https://github.com/zcaudate/purnam
我真的希望不要使用解析器,主要是因为我不知道如何使用,而且我认为它不能证明必要的复杂性。但如果正则表达式不能解决这个问题,我可能不得不这样做。
最佳答案
感谢@m.buettner 和@rafal,这是我在 clojure 中的代码:
有正常模式
和管道模式
。按照 m.buettner 的描述:
帮助者:
(defn conj-if-str [arr s]
(if (empty? s) arr
(conj arr s)))
(defmacro case-let [[var bound] & body]
`(let [~var ~bound]
(case ~var ~@body)))
管道模式:
(declare split-dotted) ;; normal mode declaration
(defn split-dotted-pipe ;; pipe mode
([output current ss] (split-dotted-pipe output current ss 0))
([output current ss level]
(case-let
[ch (first ss)]
nil (throw (Exception. "Cannot have an unpaired pipe"))
\| (case level
0 (trampoline split-dotted
(conj output (str current "|"))
"" (next ss))
(recur output (str current "|") (next ss) (dec level)))
\. (case-let
[nch (second ss)]
nil (throw (Exception. "Incomplete dotted symbol"))
\| (recur output (str current ".|") (nnext ss) (inc level))
(recur output (str current "." nch) (nnext ss) level))
(recur output (str current ch) (next ss) level))))
正常模式:
(defn split-dotted
([ss]
(split-dotted [] "" ss))
([output current ss]
(case-let
[ch (first ss)]
nil (conj-if-str output current)
\. (case-let
[nch (second ss)]
nil (throw (Exception. "Cannot have . at the end of a dotted symbol"))
\| (trampoline split-dotted-pipe
(conj-if-str output current) "|" (nnext ss))
(recur (conj-if-str output current) (str nch) (nnext ss)))
\| (throw (Exception. "Cannot have | during split mode"))
(recur output (str current ch) (next ss)))))
测试:
(fact "split-dotted"
(js/split-dotted "a") => ["a"]
(js/split-dotted "a.b") => ["a" "b"]
(js/split-dotted "a.b.c") => ["a" "b" "c"]
(js/split-dotted "a.||") => ["a" "||"]
(js/split-dotted "a.|b|.c") => ["a" "|b|" "c"]
(js/split-dotted "a.|b|.|c|") => ["a" "|b|" "|c|"]
(js/split-dotted "a.|b.c|.|d|") => ["a" "|b.c|" "|d|"]
(js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"]
(js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"]
(js/split-dotted "a.|b.|c.d.|e|||.|d|") => ["a" "|b.|c.d.|e|||" "|d|"])
(fact "split-dotted exceptions"
(js/split-dotted "|a|") => (throws Exception)
(js/split-dotted "a.") => (throws Exception)
(js/split-dotted "a.|||") => (throws Exception)
(js/split-dotted "a.|b.||") => (throws Exception))
关于regex - 可以分割具有相同嵌套括号的字符串的正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16496533/