我遇到了一个我无法解决的解析器递归问题。任何有关导致该问题的原因的建议将不胜感激。
当函数 rawData
时,以下代码可以正常工作用有限数量的元素定义(如下面的注释代码所示)。但当使用 Parser.loop
定义时,不会停止(直到堆栈溢出)如代码所示。相同的循环结构可以与所有其他函数一起正常工作(例如 files
和 directories
)
module Reader exposing (..)
import Parser exposing (..)
type TermCmd
= CD Argument
| LS
type Argument
= Home
| UpOne
| DownOne String
type Content
= Dir String (List Content)
| File Int String String
type alias RawData =
List ( List TermCmd, List Content )
rawData : Parser RawData
rawData =
loop [] <| loopHelper dataChunk -- This never ends...
-- succeed (\a b c d -> [ a, b, c, d ]) -- but this works
-- |= dataChunk
-- |= dataChunk
-- |= dataChunk
-- |= dataChunk
dataChunk : Parser ( List TermCmd, List Content )
dataChunk =
succeed (\cmds ctnt -> ( cmds, ctnt ))
|= commands
|= contents
directory : Parser Content
directory =
succeed Dir
|. symbol "dir"
|. spaces
|= (chompUntilEndOr "\n"
|> getChompedString
)
|= succeed []
|. spaces
file : Parser Content
file =
succeed File
|= int
|. spaces
|= (chompWhile (\c -> c /= '.' && c /= '\n')
|> getChompedString
)
|= (chompUntilEndOr "\n"
|> getChompedString
|> Parser.map (String.dropLeft 1)
)
|. spaces
command : Parser TermCmd
command =
succeed identity
|. symbol "$"
|. spaces
|= oneOf
[ succeed CD
|. symbol "cd"
|. spaces
|= argument
, succeed LS
|. symbol "ls"
]
|. spaces
argument : Parser Argument
argument =
oneOf
[ succeed UpOne |. symbol ".."
, succeed Home |. symbol "/"
, succeed DownOne |= (chompUntilEndOr "\n" |> getChompedString)
, problem "Bad argument"
]
|. spaces
contents : Parser (List Content)
contents =
let
contentHelper revContent =
oneOf
[ succeed (\ctnt -> Loop (ctnt :: revContent))
|= file
, succeed (\ctnt -> Loop (ctnt :: revContent))
|= directory
, succeed ()
|> map (\_ -> Done (List.reverse revContent))
]
in
loop [] contentHelper
commands : Parser (List TermCmd)
commands =
loop [] <| loopHelper command
directories : Parser (List Content)
directories =
loop [] <| loopHelper directory
files : Parser (List Content)
files =
loop [] <| loopHelper file
loopHelper : Parser a -> List a -> Parser (Step (List a) (List a))
loopHelper parser revContent =
oneOf
[ succeed (\ctnt -> Loop (ctnt :: revContent))
|= parser
, succeed ()
|> map (\_ -> Done (List.reverse revContent))
]
sampleInput =
"$ cd /\n$ ls\ndir a\n14848514 b.txt\n8504156 c.dat\ndir d\n$ cd a\n$ ls\ndir e\n29116 f\n2557 g\n62596 h.lst\n$ cd e\n$ ls\n584 i\n$ cd ..\n$ cd ..\n$ cd d\n$ ls\n4060174 j\n8033020 d.log\n5626152 d.ext\n7214296 k"
rawData
函数进入无限循环,但相同的构造( loop [] <| loopHelper parser
)在其他地方都可以正常工作。
最佳答案
您可能可以通过运行四步解析器(即开始 succeed (\a b c d -> [ a, b, c, d ])
的解析器来了解问题所在)在空字符串上。如果这样做,您将得到以下结果:
Ok [([],[]),([],[]),([],[]),([],[])]
花点时间思考一下五步解析器、十步解析器、甚至 100 步解析器会得到什么。 loop
提供了一个可以运行任意数量步骤的解析器。
Elm documentation for the loop
function提示您的问题:
Parsers like
succeed ()
andchompWhile Char.isAlpha
can succeed without consuming any characters. So in some cases you may want to usegetOffset
to ensure that each step actually consumed characters. Otherwise you could end up in an infinite loop!
您的解析器遇到无限循环,因为它输出无限长的元组列表,每个元组都有一个空命令列表。您的解析器在生成每个这样的元组时不消耗任何字符,因此它将永远循环。
在您的情况下,空命令列表似乎没有意义。因此我们必须确保空的命令列表会导致解析失败。
实现此目的的一种方法是编写 loopHelper
的变体,如果列表为空,该变体将失败:
checkNonEmpty : List a -> Parser ()
checkNonEmpty list =
if List.isEmpty list then
problem "List is empty"
else
succeed ()
loopHelperNonEmpty : Parser a -> List a -> Parser (Step (List a) (List a))
loopHelperNonEmpty parser revContent =
oneOf
[ succeed (\ctnt -> Loop (ctnt :: revContent))
|= parser
, checkNonEmpty revContent
|> map (\_ -> Done (List.reverse revContent))
]
(我在这里找不到引入 getOffset
的简单方法,所以我做了一些不同的事情。)
然后,您可以更改 commands
的定义以使用此函数而不是 loopHelper
:
commands : Parser (List TermCmd)
commands =
loop [] <| loopHelperNonEmpty command
我对您的代码进行了此更改,它生成了以下输出:
Ok
[ ( [ CD Home, LS ]
, [ Dir "a" [], File 14848514 "b" "txt", File 8504156 "c" "dat", Dir "d" [] ]
)
, ( [ CD (DownOne "a"), LS ]
, [ Dir "e" [], File 29116 "f" "", File 2557 "g" "", File 62596 "h" "lst" ]
)
, ( [ CD (DownOne "e"), LS ]
, [ File 584 "i" "" ]
)
, ( [ CD UpOne, CD UpOne, CD (DownOne "d"), LS ]
, [ File 4060174 "j" "", File 8033020 "d" "log", File 5626152 "d" "ext", File 7214296 "k" "" ]
)
]
(为了清楚起见,我已经对其进行了格式化。在研究您的代码时,我只是使用 Debug.toString()
将解析器的结果输出到浏览器窗口中,但这会显示为一长串行。我将其粘贴到 VS Code 中,添加了一些换行符并使用 elm-format 将其格式化为更好的格式。)
关于parsing - Elm 解析器循环不会终止,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75202570/