我正在尝试使用 parsec 编写以下解析器:
manyLength
:: forall s u m a.
Monad m
=> ParsecT s u m a -> ParsecT s u m Int
manyLength p = go 0
where
go :: Int -> ParsecT s u m Int
go !i = (p *> go (i + 1)) <|> pure i
这就像 many
函数,但不是返回 [a]
,而是
返回 Parser a
成功的次数。
这可行,但我似乎无法让它在恒定的堆空间中运行。这使得
从某种意义上说,因为对 go
的递归调用不在尾调用位置。
如果parsec将构造函数导出到ParsecT
,有可能
以 CPS 形式重写manyLength
。这与manyAccum
非常相似。
功能:
manyLengthCPS :: forall s u m a. ParsecT s u m a -> ParsecT s u m Int
manyLengthCPS p = ParsecT f
where
f
:: forall b.
State s u
-> (Int -> State s u -> ParseError -> m b) -- consumed ok
-> (ParseError -> m b) -- consumed err
-> (Int -> State s u -> ParseError -> m b) -- empty ok
-> (ParseError -> m b) -- empty err
-> m b
f s cok cerr eok _ =
let walk :: Int -> a -> State s u -> ParseError -> m b
walk !i _ s' _ =
unParser p s'
(walk $ i + 1) -- consumed-ok
cerr -- consumed-err
manyLengthCPSErr -- empty-ok
(\e -> cok (i + 1) s' e) -- empty-err
in unParser p s (walk 0) cerr manyLengthCPSErr (\e -> eok 0 s e)
{-# INLINE f #-}
manyLengthCPSErr :: Monad m => m a
manyLengthCPSErr =
fail "manyLengthCPS can't be used on parser that accepts empty input"
这个manyLengthCPS
函数确实在常量堆空间中运行。
为了完整性,这里是 ParsecT
构造函数:
newtype ParsecT s u m a = ParsecT
{ unParser
:: forall b .
State s u
-> (a -> State s u -> ParseError -> m b) -- consumed ok
-> (ParseError -> m b) -- consumed err
-> (a -> State s u -> ParseError -> m b) -- empty ok
-> (ParseError -> m b) -- empty err
-> m b
}
我还尝试使用以下方法将 manyLengthCPS
直接转换为非 CPS 函数
低级mkPT
功能:
manyLengthLowLevel
:: forall s u m a.
Monad m
=> ParsecT s u m a -> ParsecT s u m Int
manyLengthLowLevel p = mkPT f
where
f :: State s u -> m (Consumed (m (Reply s u Int)))
f parseState = do
consumed <- runParsecT p parseState
case consumed of
Empty mReply -> do
reply <- mReply
case reply of
Ok _ _ _ -> manyLengthErr
Error parseErr -> pure . Empty . pure $ Ok 0 parseState parseErr
Consumed mReply -> do
reply <- mReply
case reply of
Ok a newState parseErr -> walk 0 a newState parseErr
Error parseErr -> pure . Consumed . pure $ Error parseErr
where
walk
:: Int
-> a
-> State s u
-> ParseError
-> m (Consumed (m (Reply s u Int)))
walk !i _ parseState' _ = do
consumed <- runParsecT p parseState'
case consumed of
Empty mReply -> do
reply <- mReply
case reply of
Ok _ _ _ -> manyLengthErr
Error parseErr ->
pure . Consumed . pure $ Ok (i + 1) parseState' parseErr
Consumed mReply -> do
reply <- mReply
case reply of
Ok a newState parseErr -> walk (i + 1) a newState parseErr
Error parseErr -> pure . Consumed . pure $ Error parseErr
manyLengthErr :: Monad m => m a
manyLengthErr =
fail "manyLengthLowLevel can't be used on parser that accepts empty input"
就像 manyLength
一样,manyLengthLowLevel
不在常量堆空间中运行。
是否可以编写manyLength
,以便它甚至可以在恒定的堆空间中运行
不以 CPS 风格编写?如果没有,为什么不呢?有没有一些基本的
为什么在 CPS 风格中可以,但在非 CPS 风格中不行?
最佳答案
它在恒定的堆空间中运行。思路是首先尝试p
,并对其成功的结果进行显式案例分析,以决定是否运行go
,从而使go
以尾部调用位置结束。
manyLength
:: Monad m
=> ParsecT s u m a -> ParsecT s u m Int
manyLength p = go 0
where
go :: Int -> ParsecT s u m Int
go !i = do
success <- (p *> pure True) <|> pure False
if success then go (i+1) else pure i
关于parsing - Haskell 的 parsec 中 CPS 与非 CPS 解析器的堆使用情况,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43092461/