我正在使用 Text.ParserCombinators.Parsec和 Text.XHtml解析这样的输入:
This is the first paragraph example\n with two lines\n \n And this is the second paragraph\n
And my output should be:
<p>This is the first paragraph example\n
with two lines\n</p>
<p>And this is the second paragraph\n</p>
I defined:
line= do{
;t<-manyTill (anyChar) newline
;return t
}
paragraph = do{
t<-many1 (line)
;return ( p << t )
}
但它返回:
<p>This is the first paragraph example\n
with two lines\n\n And this is the second paragraph\n</p>
怎么了?有什么想法吗?
谢谢!
最佳答案
来自 documentation for manyTill ,它运行第一个参数零次或多次,所以一行中的 2 个换行符仍然有效并且你的 line
解析器不会失败。
您可能正在寻找类似 many1Till
的内容(例如 many1
与 many
),但它似乎不存在于 Parsec 中库,所以你可能需要自己动手:(警告:我在这台机器上没有 ghc,所以这是完全未经测试的)
many1Till p end = do
first <- p
rest <- p `manyTill` end
return (first : rest)
或者更简洁的方式:
many1Till p end = liftM2 (:) p (p `manyTill` end)
关于html - Haskell - Parsec 解析 <p> 元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2732832/