我正在尝试使用PARSE将CSV行变成Rebol块。足够容易用开放代码编写,但是与其他问题一样,我正在尝试学习方言可以做到这一点。
因此,如果一行显示:
"Look, that's ""MR. Fork"" to you!",Hostile Fork,,http://hostilefork.com
然后我要块:
[{Look, that's "MR. Fork" to you!} {Hostile Fork} none {http://hostilefork.com}]
注意事项:
""
指示http://rebol.com
之类的内容保留为STRING!而不是LOAD将它们分为URL! 为了使它更加统一,我要做的第一件事是在输入行上附加一个逗号。然后我有一个
column-rule
,它捕获由逗号终止的单列...它可以用引号引起来,也可以不用引号引起来。我知道由于标题行而应有多少列,因此代码如下:
unless parse line compose [(column-count) column-rule] [
print rejoin [{Expected } column-count { columns.}]
]
但是我在写
column-rule
时有些卡住。我需要一种方言表达方式:“一旦找到报价,就不断跳过报价对,直到找到一个独立存在的报价。”有什么好方法吗?
最佳答案
与大多数解析问题一样,我尝试构建一种最能描述输入格式元素的语法。
在这种情况下,我们有名词:
[comma ending value-chars qmark quoted-chars value header row]
一些动词:
[row-feed emit-value]
和操作名词:
[current chunk current-row width]
我想我可以将其分解一些,但足以使用。一,基础:
comma: ","
ending: "^/"
qmark: {"}
value-chars: complement charset reduce [qmark comma ending]
quoted-chars: complement charset reduce [qmark]
现在的值(value)结构。引用的值是从我们发现的有效字符或引号的大块中建立起来的:
current: chunk: none
quoted-value: [
qmark (current: copy "")
any [
copy chunk some quoted-chars (append current chunk)
|
qmark qmark (append current qmark)
]
qmark
]
value: [
copy current some value-chars
| quoted-value
]
emit-value: [
(
delimiter: comma
append current-row current
)
]
emit-none: [
(
delimiter: comma
append current-row none
)
]
请注意,在每行的开头将
delimiter
设置为ending
,然后在我们传递值后立即将其更改为comma
。因此,将输入行定义为[ending value any [comma value]]
。剩下的就是定义文档结构:
current-row: none
row-feed: [
(
delimiter: ending
append/only out current-row: copy []
)
]
width: none
header: [
(out: copy [])
row-feed any [
value comma
emit-value
]
value body: ending :body
emit-value
(width: length? current-row)
]
row: [
row-feed width [
delimiter [
value emit-value
| emit-none
]
]
]
if parse/all stream [header some row opt ending][out]
将其包装起来以屏蔽所有这些单词,您将拥有:
REBOL [
Title: "CSV Parser"
Date: 19-Nov-2012
Author: "Christopher Ross-Gill"
]
parse-csv: use [
comma ending delimiter value-chars qmark quoted-chars
value quoted-value header row
row-feed emit-value emit-none
out current current-row width
][
comma: ","
ending: "^/"
qmark: {"}
value-chars: complement charset reduce [qmark comma ending]
quoted-chars: complement charset reduce [qmark]
current: none
quoted-value: use [chunk][
[
qmark (current: copy "")
any [
copy chunk some quoted-chars (append current chunk)
|
qmark qmark (append current qmark)
]
qmark
]
]
value: [
copy current some value-chars
| quoted-value
]
current-row: none
row-feed: [
(
delimiter: ending
append/only out current-row: copy []
)
]
emit-value: [
(
delimiter: comma
append current-row current
)
]
emit-none: [
(
delimiter: comma
append current-row none
)
]
width: none
header: [
(out: copy [])
row-feed any [
value comma
emit-value
]
value body: ending :body
emit-value
(width: length? current-row)
]
row: [
opt ending end break
|
row-feed width [
delimiter [
value emit-value
| emit-none
]
]
]
func [stream [string!]][
if parse/all stream [header some row][out]
]
]
关于parsing - 如何使用PARSE方言从CSV中读取行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13451026/