ruby - 每行读取固定数量的管道分隔字段？

我有一堆竖线分隔的文件，生成时没有为回车符正确转义，所以我不能使用 CR 或换行符来分隔行。但是我知道每条记录必须正好有 7 个字段。

使用 Ruby 1.9 中的 CSV 库设置“col_sep”参数可以轻松拆分字段，但无法设置“row_sep”参数，因为字段中有换行符。

有没有办法使用固定数量的字段作为行分隔符来解析管道分隔文件？

谢谢!

最佳答案

这是一种实现方式:

构建一个包含七个单词的示例字符串，其中嵌入了换行符字符串的中间。值三行。

text = (["now is the\ntime for all good"] * 3).join(' ').gsub(' ', '|')
puts text
# >> now|is|the
# >> time|for|all|good|now|is|the
# >> time|for|all|good|now|is|the
# >> time|for|all|good

过程是这样的:

lines = []
chunks = text.gsub("\n", '|').split('|')
while (chunks.any?)
  lines << chunks.slice!(0, 7).join(' ')
end

puts lines
# >> now is the time for all good
# >> now is the time for all good
# >> now is the time for all good

所以，这表明我们可以重建行。

假装这些词实际上是管道分隔文件中的列，我们可以通过取出 .join(' ') 让代码做真实的事情:

while (chunks.any?)
  lines << chunks.slice!(0, 7)
end

ap lines
# >> [
# >>     [0] [
# >>         [0] "now",
# >>         [1] "is",
# >>         [2] "the",
# >>         [3] "time",
# >>         [4] "for",
# >>         [5] "all",
# >>         [6] "good"
# >>     ],
# >>     [1] [
# >>         [0] "now",
# >>         [1] "is",
# >>         [2] "the",
# >>         [3] "time",
# >>         [4] "for",
# >>         [5] "all",
# >>         [6] "good"
# >>     ],
# >>     [2] [
# >>         [0] "now",
# >>         [1] "is",
# >>         [2] "the",
# >>         [3] "time",
# >>         [4] "for",
# >>         [5] "all",
# >>         [6] "good"
# >>     ]
# >> ]

关于ruby - 每行读取固定数量的管道分隔字段？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/4083690/

ruby - 每行读取固定数量的管道分隔字段？

上一篇：ruby-on-rails - Rails3 : HOWTO Override/Reopen a class within a Gem and the Rails initialization process

下一篇：ruby - 是否有任何工具可以检测在不同 ruby 版本中表现不同的代码？

ruby - 每行读取固定数量的管道分隔字段？

上一篇：ruby-on-rails - Rails3 : HOWTO Override/Reopen a class within a Gem and the Rails initialization process

下一篇：ruby - 是否有任何工具可以检测在不同 ruby​​ 版本中表现不同的代码？

下一篇：ruby - 是否有任何工具可以检测在不同 ruby 版本中表现不同的代码？