ruby - 为什么 CSV::HeaderConverters 在返回非字符串时停止处理？

为什么处理header converters从 header 转换器返回的第一个非 String 停止？

详情

内置的:symbol头转换器被触发后，不再处理其他转换器。似乎头转换器的处理停止第一个转换器返回任何不是 String 的东西(即，如果您编写返回 的自定义头转换器，则行为相同Fixnum，或其他任何东西)。

此代码按预期工作，在 :throw_an_exception 中抛出异常

require 'csv'

CSV::HeaderConverters[:throw_an_exception] = lambda do |header|
  raise 'Exception triggered.'
end

csv_str = "Numbers\n" +
          "1\n" +
          "4\n" +
          "7"

puts CSV.parse(
  csv_str,
  {
    headers: true,
    header_converters: [
      :throw_an_exception,
      :symbol
    ]
  }
)

但是，如果您切换 header 转换器的顺序，使 :symbol 转换器排在第一位，则永远不会调用 :throw_an_exception lambda。

...

header_converters: [
  :symbol,
  :throw_an_exception
]

...

最佳答案

所以我联系了JEG2 .

我在想转换器应该是链条中的一系列步骤，其中所有元素都应该经过每个步骤。事实上，这并不是使用 CSV 库的最佳方式，尤其是当您有大量数据时。

它应该被使用的方式(这是对“为什么”问题的回答和为什么这对性能更好的解释)是让转换器像一系列匹配器一样工作，其中第一个匹配的转换器返回一个非 String，它向 CSV 库指示当前值已成功转换。当您这样做时，解析器可以在它是非 String 时立即停止，并继续处理下一个 header /单元格值。

通过这种方式，您可以在解析 CSV 数据时减少大量开销。您处理的文件越大，消除的开销就越多。

这是我收到的电子邮件回复:

...

The converters are basically a pipeline of conversions to try. Let's say you're using two converters, one for dates and one for numbers. Without a linked line, we would try both for every field. However, we know a couple of things:

An unconverterd CSV field is a String, because that's how we read it in

A field that is now a non-String, has been converted, so we can stop searching for a converter that matches.

Given that, the optimization helps our example skip checking the number converter if we already have a Date object.

...

关于ruby - 为什么 CSV::HeaderConverters 在返回非字符串时停止处理？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37999758/

ruby - 为什么 CSV::HeaderConverters 在返回非字符串时停止处理？

详情

上一篇：ruby-on-rails - 如何限制 current_user 每个时间段向订单添加超过 3 个 order_items？

下一篇：ruby-on-rails - 加载错误 : cannot load such file -- active_support