ruby-on-rails - MalformedCSVError 与 rails CSV (FasterCSV)

标签 ruby-on-rails ruby parsing csv fastercsv

我现在在尝试解析 Rails 中的一些 CSV 时遇到了严重的问题。 基本上我的应用程序让用户上传 CSV 文件。然后应用程序转换文件以确保它是 UTF-8 格式,然后尝试解析和处理它。但是,每当应用程序尝试解析它时,我都会收到 MalformedCSVError,指出“第 1 行非法引用”

现在我不明白的是,如果我将原始文件复制到一个新文档中并保存它,那么我就可以在 Rails 控制台上毫无问题地解析它。

如果我尝试解析原始文件,它会提示 UTF-8 编码的字符无效(该文件不是 UTF-8,因此应用程序会转换它)

如果我尝试解析应用程序已转换为 UTF-8 并将行尾更改为 LF 的文件,则无法解析。

如果我在应用程序生成的版本和我制作的复制/粘贴版本(有效)之间进行文件差异,则差异为 0,所以我真的无法弄清楚为什么一个是可解析的,而另一个不是。

有什么建议吗?我的应用正在按如下方式处理文件:

def create
@survey = Survey.new(params[:survey])

# Now we need to try and convert this to UTF-8 if it isn't already
 encoded = File.read(@survey.survey_data.current_path)
encoding = CharlockHolmes::EncodingDetector.detect(encoded)

# We've got a guess at the encoding, 
# so we can try and convert it but it 
# may still fail so we need to handle 
# that
begin
  re_encoded = CharlockHolmes::Converter.convert(encoded, encoding[:encoding], 'UTF-8')
  re_encoded = re_encoded.gsub(/\r\n?/, "\n")

  # Now replace the uploaded file
  File.open(@survey.survey_data.current_path, 'w') { |f|
    f.write(re_encoded)
  }
rescue ArgumentError
  puts "UH OH!!!!!"
end

puts "#{@survey.survey_data.current_path}"
@parsed = CSV.read(@survey.survey_data.current_path)

结束

文件上传 gem 是 CarrierWave,如果这有什么不同的话。

请有人帮助我,因为这让我发疯!

编辑

错误说它在第 1 行。第 1 行(假设它不是从 0 开始索引)是

"Survey","RD","GarrysMDs","NigelsMDs","PaulsMDs","StephensMDs","BrinleyJ","CarolineP","DaveL","GrantR","GregS","Kent","NeilC","NicolaP","AndyC","DarrenS","DeanB","KarenF","PaulR","RichardF","SteveG","BrianG","GordonA","NickD","NickR","NickT","RayL","SimonH","EdmondH","JasonF","MikeS","SamanthaN","TimB","TravisF","AlanS","Q1","Q2","Q3","Q4","Q5","Q6","Q7","Q8PM","Q8N","Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16PM","Q16N","Q17PM","Q17N","Q18PM","Q18N","Q19","Q20","Q21","Q22","comment","Q23.1","Q23.2","Q23.3","TQ23.1","TQ23.2","VPM","VN","VQ1","VQ2","VQ3","VQ4","VQ5","VQ6","VQ7","VQ8N","VQ8PM","VQ9","VQ10","VQ11","VQ12","VQ13","VQ14","VQ15","VQ16","VQ16N","VQ16PM","VQ17","VQ17N","VQ17PM","VQ18","VQ18N","VQ18PM","VQ19","VQ20","VQ21","VQ22","VQ23.1","VQ23.2","VQ23.3","VRD","XQ16","XQ17","XQ18"

最佳答案

这真让人恼火!

原来文件有一个导致 CSV 解析器中断的 BOM。使用

加载文件
CSV.open("path/to/file.csv", "rb:bom|encoding")

让它完美地解析它!很生气花了多长时间来追踪,但它现在可以工作了,现在也不需要转换为 UTF-8!

关于ruby-on-rails - MalformedCSVError 与 rails CSV (FasterCSV),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14252340/

相关文章:

ruby-on-rails - 未 check out 获取 "' gem(在 master 处)。首先运行 'bundle install'"每次我运行 rake db :migrate

ruby - 无需循环即可将列表解析为多维数组

java - 将 ANTLR 解析规则映射到用于代码生成的自定义 Java AST 类

mysql - 如何在 MySQL 中以 Ruby 哈希存储多封电子邮件?

ruby - Hpricot 中使用的(例如)除数表示法是什么?

ruby-on-rails - 功能测试 Authlogic?

使用 Haskell 解析方案 dottedlist/list

ruby-on-rails - ActiveAdmin - 自定义范围

ruby-on-rails - Ruby 简单表单复选框在 True 时不检查

ruby-on-rails - ActiveRecord:如何获取模型的所有可批量分配的属性?