我有一个 CSV 文件,我想使用以下命令将其加载到 MySQL 表中:
LOAD DATA LOCAL INFILE '/path/to/file.csv'
INTO TABLE items
CHARACTER SET utf8
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(field1, field2, field3, field4, field5);
我面临的问题是 csv 文件的格式不正确,因为某些字段没有用双引号 ("") 括起来并且还有换行符。例如:(第三行)
"field1","field2","field3","field4","field5"
"aaaaa","bbbbb","ccccc","ddddd","eeeeee"
aaaa
aaaa,bbbbbbbb
bbbbb,"ccccc","dddddd","eeeee"
当我将 csv 文件导入 MySQL 时,字段内容中的换行符被解释为行终止。
那么……我该如何解决呢?正则表达式?一些 CSV 编辑器(我试过 CSVed 但没有成功)?谢谢。
最佳答案
快速而肮脏的修复尝试:
$csv = str_replace("\r", "", $csv);
$data = array(array());
while (!empty($csv)) {
// if in quotes
if (substr($csv, 0, 1) == '"') {
$found = preg_match('~[^\\\\]"~', $csv, $matches, PREG_OFFSET_CAPTURE, 1);
if (!$found)
die("No closing quote found");
$data[count($data)-1][] = substr($csv, 1, $matches[0][1]);
$csv = substr($csv, $matches[0][1] + 2);
// if not in quotes
} else {
$pos = strpos($csv, ',');
if ($pos === FALSE) {
$data[count($data)-1][] = $csv;
$csv = "";
} else {
$data[count($data)-1][] = substr($csv, 0, $pos);
$csv = substr($csv, $pos);
}
}
// comma => not the end of the row
if (substr($csv, 0, 1) == ',') {
$csv = substr($csv, 1);
// newline => end of the row
} else if (substr($csv, 0, 1) == "\n") {
$csv = ltrim($csv);
$data[] = array(); // new row
} else if (!empty($csv)) {
die("unexpected error in csv");
}
}
print_r($data);
应用于您的数据片段输出:
Array
(
[0] => Array
(
[0] => field1
[1] => field2
[2] => field3
[3] => field4
[4] => field5
)
[1] => Array
(
[0] => aaaaa
[1] => bbbbb
[2] => ccccc
[3] => ddddd
[4] => eeeeee
)
[2] => Array
(
[0] => aaaa
aaaa
[1] => bbbbbbbb
bbbbb
[2] => ccccc
[3] => dddddd
[4] => eeeee
)
)
关于mysql - 当某些字段未用双引号引起来且包含新行时,将 CSV 文件加载到 MySQL 表中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5352881/