MySQL加载忽略一些记录

标签 mysql csv mysql-loadfile

我有这个CSV file大约有 16.916 条记录。当我将其加载到 MySQL 中时,它仅检测到 15.945 条记录。 MySQL 就是这么说的:

Records: 15945  Deleted: 0  Skipped: 0  Warnings: 0

有人可以告诉我为什么 MySQL 会忽略某些记录以及如何解决这个问题吗?

我使用 LOAD 函数加载文件,如下所示:

LOAD DATA LOCAL INFILE 'germany-filtered.csv'
INTO TABLE point_of_interest
FIELDS TERMINATED BY ','
    ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(osm_id,lat,lng,access,addr_housename,addr_housenumber,addr_interpolation,admin_level,aerialway,aeroway,amenity,area,barrier,bicycle,brand,bridge,boundary,building,capital,construction,covered,culvert,cutting,denomination,disused,ele,embankment,foot,generator_source,harbour,highway,historic,horse,intermittent,junction,landuse,layer,leisure,ship_lock,man_made,military,motorcar,name,osm_natural,office,oneway,operator,place,poi,population,power,power_source,public_transport,railway,ref,religion,route,service,shop,sport,surface,toll,tourism,tower_type,tunnel,water,waterway,wetland,width,wood);

这就是我使用的数据库架构:

CREATE TABLE point_of_interest (
    `poi_id` int(10) unsigned NOT NULL auto_increment,
    `lat` DECIMAL(10, 8) default NULL,
    `lng` DECIMAL(11, 8) default NULL,
    PRIMARY KEY  (`poi_id`),
    KEY `lat` (`lat`),
    KEY `lng` (`lng`),
    osm_id BIGINT,
    access TEXT,
    addr_housename TEXT,
    addr_housenumber TEXT,
    addr_interpolation TEXT,
    admin_level TEXT,
    aerialway TEXT,
    aeroway TEXT,
    amenity TEXT,
    area TEXT,
    barrier TEXT,
    bicycle TEXT,
    brand TEXT,
    bridge TEXT,
    boundary TEXT,
    building TEXT,
    capital TEXT,
    construction TEXT,
    covered TEXT,
    culvert TEXT,
    cutting TEXT,
    denomination TEXT,
    disused TEXT,
    ele TEXT,
    embankment TEXT,
    foot TEXT,
    generator_source TEXT,
    harbour TEXT,
    highway TEXT,
    historic TEXT,
    horse TEXT,
    intermittent TEXT,
    junction TEXT,
    landuse TEXT,
    layer TEXT,
    leisure TEXT,
    ship_lock TEXT,
    man_made TEXT,
    military TEXT,
    motorcar TEXT,
    name TEXT,
    osm_natural TEXT,
    office TEXT,
    oneway TEXT,
    operator TEXT,
    place TEXT,
    poi TEXT,
    population TEXT,
    power TEXT,
    power_source TEXT,
    public_transport TEXT,
    railway TEXT,
    ref TEXT,
    religion TEXT,
    route TEXT,
    service TEXT,
    shop TEXT,
    sport TEXT,
    surface TEXT,
    toll TEXT,
    tourism TEXT,
    tower_type TEXT,
    tunnel TEXT,
    water TEXT,
    waterway TEXT,
    wetland TEXT,
    width TEXT,
    wood TEXT
) ENGINE=InnoDB;

更新:

我已经检查了第一条和最后一条记录,但两者都存在。也确实存在具有大量空值的记录:

1503898236,10.5271308,52.7468051,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

更新2:

这些是我发现数据库中丢失的记录:

4228380062,9.9386752,53.6135468,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Dammwild,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4228278589,9.9391503,53.5960304,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Kaninchen,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4228278483,9.9396935,53.5960729,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Onager,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4226772791,8.8394263,54.1354887,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Familienlagune Perlebucht,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,

似乎几乎所有 osm_id4 开头的记录都丢失了。这很奇怪。

最佳答案

尝试查看文件中是否有重复的 id:

显示文件

# cat mycsv.csv
6991,10.4232704,49.4970160,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Bauernhaus aus Seubersdorf,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4228380062,9.9386752,53.6135468,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Dammwild,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4228278589,9.9391503,53.5960304,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Kaninchen,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4228278483,9.9396935,53.5960729,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Onager,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4226772791,8.8394263,54.1354887,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Familienlagune Perlebucht,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,
4228278589,9.9391503,53.5960304,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Kaninchen,,,,,,,,,,,,,,,,,,,,attraction,,,,,,,

计算行数

# wc -l mycsv.csv
6 mycsv.csv

删除重复的 ID 并重新计数

# cut -d',' -f1 mycsv.csv | sort | uniq | wc -l
5

关于MySQL加载忽略一些记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37767005/

相关文章:

javascript - 尝试通过 RingCentral 通过网页抓取或通过电子邮件发送 .csv 导出来生成自动每日通话报告

mysql - 错误代码 : 13 when loading data into SQL Database

mysql - LOAD_FILE 返回 NULL

mysql - 第 1 行 : Field separator argument is not what is expected when using mysql LOAD 处的错误 1083 (42000)

php - 如何在从数据库中提取信息的下拉菜单上插入空选项?

perl - Text::CSV bind_columns 不起作用

python - MySQL/SQL 根据条件查询结果后设置的参数进行查询 - 该函数之前在 Python 中编写,读取 CSV 文件

mysql - 在 MySQL 中使用十六进制数

mysql - 使用带外键和不带外键的 REFERENCES 之间的区别?

mysql - 使用 MySQL 而不是 SQLite 创建一个新的 Ruby on Rails 应用程序