我正在使用 apoc.periodic.iterate 和 apoc.load.csv 加载 csv 文件,但我总是失败并说有 NullPointerException :
neo4j> CALL apoc.periodic.iterate('
CALL apoc.load.csv("http://128.194.9.150:9999/On_Time_On_Time_Performance_2018_1.csv", {}) yield map as row return row
','
MATCH (sc:City {name: row.OriginCityName}), (tc:City {name: row.DestCityName})
MERGE (sc)-[f:Flight {flightDate: row.FlightDate, flightNum: toInt(row.FlightNum)}]->(tc)
', {batchSize:200, iterateList:true, parallel:true});
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| batches | total | timeTaken | committedOperations | failedOperations | failedBatches | retries | errorMessages | batch | operations | wasTerminated |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2851 | 570131 | 888 | 313800 | 256400 | 1282 | 0 | | {total: 2851, committed: 1569, failed: 1282, errors: {`java.lang.NullPointerException`: 1282}} | {total: 570131, committed: 313800, failed: 256400, errors: } | FALSE |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
但是,当我使用 neo4j 的 load csv 命令加载这个文件时,它们是正确的:
neo4j> LOAD CSV WITH HEADERS FROM "http://128.194.9.150:9999/On_Time_On_Time_Performance_2018_1.csv" as row
MATCH (sc:City {name: row.OriginCityName}), (tc:City {name: row.DestCityName})
MERGE (sc)-[f:Flight {flightDate: row.FlightDate, flightNum: toInt(row.FlightNum)}]->(tc)
;
0 rows available after 3395077 ms, consumed after another 0 ms
Created 255988 relationships, Set 511976 properties
csv文件来自网站:https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time
我应该如何加载 csv 文件以避免 NullPointerException?
最佳答案
很明显,您已经更改了 transtat-data 的默认 header ,但我假设您已经验证了这些。我立即看到的唯一问题是您的 apoc.periodic.iterate 中有 parallel:true。鉴于数据(与许多相同的起点/终点高度相关)必然会导致问题。
你能试试 parallel:false 吗?这应该会为您提供与常规 LOAD CSV 完全相同的结果。
希望对您有所帮助。
问候, 汤姆
更新
CREATE CONSTRAINT ON (c:City) ASSERT c.name IS UNIQUE;
LOAD CSV WITH HEADERS FROM "file:///488042997_T_ONTIME.csv" AS row
WITH [row.ORIGIN_CITY_NAME, row.DEST_CITY_NAME] as names
UNWIND names as cityName
WITH DISTINCT cityName as theName
CREATE (c:City {name: theName});
# Added 328 labels, created 328 nodes, set 328 properties, completed after 3308 ms.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///488042997_T_ONTIME.csv" AS row
WITH DISTINCT row.ORIGIN_CITY_NAME as ocn, row.DEST_CITY_NAME as dcn, row.FL_DATE as fdate, toInteger(row.FL_NUM) as fnum
MATCH (sc:City {name: ocn}), (tc:City {name: dcn})
CREATE (sc)-[f:FLIGHT {flightDate: fdate, flightNum: fnum}]->(tc);
# Set 1139338 properties, created 569669 relationships, completed after 38773 ms.
关于csv - 使用 neo4j 的 apoc.periodic.iterate 过程加载 csv 时出现 NullPointerException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50557893/