当我想将数据上传到 Apache Cassandra 的“测试集群”时,我打开终端,然后:
export PATH=/home/mypc/dsbulk-1.7.0/bin:$PATH
source ~/.bashrc
dsbulk load -url /home/mypc/Desktop/test/file.csv -k keyspace_test -t table_test
但是...
At least 1 record does not match the provided schema.mapping or schema.query. Please check that the connector configuration and the schema configuration are correct.
Operation LOAD_20201105-103000-577734 aborted: Too many errors, the maximum allowed is 100.
total | failed | rows/s | p50ms | p99ms | p999ms | batches
104 | 104 | 0 | 0,00 | 0,00 | 0,00 | 0,00
Rejected records can be found in the following file(s): mapping.bad
Errors are detailed in the following file(s): mapping-errors.log
Last processed positions can be found in positions.txt
这是什么意思?为什么我无法加载?
谢谢!
最佳答案
错误是您没有提供 CSV 数据与表之间的映射。可以通过两种方式完成:
- 如果 CSV 文件的 header 列名称与 Cassandra 中的列名称匹配,则使用
-header true
- 使用
-m
选项显式提供映射(请参阅 docs) - 您需要将 CSV 列映射到 Cassandra 列。
有一系列关于 DSBulk 使用的不同方面的非常好的博客文章:
- https://www.datastax.com/blog/2019/03/datastax-bulk-loader-introduction-and-loading
- https://www.datastax.com/blog/2019/04/datastax-bulk-loader-more-loading
- https://www.datastax.com/blog/2019/04/datastax-bulk-loader-common-settings
- https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading
- https://www.datastax.com/blog/2019/07/datastax-bulk-loader-counting
- https://www.datastax.com/blog/2019/12/datastax-bulk-loader-examples-loading-other-locations
前两个内容详细介绍了数据加载
关于ubuntu - 如何使用 Datastax Bulk loader (Ubuntu) 将数据加载到 Apache Cassandra 中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64695641/