ubuntu - 如何使用 Datastax Bulk loader (Ubuntu) 将数据加载到 Apache Cassandra 中？

当我想将数据上传到 Apache Cassandra 的“测试集群”时，我打开终端，然后:

export PATH=/home/mypc/dsbulk-1.7.0/bin:$PATH

source ~/.bashrc

dsbulk load -url /home/mypc/Desktop/test/file.csv -k keyspace_test -t table_test

但是...

At least 1 record does not match the provided schema.mapping or schema.query. Please check that the connector configuration and the schema configuration are correct.
Operation LOAD_20201105-103000-577734 aborted: Too many errors, the maximum allowed is 100.

total | failed | rows/s | p50ms | p99ms | p999ms | batches
  104 |    104 |      0 |  0,00 |  0,00 |   0,00 |    0,00

Rejected records can be found in the following file(s): mapping.bad
Errors are detailed in the following file(s): mapping-errors.log
Last processed positions can be found in positions.txt

这是什么意思？为什么我无法加载？

谢谢!

最佳答案

错误是您没有提供 CSV 数据与表之间的映射。可以通过两种方式完成:

如果 CSV 文件的 header 列名称与 Cassandra 中的列名称匹配，则使用 -header true
使用 -m 选项显式提供映射(请参阅 docs) - 您需要将 CSV 列映射到 Cassandra 列。

有一系列关于 DSBulk 使用的不同方面的非常好的博客文章:

前两个内容详细介绍了数据加载

关于ubuntu - 如何使用 Datastax Bulk loader (Ubuntu) 将数据加载到 Apache Cassandra 中？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/64695641/

ubuntu - 如何使用 Datastax Bulk loader (Ubuntu) 将数据加载到 Apache Cassandra 中？

上一篇：linux - 如何在 Ubuntu 上使用 .NET 核心应用程序获得总 CPU % 和 Memory% 使用率

下一篇：r - 文本在 R 图中无法正确显示