java - 带下划线的 spark 数据集到 java 对象映射字段

spark 中的数据集有一个名称为 no_of_items 的列。在相应的 Java 模型(产品)中，我有一个名为 noOfItems 的列。现在，当我使用以下代码将数据集转换为数据集时

df.as(Encoders.bean(Product.class));

它抛出以下异常

Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`noOfItems`' given input columns: [category, sub_category, no_of_items];

如何解决？

最佳答案

在将df转换为dataset之前，重命名该列，

df.withColumnRenamed("no_of_items", "noOfItems").as(Encoders.bean(Product.class));

关于java - 带下划线的 spark 数据集到 java 对象映射字段，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57589518/

上一篇：java - 在运行时基于值 java 获取枚举名称

下一篇：java - This release is not compliant with the Google Play 64-bit requirement error still after adding libraries

java - 如何用 Java 8 流替换下面的 for 循环代码

java - 将列表转换为 map 的实用程序

apache-spark - 线程 "main"org.apache.spark.SparkException : Must specify the driver container image 中的异常

apache-spark - 星火-SQL : Unable to instantiate org. apache.hadoop.hive.metastore.HiveMetaStoreClient

apache-spark - 如何将 Spark 数据帧写入 Neo4j 数据库

java - 一个仓库中的多个项目 GitHub

java - 在 Spring Boot 应用程序中使用 AspectJ 加载时间编织时构建结果不一致

java - 重命名 hibernate envers id 和 timestamp 列

python - 将RDD保存为pyspark中的序列文件