java - 为什么SparkSQL中org.apache.spark.sql.types.DecimalType的最大精度值为38？

标签 java scala apache-spark apache-spark-sql

我正在使用 Apache Spark 的 SQL 来处理结构大数据。我遇到过 Spark SQL 数据类型的使用，特别是 DecimalType，它比 SparkSQL 中的任何其他数据类型都支持最大数量的存储，但高达 38 的精度，即使根据文档:http://spark.apache.org/docs/latest/sql-programming-guide.html#data-types您可以在其中找到:它内部使用 Scala 语言的 BigDecimal，允许精度约为。 2^32。为什么会这样呢？

我需要通过sparkSQL实现scala的BigDecimal提供的相同功能。我可以知道如何实现此问题的解决方案或我可以尝试的任何其他方法吗？

最佳答案

spark 在底层使用了 Java 的 BigDecimal。

https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html

A BigDecimal consists of an arbitrary precision integer unscaled value and a 32-bit integer scale. If zero or positive, the scale is the number of digits to the right of the decimal point. If negative, the unscaled value of the number is multiplied by ten to the power of the negation of the scale. The value of the number represented by the BigDecimal is therefore (unscaledValue × 10-scale).

关于java - 为什么SparkSQL中org.apache.spark.sql.types.DecimalType的最大精度值为38？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40585319/

上一篇：node.js - 渲染器中定义的 Electron 变量在导入的模块中不正确可用

下一篇：java - Spring data neo4j无法连接neo4j 2.2.1

相关文章：

java - INetAddress 直接实例化

scala - 扩展特征和类型

scala - PlayFramework Scala 测试 - 通过依赖注入(inject)器获取类的实例

scala - 使用空/空字段值创建新的数据框

python - 如何使用 approx_count_distinct 计算 Spark DataFrame 中两列的不同组合？

java - GlassFish 3.1 中客户端的 ctx.lookup() 时出现 CommunicationException

java - struts 2中如何避免页面超时？

java - 如何修复 java 中不需要的新行？

scala - Spark 1.5.1，MLLib 随机森林概率

azure - 如何使用databricks-connect在本地执行Spark代码？