java - 使用 Spark SQL 在 Java 中调用美元符号函数

标签 java scala apache-spark apache-spark-sql

我有以下代码(带有 Spark SQL 的 Java)-

    import static org.apache.spark.sql.functions.col;
    ...

    System.out.println("=== Filtering records with average age more than 20 ===");
    Dataset<Row> result = df.filter(col("age").$less(20));

我从未见过以美元开头的 Java 函数调用。 尝试用谷歌搜索它,但到目前为止我最好的猜测是它是 Java 调用 Scala 代码的结果(但在 Scala 源代码中没有 $less 函数)

您能否对此提供一个可靠的解释?

最佳答案

可以在这里找到答案 - http://www.codecommit.com/blog/java/interop-between-java-and-scala

Operators are Methods

One of the most obvious differences between Java and Scala is that Scala supports operator overloading. In fact, Scala supports a variant of operator overloading which is far stronger than anything offered by C++, C# or even Ruby. With very few exceptions, any symbol may be used to define a custom operator. This provides tremendous flexibility in DSLs and even your average, every-day API (such as List and Map).

Obviously, this particular language feature is not going to translate into Java quite so nicely. Java doesn’t support operator overloading of any variety, much less the über-powerful form defined by Scala. Thus, Scala operators must be compiled into an entirely non-symbolic form at the bytecode level, otherwise Java interop would be irreparably broken, and the JVM itself would be unable to swallow the result.

A good starting place for deciding on this translation is the way in which operators are declared in Scala: as methods. Every Scala operator (including unary operators like !) is defined as a method within a class:

abstract class List[+A] {
  def ::[B >: A](e: B) = ...

  def +[B >: A](e: B) = ...
}

Since Scala classes become Java classes and Scala methods become Java methods, the most obvious translation would be to take each operator method and produce a corresponding Java method with a heavily-translated name. In fact, this is exactly what Scala does. The above class will compile into the equivalent of this Java code:

public abstract class List<A> {
  public <B super A> List<B> $colon$colon(B e) { ... }

  public <B super A> List<B> $plus(B e) { ... }
}

Every allowable symbol in Scala’s method syntax has a corresponding translation of the form “$trans“. A list of supported translations is one of those pieces of documentation that you would expect to find on the Scala website. However, alas, it is absent. The following is a table of all of the translations of which I am aware:

┌────────────────┬─────────────┐
│ Scala Operator │ Compiles To │
├────────────────┼─────────────┤
│  =             │  $eq        │
├────────────────┼─────────────┤
│  >             │  $greater   │
├────────────────┼─────────────┤
│  <             │  $less      │
├────────────────┼─────────────┤
│  +             │  $plus      │
├────────────────┼─────────────┤
│  -             │  $minus     │
├────────────────┼─────────────┤
│  *             │  $times     │
├────────────────┼─────────────┤
│  /             │  div        │
├────────────────┼─────────────┤
│  !             │  $bang      │
├────────────────┼─────────────┤
│  @             │  $at        │
├────────────────┼─────────────┤
│  #             │  $hash      │
├────────────────┼─────────────┤
│  %             │  $percent   │
├────────────────┼─────────────┤
│  ^             │  $up        │
├────────────────┼─────────────┤
│  &             │  $amp       │
├────────────────┼─────────────┤
│  ~             │  $tilde     │
├────────────────┼─────────────┤
│  ?             │  $qmark     │
├────────────────┼─────────────┤
│  |             │  $bar       │
├────────────────┼─────────────┤
│  \             │  $bslash    │
├────────────────┼─────────────┤
│  :             │  $colon     │
└────────────────┴─────────────┘

Using this table, you should be able to derive the “real name” of any Scala operator, allowing its use from within Java. Of course, the idea solution would be if Java actually supported operator overloading and could use Scala’s operators directly, but somehow I doubt that will happen any time soon.

**** 这个答案是由某人发布的,但由于某种原因被删除了(如果答案的原始所有者可以重新发布它,我会很高兴)

关于java - 使用 Spark SQL 在 Java 中调用美元符号函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53937130/

相关文章:

scala - Spark 工作节点超时

java - 在 <html :errors> in Action 上发送错误消息

java - 如何在Java代码中使用JavaMongoRDD的管道提供多阶段?

scala - 在 Scala + JDBC 中避免可变变量

apache-spark - 为什么Complete输出模式需要聚合?

apache-spark - Spark数据集写入之间的区别

java - 使用不同的服务创建 OSGi bundle

java - Spring Data JPA - 将 findAll 与 MySQL 一起使用时出现异常

Java 2D Graphics 从 URL 而不是资源获取图像

scala - 如何将正则表达式解析为整个 spark 数据框而不是每一列?