您好,我正在尝试 spark 窗口函数。我需要从“0”开始 row_number。这是我的代码。
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id")))
行号从“1”开始。我试过这样。
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id") -1))
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id"))) -1
不适合我。我需要从零开始我的 row_number。任何帮助将不胜感激。
最佳答案
试试这个:
w = Window.partitionBy("name","mark1","mark2").orderBy("id")
target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank",
row_number().over(w)-1)
它适用于 PySpark。
关于apache-spark - 设置 row_number 从 0 开始,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49998448/