Scala的collection的sliding()在窗口大小大于step时不一致

标签 scala scala-collections sliding

这是 Scala 集合 API 的 sliding():

/** Groups elements in fixed size blocks by passing a "sliding window"
   *  over them (as opposed to partitioning them, as is done in grouped.)
   *  @see [[scala.collection.Iterator]], method `sliding`
   *
   *  @param size the number of elements per group
   *  @param step the distance between the first elements of successive
   *         groups
   *  @return An iterator producing ${coll}s of size `size`, except the
   *          last and the only element will be truncated if there are
   *          fewer elements than size.
   */
  def sliding(size: Int, step: Int): Iterator[Repr] =

理解这一点的一个简单方法是,滑动只是(0 until this.length by step).map(i => slice(i, i + size))。但是这种解释在 size > step 时不起作用:

object SlidingTest extends App {
  val n = 10

  val r1 = 0 until n

  val r2 = new Range(start = 0, end = n, step = 1) {
    override def sliding(size: Int, step: Int) = 
     (indices by step).iterator.map(i => slice(i, i + size))
  }

  for {
    i <- 1 to 2*n
    j <- 1 to 2*n
    s1 = r1.sliding(i, j).toList.map(_.toList)
    s2 = r2.sliding(i, j).toList.map(_.toList)
    if s1 != s2
  } println(s"Sliding fail for size=$i and step=$j: [s1=$s1; s2=$s2]")
}

特别考虑 r1 = 0 until 10。根据文档,r1.sliding(size = 2, step = 1) 应该是这样的:

List(List(0, 1), List(1, 2), List(2, 3), List(3, 4), List(4, 5), List(5, 6), List(6, 7), List(7, 8), List(8, 9), List(9))

但实际上是这样的:

List(List(0, 1), List(1, 2), List(2, 3), List(3, 4), List(4, 5), List(5, 6), List(6, 7), List(7, 8), List(8, 9))

(即缺少最后一个截断的切片)。

从 Scaladoc 复制的另一个片段:

 /** Returns an iterator which presents a "sliding window" view of
   *  another iterator.  The first argument is the window size, and
   *  the second is how far to advance the window on each iteration;
   *  defaults to `1`.  Example usages:
   *  {{{
   *    // Returns List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5))
   *    (1 to 5).iterator.sliding(3).toList
   *    // Returns List(List(1, 2, 3, 4), List(4, 5))
   *    (1 to 5).iterator.sliding(4, 3).toList
   *    // Returns List(List(1, 2, 3, 4))
   *    (1 to 5).iterator.sliding(4, 3).withPartial(false).toList
   *    // Returns List(List(1, 2, 3, 4), List(4, 5, 20, 25))
   *    // Illustrating that withPadding's argument is by-name.
   *    val it2 = Iterator.iterate(20)(_ + 5)
   *    (1 to 5).iterator.sliding(4, 3).withPadding(it2.next).toList
   *  }}}
   *
   *  @note Reuse: $consumesAndProducesIterator
   */
  def sliding[B >: A](size: Int, step: Int = 1): GroupedIterator[B] =
    new GroupedIterator[B](self, size, step)

我做错了什么?

最佳答案

它对元素进行分组,并在所有元素都分组后停止。

它不会在每个可能的步骤进行分组。

scala> (1 to 100).sliding(size=100,step=1).toList.size
res0: Int = 1

scala> (1 to 100).sliding(size=99,step=1).toList.size
res1: Int = 2

在您的示例中,您希望它创建一个带有 9 的额外组,即使该集合已经被彻底分组。

您还展示了元素形成部分组的示例:

scala> (1 to 5).sliding(size=4,step=3).toList
res4: List[scala.collection.immutable.IndexedSeq[Int]] = List(Vector(1, 2, 3, 4), Vector(4, 5))

需要额外的组,因为 5 仍未分组。

编辑:Scaladoc 的可能改写:

An iterator producing ${coll}s of size size, except the last element (which may be the only element) will be truncated if there are fewer than size elements remaining to be grouped.

关于Scala的collection的sliding()在窗口大小大于step时不一致,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42093762/

相关文章:

mysql - 将 monad 组合翻译成 SQL

scala - Spark DataFrame 根据列条件更改数据类型

Scala-如何获取 Vector 的包含类?

Python:无法使用带有signum函数的odeint求解微分方程

json - 如何在 AKKA-HTTP 中将 Future[Option[Foo]] 类编码为 JSON

scala - Akka-http logrequest 不记录请求正文

arrays - 如果 Scala 中需要不可变数组,返回 IndexesSeq 而不是 Array 是否正确?

scala - 是否可以在 Scala 中打印函数的定义

java - Android:取出下一张具有钉书钉效果的扑克牌