我需要从流中创建一个迭代器。父流和子流都是由互不干扰的无状态操作组成的,显而易见的策略是使用 flatMap。
原来那个迭代器,在第一次“hasNext”调用时,遍历了整个第一个子流,我不明白为什么。尽管 iterator()
是一个终端操作,但明确指出它不应该消耗流。
我需要子流生成的对象一个一个生成。
为了复制我用显示相同的示例模拟我的真实代码的行为:
import java.util.Iterator;
import java.util.Objects;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Stream;
public class FreeRunner {
public static void main(String[] args) {
AtomicInteger x = new AtomicInteger();
Iterator<C> iterator = Stream.generate(() -> null)
.takeWhile(y -> x.incrementAndGet() < 5)
.filter(y -> x.get() % 2 == 0)
.map(n -> new A("A" + x.get()))
.flatMap(A::getBStream)
.filter(Objects::nonNull)
.map(B::toC)
.iterator();
while(iterator.hasNext()) {
System.out.println("after hasNext()");
C next = iterator.next();
System.out.println(next);
}
}
private static class A {
private final String name;
public A(String name) {
this.name = name;
System.out.println(" > created " + name);
}
public Stream<B> getBStream() {
AtomicInteger c = new AtomicInteger();
return Stream.generate(() -> null)
.takeWhile(x -> c.incrementAndGet() < 5)
.map(n -> c.get() % 2 == 0 ? null : new B(this.name + "->B" + c.get()));
}
public String toString() {
return name;
}
}
private static class B {
private final String name;
public B(String name) {
this.name = name;
System.out.println(" >> created " + name);
}
public String toString() {
return name;
}
public C toC() {
return new C(this.name + "+C");
}
}
private static class C {
private final String name;
public C(String name) {
this.name = name;
System.out.println(" >>> created " + name);
}
public String toString() {
return name;
}
}
}
执行后显示:
> created A2
>> created A2->B1
>>> created A2->B1+C
>> created A2->B3
>>> created A2->B3+C
after hasNext()
A2->B1+C
after hasNext()
A2->B3+C
> created A4
>> created A4->B1
>>> created A4->B1+C
>> created A4->B3
>>> created A4->B3+C
after hasNext()
A4->B1+C
after hasNext()
A4->B3+C
Process finished with exit code 0
在调试中很明显 iterator.hasNext()
触发了对象 B 和 C 的生成。
相反,所需的行为是:
> created A2
>> created A2->B1
>>> created A2->B1+C
after hasNext()
A2->B1+C
>> created A2->B3
>>> created A2->B3+C
after hasNext()
A2->B3+C
> created A4
>> created A4->B1
>>> created A4->B1+C
after hasNext()
A4->B1+C
>> created A4->B3
>>> created A4->B3+C
after hasNext()
A4->B3+C
我在这里错过了什么?
最佳答案
我找到了出路,但我不得不牺牲主流的懒惰。正如我在上面的评论中发布的那样,我试图简化模拟代码的问题即将逐张读取 excel 文件(按工作表名称过滤)并遍历所有行以根据电子表格中的数据创建对象。
最初的想法对我来说仍然很好,但显然,Stream.iterator()
实现在创建时操作的第一个 hasNext()
调用中消耗每个嵌套流第一个 A
对象。
所以我放弃了 flatMap()
并使用 reduce(Stream::concat)
连接所有由 A.getBStream()
生成的流:
public static void main(String[] args) {
AtomicInteger x = new AtomicInteger();
Iterator<C> it = Stream.generate(() -> null)
.takeWhile(y -> x.incrementAndGet() < 5)
.filter(y -> x.get() % 2 == 0)
.map(a -> new A("A" + x.get()))
.map(A::getBStream)
.filter(Objects::nonNull)
.reduce(Stream::concat)
.orElseGet(Stream::empty)
.filter(Objects::nonNull)
.map(B::toC)
.iterator();
while(it.hasNext()) {
System.out.println("after hasNext()");
C next = it.next();
System.out.println(next);
}
}
这会产生以下输出:
> created A2
> created A4
>> created A2->B0
>>> created A2->B0+C
after hasNext()
A2->B0+C
>> created A2->B1
>>> created A2->B1+C
after hasNext()
A2->B1+C
>> created A2->B2
>>> created A2->B2+C
after hasNext()
A2->B2+C
>> created A2->B3
>>> created A2->B3+C
after hasNext()
A2->B3+C
>> created A2->B4
>> created A4->B0
>>> created A4->B0+C
after hasNext()
A4->B0+C
>> created A4->B1
>>> created A4->B1+C
after hasNext()
A4->B1+C
>> created A4->B2
>>> created A4->B2+C
after hasNext()
A4->B2+C
>> created A4->B3
>>> created A4->B3+C
after hasNext()
A4->B3+C
>> created A4->B4
付出的代价是预先生成A2
和A4
,但是所有的B
对象都是延迟生成的
关于Java流的迭代器强制flatmap在获取第一项之前遍历子流,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67374235/