java - 我在 java.io.PipedInputStream 中发现了错误吗？

我不确定，但我很确定我在 Oracle Java 实现(1.7.0_67 和 1.8.0_31 我可以确认受影响)中发现了一个错误(或未记录的功能)。

症状

当管道已满时，写入管道的时间可能比管道再次空闲所需的时间长一秒。该问题的一个最小示例如下(我已将此处显示的示例推送到 a repository on GitHub ):

private static void threadA() throws IOException, InterruptedException {
  logA("Filling pipe...");
  pos.write(new byte[5]);
  logA("Pipe full. Writing one more byte...");
  pos.write(0);
  logA("Done.");
}

private static void threadB() throws IOException, InterruptedException {
  logB("Sleeping a bit...");
  Thread.sleep(100);
  logB("Making space in pipe...");
  pis.read();
  logB("Done.");
}

pis 和 pos 分别连接到 PipedInputStream 和 PipedOutputStream 实例。 logA 和 logB 是输出线程名称(A 或 B)、以毫秒为单位的时间戳和消息的辅助函数。输出结果如下:

     0 A: Filling pipe...
     6 B: Sleeping a bit...
     7 A: Pipe full. Writing one more byte...
   108 B: Making space in pipe...
   109 B: Done.
  1009 A: Done.

如您所见，B: Done 和 A: Done 之间有 1 秒(1000 毫秒)。这是Oracle Java 1.7.0_67中PipedInputStream的实现造成的，如下:

private void awaitSpace() throws IOException {
    while (in == out) {
        checkStateForReceive();

        /* full: kick any waiting readers */
        notifyAll();
        try {
            wait(1000);
        } catch (InterruptedException ex) {
            throw new java.io.InterruptedIOException();
        }
    }
}

wait(1000) 只会被超时(1000 毫秒，如上所示)或调用 notifyAll() 中断，这只会发生在以下情况下:

在 awaitSpace() 中，在 wait(1000) 之前，正如我们在上面的代码片段中看到的那样
在receivedLast()中，流关闭时调用(此处不适用)
在 read() 中，但仅当 read() 正在等待空缓冲区填满时 -- 此处也不适用

问题

有没有人有足够的 Java 经验来告诉我这是否应该是预期的行为？ awaitSpace() 方法被 PipedOutputStream.write(...) 用来等待空闲空间，他们的契约(Contract)简单地声明:

This method blocks until all the bytes are written to the output stream.

虽然严格来说并没有违反，但是1秒的等待时间似乎相当长。如果我要解决这个问题(最小化/缩短等待时间)，我建议在每次读取结束时插入一个 notifyAll() 以确保等待的作者得到通知。为了避免同步的额外时间开销，可以使用一个简单的 boolean 标志(并且不会损害线程安全)。

受影响的 Java 版本

到目前为止，我可以在 Java 7 和 Java 8 上验证这一点，准确地说是以下版本:

$ java -version
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

$ java -version
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)

最佳答案

这是 Piped*Stream 中众所周知的问题，最终解决方案(针对 JDK-8014239 )是“不会修复”。

JDK-4545831: PipedInputStream performance problems

The class blocks for read when the buffer is empty and blocks for write then the buffer is full. It blocks by calling wait(1000), however a reader will only be woken by a writer who encounters a full buffer (or the wait times out) and a writer will only be woken by a reader who encounters an empty buffer (or the wait times out).

Customer Workaround : Notify()ing the PipedInputStream after every read()/write() would probably solve the problem, but still results in suboptimal performance as many unnecessary notify() calls are being made.

JDK-4404700: PipedInputStream too slow due to polling (alt implementation proposed)

The java.io.PipedInputStream is too slow because it polls to check for new data. Every second it tests if new data is available. When data is available it potentially wastes almost a second. It also has an unsettable small buffer. I propose to consider the following implementation of both PipedInputStream and PipedOutputStream, which is simpler and much faster.

BT2:EVALUATION

We should keep this around as a target of opportunity for merlin and tiger. Due to the age of the classes the submitted code is designed to replace, there may be compatibility issues involved in using it.

JDK-8014239: PipedInputStream not notifying waiting readers on receive

When reading/writing from PipedInputStream/PipedOutputStream pair, read() blocks exactly for one second when new data is written into PipedOutputStream. The reason for this is that PipedInputStream only wakes waiting readers, when during receive() the buffer is filled. The solution is very simple, add a notifyAll() at the end of both receive() methods in PipedInputStream.

It's not obvious how the majority of real-life scenarios would benefit from the proposed change. Per-write notifications may lead to an unnecessary writer stalls. Thus defeating one of the main purposes of a pipe -- time-decoupling readers from writers and buffering. PipedInputStream/PipedWriter API gives us a flexible way to control how often we would like the reader(s) to be notified on new data. Namely, flush(). Calling flush() at the right time we can control latency and throughput.

关于java - 我在 java.io.PipedInputStream 中发现了错误吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28617175/

java - 我在 java.io.PipedInputStream 中发现了错误吗？

上一篇：java - 当使用未定义的参数调用模拟时，如何使 Mockito 抛出异常？

下一篇：java - 依赖关系分析工具 - 更新回归测试用例