node.js - NodeJS通过流复制文件非常慢

标签 node.js performance file-io stream pipe

我在 VMWare 下的 SSD 上使用 Node 复制文件,但性能非常低。我为测量实际速度而运行的基准如下:

$ hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   12004 MB in  1.99 seconds = 6025.64 MB/sec
 Timing buffered disk reads: 1370 MB in  3.00 seconds = 456.29 MB/sec

但是,以下复制文件的 Node 代码非常慢,即使后续运行也不会使其更快:

var fs  = require("fs");
fs.createReadStream("bigfile").pipe(fs.createWriteStream("tempbigfile"));

运行如下:

$ seq 1 10000000 > bigfile
$ ll bigfile -h
-rw-rw-r-- 1 mustafa mustafa 848M Jun  3 03:30 bigfile
$ time node test.js 

real    0m4.973s
user    0m2.621s
sys     0m7.236s
$ time node test.js 

real    0m5.370s
user    0m2.496s
sys     0m7.190s

这里有什么问题,我该如何加快速度?我相信我可以通过调整缓冲区大小在 C 中更快地编写它。让我感到困惑的是,当我编写简单的几乎 pv 等效程序时,将 stdin 连接到 stdout 如下所示,速度非常快。

process.stdin.pipe(process.stdout);

运行如下:

$ dd if=/dev/zero bs=8M count=128 | pv | dd of=/dev/null
128+0 records in 174MB/s] [        <=>                                                                                ]
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.78077 s, 186 MB/s
   1GB 0:00:05 [ 177MB/s] [          <=>                                                                              ]
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.78131 s, 186 MB/s
$ dd if=/dev/zero bs=8M count=128 |  dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.57005 s, 193 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.5704 s, 193 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.61734 s, 233 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.62766 s, 232 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.22107 s, 254 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.23231 s, 254 MB/s
$ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.70124 s, 188 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.70144 s, 188 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.51055 s, 238 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.52087 s, 238 MB/s

最佳答案

我不知道你的问题的答案,但也许这有助于你调查问题。

在 Node.js 中 documentation关于流缓冲,它说:

Both Writable and Readable streams will store data in an internal buffer that can be retrieved using writable.writableBuffer or readable.readableBuffer, respectively.

The amount of data potentially buffered depends on the highWaterMark option passed into the stream's constructor. For normal streams, the highWaterMark option specifies a total number of bytes. For streams operating in object mode, the highWaterMark specifies a total number of objects....

A key goal of the stream API, particularly the stream.pipe() method, is to limit the buffering of data to acceptable levels such that sources and destinations of differing speeds will not overwhelm the available memory.

因此,您可以使用缓冲区大小来提高速度:

var fs = require('fs');
var path = require('path');
var from = path.normalize(process.argv[2]);
var to = path.normalize(process.argv[3]);

var readOpts = {highWaterMark: Math.pow(2,16)};  // 65536
var writeOpts = {highWaterMark: Math.pow(2,16)}; // 65536  

var source = fs.createReadStream(from, readOpts);
var destiny = fs.createWriteStream(to, writeOpts)

source.pipe(destiny);

关于node.js - NodeJS通过流复制文件非常慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24005496/

相关文章:

node.js - VS2013 R3 NodeJS项目加载慢

python - Python 是否支持零拷贝 I/O?

python - 如果我不关闭 txt 文件会发生什么

javascript - 将 React 与 Express 混合?

javascript - 安全地解析和评估用户输入

javascript - 使用node.js通过链接创建私有(private)房间

php - Heroku 1x 与 2x 网络测功机

c# - 在 C# 中搜索 ASCII 文件以获取简单关键字的最快方法?

java - java在不同硬件上的性能?

java - 如何确定java中文件每一行的字节数?