javascript - 使用 AJAX + Javascript 分块读取文件

所以，这是我的问题: 我有一个包含数十万行的大型文本文件(大小约为 150 MB)。我需要读取文件的内容，对其进行解析，以便将这些行放入适当的 html 标记中并将其写入 window.document。打开()对象。

我的代码适用于大小不超过 50 MB 的文件。

var rawFile=new XMLHttpRequest();
    rawFile.open("GET",file, true);
    rawFile.onreadystatechange= function () {
        if (rawFile.readyState === 4) {
            if (rawFile.status === 200 || rawFile.status === 0) {
                var allText = rawFile.responseText;
                var contents = allText.split("\n");
                var w = window.open();
                w.document.open();
                for (i = 0; i < contents.length; i++) {
                    //logc so that str= appropriate tags + contents[i]
                    w.document.write(str);
                }
            }
        }
    }

代码有效。逻辑有效。但如果文件大小大于 100MB 或类似大小，chrome 就会崩溃。我认为分块读取文件然后将其写入 window.document.open() 将为我解决这个问题。

非常感谢任何关于如何实现这一目标的建议。谢谢 :)

(我上面贴的代码如果有什么错误请忽略，我的实际代码量很大所以只写了一个缩影版)

最佳答案

您的方法会削弱浏览器，因为您是一次性处理整个响应。更好的方法是将流程分解，以便您处理更小的 block ，或者通过您的流程流式传输文件。

使用 Fetch API而不是 XMLHttpRequest 会让您访问流数据。使用流的一大优势是您在处理内容时不会占用浏览器的内存。

以下代码概述了如何使用流来执行任务:

var file_url = 'URL_TO_FILE';
// @link https://developer.mozilla.org/en-US/docs/Web/API/Request/Request
var myRequest = new Request( file_url );
// fetch returns a promise
fetch(myRequest)
  .then(function(response) {
    var contentLength = response.headers.get('Content-Length');
    // response.body is a readable stream
    // @link https://learn.microsoft.com/en-us/microsoft-edge/dev-guide/performance/streams-api
    var myReader = response.body.getReader();
    // the reader result will need to be decoded to text
    // @link https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/TextDecoder 
    var decoder = new TextDecoder();
    // add decoded text to buffer for decoding
    var buffer = '';
    // you could use the number of bytes received to implement a progress indicator
    var received = 0;
    // read() returns a promise
    myReader.read().then(function processResult(result) {
      // the result object contains two properties:
      // done  - true if the stream is finished
      // value - the data
      if (result.done) {
        return;
      }
      // update the number of bytes received total
      received += result.value.length;
      // result.value is a Uint8Array so it will need to be decoded
      // buffer the decoded text before processing it
      buffer += decoder.decode(result.value, {stream: true});
      /* process the buffer string */

      // read the next piece of the stream and process the result
      return myReader.read().then(processResult);
    })
  })

我没有实现处理缓冲区的代码，但算法如下:

If the buffer contains a newline character:
    Split the buffer into an array of lines
If there is still more data to read:
    Save the last array item because it may be an incomplete line
    Do this by setting the content of the buffer to that of the last array item
Process each line in the array

快速浏览 Can I Use告诉我这在 IE 中不起作用，因为在 Edge 浏览器之前没有实现 Fetch API。然而，没有必要绝望，因为一如既往，一些善良的灵魂已经实现了polyfill。对于不支持的浏览器。

关于javascript - 使用 AJAX + Javascript 分块读取文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44493340/

javascript - 使用 AJAX + Javascript 分块读取文件

上一篇：javascript - 将嵌套对象解构为函数参数

下一篇：javascript - 如何在不使用 Assets 管道的情况下在 Rails 5 中提供静态图像？