node.js - 无法将文件读取流从 Google Cloud Storage 传输到 Google Drive API

标签 node.js google-app-engine google-cloud-platform google-drive-api

我正在做一个项目,我正在阅读使用 SimpleMind 创建的思维导图文件从 Google Drive,修改文件,然后将它们上传回 Google Drive。

SimpleMind 创建的 SMMX 文件是包含 XML 文件和媒体文件的 zip 文件。

我的程序在本地运行时运行良好,我对思维导图所做的更改会显示在 SimpleMind 中。

我现在想使用 App Engine 在 Google Cloud Platform 上运行该程序。

由于安全限制,我不能只将我从 Google Drive 下载的文件写入云中的应用服务器的文件系统。相反,我创建了一个存储桶来存储文件。

但是,当我这样做时,我的文件被损坏了,在我运行我的程序之后,它不是 zip 文件的内容,而是一个 JSON 文件,显然是读取流的字符串表示。

本地运行 - 工作

这是我的代码的简化版本,没有对 zip 文件进行实际修改,我将其省略了,因为它与问题以及任何错误处理无关 - 从来没有任何错误。

当我在本地运行代码时,我使用写入流和读取流在本地文件系统上保存和加载文件:

#!/usr/bin/env node

const { readFileSync, createReadStream, createWriteStream } = require('fs');
const { google } = require('googleapis');

const tokenPath = 'google-drive-token.json';
const clientId = 'xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com';
const redirectUri = 'urn:ietf:wg:oauth:2.0:oob';
const clientSecret = 'xxxxxxxxxxxxxxxxxxxxxxxx';
const fileId = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
const fileName = 'deleteme.smmx';

(async () => {
  const auth = new google.auth.OAuth2(clientId, clientSecret, redirectUri);
  const token = JSON.parse(readFileSync(tokenPath));
  auth.setCredentials(token);
  const writeStream = createWriteStream(fileName);
  const drive = google.drive({ version: 'v3', auth });
  let progress = 0;
  const res = await drive.files.get({ fileId, alt: 'media' }, { responseType: 'stream' });
  await new Promise(resolve => {
    res.data.on('data', d => (progress += d.length)).pipe(writeStream);
    writeStream.on('finish', () => {
      console.log(`Done downloading file ${fileName} from Google Drive to local file system (${progress} bytes)`);
      resolve();
    });
  });
  const readStream = createReadStream(fileName);
  progress = 0;
  const media = {
    mimeType: 'application/x-zip',
    body: readStream
      .on('data', d => {
        progress += d.length;
      })
      .on('end', () => console.log(`${progress} bytes read from local file system`))
  };
  await drive.files.update({
    fileId,
    media
  });
  console.log(`File ${fileName} successfully uploaded to Google Drive`);
})();

当我在本地运行此脚本时,它工作正常,程序输出始终为:

Done downloading file deleteme.smmx from Google Drive to local file system (371 bytes)

371 bytes read from local file system

File deleteme.smmx successfully uploaded to Google Drive



我可以多次运行它,每次都会在 Google Drive 上创建文件的新版本,每个文件大小为 371 字节。

在 Google Cloud 中运行 – 不工作

这是上面脚本的一个版本,我用它来尝试做同样的事情,在 Google Cloud 中从 Google Drive 下载文件并将文件上传到 Google Drive,在 App Engine 上运行:
const { readFileSync } = require('fs');
const { google } = require('googleapis');
const { Storage } = require('@google-cloud/storage');

const tokenPath = 'google-drive-token.json';
const clientId = 'xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com';
const redirectUri = 'urn:ietf:wg:oauth:2.0:oob';
const clientSecret = 'xxxxxxxxxxxxxxxxxxxxxxxx';
const fileId = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
const fileName = 'deleteme.smmx';
const storageBucketId = 'xxxxxxxxxxx';

module.exports = async () => {
  const auth = new google.auth.OAuth2(clientId, clientSecret, redirectUri);
  const token = JSON.parse(readFileSync(tokenPath));
  auth.setCredentials(token);
  const storage = new Storage();
  const bucket = storage.bucket(storageBucketId);
  const file = bucket.file(fileName);
  const writeStream = file.createWriteStream({ resumable: false });
  const drive = google.drive({ version: 'v3', auth });
  let progress = 0;
  const res = await drive.files.get({ fileId, alt: 'media' }, { responseType: 'stream' });
  await new Promise(resolve => {
    res.data.on('data', d => (progress += d.length)).pipe(writeStream);
    writeStream.on('finish', () => {
      console.log(`Done downloading file ${fileName} from Google Drive to Cloud bucket (${progress} bytes)`);
      resolve();
    });
  });
  const readStream = file.createReadStream();
  progress = 0;
  const media = {
    mimeType: 'application/x-zip',
    body: readStream
      .on('data', d => {
        progress += d.length;
      })
      .on('end', () => console.log(`${progress} bytes read from storage`))
  };
  await drive.files.update({
    fileId,
    media
  });
  console.log(`File ${fileName} successfully uploaded to Google Drive`);
  return 0;
};

这里唯一的区别是不是使用 createWriteStreamcreateReadStream来自 Node.js fs模块,我用的是对应的方法file.createWriteStreamfile.createReadStream来自谷歌云存储库。

当我第一次在云端的 App Engine 上运行这段代码时,一切似乎都正常,输出与我在本地运行时相同:

Done downloading file deleteme.smmx from Google Drive to Cloud bucket (371 bytes)

371 bytes read from storage

File deleteme.smmx successfully uploaded to Google Drive



但是,当我在 Google Drive Web 前端查看文件的最新版本时,它不再是我的 smmx 文件,而是一个 JSON 文件,它看起来像是读取流的字符串表示形式:
{
  "_readableState": {
    "objectMode": false,
    "highWaterMark": 16384,
    "buffer": { "head": null, "tail": null, "length": 0 },
    "length": 0,
    "pipes": null,
    "pipesCount": 0,
    "flowing": true,
    "ended": false,
    "endEmitted": false,
    "reading": false,
    "sync": false,
    "needReadable": true,
    "emittedReadable": false,
    "readableListening": false,
    "resumeScheduled": true,
    "paused": false,
    "emitClose": true,
    "destroyed": false,
    "defaultEncoding": "utf8",
    "awaitDrain": 0,
    "readingMore": false,
    "decoder": null,
    "encoding": null
  },
  "readable": true,
  "_events": {},
  "_eventsCount": 4,
  "_writableState": {
    "objectMode": false,
    "highWaterMark": 16384,
    "finalCalled": false,
    "needDrain": false,
    "ending": false,
    "ended": false,
    "finished": false,
    "destroyed": false,
    "decodeStrings": true,
    "defaultEncoding": "utf8",
    "length": 0,
    "writing": false,
    "corked": 0,
    "sync": true,
    "bufferProcessing": false,
    "writecb": null,
    "writelen": 0,
    "bufferedRequest": null,
    "lastBufferedRequest": null,
    "pendingcb": 0,
    "prefinished": false,
    "errorEmitted": false,
    "emitClose": true,
    "bufferedRequestCount": 0,
    "corkedRequestsFree": { "next": null, "entry": null }
  },
  "writable": true,
  "allowHalfOpen": true,
  "_transformState": {
    "needTransform": false,
    "transforming": false,
    "writecb": null,
    "writechunk": null,
    "writeencoding": null
  },
  "_destroyed": false
}

似乎将读取流从云存储桶传输到写入流以上传到 Google Drive 并不能像我希望的那样工作。

我究竟做错了什么?我需要进行哪些更改才能使我的代码在云中正确运行?

如果你有兴趣,full source code of my project can be found on GitHub .

更新:解决方法

我找到了解决此问题的方法:
  • 从云存储桶的读取流中读取数据到缓冲区
  • 从此缓冲区创建可读流as described in this tutorial
  • 将此“缓冲流”传递给 drive.files.update方法

  • 这样,Google Drive 上的 zip 文件就不会损坏,新版本的存储内容与以前相同,正如预期的那样。

    然而,我觉得这很丑陋。使用大型思维导图文件,例如有很多图像的,它会给服务器带来压力,因为文件的全部内容必须存储在内存中。

    我更愿意让从云存储桶到 Google Drive API 的直接管道工作。

    最佳答案

    显然,您可以使用直通流

    const file = storage.bucket(bucketName).file(object.name)
    const fileStream = file.createReadStream();
    
    const dataStream = new stream.PassThrough();
    fileStream.pipe(dataStream);
    
    await uploadFileToGDrive(dataStream, {
       name: object.name,
       mimeType: object.contentType,
       parents: ['shared_dir_in_g_drive'],
    })
    

    来源:https://github.com/googleapis/google-api-nodejs-client/issues/2015

    关于node.js - 无法将文件读取流从 Google Cloud Storage 传输到 Google Drive API,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57151911/

    相关文章:

    go - 当前用于 Google Dataflow 的 GoLang SDK 是否支持自动缩放和并行处理?

    java - 更新 gwt CellList 中的图像

    php - 将具有丰富 UI 的 PHP/MySQL 网站迁移到 Google 网站或 Google App Engine 是个好主意吗?

    node.js - DialogFlow中如何接收Kommunicate发送的数据?

    处理 Node.js TLS 服务器和 C TLS 客户端 (openSSL) 连接的正确方法

    python - 加密不可用错误: No crypto library available (using oauth2client in google app engine)

    google-app-engine - App Engine 之外的 db.model from_protobuf() 等效项?

    nginx - kubernetes nginx 入口无法将 HTTP 重定向到 HTTPS

    javascript - Bot 框架 (v4) - 如何从自定义提示验证中获取状态

    html - 使用 $routeProvider 是否节省网络带宽?