javascript - 如何将从客户端浏览器抓取的数据保存到用户文件中?使用 Electron ?

标签 javascript file browser google-chrome-extension electron

我想从网站提取数据(可能是抓取)并将其保存到外部文件。

我的第一个想法是编写一个 Chrome 扩展来为我做到这一点,但我找不到如何保存到外部文件。 (我是 Chrome 扩展程序的新手。)我搜索了 StackOverflow 并找到了答案:

“您无法在 Chrome 扩展程序中执行此操作。”,
“你可以做到,但我不会告诉你如何做。;)”
“使用本地存储”

localStorage不会写入用户外部文件,我可能需要保存许多MB的数据。

我的第二个想法是使用 Electron 并为该任务编写一个专用浏览器。 Electron 内置了节点,因此可以保存文件。

在我投入时间和精力这样做之前,有人已经尝试过吗?前面有什么陷阱或障碍吗?

最佳答案

我发布这个快速示例作为答案并跟进评论。如果你想测试它,你需要npm install request jsdom

const request = require('request');
const jsdom = require('jsdom');

request(
  'https://stackoverflow.com/questions/51896635/how-to-save-scraped-data-from-client-side-browser-to-a-user-file-use-electron?noredirect=1',
  (err, result, body) => {
    const dom = new jsdom.JSDOM(body);

    const comments = dom.window.document.querySelectorAll('.comment-copy');
    comments.forEach(comment => console.log(`>>> ${comment.innerHTML}\n`));
  }
);

输出必须是同一页面的实际评论。

>>> Regarding extensions, the authoritative source is the <a href="https://developer.chrome.com/extensions/downloads#method-download" rel="nofollow noreferrer">documentation</a>: they can download the data to a file in the default downloads directory, optionally showing the Save As dialog where the user can manually choose any directory.

>>> You don't really need a browser for that. A simple script (in any scripting language really) should be good for this task. If you want to perform queries on your file, you can either process it later with a different script or you can use Node.js and do everything in a single script; there are a bunch of libraries that simulate DOM objects for Node. Worst case you could even spin up a headless Chrome from Node to do all DOM related tasks.

>>> The "download" will be text that I create in the browser, possibly from multiple web pages.  The documentation suggests that it is only possible to download using a URL, not save something created locally.  Or am I wrong?

>>> @ErickRuizdeChavez Yes, Node looks a good way to go, and Electron gives a convenient framework to house it as an app.

>>> In node you can do whatever you want, it is not constrained by the browser sandbox, so you should be able to do whatever you need. Obviously it is not as straightforward as just dropping some javascript on the browser.

关于javascript - 如何将从客户端浏览器抓取的数据保存到用户文件中?使用 Electron ?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51896635/

相关文章:

javascript - 将 localStorage 值加一

javascript - 使用 jquery 模拟 CSS 悬停状态

c - 在 C 中使用 fopen() 使用字符数组作为文件名

android - 手机网页

javascript - 动态更新 JavaScript 数组

javascript - 设置嵌套对象属性的更好方法?

c++ - 从文件中读取图像 - C++

C++ 从文件中读/写 N 个字节(甚至是 '00' 字节)

css - 在 CSS 中,[class][class] 比 [element][class] 快吗?

css - 悬停伪类在触摸屏设备上的行为如何