有什么方法可以跟踪 Puppeteer 中的总体数据使用情况吗?我正在使用不同的代理运行一个程序,并且想查看我使用了多少数据。
最佳答案
在Puppeteer文档中,有一个仅使用page.coverage
方法来统计JS和CSS大小的示例。我对其进行了修改,并添加了将结果保存到 CSV 文件的选项。
https://pptr.dev/#?product=Puppeteer&version=v1.20.0&show=api-class-coverage
const puppeteer = require('puppeteer')
const fs = require('fs-extra')
const filePath = 'datausage.csv'
;(async () => {
const browser = await puppeteer.launch()
const [page] = await browser.pages()
// Enable both JavaScript and CSS coverage
await Promise.all([
page.coverage.startJSCoverage(),
page.coverage.startCSSCoverage()
])
// Navigate to page
await page.goto('https://www.google.com')
// Disable both JavaScript and CSS coverage
const [jsCoverage, cssCoverage] = await Promise.all([
page.coverage.stopJSCoverage(),
page.coverage.stopCSSCoverage(),
])
let totalBytes = 0
let usedBytes = 0
const coverage = [...jsCoverage, ...cssCoverage]
for (const entry of coverage) {
totalBytes += entry.text.length
for (const range of entry.ranges) {
usedBytes += range.end - range.start - 1
}
}
if ( !await fs.pathExists(filePath) ) {
await fs.writeFile(filePath, 'totalBytes\n')
}
await fs.appendFile(filePath, `${totalBytes}\n`)
console.log(`Total data used: ${totalBytes/1048576} MBytes`)
// console.log(`Bytes used: ${usedBytes / totalBytes * 100}%`)
await browser.close()
})()
但是如果您想要更多详细信息,例如图像、媒体、文档、获取、字体、xhr;您可以在每次 puppeteer 运行并请求任何资源时使用 content-length
响应 header 。我创建此代码是为了给您一个示例:
const puppeteer = require('puppeteer')
const fs = require('fs-extra')
const filePath = 'datausage.csv'
;(async () => {
const browser = await puppeteer.launch({headless: false})
const [page] = await browser.pages()
// Set Request Interception to detect images, fonts, media, and others
page.setRequestInterception(true)
let totalBytes = 0
page.on('request', request => {
request.continue()
})
page.on('response', response => {
let headers = response.headers()
if ( typeof headers['content-length'] !== 'undefined' ){
const length = parseInt( headers['content-length'] )
totalBytes+= length
}
})
// Navigate to page
await page.goto('https://www.google.com', {waitUntil: 'networkidle0', timeout: 0})
if ( !await fs.pathExists(filePath) ) {
await fs.writeFile(filePath, 'totalBytes\n')
}
await fs.appendFile(filePath, `${totalBytes}\n`)
console.log(`Total data used: ${totalBytes/1048576} MBytes`)
await browser.close()
})()
PS:我不知道这是否有效,但请您亲自尝试一下,以证明它与您的实际数据使用情况相同。如果您认为这是正确的,请选择它作为正确答案。
关于node.js - Puppeteer 查找数据使用情况,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58379372/