我有以下带有 puppeter 的脚本,可以正常工作,此代码提取有关表的所有信息。
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
const tableRows = await page.$$('table > tbody tr');
await page.goto("https://www.mismarcadores.com/baloncesto/espana/liga-endesa/partidos/");
const time = await page.evaluate(() => {
const tables = Array.from(document.querySelectorAll('table tr .time'));
return tables.map(table => table.textContent)
});
const teamHome = await page.evaluate(() => {
const tables = Array.from(document.querySelectorAll('table tr .team-home'));
return tables.map(table => table.textContent)
});
const teamAway = await page.evaluate(() => {
const tables = Array.from(document.querySelectorAll('table tr .team-away'));
return tables.map(table => table.textContent)
});
for (let i = 0; i < time.length; i++) {
console.log(time[i]);
console.log(teamHome[i]);
console.log(teamAway[i]);
}
await browser.close();
})();
现在我尝试以更好的方式创建它,并且我有以下代码。
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://www.mismarcadores.com/baloncesto/espana/liga-endesa/partidos/");
console.log("started evalating");
var data = await page.evaluate(() => {
Array.from(
document.querySelectorAll('table tr')
).map(row => {
return {
time: row.querySelector(".time"),
teamHome: row.querySelector(".team-home"),
teamAway: row.querySelector(".team-away")
};
});
});
console.log(data);
})();
当我尝试执行收到但未定义的第二个脚本时。
结果是将第一个脚本传递给第二个脚本。
有人可以帮助我吗?
最佳答案
您需要更多地指定 tr
元素(例如通过添加 .stage-scheduled
类)并返回 .textContent
属性而不是 fo 元素他们自己。试试这个:
var data = await page.evaluate(() => {
return Array.from(
document.querySelectorAll('table tr.stage-scheduled')
).map(row => {
return {
time: row.querySelector(".time").textContent,
teamHome: row.querySelector(".team-home").textContent,
teamAway: row.querySelector(".team-away").textContent,
};
});
});
关于javascript - 与 puppeteer 师一起刮 table ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54883901/