gpt4 book ai didi

javascript - 如何让 puppeteer 加载网站更快?

转载 作者:行者123 更新时间:2023-12-03 12:14:44 29 4
gpt4 key购买 nike

所以我和 puppeteer 一起工作来自动化东西,它工作正常,但是当我加载网站时,它比我的普通网站需要更多的时间来加载,我尝试使用这个来做缓存

const puppeteer = require('puppeteer');
let time = new Date()
async function test() {
const browser = await puppeteer.launch({
headless: true,
executablePath:"D:\\Desktop\\node_modules\\puppeteer\\.local-chromium\\win64-848005\\chrome-win\\chrome.exe",
args: ['--no-sandbox'],
});
const page = await browser.newPage();
const response = await page.goto('https://example.com/');
console.log(`${new Date() -time }`)
console.log(response);
await browser.close();
}
它适用于example.com缓存已存储并且加载速度更快,但我的目标网站似乎不允许缓存存储
enter image description here
还有其他方法可以加快流程吗?

最佳答案

如果您只是希望网站在抓取时加载得更快,并且不依赖某些图像或 javascript,则可以阻止这些资源。
按资源类型阻止

const puppeteer = require('puppeteer');

(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();

await page.setRequestInterception(true);

page.on('request', (req) => {
if (req.resourceType() === 'image') {
req.abort();
} else {
req.continue();
}
});

await page.goto('https://bbc.com');
await page.screenshot({path: 'no-images.png', fullPage: true});
await browser.close();
})();
按域 阻止
const puppeteer = require('puppeteer');

(async () => {
const browser = await puppeteer.launch({
headless: true,
});
const page = await browser.newPage();
const options = {
waitUntil: 'networkidle2',
timeout: 30000,
};

// Before: Normal navigtation
await page.goto('https://theverge.com', options);
await page.screenshot({path: 'before.png', fullPage: true});
const metrics = await page.metrics();
console.info(metrics);

// After: Navigation with some domains blocked

// Array of third-party domains to block
const blockedDomains = [
'https://pagead2.googlesyndication.com',
'https://creativecdn.com',
'https://www.googletagmanager.com',
'https://cdn.krxd.net',
'https://adservice.google.com',
'https://cdn.concert.io',
'https://z.moatads.com',
'https://cdn.permutive.com'];
await page.setRequestInterception(true);
page.on('request', (request) => {
const url = request.url();
if (blockedDomains.some((d) => url.startsWith(d))) {
request.abort();
} else {
request.continue();
}
});

await page.goto('https://theverge.com', options);
await page.screenshot({path: 'after.png', fullPage: true});

const metricsAfter = await page.metrics();
console.info(metricsAfter);

await browser.close();
})();
资源:
https://github.com/addyosmani/puppeteer-webperf

关于javascript - 如何让 puppeteer 加载网站更快?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66562959/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com