gpt4 book ai didi

javascript - 如何从 response.body 获取 Node 中 '' 的绝对路径

转载 作者:太空狗 更新时间:2023-10-29 15:59:31 31 4
gpt4 key购买 nike

所以我想使用request-promise来拉取一个页面的body。获得页面后,我想收集所有标签并获取这些图像的 src 数组。假设页面上的 src 属性同时具有相对路径和绝对路径。我想要一个页面上 imgs 的绝对路径数组。我知道我可以使用一些字符串操作和 npm 路径来构建绝对路径,但我想找到一种更好的方法。

var rp = require('request-promise'),
cheerio = require('cheerio');

var options = {
uri: 'http://www.google.com',
method: 'GET',
resolveWithFullResponse: true
};

rp(options)
.then (function (response) {
$ = cheerio.load(response.body);
var relativeLinks = $("img");
relativeLinks.each( function() {
var link = $(this).attr('src');
console.log(link);
if (link.startsWith('http')){
console.log('abs');
}
else {
console.log('rel');
}
});
});

结果

  /logos/doodles/2016/phoebe-snetsingers-85th-birthday-5179281716019200-hp.gif
rel

最佳答案

将您的页面 URL 存储为变量,使用 url.resolve 将各个部分连接在一起。在 Node REPL 中,这适用于相对路径和绝对路径(因此称为“解析”):

$:~/Projects/test$ node
> var base = "https://www.google.com";
undefined
> var imageSrc = "/logos/doodles/2016/phoebe-snetsingers-85th-birthday-5179281716019200-hp.gif";
undefined
> var url = require('url');
undefined
> url.resolve(base, imageSrc);
'https://www.google.com/logos/doodles/2016/phoebe-snetsingers-85th-birthday-5179281716019200-hp.gif'
> imageSrc = base + imageSrc;
'https://www.google.com/logos/doodles/2016/phoebe-snetsingers-85th-birthday-5179281716019200-hp.gif'
> url.resolve(base, imageSrc);
'https://www.google.com/logos/doodles/2016/phoebe-snetsingers-85th-birthday-5179281716019200-hp.gif'

您的代码将更改为:

var rp = require('request-promise'),
cheerio = require('cheerio'),
url = require('url'),
base = 'http://www.google.com';

var options = {
uri: base,
method: 'GET',
resolveWithFullResponse: true
};

rp(options)
.then (function (response) {
$ = cheerio.load(response.body);
var relativeLinks = $("img");
relativeLinks.each( function() {
var link = $(this).attr('src');
var fullImagePath = url.resolve(base, link); // should be absolute
console.log(link);
if (link.startsWith('http')){
console.log('abs');
}
else {
console.log('rel');
}
});
});

关于javascript - 如何从 response.body 获取 Node 中 '<img src=' '>' 的绝对路径,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37733871/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com