gpt4 book ai didi

javascript - 从 samsclub.com 抓取产品详细信息

转载 作者:行者123 更新时间:2023-12-03 11:54:04 28 4
gpt4 key购买 nike

我正在使用 PHP 从 SamsClub.com 抓取数据

    $res = file_get_contents('http://www.samsclub.com/sams/bath-towel-apple-gr-100-cotton/prod10450797.ip');

我有使用 PHP Explode 创建函数来获取数据。

function getData($content,$start,$end){
$str = explode($start,$content);
$str = explode($end,$str[1]);
return $str[0];
}

所有必需的数据均已成功获取,但仅剩下一件事。这是产品的变化意味着其他颜色,正如您在快照中看到的那样,有不同的颜色可供选择。

enter image description here

当我们选择其他颜色时,产品的商品编号和型号也会发生变化,如下面的快照所示

enter image description here

我只想获取其他颜色的“商品编号和型号”等信息。

等待你们的精彩回应。

最佳答案

为此,您需要使用库 ( PHP Simple HTML DOM Parser )。只需将 simple_html_dom.php 上传到您能够包含它的地方(在我的代码中,它位于同一文件夹中)。

<?php

$url = 'http://www.samsclub.com/sams/bath-towel-apple-gr-100-cotton/prod10450797.ip';

include('simple_html_dom.php');

$html = file_get_html($url);
$colour = array(); $item = array(); $model = array();
$script = $html->find('div[id=variance] script', 0)->innertext;
$script = preg_replace('/\s+/', ' ', $script);
$scripts = explode (";", $script);

$script = $scripts[2];
$id = $scripts[4];
$type = $scripts[5];

$script = str_replace("skuJson.skuVariantJson = $.parseJSON('", "", $script);
$script = str_replace("')", "", $script);

$colours = json_decode($script);

preg_match("/'([a-z0-9]*)'/", $type, $types); $type = $types[1];
preg_match("/'([a-z0-9]*)'/", $id, $ids); $id = $ids[1];

$script = $html->find('script', -1)->innertext;
$scripts = explode (";", $script);

$time = $scripts[0];
preg_match('/"([0-9]*)"/', $time, $times); $time = $times[1];

foreach ($colours as $key => $value) {
$url = 'http://www.samsclub.com/sams/shop/product/ajax/ajaxSkuVariant.jsp?skuId='. $value .'&productId='. $id .'&productType='. $type .'&_='. $time;
$html = file_get_html($url);
preg_match('/"legacyItemNumber":"([0-9]*)"/', $html, $match); $item[] = $match[1];
preg_match('/"model":"([a-z-]*)"/i', $html, $match); $model[] = $match[1];
$colour[] = substr($key, 0, -1);
}

//Print results
echo "<pre>"; print_r($colour); echo "</pre>";
echo "<pre>"; print_r($item); echo "</pre>";
echo "<pre>"; print_r($model); echo "</pre>";

?>

唯一需要更改的是开头的 $url 变量。为什么所有这些代码,您可能会问...因为您要查找的数据不在同一页面上,并且每次单击颜色时都会通过 ajax 调用它,所以基本上我们会发出很多请求(一个是每种颜色)。这是输出:

Array
(
[0] => White
[1] => Burgundy
[2] => Apple Green
[3] => Lilac
[4] => Chocolate
[5] => Sage
[6] => Grey
[7] => PckBlue
[8] => Linen
[9] => null
[10] => Plum
[11] => Clay
[12] => Light Blue
)

Array
(
[0] => 252368
[1] => 252505
[2] => 252414
[3] => 433076
[4] => 252389
[5] => 117268
[6] => 252438
[7] => 613317
[8] => 252382
[9] => 433083
[10] => 252541
[11] => 117175
[12] => 252400
)

Array
(
[0] => SAMW-B
[1] => SAMB-B
[2] => SAMA-B
[3] => SAMLC-B
[4] => SAMCH-B
[5] => SAMSS-B
[6] => SAMGR-B
[7] => SAMPB-B
[8] => SAMLI-B
[9] => SAMDR-B
[10] => SAMP-B
[11] => SAMTC-B
[12] => SAMLB-B
)

关于javascript - 从 samsclub.com 抓取产品详细信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25681374/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com