html(); $c-6ren">
gpt4 book ai didi

php - Goutte - 获取日期在顶部、标题在下面的列表

转载 作者:行者123 更新时间:2023-12-02 01:50:25 25 4
gpt4 key购买 nike

我正在使用"fabpot/goutte": "^4.0",

我正在尝试从网站获取数组中的日期和版本。

请找到我的可运行示例:

<?php

require_once '../vendor/autoload.php';

use Symfony\Component\DomCrawler\Crawler;
use Goutte\Client;

try {

$resArr = array();
$tempArr = array();

$url = "https://www.steelcitycollectibles.com/product-release-calendar";

// get page
$client = new Client();
$content = $client->request('GET', $url)->html();
$crawler = new Crawler($content, null, null);

$table = $crawler->filter('#schedule'); //->first()->closest('table');

$index = 0;
$resArr = array();
$table->filter('div')
->each(function (Crawler $tr) use (&$index, &$resArr) {

if ($tr->filter('.schedule-date')->count() > 0) {
$releaseDate = $tr->filter('.schedule-date')->text();
}

if ($tr->filter('div > div.eight.columns > a')->count() > 0) {
$releaseStr = $tr->filter('div > div.eight.columns > a')->text();
array_push($resArr, [$releaseDate, $releaseStr]);
}

});

var_dump($resArr);
} catch (Exception $e) {}

但是,我没有得到每件商品的正确日期:

enter image description here

对于空值,我想添加正确的日期。在本例中 12/20/21

最佳答案

假设您想将最近看到的日期应用于数组的每个元素,您只需设置一个默认值,然后在循环中更新它。这必须是另一次引用传递,因为匿名函数状态在每次传递时都会重置。

<?php

require_once '../vendor/autoload.php';

use Symfony\Component\DomCrawler\Crawler;
use Goutte\Client;

try {

$resArr = [];

$content = <<< HTML
<div id="schedule" class="schedule nine columns">
<div class="schedule-date">12/22/21</div>
<div class="schedule-list clear">
<div class="eight columns">
<a href="xxx" class="schedule-product-title ">2022 Gold Rush Autographed Full-Size Speed Flex Helmet Edition Series 1 2-Box Case</a>
</div>
<div class="schedule-notify three columns">
<release-schedule-notify type="'release'"/>
</div>
</div>
<div class="schedule-list clear">
<div class="eight columns">
<a href="xxx" class="schedule-product-title ">2022 Gold Rush Autographed Full-Size Speed Flex Helmet Edition Series 1 Box</a>
</div>
<div class="schedule-notify three columns">
<release-schedule-notify type="'release'"/>
</div>
</div>
<div class="schedule-date">12/24/21</div>
<div class="schedule-list clear">
<div class="eight columns">
<a href="xxx">2021 Panini Flawless Baseball Hobby 2-Box Case</a>
</div>
<div class="schedule-notify three columns">
<release-schedule-notify type="'release'"/>
</div>
</div>
<div class="schedule-list clear">
<div class="eight columns">
<a href="xxx">2021 Panini Flawless Baseball Hobby Box</a>
</div>
<div class="schedule-notify three columns">
<release-schedule-notify type="'release'"/>
</div>
</div>
HTML;

$crawler = new Crawler($content, null, null);

$table = $crawler->filter('#schedule');

// use today's date as a default, in case first one is missing
$releaseDate = (new DateTime())->format("m/d/y");
$table->filter('div')
->each(function (Crawler $tr) use (&$index, &$resArr, &$releaseDate) {
if ($tr->filter('.schedule-date')->count() > 0) {
// update the date if it exists, otherwise continue with the old one
$releaseDate = $tr->filter('.schedule-date')->text();
}
if ($tr->filter('div > div.eight.columns > a')->count() > 0) {
$releaseStr = $tr->filter('div > div.eight.columns > a')->text();
$resArr[] = [$releaseDate, $releaseStr];
}
});
} catch (Exception $e) {}

echo json_encode($resArr, JSON_PRETTY_PRINT);

输出:

[
[
"12\/22\/21",
"2022 Gold Rush Autographed Full-Size Speed Flex Helmet Edition Series 1 2-Box Case"
],
[
"12\/22\/21",
"2022 Gold Rush Autographed Full-Size Speed Flex Helmet Edition Series 1 2-Box Case"
],
[
"12\/22\/21",
"2022 Gold Rush Autographed Full-Size Speed Flex Helmet Edition Series 1 Box"
],
[
"12\/24\/21",
"2021 Panini Flawless Baseball Hobby 2-Box Case"
],
[
"12\/24\/21",
"2021 Panini Flawless Baseball Hobby Box"
]
]

作为旁注,documentation for Goutterequest() 方法返回一个 Crawler 对象。您无需手动提取 HTML 并创建 Crawler 对象。将您的代码更改为:

// get page
$crawler = (new Client)->request('GET', $url);

关于php - Goutte - 获取日期在顶部、标题在下面的列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70402267/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com