gpt4 book ai didi

php - cURL 不适用于标签

转载 作者:行者123 更新时间:2023-11-28 12:48:52 25 4
gpt4 key购买 nike

我尝试从 webpage 中复制一个句子

我的代码是:

$request_url ='https://stackoverflow.com/questions/391005/convert-html-css-to-pdf-with-php';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $request_url); // The url to get links from
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // We want to get the respone
$result = curl_exec($ch);
$regex='/<h1 itemprop="name">(.*)<\/h1>/i';
preg_match_all($regex,$result,$parts);
$links=$parts[1];
foreach($links as $link){
echo $link."<br>";
}
curl_close($ch);

这是有效的,但是当我尝试在第 6 行更改时它不起作用

$regex='/itemprop="name">(.*)<\/h1>/i';

我要从中复制的该网站的脚本是:

<h1 itemprop="name">
<a class="question-hyperlink" href="/questions/391005/convert-html-css-to-pdf-with-php">Convert HTML + CSS to PDF with PHP?</a></h1>

我想打印“使用 PHP 将 HTML + CSS 转换为 PDF?”请告诉我如何从这个 anchor 标记中复制和打印该句子。

最佳答案

或者,您也可以将 DOMDocumentDOMXpath 一起使用。考虑这个例子:

$request_url ='http://stackoverflow.com/questions/391005/convert-html-css-to-pdf-with-php';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $request_url); // The url to get links from
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // We want to get the response
libxml_use_internal_errors(true);
$result = curl_exec($ch);
$dom = new DOMDocument();
$dom->loadHTML($result);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
// target the title
$title = $xpath->query('//div[@id="question-header"]/h1[@itemprop="name"]/a[@class="question-hyperlink"]')->item(0)->nodeValue;
echo $title; // Convert HTML + CSS to PDF with PHP?

Sidenote: This is the most odd scraping question, scraping SO.

关于php - cURL 不适用于标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24430624/

25 4 0