gpt4 book ai didi

mysql - 优化 XML 数据提要的 MySQL 插入

转载 作者:行者123 更新时间:2023-11-29 00:51:32 24 4
gpt4 key购买 nike

我每晚从一家公司获取 XML 提要,需要进行一些认真的优化,因为它要花很长时间

下面的代码显示了我是如何做到的,但必须有更好的方法 - 基本上,我接收每种产品,然后接收供应该产品的相关零售商

//db connect
include '../php/lib/dbconnect.inc';

$categories = array(1, 2, 4, 8, 9);

foreach ($arr as $key => $cat_id) { {

$url = "http://*********.com/feed?f=PRSP_UK_xx&categories=$cat_id&limit=100&startproducts=$ii&price_min=0.01&sortproducts=score&show=properties";
$c = curl_init($url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_HEADER, 0);
curl_setopt($c, CURLOPT_USERPWD, "****:****");
$xml = simplexml_load_string(curl_exec($c));
curl_close($c);

$num_items = $xml->{'product-count'};

$ii = 0;

while ($ii <= $num_items) { // this sets the number of items from start of xml feed

$url = "http://********.com/feed?f=PRSP_UK_xx&categories=$cat_id&limit=100&startproducts=$ii&price_min=0.01&sortproducts=score&show=properties";

$c = curl_init($url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_HEADER, 0);
curl_setopt($c, CURLOPT_USERPWD, "****:****");
$xml = simplexml_load_string(curl_exec($c));
curl_close($c);

// load each product first

foreach ($xml->product as $products) {

$title = $products->name;

$title = preg_replace('/[^a-z0-9\s]/i', '', $title);

$PRid = $products->id;

$author = $products->properties->group->property[2]->value;

$author = preg_replace('/[^a-z0-9\s]/i', '', $author);

$genre = $products->properties->group->property[4]->value;

$genre = preg_replace('/[^a-z0-9\s]/i', '', $genre);

$prodcat = $products->{'category'};

$prodcat = preg_replace('/[^a-z0-9\s]/i', '', $prodcat);

$prodcatID = $products->{'category-id'};

$lowprice = $products->{'lowest-price'};

$highprice = $products->{'highest-price'};

$imageURL = $products->{'image-url'};

$userrating = $products->rating[0]->average;

$userrating = str_replace(",",".",$userrating);

$profrating = $products->rating[0]->average;

$profrating = str_replace(",",".",$profrating);

$addline = mysql_query("
insert into PRprodINFO (
PRid,
main_category,
title,
author,
genre,
prodcat,
prodcatID,
userrating,
profrating,
lowprice,
highprice,
imageURL
)
VALUES (
'$PRid',
'Books',
'$title',
'$author',
'$genre',
'$prodcat',
'$prodcatID',
'$userrating',
'$profrating',
'$lowprice',
'$highprice',
'$imageURL'
) ON DUPLICATE KEY UPDATE lowprice='$lowprice', highprice='$highprice'",$db);

if(!$addline) { echo "cannot add to table here".mysql_error(); exit; } // debug

// now each retailer associated with the product

foreach ($products->retailer as $retailer) {

$id = $retailer->{'id'};

$name = $retailer->{'name'};

$name = addslashes($name);

$link = $retailer->{'link'};

$logoURL = $retailer->{'logo'};

$stockinfo = $retailer->{'stock-info'};

$price = $retailer->{'price'};

$priceshipmin = $retailer->{'price-with-shipping-min'};

$priceshipmax = $retailer->{'price-with-shipping-max'};

$dummyid = $PRid.$id;

$id = preg_replace('/[^a-z0-9\s]/i', '', $id);

$stockinfo = preg_replace('/[^a-z0-9\s]/i', '', $stockinfo);

$dummyid = preg_replace('/[^a-z0-9\s]/i', '', $dummyid);

$addretail = mysql_query("
insert into PRretailerinfo (
PRid,
id,
dummyid,
category_id,
name,
link,
logoURL,
stockinfo,
price,
priceshipmin,
priceshipmax
)
VALUES (
'$PRid',
'$id',
'$dummyid',
'$i',
'$name',
'$link',
'$logoURL',
'$stockinfo',
'$price',
'$priceshipmin',
'$priceshipmax'
) ON DUPLICATE KEY UPDATE price='$price', priceshipmin='$priceshipmin', priceshipmax='$priceshipmax'",$db);

if(!$addretail) { echo "cannot add to table - price is".$price.mysql_error(); exit; } // debug

} // close
} // close

// add 100 to url to get next 100 items
$ii = ($ii+100);

}

} // whole thing

我认为有比逐行更好的方法来做到这一点,因为整个过程大约有 800,000 种产品,平均每种产品有 4 个零售商

我相信在插入之前先使用多行构造一个长查询会更快,但我找不到这样做的方法。

最佳答案

缓慢很可能不是由于发出多个插入查询,而是很可能是由于您为获取数据而发出的大量 http 请求。有什么方法可以一次从服务器取回更多数据?

....也许将您的产品获取参数从 limit=100 更改为 limit=$num_items

考虑到您的评论,我想您可以尝试使用单独的线程来加快速度 - 一个用于下载内容,一个用于插入到您的表中。这样你总是能得到新的数据,而不是得到,然后等待插入完成,然后得到更多。不过,编码可能会变得相当复杂。

关于mysql - 优化 XML 数据提要的 MySQL 插入,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8369107/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com