gpt4 book ai didi

javascript - zombie JS : intermittently crashes when called repeatedly from a for loop

转载 作者:行者123 更新时间:2023-12-03 07:37:07 24 4
gpt4 key购买 nike

我在 Heroku 上有一个 ZombieJS Node 服务器,用于从互联网上抓取数据。服务器代码是从客户端的 for 循环调用的。循环的每次迭代都会进行一次服务器调用,从而产生 zombie 抓取。有时,服务器会崩溃并出现以下错误。仅当 for 循环迭代多次时才会发生这种情况。

如何使代码足够健壮,能够处理多个并发客户端调用,每个调用都有一个 for 循环。

代码:

var express = require('express');
var app = express();
var Browser = require('zombie'); // tried changing var to const; no difference
var assert = require('assert');

app.set('port', (process.env.PORT || 5000));

var printMessage = function() { console.log("Node app running on " + app.get('port')); };

var getAbc = function(response, input)
{
var browser = new Browser();
browser.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0';
browser.runScripts = true;
var url = "http://www.google.com/ncr";

browser.visit(url, function() {
browser.fill('q', input).pressButton('Google Search', function(){
// parsing number of results from browser object

response.writeHead(200, {'Content-Type': 'text/plain'});
response.end(numberOfSearchResults);
});
});
}

var handleXyz = function(request, response)
{
getAbc(response, request.query.input);
}

app.listen(app.get('port'), printMessage);
app.post('/xyz', handleXyz);

错误:

 assert.js:86
throw new assert.AssertionError({
^
No open window with an HTML document
at Browser.field (/app/node_modules/zombie/lib/index.js:811:7)
at Browser.fill (/app/node_modules/zombie/lib/index.js:903:24)
at /app/cfv1.js:42:11
at done (/app/node_modules/zombie/lib/eventloop.js:589:9)
at timeout (/app/node_modules/zombie/lib/eventloop.js:594:33)
at Timer.listOnTimeout (timers.js:119:15)

我有一个使用 HorsemanJS/PhantomJS 的类似项目,它以类似的方式失败(我也陷入困境!): NodeJS server can't handle multiple users

最佳答案

一般来说,我认为您应该小心或避免向远程服务器生成大量未经请求的请求。许多站点会限制您和/或开始拒绝连接。话虽如此,我相信我在这个特殊案例中找到了问题的根源。

我测试了代码片段,对于这种特殊情况,如果您发出太多请求,Google 将重置连接。当连接重置时,其中一个变量最终断言失败。

重置连接时出现的错误:

  zombie TypeError: read ECONNRESET
at zombie/lib/pipeline.js:89:15
at tryCatcher (zombie/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (zombie/node_modules/bluebird/js/release/promise.js:497:31)
at Promise._settlePromise (zombie/node_modules/bluebird/js/release/promise.js:555:18)
at Promise._settlePromise0 (zombie/node_modules/bluebird/js/release/promise.js:600:10)
at Promise._settlePromises (zombie/node_modules/bluebird/js/release/promise.js:679:18)
at Async._drainQueue (zombie/node_modules/bluebird/js/release/async.js:125:16)
at Async._drainQueues (zombie/node_modules/bluebird/js/release/async.js:135:10)
at Immediate.Async.drainQueues [as _onImmediate] (zombie/node_modules/bluebird/js/release/async.js:16:14)
at processImmediate [as _immediateCallback] (timers.js:383:17)

我进一步了解了您原来的错误,但问题的根源实际上是由于上述原因。当上述情况发生时,它会导致 document.documentElement 为 false-y 值,并随后导致 zombie/lib/index.js 中的 field 函数中的断言失败:

assert(this.document && this.document.documentElement, 'No open window with an HTML document');

我认为最简单的解决方案是在客户端处理错误并尝试正常恢复。

关于javascript - zombie JS : intermittently crashes when called repeatedly from a for loop,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35563187/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com