gpt4 book ai didi

python - 连接被对方​​拒绝 : 111: Connection refused

转载 作者:行者123 更新时间:2023-11-30 22:08:44 25 4
gpt4 key购买 nike

我有一个 LinkedIn 蜘蛛。它在我的本地计算机上运行良好,但是当我在 Scrapinghub 上部署时出现错误:

Error downloading <GET https://www.linkedin.com/>: Connection was refused by other side: 111: Connection refused.

Scrapinghub的完整日志为:

0:  2018-08-30 12:58:34 INFO    Log opened.
1: 2018-08-30 12:58:34 INFO [scrapy.log] Scrapy 1.0.5 started
2: 2018-08-30 12:58:34 INFO [scrapy.utils.log] Scrapy 1.0.5 started (bot: facebook_stats)
3: 2018-08-30 12:58:34 INFO [scrapy.utils.log] Optional features available: ssl, http11, boto
4: 2018-08-30 12:58:34 INFO [scrapy.utils.log] Overridden settings: {'NEWSPIDER_MODULE': 'facebook_stats.spiders', 'STATS_CLASS': 'sh_scrapy.stats.HubStorageStatsCollector', 'LOG_LEVEL': 'INFO', 'SPIDER_MODULES': ['facebook_stats.spiders'], 'RETRY_TIMES': 10, 'RETRY_HTTP_CODES': [500, 503, 504, 400, 403, 404, 408], 'BOT_NAME': 'facebook_stats', 'MEMUSAGE_LIMIT_MB': 950, 'DOWNLOAD_DELAY': 1, 'TELNETCONSOLE_HOST': '0.0.0.0', 'LOG_FILE': 'scrapy.log', 'MEMUSAGE_ENABLED': True, 'USER_AGENT': 'Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.7'}
5: 2018-08-30 12:58:34 INFO [scrapy.log] HubStorage: writing items to https://storage.scrapinghub.com/items/341545/3/9
6: 2018-08-30 12:58:34 INFO [scrapy.middleware] Enabled extensions: CoreStats, TelnetConsole, MemoryUsage, LogStats, StackTraceDump, CloseSpider, SpiderState, AutoThrottle, HubstorageExtension
7: 2018-08-30 12:58:35 INFO [scrapy.middleware] Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
8: 2018-08-30 12:58:35 INFO [scrapy.middleware] Enabled spider middlewares: HubstorageMiddleware, HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
9: 2018-08-30 12:58:35 INFO [scrapy.middleware] Enabled item pipelines: CreditCardsPipeline
10: 2018-08-30 12:58:35 INFO [scrapy.core.engine] Spider opened
11: 2018-08-30 12:58:36 INFO [scrapy.extensions.logstats] Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
12: 2018-08-30 12:58:36 INFO TelnetConsole starting on 6023
13: 2018-08-30 12:59:32 ERROR [scrapy.core.scraper] Error downloading <GET https://www.linkedin.com/>: Connection was refused by other side: 111: Connection refused.
14: 2018-08-30 12:59:32 INFO [scrapy.core.engine] Closing spider (finished)
15: 2018-08-30 12:59:33 INFO [scrapy.statscollectors] Dumping Scrapy stats: More
16: 2018-08-30 12:59:34 INFO [scrapy.core.engine] Spider closed (finished)
17: 2018-08-30 12:59:34 INFO Main loop terminated.

我该如何解决这个问题?

最佳答案

领英 prohibits scraping :

Prohibited Software and Extensions

LinkedIn is committed to keeping its members' data safe and its website free from fraud and abuse. In order to protect our members’ data and our website, we don't permit the use of any third party software, including "crawlers", bots, browser plug-ins, or browser extensions (also called "add-ons"), that scrapes, modifies the appearance of, or automates activity on LinkedIn’s website. Such tools violate the User Agreement, including, but not limited to, many of the "Don'ts" listed in Section 8.2…

有理由认为他们可能会主动阻止来自 Scrapinghub 和类似服务的连接。

关于python - 连接被对方​​拒绝 : 111: Connection refused,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52098291/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com