gpt4 book ai didi

java - JSoup 超时未按预期工作

转载 作者:搜寻专家 更新时间:2023-11-01 03:49:22 24 4
gpt4 key购买 nike

我正在尝试使用 JSoup 下载页面内容。如果整个操作(打开连接+读取)超过 8 秒,我想立即中止。我假设 timeout(int millis) 方法的目的正是这样做的。根据 javadoc:

Set the request timeouts (connect and read). If a timeout occurs, an IOException will be thrown. The default timeout is 3 seconds (3000 millis). A timeout of zero is treated as an infinite timeout.

我写了一个简单的代码来模拟那个操作:

    final int TIME_OUT = 8000;
final String USER_AGENT_STRING = "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)";
final String url = "http://reguler-pmb-tanggamus.va.web.id/";

long time = System.currentTimeMillis();
try {
Document doc = Jsoup.connect(url).userAgent(USER_AGENT_STRING).timeout(TIME_OUT).get();
System.out.println("Done crawling " + url + ", took " + (System.currentTimeMillis() - time) + " millis");
System.out.println("Content: " + doc);
} catch (Exception e) {
System.out.println("Failed after " + (System.currentTimeMillis() - time) + " millis");
e.printStackTrace();
}

我尝试在一些“有问题的”网站上在单线程环境上运行这个小脚本。我假设无论是成功还是异常被捕获,操作时间都不会超过 8 秒(8000 毫秒)。不幸的是,情况并非如此,因为有时它会在超过一分钟后成功(无一异常(exception)):

Done crawling http://reguler-pmb-tanggamus.va.web.id/, took 68215 millis
Content: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> ...

有时(虽然很少见)会在超过一分钟后失败 (SocketTimeoutException)。

有没有人遇到过这种问题?

最佳答案

OP 面临的问题似乎是 Jsoup 1.8.3 中的错误。

I was able to reproduce your finding. I would suggest you file a bug report @ github.com/jhy/jsoup/issues (luksch)

OP 提供了一个问题:https://github.com/jhy/jsoup/issues/628

关于java - JSoup 超时未按预期工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32678890/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com