java - 当 PoolingHttpClientConnectionManager 中的 MaxPoolSize 太小时，Apache 的 HttpClient 停止工作

转载作者：行者123 更新时间：2023-12-05 09:19:36

24

4

我有一个 SwingWorker 可以获取 URL 列表(可能有数百个)。在 doInBackground() 方法中，它循环遍历这些 URL，每次创建一个 CloseableHttpClient，由一个 PoolingHttpClientConnectionManager 和一个 HttpGet 管理。client 将执行 httpGet 并将内容(图像)写入文件(如果可能的话!并非所有 URL 都有效，有些会返回 404)。

这适用于大约 100 个请求，直到达到 connectionManager 的 maxTotal(或 defaultMaxPerRoute)。然后一切都停止了，客户端的执行停止了，没有异常被抛出。

所以我想，我应该将 maxTotal 和 defaultMaxPerRoute 设置为 1000 并检查它。当我尝试下载 1500 张图像时它可以工作，但感觉很不对劲!我想重用客户端，而不是将 1000 个客户端放在一个池中。

Socket- 或 ConnectionTimeouts 不起作用，调试不会告诉我发生了什么，并且在没有 PoolingHttpClientConnectionManager 的情况下创建新的 HttpClients 也不起作用。关闭客户端和/或结果也不起作用。

我应该如何管理客户端或设置池以确保我的 SwingWorker 甚至可以下载数千张图片？

我将尝试将我的 SwingWorker 代码分解为重要的部分:(差点忘了，其实是循环的对象列表，每个对象有3个URL)

// this is how I init the connectionManager (outside the swing worker)
this.connectionManager = new PoolingHttpClientConnectionManager();
this.connectionManager.setMaxTotal(100);
this.connectionManager.setDefaultMaxPerRoute(100);

// this is where the images are downloaded in the swing worker
@Override
protected Void doInBackground() throws Exception{
    for(MyUrls myUrls : myUrlsList){
        client = HttpClients.custom().setConnectionManager(connectionManager).build();         
        for(MyImage image : myUrls.getImageList()){
            File outputFile = null;
            HttpEntity entity = null;
            switch(image.getImageSize()){
                case 1:
                    HttpGet httpGet = new HttpGet(image.getUrl()));
                    httpGet.setConfig(RequestConfig.custom().setSocketTimeout(1000).setConnectTimeout(1000).build());  // doesn't change anything
                    response = client.execute(httpGet);
                    if(response.getStatusLine().getStatusCode() >= 200 && response.getStatusLine().getStatusCode() < 300){
                        entity = response.getEntity();
                        if(entity != null){
                            String contentType = entity.getContentType().getValue();
                            String extension = "." + (contentType.contains("/") ? contentType.substring(contentType.indexOf("/") + 1) : "jpg");
                            outputFile = new File(image.getName()+extension);
                        }
                    }else{
                        System.err.println("download of "+image.getName()+extension+" failed: " + response.getStatusLine().getStatusCode());
                    }
                    break;
                case 2:
                   // the other cases are pretty much the same
            }
            if(entity != null && outputFile != null){
                try(InputStream inputStream = entity.getContent(); FileOutputStream outputStream = new FileOutputStream(outputFile)){
                    byte[] buffer = new byte[1024];
                    int bytesRead;
                    while((bytesRead = inputStream.read(buffer)) != -1){
                        outputStream.write(buffer, 0, bytesRead);
                    }
                }
            }
        }
    }
}

最佳答案

您可能会泄漏连接，因为您只有在可以正确转储到文件输出流的 OK 响应(在 200 到 300 范围内)时才正确处理它们。

长话短说:始终处置实体

处理实体(响应的“内容”)是释放连接对象的唯一方法。为此，您可以调用 EntityUtils.consume(responseEntity)) 和/或关闭响应对象(调用 response.close())和/或关闭或读取在每个请求/响应周期的 finally 子句中，直到实体流结束(调用 stream.close())。

您的代码实际上使用了响应的流，但仅在某些情况下(主要是:当请求成功时)，因此这是对 API 的正确使用，但在其他情况下(例如 404 响应)，您从不使用或释放.

您也可以为此使用 ResponseHandler API 变体，它会为您处理所有这些 - 我建议您使用它。

如果你不总是这样做，连接将被 HTTPClient 认为是 Activity 的，它永远不会被释放，也不会被重用。在某些时候，您最终会遇到空连接池或系统错误(例如，打开的文件太多)。

请注意，它不仅适用于 200 OK 响应，它适用于每次调用(即使是 404 通常也有响应主体，如果您不使用该主体，则连接不会被释放)。

来自用户指南的提示

我建议您参阅用户指南的“基础知识”部分:https://hc.apache.org/httpcomponents-client-ga/tutorial/html/fundamentals.html

Here is an example of request execution process in its simplest form:

CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
    <...>
} finally {
    response.close();
}

非常重要的部分是finally close中的close()。

1.1.5. Ensuring release of low level resources

In order to ensure proper release of system resources one must close either the content stream associated with the entity or the response itself

和:

1.1.6. Consuming entity content

The recommended way to consume the content of an entity is by using its HttpEntity#getContent() or HttpEntity#writeTo(OutputStream) methods. HttpClient also comes with the EntityUtils class, which exposes several static methods to more easily read the content or information from an entity. Instead of reading the java.io.InputStream directly, one can retrieve the whole content body in a string / byte array by using the methods from this class. However, the use of EntityUtils is strongly discouraged unless the response entities originate from a trusted HTTP server and are known to be of limited length.

关于响应处理程序变体:

1.1.8. Response handlers

The simplest and the most convenient way to handle responses is by using the ResponseHandler interface, which includes the handleResponse(HttpResponse response) method. This method completely relieves the user from having to worry about connection management.

栈溢出提示

您还可以阅读以下内容: Why did the author use EntityUtils.consume(httpEntity);?

旁注

拥有多线程连接管理器的要点是您可以拥有一个(而且只有一个)客户端，并为您的整个应用程序共享它。您不需要为每个请求构建一个客户端并让它们共享一个连接管理器。虽然你可以。
如果您使用缓存、cookie、身份验证等，拥有单个客户端可能会有所帮助，但在为每个请求创建新客户端时这些方法将无法工作。

1.2.1. HttpClient thread safety

HttpClient implementations are expected to be thread safe. It is recommended that the same instance of this class is reused for multiple request executions.

关于java - 当 PoolingHttpClientConnectionManager 中的 MaxPoolSize 太小时，Apache 的 HttpClient 停止工作，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40134171/

24

4

0

文章推荐： erlang - 有什么方法可以浏览 erlang 术语存储 (ETS)

文章推荐： sql-server - TSQL 触发器而不是插入

文章推荐： scala - 隐式是私有(private)的吗？

java - 自定义线程池配置文件 maxPoolSize
我已经设置了自定义线程配置文件，我可以看到从以下配置创建了 20 个线程。但我从来没有看到线程数增加到 20 以上，但最大池大小配置为 50。什么时候使用最大池大小？我们尝试对 50 个具有并发
spring - HikariConfig 和 maxPoolSize
我使用 HikariConfig 作为 spring 服务器上 postgres 数据库的数据源。我应该设置 maxPoolSize 吗？ (默认值为 -1)我可以使用多少池大小？是否与硬件有任何依赖
c# - Firebird 数据库上的错误 MaxPoolSize
在连接到数据库时，我使用连接池和 Firebird 数据库。我使用 FirebirdSql.Data.FirebirdClient 版本 2.6.5.0。我有以下连接字符串: 我有以下使用数据库连接
c# - MongoDB，MaxPoolSize 限制是否适用于所有客户端实例？
我将 MaxPoolSize 值增加到 3000。这是否意味着允许使用此实例的并发连接数为 3000，或者它也计算通过其他对象实例的任何其他连接数？ var connectionString = "m
c# - 默认的 MaxPoolSize 是多少？
我遇到了可怕的错误: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool
java - 使用 ThreadPoolExecutor 缩放 maxPoolSize；为什么池不会动态增加其大小？
我作为初学者正在学习java中的线程和并发包，并且我已经阅读了有关 ThreadPoolExecutor 的文档以了解 getPoolSize() 和 getCorePoolSize() 之间的区别,
java - 为什么 ThreadPoolExecutor 中的 maxPoolSize 什么都不做？
这个问题在这里已经有了答案: How to get the ThreadPoolExecutor to increase threads to max before queueing? (10 个回答
c# - .NET CORE MySql MaxPoolSize 未应用于
我们有一个托管在 RDS 中的 MySQl 数据库。创建 MySQL 连接时，似乎未应用最大池大小。我们有一个看起来像这样的连接字符串 Server=myServer;Port=3306;Databa
http - Vert.x HTTP 客户端创建的连接数多于 MaxPoolSize
我的应用程序中有 8 个 Verticle。每个 Verticle 都在一个单独的线程上。每个 Verticle 都有一个 WebClient(Vert.x HTTP 客户端)我将 MaxPoolSi
java - Tomcat 使用 c3p0 数据源，超过 maxPoolSize
我已将 Tomcat 配置为使用 ComboPooledDataSource，方法是在我的 context.xml 中添加以下内容。我想将数据库连接数限制为 20，如 maxPoolSize 中所
java - c3p0 - 设置 initialPoolSize = minPoolSize = maxPoolSize 有任何问题吗？
我试图找出我们服务中一些随机、间歇性“卡住”的根本原因，在排除了几个问题后，我们为我们的应用程序打开了 c3p0 调试级别的日志记录，我们认为这似乎是一个 c3p0 配置问题。我们使用“Oracle
java - 使用 JDBC 设置高 maxPoolSize 时需要注意哪些风险/因素
我的申请是 Piwik Server从放置在数百个网站上的跟踪代码接收传入的跟踪数据。当这些跟踪请求进入时，大部分工作负载是每秒向数据库写入数百次。我使用的是带有 JDBC 和 Hibernate 的
mongodb - 带有 golang mongo-driver 的多个 maxPoolSize 配置
当使用 golang 连接到 mongodb 实例时，我们可以指定 maxPoolSize 的值通过 connection string或使用 ClientOptions.SetMaxPoolSize
java - 如何从 java 更改 scala 中的 actors.maxPoolSize？
所以我有两个 bash 脚本。一个使用“scala”命令调用字节码，另一个使用“java”命令调用相同的代码。我的问题如下，当我使用 scala 时，我可以看到我可以获得大约 80 个线程(我创建并显
java - Spring ThreadPoolTaskExecutor 中的 corePoolSize 和 maxPoolSize 有什么区别
我必须向网站的所有用户发送大量电子邮件。我想为每封发送的电子邮件使用一个线程池。目前我已将值设置为: 两者之间有什么区别，是否会扩展。目前我有大约。 10000 个用户。最佳答案以下是 Sun
java - Vertx-java-HttpClient : How to derive maxPoolSize and maxWaitQueueSize values and their impact
我在vertx java中创建了一个java后端服务。我使用 httpClient(io.vertx.core.http.HttpClient) 并启用连接池来连接到外部服务。我排除吞吐量为 50。对
java - 当 PoolingHttpClientConnectionManager 中的 MaxPoolSize 太小时，Apache 的 HttpClient 停止工作
我有一个 SwingWorker 可以获取 URL 列表(可能有数百个)。在 doInBackground() 方法中，它循环遍历这些 URL，每次创建一个 CloseableHttpClient，由
java - 在 springboot async 中设置 maxpoolsize 会使我的 UI 仅对 5 个用户可用？
executor.setCorePoolSize(5); executor.setMaxPoolSize(5); 我在 Spring 中有一个 UI 应用程序和后端。一种方法需要 15 秒才能执行。我
php - 如何将 maxPoolSize 与 mongodb-php 驱动程序版本 1.2.0 一起使用
如何在 mongodb-php 驱动程序版本 1.2.0 中使用 maxPoolSize。新的 mongodb php 驱动程序是否使用连接池，如果是那么如何更改它？最佳答案你可以使用 Mong

首页

博学

6Ren·AI

商城

java - 当 PoolingHttpClientConnectionManager 中的 MaxPoolSize 太小时，Apache 的 HttpClient 停止工作

长话短说:始终处置实体

来自用户指南的提示

栈溢出提示

旁注