gpt4 book ai didi

java - 从 Commons HttpClient 迁移到 HttpComponents 客户端

转载 作者:塔克拉玛干 更新时间:2023-11-02 19:33:57 25 4
gpt4 key购买 nike

我想从 Commons HttpClient (3.x) 迁移到 HttpComponents Client (4.x),但很难处理重定向。该代码在 Commons HttpClient 下正常工作,但在迁移到 HttpComponents Client 时中断。一些链接得到了不需要的重定向,但是当我将“http.protocol.handle-redirects”设置为“false”时,大量链接完全停止工作。

Commons HttpClient 3.x:

private static HttpClient httpClient = null;
private static MultiThreadedHttpConnectionManager connectionManager = null;
private static final long MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds

static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);

connectionManager = new MultiThreadedHttpConnectionManager();
connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000); // will need to set from properties file
connectionManager.getParams().setMaxTotalConnections(1000);
httpClient = new HttpClient(connectionManager);
}




/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{

if ( StringUtils.isEmpty(url) )
return null;

GetMethod getMethod = new GetMethod(url);
HttpClient httpClient = new HttpClient();
//configureMethod(getMethod);
//ObjectInputStream oin = null;
InputStream in = null;
int code = -1;
String html = "";
String lastModified = null;
try {
code = httpClient.executeMethod(getMethod);

in = getMethod.getResponseBodyAsStream();
//oin = new ObjectInputStream(in);
//html = getMethod.getResponseBodyAsString();
html = CharStreams.toString(new InputStreamReader(in));

}


catch (Exception except) {
}
finally {

try {
//oin.close();
in.close();
}
catch (Exception except) {}

getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME);
}

if (code <= 400){
return html.replaceAll("\\s+", " ");
} else {
throw new Exception("URL: " + url + " returned response code " + code);
}

}

HttpComponents 客户端 4.x:

private static HttpClient httpClient = null;
private static HttpParams params = null;
//private static MultiThreadedHttpConnectionManager connectionManager = null;
private static ThreadSafeClientConnManager connectionManager = null;
private static final int MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds


static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);


connectionManager = new ThreadSafeClientConnManager();
connectionManager.setDefaultMaxPerRoute(1000); // will need to set from properties file
connectionManager.setMaxTotal(1000);
httpClient = new DefaultHttpClient(connectionManager);



// HTTP parameters stores header etc.
params = new BasicHttpParams();
params.setParameter("http.protocol.handle-redirects",false);

}




/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{

if ( StringUtils.isEmpty(url) )
return null;

InputStream in = null;
//int code = -1;
String html = "";

// Prepare a request object
HttpGet httpget = new HttpGet(url);
httpget.setParams(params);

// Execute the request
HttpResponse response = httpClient.execute(httpget);

// The response status
//System.out.println(response.getStatusLine());
int code = response.getStatusLine().getStatusCode();

// Get hold of the response entity
HttpEntity entity = response.getEntity();

// If the response does not enclose an entity, there is no need
// to worry about connection release
if (entity != null) {

try {
//code = httpClient.executeMethod(getMethod);

//in = getMethod.getResponseBodyAsStream();
in = entity.getContent();
html = CharStreams.toString(new InputStreamReader(in));

}


catch (Exception except) {
throw new Exception("URL: " + url + " returned response code " + code);
}
finally {

try {
//oin.close();
in.close();
}
catch (Exception except) {}

//getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME, TimeUnit.MILLISECONDS);
connectionManager.closeExpiredConnections();
}

}

if (code <= 400){
return html;
} else {
throw new Exception("URL: " + url + " returned response code " + code);
}


}

我不想要重定向,但是在 HttpClient 4.x 下,如果我启用重定向,那么我会得到一些不需要的,例如http://www.walmart.com/ => http://mobile.walmart.com/ .在 HttpClient 3.x 下,不会发生此类重定向。

我需要做什么才能在不破坏代码的情况下将 HttpClient 3.x 迁移到 HttpClient 4.x?

最佳答案

这不是 HttpClient 4.x 的问题,可能是目标服务器处理请求的方式,因为用户代理是 httpclient,它可能被处理为移动(目标服务器可能会考虑可用浏览器以外的其他浏览器,例如, chrome、mozilla 等作为移动设备。)

请使用以下代码手动设置

 httpclient.getParams().setParameter(
org.apache.http.params.HttpProtocolParams.USER_AGENT,
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2"
);

关于java - 从 Commons HttpClient 迁移到 HttpComponents 客户端,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10431561/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com