gpt4 book ai didi

javax.net.ssl.HttpsURLConnection 返回火星诗

转载 作者:太空宇宙 更新时间:2023-11-03 14:17:19 26 4
gpt4 key购买 nike

我正在编写一个简单的 https 客户端,它将通过 https 拉取网页的 html。我可以很好地连接到网页,但是我下拉的 html 是乱码。

public String GetWebPageHTTPS(String URI){
BufferedReader read;
URL inputURI;
String line;
String renderedPage = "";
try{
inputURI = new URL(URI);
HttpsURLConnection connect;
connect = (HttpsURLConnection)inputURI.openConnection();
connect.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401");
read = new BufferedReader (new InputStreamReader(connect.getInputStream()));
while ((line = read.readLine()) != null)
renderedPage += line;
read.close();
}
catch (MalformedURLException e){
e.printStackTrace();
}
catch (IOException e){
e.printStackTrace();
}
return renderedPage;
}

当我向它传递一个类似 https://kat.ph/ 的字符串时返回大约 10,000 个乱码

编辑这是我修改后的自签名证书代码,但我仍然得到加密流:

public String GetWebPageHTTPS(String URI){
TrustManager[] trustAllCerts = new TrustManager[] {
new X509TrustManager() {
public java.security.cert.X509Certificate[] getAcceptedIssuers() {
return null;
}
public void checkClientTrusted(
java.security.cert.X509Certificate[] certs, String authType) {
}
public void checkServerTrusted(
java.security.cert.X509Certificate[] certs, String authType) {
}
}
};
try {
SSLContext sc = SSLContext.getInstance("SSL");
sc.init(null, trustAllCerts, new java.security.SecureRandom());
HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
} catch (GeneralSecurityException e) {
}
try {
System.out.println("URI: " + URI);
URL url = new URL(URI);
} catch (MalformedURLException e) {
}
BufferedReader read;
URL inputURI;
String line;
String renderedPage = "";
try{
inputURI = new URL(URI);
HttpsURLConnection connect;
connect = (HttpsURLConnection)inputURI.openConnection();
read = new BufferedReader (new InputStreamReader(connect.getInputStream()));
while ((line = read.readLine()) != null)
renderedPage += line;
read.close();
}
catch (MalformedURLException e){
e.printStackTrace();
}
catch (IOException e){
e.printStackTrace();
}
return renderedPage;
}

最佳答案

“它是否被压缩过?stackoverflow.com/questions/8249522/…”——Mahesh Guruswamy

是的,事实证明它只是 gzip 压缩,这是我解决这个问题的方法

public String GetWebPageGzipHTTP(String URI){ 
String html = "";
try {
URLConnection connect = new URL(URI).openConnection();
BufferedReader in = null;
connect.setReadTimeout(10000);
connect.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401");
if (connect.getHeaderField("Content-Encoding")!=null && connect.getHeaderField("Content-Encoding").equals("gzip")){
in = new BufferedReader(new InputStreamReader(new GZIPInputStream(connect.getInputStream())));
} else {
in = new BufferedReader(new InputStreamReader(connect.getInputStream()));
}
String inputLine;
while ((inputLine = in.readLine()) != null){
html+=inputLine;
}
in.close();
return html;
} catch (Exception e) {
return html;
}
}

关于javax.net.ssl.HttpsURLConnection 返回火星诗,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16611447/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com