java - 使用 URLConnetion.getInputStream() 获取源代码 (amazon.de)-6ren

java - 使用 URLConnetion.getInputStream() 获取源代码 (amazon.de)

转载作者：行者123 更新时间：2023-12-01 16:51:31

当我想获取特定网页的源代码时，我使用以下代码:

URL url = new URL("https://google.de");
URLConnection urlConnect = url.openConnection();
BufferedReader br = new BufferedReader(new InputStreamReader(urlConnect.getInputStream())); //Here is the error with the amazon url
StringBuffer sb = new StringBuffer();
String line, htmlData;
while((line=br.readLine())!=null){
    sb.append(line+"\n");
}
htmlData = sb.toString();

上面的代码工作没有问题，但是当你的网址被调用时......

URL url = new URL("https://amazon.de");

...那么有时你可能会得到 IOException 错误 -> 服务器错误代码 503。在我看来，这没有任何意义，因为我可以用浏览器进入亚马逊网页，没有任何错误。

最佳答案

使用 curl -v https://amazon.de 访问 https://amazon.de 时，您会得到 503 或响应中的 301 状态代码(遵循重定向时，您会从引用的位置 https://www.amazon.de/ 获得 503 )。正文包含以下注释:

To discuss automated access to Amazon data please contact api-services-support@amazon.com. For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.de/ref=rm_5_sv, or our Product Advertising API at https://partnernet.amazon.de/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.

我假设当检测到您的请求来自非浏览器上下文(即通过解析用户代理)时，亚马逊会返回此响应，以提示您使用 API 而不是直接抓取网站。

关于java - 使用 URLConnetion.getInputStream() 获取源代码 (amazon.de)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39202405/

文章推荐： iphone - Google Analytics for iOS 中仅显示部分 View

文章推荐： jsf-2 - 自己的 ResourceHandler 从数据库流式传输图像

文章推荐： delphi - tcpserver x tcpclient 在运行压力测试时出现问题

java - 使用 URLConnetion.getInputStream() 获取源代码 (amazon.de)
当我想获取特定网页的源代码时，我使用以下代码: URL url = new URL("https://google.de"); URLConnection urlConnect = url.openC

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

java - 使用 URLConnetion.getInputStream() 获取源代码 (amazon.de)