gpt4 book ai didi

java - 浏览器不支持框架

转载 作者:行者123 更新时间:2023-12-02 08:14:48 26 4
gpt4 key购买 nike

我正在尝试创建一个 java 程序,该程序对 achievo 执行登录。实例。我正在尝试使用Screen Scraping

我设法使用以下代码登录:

@Test
public void testLogin() throws Exception {
HashMap<String, String> data = new HashMap<String, String>();
data.put("auth_user", "user");
data.put("auth_pw", "password");
doSubmit("https://someurl.com/achievo/index.php", data);
}

private void doSubmit(String url, HashMap<String, String> data) throws Exception {
URL siteUrl = new URL(url);
HttpsURLConnection conn = (HttpsURLConnection) siteUrl.openConnection();
conn.setRequestMethod("POST");
conn.setDoOutput(true);
conn.setDoInput(true);
//conn.setRequestProperty( "User-agent", "spider" );
//conn.setRequestProperty("User-agent", "Opera/9.80 (X11; Linux i686; U; en) Presto/2.7.62 Version/11.01");

conn.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 1.2.30703)");

DataOutputStream out = new DataOutputStream(conn.getOutputStream());

Set<String> keys = data.keySet();
Iterator<String> keyIter = keys.iterator();
StringBuilder content = new StringBuilder("");
for(int i=0; keyIter.hasNext(); i++) {
Object key = keyIter.next();
if(i!=0) {
content.append("&");
}
content.append(key + "=" + URLEncoder.encode(data.get(key), "UTF-8"));
}
System.out.println(content.toString());

out.writeBytes(content.toString());
out.flush();
out.close();
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line = "";
while((line=in.readLine())!=null) {
System.out.println(line);
}
in.close();
}

但是,当achievo成功登录时,我被重定向到主页,其中显示:

<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<title>Achievo</title>
</head>
<frameset rows="113,*" frameborder="0" border="0">
<frame name="top" scrolling="no" noresize src="top.php?atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
<frameset cols="210,*" frameborder="0" border="0">
<frame name="menu" scrolling="no" noresize src="menu.php?atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
<frame name="main" scrolling="auto" noresize src="dispatch.php?atknodetype=pim.pim&atkaction=pim&atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
</frameset>
<noframes>
<body bgcolor="#CCCCCC" text="#000000">
<p>Your browser doesnt support frames, but this is required to run Achievo</p>
</body>
</noframes>
</frameset>

显然我得到了您的浏览器不支持框架,但这是运行 Achievo 所必需的

我尝试直接访问dispatch.php框架,因为这可能是我想要的,但是,它报告我的 session 已过期,并且我需要重新登录。

有办法伪造一个框架吗?或者以某种方式保持连接,更改 url,并尝试获取dispatch.php 框架?

<小时/>

使用 HtmlUnit,我完成了以下操作:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3);
HtmlPage page = webClient.getPage("https://someurl.com/index.php");
System.out.println(page.asXml());

List<HtmlForm> forms = page.getForms();
assertTrue(forms != null && !forms.isEmpty());

HtmlForm form = forms.get(0);
HtmlSubmitInput submit = form.getInputByName("login");
HtmlInput inputUsername = form.getInputByName("auth_user");
HtmlInput inputPw = form.getInputByName("auth_pw");

inputUsername.setValueAttribute("foo");
inputPw.setValueAttribute("bar");

HtmlPage page2 = submit.click();

CookieManager cookieManager = webClient.getCookieManager();
Set<Cookie> cookies = cookieManager.getCookies();
System.out.println("Is cookie " + cookieManager.isCookiesEnabled());

for(Cookie cookie : cookies) {
System.out.println(cookie.toString());
}

System.out.println(page2.asXml());
webClient.closeAllWindows();

在这里,我获取了表单,提交了它,然后检索了相同的消息。当我也打印出来时,我可以看到我有一个cookie。现在的问题是,如何使用登录的 cookie 获取dispatch.php 框架?

最佳答案

这种抓取有点复杂,需要考虑几个因素。

  1. Achieve 应用程序是否设置任何 cookie?如果是这样,您将需要接受它们并与下一个请求一起发送。我认为
  2. 从表面上看,您需要解析该 HTML 页面并提取您想要加载的框架。我怀疑您收到了 session 过期消息,因为您没有发送 cookie 或类似的内容。您需要确保使用框架集中提供的准确 URL。

我建议使用Apache HttpClient module它比标准 Java URL 提供程序功能更全面,并且可以为您管理 cookie 之类的东西。

关于java - 浏览器不支持框架,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6662960/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com