gpt4 book ai didi

java - 访问历史记录中上一页的 webelement 会引发错误

转载 作者:太空宇宙 更新时间:2023-11-04 10:01:53 25 4
gpt4 key购买 nike

我有一个需要抓取 4 层嵌套页面的网站。

Level 1
--->Level 2
--->Level 3
--->Level 4
--->Level 3
--->Level 4
--->Level 2

因此,我必须来回访问每个级别 4、每个级别 3、每个级别 2、每个级别 1。

因此,我创建了嵌套循环

List<WebElement> chapters = driver.findElements(By.xpath("/html[1]/body[1]/div[2]/div[1]/div[4]/div[3]/div[1]/div[1]/table[1]/tbody[1]/tr[*]/td[3]/a"));
for(WebElement chapter: chapters)
{
String chapter_name = chapter.getText();
String chapter_url = chapter.getAttribute("href");

System.out.println("CHAPTER : " + chapter_name + "URL : " + chapter_url);
driver.get(chapter_url);

List<WebElement> topics = driver.findElements(By.xpath("/html[1]/body[1]/div[2]/div[1]/div[4]/div[3]/div[1]/div[1]/table[1]/tbody[1]/tr[*]/td[3]/a"));
for(WebElement topic: topics)
{
String topic_name = topic.getText();
String topic_url = topic.getAttribute("href");

System.out.println("\tTOPIC : " + topic_name + "URL : " + topic_url);
driver.get(topic_url);
List<WebElement> sub_topics = driver.findElements(By.xpath("/html[1]/body[1]/div[2]/div[1]/div[4]/div[3]/div[1]/div[1]/table[1]/tbody[1]/tr[*]/td[3]/a"));
for(WebElement sub_topic : sub_topics)
{
String sub_topic_name = sub_topic.getText();
String sub_topic_url = sub_topic.getAttribute("href");

System.out.println("\t\tSUBTOPIC : " + sub_topic_name + "URL : " + sub_topic_url);
driver.get(sub_topic_url);
List<WebElement> problems = driver.findElements(By.xpath("/html[1]/body[1]/div[2]/div[1]/div[4]/div[3]/div[1]/div[1]/table[1]/tbody[1]/tr[*]/td[3]/a"));
for(WebElement problem : problems)
{
System.out.println("\t\t\t"+problem.getText());
}
driver.navigate().back();
}
driver.navigate().back();
}
driver.navigate().back();
}

但我遇到以下异常:

Exception in thread "main" org.openqa.selenium.NoSuchElementException: Web element reference not seen before: dcbb0aef-d165-4450-964c-535fc4577f69
For documentation on this error, please visit: http://seleniumhq.org/exceptions/no_such_element.html
Build info: version: '3.14.0', revision: 'aacccce0', time: '2018-08-02T20:05:20.749Z'
System info: host: 'workstation', ip: '127.0.1.1', os.name: 'Linux', os.arch: 'amd64', os.version: '4.15.0-39-generic', java.version: '1.8.0_181'
Driver info: org.openqa.selenium.firefox.FirefoxDriver
Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 63.0.3, javascriptEnabled: true, moz:accessibilityChecks: false, moz:geckodriverVersion: 0.23.0, moz:headless: false, moz:processID: 13651, moz:profile: /tmp/rust_mozprofile.gx46rW..., moz:useNonSpecCompliantPointerOrigin: false, moz:webdriverClick: true, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, platformVersion: 4.15.0-39-generic, rotatable: false, setWindowRect: true, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
Session ID: 55d3e16e-5920-414d-b047-a24f5483a2c7
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.openqa.selenium.remote.http.W3CHttpResponseCodec.createException(W3CHttpResponseCodec.java:187)
at org.openqa.selenium.remote.http.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:122)
at org.openqa.selenium.remote.http.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:49)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:158)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:548)
at org.openqa.selenium.remote.RemoteWebElement.execute(RemoteWebElement.java:276)
at org.openqa.selenium.remote.RemoteWebElement.getText(RemoteWebElement.java:160)
at firstTest.Getlinks.main(Getlinks.java:52)

这可能是因为向后导航可能会刷新并且状态会丢失。这种情况下的解决方案/最佳实践是什么?

最佳答案

这绝对是向后导航。每次您向后导航时,您都会看到一个新页面,并且以前存储的元素不再可交互。我注意到您的所有 xPath 都将获取链接(顺便说一句,它们是相似的),因此我修改了可能会解决您的问题的代码:

private static final By XPATH = By.xpath("/html[1]/body[1]/div[2]/div[1]/div[4]/div[3]/div[1]/div[1]/table[1]/tbody[1]/tr[*]/td[3]/a");

public void testMethod() {
List<WebElement> chapters = driver.findElements(XPATH);
List<String> chapterTexts = getTextsFromElements(chapters);

scanChapters(chapterTexts);
}

private List<String> getTextsFromElements(List<WebElement> els) {
List<String> texts = new ArrayList<>();
for (WebElement el : els) {
texts.add(el.getText());
}
return texts;
}

private void scanChapters(List<String> chapterTexts) {
for (String chapterText : chapterTexts) {
WebElement chapter = driver.findElement(By.linkText((chapterText)));
String chapter_url = chapter.getAttribute("href");
System.out.println("CHAPTER : " + chapterText + "URL : " + chapter_url);
driver.get(chapter_url);

List<WebElement> topics = driver.findElements(XPATH);
List<String> topicTexts = getTextsFromElements(topics);
scanTopics(topicTexts);

driver.navigate().back();
}
}

private void scanTopics(List<String> topicTexts) {
for (String topicText : topicTexts) {
WebElement topic = driver.findElement(By.linkText((topicText)));
String topic_url = topic.getAttribute("href");
System.out.println("\tTOPIC : " + topicText + "URL : " + topic_url);
driver.get(topic_url);

List<WebElement> sub_topics = driver.findElements(XPATH);
List<String> subTopicTexts = getTextsFromElements(sub_topics);
scanSubTopics(subTopicTexts);

driver.navigate().back();
}
}

private void scanSubTopics(List<String> subTopicTexts) {
for (String subTopicText : subTopicTexts) {
WebElement subTopic = driver.findElement(By.linkText((subTopicText)));
String sub_topic_url = subTopic.getAttribute("href");
System.out.println("\t\tSUBTOPIC : " + subTopicText + "URL : " + sub_topic_url);
driver.get(sub_topic_url);

List<WebElement> problems = driver.findElements(XPATH);
for (WebElement problem : problems) {
System.out.println("\t\t\t" + problem.getText());
}

driver.navigate().back();
}
}

关于java - 访问历史记录中上一页的 webelement 会引发错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53362801/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com