gpt4 book ai didi

java - 线程 "main"java.net.MalformedURLException : no protocol error while finding broken links in a page using Selenium and Java 中出现异常

转载 作者:行者123 更新时间:2023-11-30 05:18:33 25 4
gpt4 key购买 nike

我试图通过 Selenium(Java) 代码找到页面中损坏的链接,但我遇到了这个问题。由于以下异常,我无法运行此代码。在此代码中,找到页面中的链接总数,然后找到链接的 URL。请查看问题并给我解决方案。

Exception in thread "main" java.net.MalformedURLException: no protocol: 
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at fire.Weil.main(Weil.java:57)

我的代码是:-

package fire;

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class Weil {

public static void main(String[] args) throws MalformedURLException, IOException{

System.setProperty("webdriver.gecko.driver", "C:\\Users\\sumitk\\Downloads\\Selenium Drivers\\Gecodriver\\geckodriver.exe");
WebDriver driver = new FirefoxDriver();

//delete all cookies
driver.manage().deleteAllCookies();

//dynamic wait
driver.manage().timeouts().pageLoadTimeout(30, TimeUnit.SECONDS);
driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);

//open site
driver.get("https://www.weil.com/");

//1. get the list of all the links and images
List<WebElement> linklist = driver.findElements(By.tagName("a"));
linklist.addAll(driver.findElements(By.tagName("img")));

System.out.println("Size of full links and images--->"+ linklist.size());

List<WebElement> activeLinks =new ArrayList<WebElement>();

// 2. iterate linklist : exclude all the links/images does not have any href attribute
for(int i=0; i<linklist.size(); i++)
{
System.out.println(linklist.get(i).getAttribute("href"));
if(linklist.get(i).getAttribute("href") !=null)
{
activeLinks.add(linklist.get(i));
}
}

//get the size of active links list.
System.out.println("Size of active links and images--->"+ activeLinks.size());

//3. check the href url, with httpconnection api.
for(int j=0; j<activeLinks.size(); j++)
{
HttpURLConnection connection=(HttpURLConnection) new URL(activeLinks.get(j).getAttribute("href")).openConnection();
connection.connect();
String response=connection.getResponseMessage();
connection.disconnect();
System.out.println(activeLinks.get(j).getAttribute("href") +" --->"+response);
}
}

}

最佳答案

此错误消息...

Exception in thread "main" java.net.MalformedURLException: no protocol:

...暗示您的程序正在尝试访问 URL没有协议(protocol),即 HTTPHTTPS缺席。

你的逻辑近乎完美。几句话:

  • 有可能 <a> 中的某些网页内的元素 https://www.weil.com/href属性没有分配值。举个例子:

    • <a class="canvas-button ss-icon" href="">?</a>
    • <a class="search-button ss-icon" href="">Search</a>
  • 这就是这一行的原因:

    System.out.println("Size of active links and images--->"+ activeLinks.size());
    //prints: Size of active links and images--->72
  • 但是如果你打印 href属性:

    for(int i=0; i<activeLinks.size(); i++)
    System.out.println(activeLinks.get(i).getAttribute("href"));
  • 前两行为空,如下:

    <blank>
    <blank>
    https://www.weil.com/
    https://www.weil.com/
    https://www.weil.com/people
  • 我对您的代码进行了一些简单的调整,如下所示:

    • 已替换 findElements(By.tagName("a"))findElements(By.xpath("//a[contains (@href, 'weil')]"))
    • 已替换 findElements(By.tagName("img"))findElements(By.xpath("//img[contains (@src, 'weil')]"))
  • 执行结果如下:

    • 代码块:

      public class A_Chrome_Demo {

      public static void main(String[] args) throws IOException {
      System.setProperty("webdriver.chrome.driver", "C:\\Utility\\BrowserDrivers\\chromedriver.exe");
      ChromeOptions options = new ChromeOptions();
      options.addArguments("start-maximized");
      options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
      options.setExperimentalOption("useAutomationExtension", false);
      WebDriver driver = new ChromeDriver(options);
      driver.get("https://www.weil.com/");
      List<WebElement> linklist = driver.findElements(By.xpath("//a[contains (@href, 'weil')]"));
      linklist.addAll(driver.findElements(By.xpath("//img[contains (@src, 'weil')]")));
      System.out.println("Size of full links and images--->"+ linklist.size());
      List<WebElement> activeLinks =new ArrayList<WebElement>();
      for(int i=0; i<linklist.size(); i++)
      {
      System.out.println(linklist.get(i).getAttribute("href"));
      if(linklist.get(i).getAttribute("href") !=null)
      activeLinks.add(linklist.get(i));
      }
      System.out.println("Size of active links and images--->"+ activeLinks.size());
      for(int j=0; j<activeLinks.size(); j++)
      {
      HttpURLConnection connection=(HttpURLConnection) new URL(activeLinks.get(j).getAttribute("href")).openConnection();
      connection.connect();
      String response=connection.getResponseMessage();
      connection.disconnect();
      System.out.println(activeLinks.get(j).getAttribute("href") +" --->"+response);
      }
      }
      }
    • 控制台输出:

      Size of full links and images--->46
      https://www.weil.com/about-weil
      https://extranet.weil.com/
      https://login.weil.com/
      https://www.weil.com/articles/weil-elects-16-new-partners-and-announces-new-counsel-class-2019
      https://www.weil.com/articles/weil-announces-weil-legal-innovators-program
      https://www.weil.com/articles/weil-partners-receive-top-honors-in-2019
      https://www.weil.com/articles/two-weil-partners-named-among-turnarounds-workouts-outstanding-restructuring-lawyers-for-2019
      https://careers.weil.com/
      https://www.weil.com/articles/weil-wins-five-2019-law360-practice-group-of-the-year-awards
      https://www.weil.com/articles/weil-earns-2020-litigation-department-of-the-year-honorable-mention-from-the-american-lawyer
      https://www.weil.com/articles/weil-leads-three-of-the-five-top-bankruptcy-cases-of-2019
      https://www.weil.com/about-weil/about-weil-prominent-matters
      https://www.weil.com/articles/weil-represented-french-state-in-landmark-privatization-and-ipo-of-francaise-des-jeux
      https://www.weil.com/articles/weil-litigators-clinch-four-win-week-showcasing-cross-departmental-strengths
      https://www.weil.com/articles/weil-advised-guggenheim-securities-and-morgan-stanley-on-jack-in-the-boxs-1-3b-securitization
      https://www.weil.com/about-weil/not-for-profit
      https://www.weil.com/articles/weil-secures-asylum-for-burkina-faso-native-escaping-persecution
      https://www.weil.com/articles/weils-2019-pro-bono-annual-review-our-finest-hours
      https://www.weil.com/articles/weil-and-nysba-task-force-deliver-report-on-wrongful-convictions-in-new-york-state
      https://www.weil.com/about-weil/diversity-and-inclusion
      https://www.weil.com/articles/weil-named-a-2020-best-place-to-work-for-lgbtq-equality
      https://www.weil.com/articles/three-weil-partners-named-best-practitioners-in-their-fields
      http://business-finance-restructuring.weil.com/
      http://eurorestructuring.weil.com/
      http://privateequity.weil.com/
      http://governance.weil.com/
      http://product-liability.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/
      https://tax.weil.com/latest-thinking/cryptoassets-hmrc-uk-tax-net-widens/
      http://business-finance-restructuring.weil.com/automatic-stay/denial-of-stay-relief-is-a-final-order-says-the-u-s-supreme-court/
      http://business-finance-restructuring.weil.com/news/weil-wins-five-2019-law360-practice-group-of-the-year-awards/
      https://www.weil.com/about-weil/green-policy
      https://www.weil.com/about-weil/sitemap
      https://www.weil.com/about-weil/privacy-policy
      https://www.weil.com/about-weil/privacy-shield-notice
      https://www.weil.com/about-weil/regulatory-information
      https://www.weil.com/about-weil/disclaimer
      null
      null
      null
      Size of active links and images--->43
      https://www.weil.com/about-weil --->OK
      https://extranet.weil.com/ --->OK
      https://login.weil.com/ --->OK
      https://www.weil.com/articles/weil-elects-16-new-partners-and-announces-new-counsel-class-2019 --->OK
      https://www.weil.com/articles/weil-announces-weil-legal-innovators-program --->OK
      https://www.weil.com/articles/weil-partners-receive-top-honors-in-2019 --->OK
      https://www.weil.com/articles/two-weil-partners-named-among-turnarounds-workouts-outstanding-restructuring-lawyers-for-2019 --->OK
      https://careers.weil.com/ --->OK
      https://www.weil.com/articles/weil-wins-five-2019-law360-practice-group-of-the-year-awards --->OK
      https://www.weil.com/articles/weil-earns-2020-litigation-department-of-the-year-honorable-mention-from-the-american-lawyer --->OK
      https://www.weil.com/articles/weil-leads-three-of-the-five-top-bankruptcy-cases-of-2019 --->OK
      https://www.weil.com/about-weil/about-weil-prominent-matters --->OK
      https://www.weil.com/articles/weil-represented-french-state-in-landmark-privatization-and-ipo-of-francaise-des-jeux --->OK
      https://www.weil.com/articles/weil-litigators-clinch-four-win-week-showcasing-cross-departmental-strengths --->OK
      https://www.weil.com/articles/weil-advised-guggenheim-securities-and-morgan-stanley-on-jack-in-the-boxs-1-3b-securitization --->OK
      https://www.weil.com/about-weil/not-for-profit --->OK
      https://www.weil.com/articles/weil-secures-asylum-for-burkina-faso-native-escaping-persecution --->OK
      https://www.weil.com/articles/weils-2019-pro-bono-annual-review-our-finest-hours --->OK
      https://www.weil.com/articles/weil-and-nysba-task-force-deliver-report-on-wrongful-convictions-in-new-york-state --->OK
      https://www.weil.com/about-weil/diversity-and-inclusion --->OK
      https://www.weil.com/articles/weil-named-a-2020-best-place-to-work-for-lgbtq-equality --->OK
      https://www.weil.com/articles/three-weil-partners-named-best-practitioners-in-their-fields --->OK
      http://business-finance-restructuring.weil.com/ --->Forbidden
      http://eurorestructuring.weil.com/ --->Forbidden
      http://privateequity.weil.com/ --->Forbidden
      http://governance.weil.com/ --->Forbidden
      http://product-liability.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/ --->Forbidden
      https://tax.weil.com/latest-thinking/cryptoassets-hmrc-uk-tax-net-widens/ --->Forbidden
      http://business-finance-restructuring.weil.com/automatic-stay/denial-of-stay-relief-is-a-final-order-says-the-u-s-supreme-court/ --->Forbidden
      http://business-finance-restructuring.weil.com/news/weil-wins-five-2019-law360-practice-group-of-the-year-awards/ --->Forbidden
      https://www.weil.com/about-weil/green-policy --->OK
      https://www.weil.com/about-weil/sitemap --->OK
      https://www.weil.com/about-weil/privacy-policy --->OK
      https://www.weil.com/about-weil/privacy-shield-notice --->OK
      https://www.weil.com/about-weil/regulatory-information --->OK
      https://www.weil.com/about-weil/disclaimer --->OK
<小时/>

引用

您可以在以下位置找到相关的详细讨论:

关于java - 线程 "main"java.net.MalformedURLException : no protocol error while finding broken links in a page using Selenium and Java 中出现异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59929164/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com