gpt4 book ai didi

java - 需要使用selenium从多个网页捕获数据

转载 作者:行者123 更新时间:2023-12-01 08:54:35 24 4
gpt4 key购买 nike

网页:http://www.forbes.com/companies/icbc/

package selenium;

import java.util.List;
import java.util.concurrent.TimeUnit;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.By.ByTagName;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.ie.InternetExplorerDriver;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;

public class ForbesTest {

WebDriver driver;
String url;


@Before
public void setUp() throws Exception {

System.setProperty("webdriver.ie.driver","D:\\IEDriverServer_x64_2.53.1\\IEDriverServer.exe");
driver=new InternetExplorerDriver();
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
url="http://www.forbes.com/companies/icbc/";
driver.get(url);
}

@After
public void tearDown() throws Exception {
driver.quit();
driver.close();
}

@Test
public void test() throws InterruptedException {
Thread.sleep(10000);
WebElement tab=driver.findElement(By.className("large"));
Thread.sleep(1000);
String text= tab.getText();
System.out.println(text);

WebElement col1=driver.findElement(By.tagName("dt"));
//Thread.sleep(1000);
String industry= col1.getText();
if(industry.matches("Industry")){
System.out.println(industry);

WebElement col2=driver.findElement(By.tagName("dd"));
//Thread.sleep(1000);
String industryName= col2.getText();
System.out.println(industryName);
}
String forbesWebsite= driver.getCurrentUrl();
System.out.println(forbesWebsite);
WebElement nextPage=driver.findElement(By.className("next-number"));
nextPage.click();
driver.close();
}
}

我想捕获排名、公司、国家/地区、销售额、销售排名、利润、利润排名、 Assets 、 Assets 排名、市值、市值排名、行业、成立、公司网站、员工、总部城市、CEO 姓名、 Forbes.com 公司信息页面和年份

最佳答案

获取行业文本:

String industryName= driver.findElement(By.xpath("//*[contains(text(),'Industry')]//following::dd[1]")).getText();

获取“Founded”的文本:

String Founded= driver.findElement(By.xpath("//*[contains(text(),'Founded')]//following::dd[1]")).getText();

因此您只需将字符串替换为所需的文本,如下所示

xpath = //*[contains(text(),'String')]//following::dd[1]

关于java - 需要使用selenium从多个网页捕获数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42129874/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com