gpt4 book ai didi

java - 使用java从网页中提取数据

转载 作者:太空宇宙 更新时间:2023-11-04 14:08:35 24 4
gpt4 key购买 nike

我正在开展一个项目,该项目包括从网络上收集工作机会。因此,作为第一步,我想从特定网页中提取数据(工作机会数据)。所以我想知道是否有 API 或现有代码可以帮助我。

最佳答案

例如,您可以使用以下命令来发出请求:

    import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.protocol.HTTP;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class ... {

Document doc;

HttpClient client = HttpClientBuilder.create().build();
HttpGet requestGet = new HttpGet(url + params);
HttpResponse response = client.execute(requestGet);
HttpEntity entity = response.getEntity();
String responseString = EntityUtils.toString(entity, "UTF-8");

/*
* Here you can retrive the information with Jsoup library
* in thi example extract data from a table element
*/
doc = Jsoup.parse(response);
Element elementsByTag = doc.getElementsByTag("table").get(1);

Elements rows = elementsByTag.getElementsByTag("tr");
for (Element row : rows) {
\\TODO
}
}

关于java - 使用java从网页中提取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28603012/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com