gpt4 book ai didi

java - 使用 JSoup 从 HTML 中提取图像类型

转载 作者:行者123 更新时间:2023-12-02 06:10:01 25 4
gpt4 key购买 nike

如何使用 JSoup 从 url 中提取图像类型?我正在解析一个 html,它可以发送图像 url(使用 asbUrl() )。但是,我需要测试它的类型。现在,它使所有内容都变成了 .png,这显然不适用于大多数类型。有什么想法吗?

最佳答案

首先将图像保存在文件中。这里有一些代码可以帮助您做到这一点:

public class DownloadImages {

//The url of the website. This is just an example
private static final String webSiteURL = "http://www.supercars.net/gallery/119513/2841/5.html";

//The path of the folder that you want to save the images to
private static final String folderPath = "<FOLDER PATH>";

public static void main(String[] args) {

try {

//Connect to the website and get the html
Document doc = Jsoup.connect(webSiteURL).get();

//Get all elements with img tag ,
Elements img = doc.getElementsByTag("img");

for (Element el : img) {

//for each element get the srs url
String src = el.absUrl("src");

System.out.println("Image Found!");
System.out.println("src attribute is : "+src);

getImages(src);

}

} catch (IOException ex) {
System.err.println("There was an error");
Logger.getLogger(DownloadImages.class.getName()).log(Level.SEVERE, null, ex);
}
}

private static void getImages(String src) throws IOException {

String folder = null;

//Exctract the name of the image from the src attribute
int indexname = src.lastIndexOf("/");

if (indexname == src.length()) {
src = src.substring(1, indexname);
}

indexname = src.lastIndexOf("/");
String name = src.substring(indexname, src.length());

System.out.println(name);

//Open a URL Stream
URL url = new URL(src);
InputStream in = url.openStream();

OutputStream out = new BufferedOutputStream(new FileOutputStream( folderPath+ name));

for (int b; (b = in.read()) != -1;) {
out.write(b);
}
out.close();
in.close();

}

当您将图像保存在磁盘上时,您可以像这样找到它们的扩展名:

String extension = FilenameUtils.getExtension("/path/to/file/image.png");

完成后,也使用 Java 删除文件。

我不确定如何直接从 URL 获取扩展

关于java - 使用 JSoup 从 HTML 中提取图像类型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21972317/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com