gpt4 book ai didi

java - 解析 XML 文件时跳过除 Java 标签之外的所有标签

转载 作者:行者123 更新时间:2023-12-01 17:21:44 26 4
gpt4 key购买 nike

我有一个 XML 文件:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "dblp-2019-11-22.dtd">
<dblp>
<phdthesis mdate="2016-05-04" key="phd/dk/Heine2010">
<author>Carmen Heine</author>
<title>Modell zur Produktion von Online-Hilfen.</title>
<year>2010</year>
<school>Aarhus University</school>
<pages>1-315</pages>
<isbn>978-3-86596-263-8</isbn>
<ee>http://d-nb.info/996064095</ee>
</phdthesis><phdthesis mdate="2020-02-12" key="phd/Hoff2002">
<author>Gerd Hoff</author>
<title>Ein Verfahren zur thematisch spezialisierten Suche im Web und seine Realisierung im Prototypen HomePageSearch</title>
<year>2002</year> ....(continue to have info about published books.)

从该文件中,我只想导出有关“年份”标签的详细信息。我已经尝试过这段代码:

public class Publications {
String year1="YEAR";
public static void main(String[] args) {
{
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean year = false;

//parser starts parsing a specific element inside the document
public void startElement(String uri, String localName, String year1 , Attributes attributes) throws SAXException {
System.out.println("Start Element :" + year1);
if (year1.equalsIgnoreCase("YEAR")) {
year = true;
}

}

//parser ends parsing the specific element inside the document
public void endElement(String uri, String localName, String year1) throws SAXException {
System.out.println("End Element:" + year1);
}

//reads the text value of the currently parsed element
public void characters(char ch[], int start, int length) throws SAXException {
if (year) {
System.out.println("Year : " + new String(ch, start, length));
year = false;
}
}
};
saxParser.parse("dblp-2020-04-01.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}

我得到的结果不是我预期的。它从所有标签(包括年份标签)导出更多详细信息。

Start Element :ee
End Element:ee
End Element:phdthesis
Start Element :phdthesis
Start Element :author
End Element:author
Start Element :title
End Element:title
Start Element :year
Year : 1990
End Element:year (...)

是否有关于仅导出“年份”标签的详细信息的代码建议?

最佳答案

您可以使用Declarative Stream Mapping (DSM)流解析库。您可以轻松定义要从 XML 中提取的数据

以下是 XML 映射示例:

result:
path: /dblp/phdthesis/year
type: array

Java代码:

DSM dsm=new DSMBuilder(new File("path/to/mapping.yaml")).setType(DSMBuilder.TYPE.XML).create();
Object result= dsm.toObject(xmlFileContent);
// json represntation fo result
dsm.getObjectMapper().writerWithDefaultPrettyPrinter().writeValue(System.out, object);

JSON 格式的结果:

 [ "2010", "2002" ]

工作示例:https://repl.it/@MehmetFatihFat3/DSMFilterData1

关于java - 解析 XML 文件时跳过除 Java 标签之外的所有标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61280069/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com