gpt4 book ai didi

java - 针对此特定场景使用 XPath、SAX 或 DOM 从 XML 文件中提取值

转载 作者:行者123 更新时间:2023-11-30 04:37:32 25 4
gpt4 key购买 nike

我目前正在从事一个学术项目,使用 JavaXML 进行开发。实际任务是解析 XML,最好在 HashMap 中传递所需的值以进行进一步处理。这是实际 XML 的简短片段。

<root>
<BugReport ID = "1">
<Title>"(495584) Firefox - search suggestions passes wrong previous result to form history"</Title>

<Turn>
<Date>'2009-06-14 18:55:25'</Date>
<From>'Justin Dolske'</From>
<Text>
<Sentence ID = "3.1"> Created an attachment (id=383211) [details] Patch v.2</Sentence>
<Sentence ID = "3.2"> Ah. So, there's a ._formHistoryResult in the....</Sentence>
<Sentence ID = "3.3"> The simple fix it to just discard the service's form history result.</Sentence>
<Sentence ID = "3.4"> Otherwise it's trying to use a old form history result that no longer applies for the search string.</Sentence>
</Text>
</Turn>

<Turn>
<Date>'2009-06-19 12:07:34'</Date>
<From>'Gavin Sharp'</From>
<Text>
<Sentence ID = "4.1"> (From update of attachment 383211 [details])</Sentence>
<Sentence ID = "4.2"> Perhaps we should rename one of them to _fhResult just to reduce confusion?</Sentence>
</Text>
</Turn>

<Turn>
<Date>'2009-06-19 13:17:56'</Date>
<From>'Justin Dolske'</From>
<Text>
<Sentence ID = "5.1"> (In reply to comment #3)</Sentence>
<Sentence ID = "5.2"> &amp;gt; (From update of attachment 383211 [details] [details])</Sentence>
<Sentence ID = "5.3"> &amp;gt; Perhaps we should rename one of them to _fhResult just to reduce confusion?</Sentence>
<Sentence ID = "5.4"> Good point.</Sentence>
<Sentence ID = "5.5"> I renamed the one in the wrapper to _formHistResult. </Sentence>
<Sentence ID = "5.6"> fhResult seemed maybe a bit too short.</Sentence>
</Text>
</Turn>

.....
and so on
</BugReport>

有很多像“Justin Dolske”这样的评论者对此报告发表了评论,我真正寻找的是评论者列表以及他们在整个 XML 文件中编写的所有句子。类似 if(from == justin dolske) getHisAllSentences() 的东西。对于其他评论者(对于所有人)也是如此。我尝试了许多不同的方法来获取仅针对“Justin dolske”或其他评论者的句子,甚至以通用形式供所有使用 XPathSAXDOM 的人使用 但失败了。我对包括JAVA在内的这些技术都很陌生,不知道如何实现它。

任何人都可以具体指导我如何使用上述技术获得它,或者是否有其他更好的策略来做到这一点?

(注意:稍后我想将其放入 hashmap 中,例如 HashMap (key, value) 其中 key = name评论者(贾斯汀·多尔斯克)的值(value)是(所有句子))

我们将非常感谢紧急帮助。

最佳答案

您可以使用多种方法来实现您的要求。

  • 一种方法是使用 JAXB 。网络上有一些关于这方面的教程,因此请随意引用。

  • 您还可以考虑创建一个 DOM,然后从中提取数据,然后将其放入 HashMap 中。

一个引用实现如下所示:

import java.io.File;
import java.util.ArrayList;
import java.util.HashMap;

import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;

public class XMLReader {

private HashMap<String,ArrayList<String>> namesSentencesMap;

public XMLReader() {
namesSentencesMap = new HashMap<String, ArrayList<String>>();
}

private Document getDocument(String fileName){
Document document = null;

try{
document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File(fileName));
}catch(Exception exe){
//handle exception
}

return document;
}

private void buildNamesSentencesMap(Document document){
if(document == null){
return;
}

//Get each Turn block
NodeList turnList = document.getElementsByTagName("Turn");
String fromName = null;

NodeList sentenceNodeList = null;
for(int turnIndex = 0; turnIndex < turnList.getLength(); turnIndex++){
Element turnElement = (Element)turnList.item(turnIndex);
//Assumption: <From> element
Element fromElement = (Element) turnElement.getElementsByTagName("From").item(0);
fromName = fromElement.getTextContent();
//Extracting sentences - First check whether the map contains
//an ArrayList corresponding to the name. If yes, then use that,
//else create a new one
ArrayList<String> sentenceList = namesSentencesMap.get(fromName);
if(sentenceList == null){
sentenceList = new ArrayList<String>();
}
//Extract sentences from the Turn node
try{
sentenceNodeList = turnElement.getElementsByTagName("Sentence");
for(int sentenceIndex = 0; sentenceIndex < sentenceNodeList.getLength(); sentenceIndex++){
sentenceList.add(((Element)sentenceNodeList.item(sentenceIndex)).getTextContent());
}
}finally{
sentenceNodeList = null;
}
//Put the list back in the map
namesSentencesMap.put(fromName, sentenceList);
}
}

public static void main(String[] args) {
XMLReader reader = new XMLReader();
reader.buildNamesSentencesMap(reader.getDocument("<your_xml_file>"));

for(String names: reader.namesSentencesMap.keySet()){
System.out.println("Name: "+names+"\tTotal Sentences: "+reader.namesSentencesMap.get(names).size());
}
}
}

注意:这只是一个演示,您需要对其进行修改以满足您的需要。我根据您的 XML 创建了它,以展示一种实现方法。

关于java - 针对此特定场景使用 XPath、SAX 或 DOM 从 XML 文件中提取值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13107624/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com