gpt4 book ai didi

java - 如何仅将具有 xml 标签的文本文件中的某些元素打印到新文本文件中?

转载 作者:太空宇宙 更新时间:2023-11-04 14:39:01 24 4
gpt4 key购买 nike

我需要帮助来解决一些听起来很简单但给我带来了一些麻烦的事情。

我有一个文本文件 (record.txt),其中包含根元素“PatientRecord”和一遍又一遍重复的子标签(“名字”、“年龄”、血型、地址等...)但具有不同的值,因为它是每个人的记录。我只感兴趣将标签之间的值打印到每个人的新文本文件中,但仅限于我想要的元素。例如,对于我上面提到的标签,我只需要姓名和年龄,而不需要该患者的其余信息。如何只打印那些用逗号分隔的值,然后转到下一个患者?这是我到目前为止的代码

    package patient.records;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.OutputStreamWriter;
import java.io.Writer;
public class ProcessRecords {
private static final String FILE = "C:\\Users\\Desktop\\records.txt";
private static final String RECORD_START_TAG = "<PatientRecord>";
private static final String RECORD_END_TAG = "</PatientRecord>";
private static final String newFileName = "C:\\Users\\Desktop\\DataFolder\\";
public static void main(String[] args) throws Exception {
String scan;
FileReader file = new FileReader(FILE);
BufferedReader br = new BufferedReader(file);
Writer writer = null;

while ((scan = br.readLine()) != null)
{
if (scan.contains(RECORD_START_TAG)) {


//This is the logic I am missing that will only grab the element values
//between the tags inside of the file

writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(newFileName + "Record Data" + ".txt"), "utf-8"));
}
else if (scan.contains(RECORD_END_TAG)) {
writer.close();
writer=null;
}
else {
// only write if writer is not null
if (writer!=null) {
writer.write(scan);
}
}
}
br.close();
}
} //This is the end of my code
<小时/>

我正在读取的文本文件(record.txt)如下所示:

<PatientRecord> <---first patient record--->
<---XML Schema goes here--->
<Info>
<age>66</age>
<first_name>john</first_name>
<last_name>smith</last_name>
<mailing_address>200 main street</mailing_address>
<blood_type>AB</blood_type>
<phone_number>000-000-0000</phone_number>
</PatientRecord>
<PatientRecord> <---second patient record--->
<---XML Schema goes here--->
<Info>
<age>27</age>
<first_name>micheal</first_name>
<last_name>thompson</last_name>
<mailing_address>123 baker street</mailing_address>
<blood_type>O</blood_type>
<phone_number>111-222-3333</phone_number>
</PatientRecord>

因此,理论上,如果我只想打印此文本文件中所有患者的名字、邮寄地址和血型标签的值,它应该如下所示:

john, 200 main street, AB
//this line is blank
michael, 123 baker street, O

感谢您的所有帮助。如果您觉得我的代码应该修改,那么我完全赞成。谢谢。

最佳答案

我的第一个直觉是将整个文本内容包裹在某个外部标记周围,并将文本处理为 XML,类似于...

<Patients>
<PatientRecord> <---first patient record--->
<Info>
<age>66</age>
<first_name>john</first_name>
<last_name>smith</last_name>
<mailing_address>200 main street</mailing_address>
<blood_type>AB</blood_type>
<phone_number>000-000-0000</phone_number>
</PatientRecord>
...
</Patients>

但是这有两个问题......

一个<---first patient record--->不是有效的 XML 注释或文本,两个,没有结束 </Info>标签...[叹气]

所以,我的下一个想法是,阅读每个<PatientRecord>个人,作为文本,然后将其处理为 XML ....

问题来了...我们需要删除 <--- ... ---> 包围的任何内容包括小箭头...对此有很多假设,但希望我们可以忽略它...

下一个问题是,我们需要插入结束语 </Info>标签...

之后,一切都变得非常简单......

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

public class Test {

private static final String RECORD_START_TAG = "<PatientRecord>";
private static final String RECORD_END_TAG = "</PatientRecord>";

public static void main(String[] args) {
File records = new File("Records.txt");
try (BufferedReader br = new BufferedReader(new FileReader(records))) {
StringBuilder record = null;
String text = null;
while ((text = br.readLine()) != null) {

if (text.contains("<---") && text.contains("--->")) {
String start = text.substring(0, text.indexOf("<---"));
String end = text.substring(text.indexOf("--->") + 4);
text = start + end;
}

if (text.trim().length() > 0) {
if (text.startsWith(RECORD_START_TAG)) {

record = new StringBuilder(128);
record.append(text);

} else if (text.startsWith(RECORD_END_TAG)) {

record.append("</Info>");
record.append(text);

try (ByteArrayInputStream bais = new ByteArrayInputStream(record.toString().getBytes())) {

Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(bais);
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression exp = xPath.compile("PatientRecord/Info/first_name");
Node firstName = (Node) exp.evaluate(doc, XPathConstants.NODE);

exp = xPath.compile("PatientRecord/Info/mailing_address");
Node address = (Node) exp.evaluate(doc, XPathConstants.NODE);

exp = xPath.compile("PatientRecord/Info/blood_type");
Node bloodType = (Node) exp.evaluate(doc, XPathConstants.NODE);

System.out.println(
firstName.getTextContent() + ", "
+ address.getTextContent() + ", "
+ bloodType.getTextContent());

} catch (ParserConfigurationException | XPathExpressionException | SAXException ex) {
ex.printStackTrace();
}

} else {

record.append(text);

}

}

}
} catch (IOException exp) {
exp.printStackTrace();
}
}

}

打印出...

john, 200 main street, AB
micheal, 123 baker street, O

总而言之,回到给你这个文件的人,打他们一巴掌,然后告诉他们将其转换为有效的 XML 格式...

关于java - 如何仅将具有 xml 标签的文本文件中的某些元素打印到新文本文件中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25195086/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com