gpt4 book ai didi

java - Sax 解析器 - 无法将 XML 文件分割为指定大小

转载 作者:行者123 更新时间:2023-12-02 12:27:54 24 4
gpt4 key购买 nike

我在如何读取 xml 并使用 SAX 解析器将其拆分为多个文件方面遇到了一些困难。考虑一下我们的输入遵循生成的 xml:

<?xml version="1.0" encoding="utf-8"?>
<record-table>
<record>
<record_id>12345</record_id>
<record_rows>
<record_row>str1234</record_row>
</record_rows>
</record>
<footer>
<record_count>12345</record_count>
<record_row_count>12345</record_row_count>
</footer>
</record-table>

为了使它干净和甜蜜,我制作了“TODO”列表:

XML splitting:
* Splits file generated by XML generation functionality in multiple files of configurable size.
* Asks the user XML file location.
* Asks the user maximum single file size in bytes.
* Each split file must conform to schema.
* Elements record_count and record_row_count should contain actual numbers for each file.
* Files should be split as close to specified limit as possible.

截至目前,我多次尝试读取它,程序执行但没有执行任何操作。

代码草案:

    public static void splitXML(File fileToSplit, int splitFileSize) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
XMLReader reader = parser.getXMLReader();

reader.parse(new InputSource(new FileInputStream(fileToSplit)));
reader.setContentHandler(new DefaultHandler() {

public static final String DIRECTORY = "target/results";

private int fileSize = 0;

private File fileLocation;

// counts number of files created
private int fileCount = 0;

// counts characters to decide where to split file
private long charCount = 0;
// data line buffer (is reset when the file is split)
private StringBuilder recordRowDataLines = new StringBuilder();

// temporary variables used for the parser events
private String currentElement = null;
private String currentRecordId = null;
private String currentRecordRowData = null;

public final long TAG_CHAR_SIZE = 5;

@Override
public void startDocument() throws SAXException {
File directory = new File(DIRECTORY);
if(!directory.exists())
directory.mkdir();
}

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
currentElement = qName;
}

@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if(qName.equals("record_rows")) {
try {
savePatch();
} catch (IOException e) {
throw new SAXException(e);
}
}
if (qName.equals("record_row")) { // one record finished - save in buffer & calculate size so far
charCount += tagSize("record_row");
recordRowDataLines.append("<record_row>")
.append(currentRecordRowData)
.append("</record_row>");
if (charCount >= fileSize) { // if max size was reached, save what was read so far in a new file
try {
savePatch();
} catch (IOException ex) {
throw new SAXException(ex);
}
}
}
currentElement = null;
}

@Override
public void characters(char[] ch, int start, int length) throws SAXException {
System.out.println(new String(ch, start, length));
if (currentElement == null) {
return;
}
if (currentElement.equals("record_id")) {
currentRecordId = new String(ch, start, length);
}
if (currentElement.equals("record_row")) {
currentRecordRowData = new String(ch, start, length);
charCount += currentRecordRowData.length(); // storing size so far
}
}

public long tagSize(String tagName) {
return TAG_CHAR_SIZE + tagName.length() * 2; // size of text + tags
}

public void savePatch() throws IOException {
++fileCount;
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("<record part='")
.append(fileCount)
.append("'><record_id>")
.append(currentRecordId)
.append("</record_id>")
.append("<record_rows>")
.append(recordRowDataLines)
.append("</record_rows></record>");
File fragment = new File(DIRECTORY, "data_part_" + fileCount + ".xml");
System.out.println("File " + fragment.getAbsolutePath() + "has been saved!");

try(FileWriter out = new FileWriter(fragment)){
out.write(stringBuilder.toString());
} catch (Exception e) {
e.printStackTrace();
}

//flush current information that was saved.
recordRowDataLines = new StringBuilder();
charCount = 0;
}
});

} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
}
}

主类看起来如何:

public class Main {

public static void main(String[] args) {
System.out.println("Welcome!");

<omitted>
File f = CommonUtils.requestFilePath();
int fileSize = CommonUtils.requestUserValueInt("Enter file split size : ");
XMLSplitter.splitXML(f, fileSize);
}
}

愿你看到,我看不到的东西。请帮忙。

最佳答案

您应该在解析之前调用setContentHandler。

关于java - Sax 解析器 - 无法将 XML 文件分割为指定大小,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45415799/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com