gpt4 book ai didi

Java/Groovy : Find XML node by Line number

转载 作者:行者123 更新时间:2023-12-02 11:54:22 26 4
gpt4 key购买 nike

下面是我的常规代码,用于根据 XSD 验证 XML 模式

import java.io.File;
import java.io.IOException;

import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import javax.xml.transform.sax.SAXSource
import javax.xml.parsers.SAXParserFactory
import org.xml.sax.SAXException
import org.xml.sax.InputSource
import org.xml.sax.SAXParseException
import org.xml.sax.ErrorHandler


def validateXMLSchema(String xsdPath, String xmlPath) {
final List < SAXParseException > exceptions = new LinkedList < SAXParseException > ();
try {
SchemaFactory factory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = factory.newSchema(new File(xsdPath));
Validator validator = schema.newValidator();
validator.setErrorHandler(new ErrorHandler() {
@Override
public void warning(SAXParseException exception) throws SAXException {
exceptions.add(exception);
}

@Override
public void fatalError(SAXParseException exception) throws SAXException {
exceptions.add(exception);
}

@Override
public void error(SAXParseException exception) throws SAXException {
exceptions.add(exception);
}
});
def xmlFile = new File(xmlPath);
validator.validate(new StreamSource(xmlFile));
exceptions.each {
println 'lineNumber : ' + it.lineNumber + '; message : ' + it.message
}
} catch (IOException | SAXException e) {
println("Exception: line ${e.lineNumber} " + e.getMessage());
return false;
}
return exceptions.size() == 0;
}

下面是一些验证错误,我可以访问每条消息的行号,并尝试查找相应的节点名称

lineNumber : 106; message : cvc-datatype-valid.1.2.1: '' is not a valid value for 'date'. 
lineNumber : 248; message : cvc-enumeration-valid: Value 'Associate' is not facet-valid with respect to enumeration '[ADJSTR, ADJSMT]

是否有一种简单的方法可以使用行号查找相应错误消息的节点名称?或者我是否必须读取该特定行并使用如下所示的 XmlSlurper 解析它(尝试避免这种方法,因为在用户负载较重的情况下,生产中的较大 XML 文件会变慢)?

def getNodeName(xmlFile, lineNumber){
def xmlLine = xmlFile.readLines().get(lineNumber)
def node = new XmlSlurper().parseText(xmlLine.toString())
node.name()
}

最佳答案

这并不优雅,但以下 getNodeName() 应该更快 ( full example here ):

def getNodeName(xmlFile, lineNumber) {
def result = "unknown"
def count = 1
def NODE_REGEX = /.*?<(.*?)>.*/
def br

try {
br = new BufferedReader(new FileReader(xmlFile))
String line
def isDone = false
while ((! isDone) && (line = br.readLine()) != null) {
if (count == lineNumber) {
def matcher = (line =~ NODE_REGEX)
if (matcher.matches()) {
result = matcher[0][1]
}
isDone = true
}
count++
}
} finally {
// TODO: better exception handling
br.close()
}

return result
}

它只是读取行直到有问题的行,然后使用基本的正则表达式来获取名称。如果愿意,您可以像示例中那样使用 XmlSlurper。关键是文件 IO/内存应该大大减少。

关于Java/Groovy : Find XML node by Line number,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47701357/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com