gpt4 book ai didi

从 UTF-8 到 ISO-8859-1 的 Java 编码为 XML 文件

转载 作者:行者123 更新时间:2023-11-30 04:01:10 25 4
gpt4 key购买 nike

我一直在尝试将 UTF-8 字符串转换为 ISO-8859-1 中的相对字符串,以便将其输出到 XML 文档中,但无论我如何尝试,输出总是显示错误。

为了简化问题,我创建了一个代码片段,其中包含我所做的所有测试,然后复制/粘贴生成的文档。

您还可以确定我通过切换 UTF 和 ISO 尝试了 new String(xxx.getBytes("UTF-8"), "ISO-8859-1") 之间所有可能的组合,有时也通过设置相同的值。没有任何效果!

这是片段:

// @see http://stackoverflow.com/questions/229015/encoding-conversion-in-java
private static String changeEncoding(String input) throws Exception {
// Create the encoder and decoder for ISO-8859-1
Charset charset = Charset.forName("ISO-8859-1");
CharsetDecoder decoder = charset.newDecoder();
CharsetEncoder encoder = charset.newEncoder();

// Convert a string to ISO-LATIN-1 bytes in a ByteBuffer
// The new ByteBuffer is ready to be read.
ByteBuffer bbuf = encoder.encode(CharBuffer.wrap(input));

// Convert ISO-LATIN-1 bytes in a ByteBuffer to a character ByteBuffer and then to a string.
// The new ByteBuffer is ready to be read.
CharBuffer cbuf = decoder.decode(bbuf);
return cbuf.toString();
}

// @see http://stackoverflow.com/questions/655891/converting-utf-8-to-iso-8859-1-in-java-how-to-keep-it-as-single-byte
private static String byteEncoding(String input) throws Exception {
Charset utf8charset = Charset.forName("UTF-8");
Charset iso88591charset = Charset.forName("ISO-8859-1");

ByteBuffer inputBuffer = ByteBuffer.wrap(input.getBytes());

// decode UTF-8
CharBuffer data = utf8charset.decode(inputBuffer);

// encode ISO-8559-1
ByteBuffer outputBuffer = iso88591charset.encode(data);
byte[] outputData = outputBuffer.array();
return new String(outputData, "ISO-8859-1");
}

public static Result home() throws Exception {
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

//root elements
Document doc = docBuilder.newDocument();
doc.setXmlVersion("1.0");
doc.setXmlStandalone(true);

Element rootElement = doc.createElement("test");
doc.appendChild(rootElement);

rootElement.setAttribute("original", "héllo");

rootElement.setAttribute("stringToString", new String("héllo".getBytes("UTF-8"), "ISO-8859-1"));

rootElement.setAttribute("stringToBytes", changeEncoding("héllo"));

rootElement.setAttribute("stringToBytes2", byteEncoding("héllo"));

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");

// The following is Play!Framework specifics for rendering an url, but I believe this is not the problem (I checked in the developer console, the document is correctly in "ISO-8859-1"
response().setHeader("Content-Type", "text/xml; charset=ISO-8859-1");
return ok(output).as("text/xml");
}

结果:

<?xml version="1.0" encoding="ISO-8859-1"?>
<test original="héllo" stringToBytes="héllo" stringToBytes2="héllo" stringToString="héllo"/>

我该如何继续?

最佳答案

由于我无法解释的原因,通过写入文件并将该文件返回到输出修复了编码问题。

我决定保留这个问题,以防其他人遇到类似的问题。

这是片段:

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

File file = new File("Path/to/file.xml");
transformer.transform(new DOMSource(doc), new StreamResult(file));

response().setHeader("Content-Disposition", "attachment;filename=" + file.getName());
response().setHeader("Content-Type", "text/xml; charset=ISO-8859-1");
return ok(file).as("text/xml");

关于从 UTF-8 到 ISO-8859-1 的 Java 编码为 XML 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21967222/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com