gpt4 book ai didi

java - Hadoop 文件开头附加的奇怪字符

转载 作者:可可西里 更新时间:2023-11-01 16:39:46 26 4
gpt4 key购买 nike

每当我使用 Java 在 Hadoop 中创建一个新文件并写入内容时,都会在文件开头附加特殊字符。有办法消除吗?下面是代码

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String extractedXML = writer.getBuffer().toString().replaceAll("\\r$", "");
FSDataOutputStream fin = fs.create("/filelocation/input.txt");
fin.writeUTF(extractedXML);
fin.close();


$ hadoop fs -cat /filelocation/input.txt|head -5
)▒hello world
input1
hello again
hello
welcome again

最佳答案

它对我有用,只需替换下面几行

FSDataOutputStream fin = fs.create("/filelocation/input.txt");
fin.writeUTF(extractedXML);
fin.close();

使用以下代码:

OutputStream os = fs.create( "/filelocation/input.txt",  new Progressable() {
public void progress() {

}
});
BufferedWriter br = new BufferedWriter( new OutputStreamWriter( os, "UTF-8" ) );
br.write(extractedXML);
br.close();

关于java - Hadoop 文件开头附加的奇怪字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43692878/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com