gpt4 book ai didi

java - 重构自动检测文件的编码

转载 作者:塔克拉玛干 更新时间:2023-11-03 03:46:53 33 4
gpt4 key购买 nike

我需要检查编码文件。这段代码有效,但有点长。如何能够对这种逻辑进行任何重构。也许可以为此目标使用其他变体?

代码:

class CharsetDetector implements Checker {

Charset detectCharset(File currentFile, String[] charsets) {
Charset charset = null;

for (String charsetName : charsets) {
charset = detectCharset(currentFile, Charset.forName(charsetName));
if (charset != null) {
break;
}
}

return charset;
}

private Charset detectCharset(File currentFile, Charset charset) {
try {
BufferedInputStream input = new BufferedInputStream(
new FileInputStream(currentFile));

CharsetDecoder decoder = charset.newDecoder();
decoder.reset();

byte[] buffer = new byte[512];
boolean identified = false;
while ((input.read(buffer) != -1) && (!identified)) {
identified = identify(buffer, decoder);
}

input.close();

if (identified) {
return charset;
} else {
return null;
}

} catch (Exception e) {
return null;
}
}

private boolean identify(byte[] bytes, CharsetDecoder decoder) {
try {
decoder.decode(ByteBuffer.wrap(bytes));
} catch (CharacterCodingException e) {
return false;
}
return true;
}

@Override
public boolean check(File fileChack) {
if (charsetDetector(fileChack)) {
return true;
}
return false;
}

private boolean charsetDetector(File currentFile) {
String[] charsetsToBeTested = { "UTF-8", "windows-1253", "ISO-8859-7" };

CharsetDetector charsetDetector = new CharsetDetector();
Charset charset = charsetDetector.detectCharset(currentFile,
charsetsToBeTested);

if (charset != null) {
try {
InputStreamReader reader = new InputStreamReader(
new FileInputStream(currentFile), charset);

@SuppressWarnings("unused")
int valueReaders = 0;
while ((valueReaders = reader.read()) != -1) {
return true;
}

reader.close();
} catch (FileNotFoundException exc) {
System.out.println("File not found!");
exc.printStackTrace();
} catch (IOException exc) {
exc.printStackTrace();
}
} else {
System.out.println("Unrecognized charset.");
return false;
}

return true;
}
}

问题:

  • 这个程序逻辑如何重构?
  • 还有哪些检测编码的方法(如 UTF-16 序列等)?

最佳答案

重构此代码的最佳方法是引入一个为您进行字符检测的第 3 方库,因为他们可能做得更好,并且可以使您的代码更小。见this question一些选择

关于java - 重构自动检测文件的编码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15154577/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com