gpt4 book ai didi

java - 如何使用 java 和 CMU Sphinx 将音频文件分解为多个片段,然后将分解的音频文件转录为文本

转载 作者:行者123 更新时间:2023-11-30 07:57:24 25 4
gpt4 key购买 nike

我已经编写了一个代码,可以将音频文件转录为文本,但我的问题是我想将音频文件分成几部分,然后我想一个接一个地转录该中断的音频文件,请帮助我

        StreamSpeechRecognizer recognizer;
try
{
recognizer = new StreamSpeechRecognizer( configuration);
java.io.InputStream stream = AppRunner.class.getResourceAsStream(splitFile(new File("/com/dsquare/Arabtec_Construction_INDIA_Private_Limited_convert.wav")));

System.out.println(stream);
stream.skip(44);

// Simple recognition with generic model
recognizer.startRecognition(stream);
SpeechResult result;
while ((result = recognizer.getResult()) != null)
{

System.out.format("Hypothesis: %s\n", result.getHypothesis());

System.out.println("List of recognized words and their times:");
for (WordResult r : result.getWords())
{
System.out.println(r);
}

// System.out.println("Best 3 hypothesis:");
for (String s : result.getNbest(3))
{
System.out.println(s);
}
recognizer.stopRecognition();

}
}
catch (IOException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}







public static String splitFile(File f) throws IOException {
int partCounter = 1;//I like to name parts from 001, 002, 003, ...
//you can change it to 0 if you want 000, 001, ...

int sizeOfFiles = 1024 * 1024;// 1MB
byte[] buffer = new byte[sizeOfFiles];

try (BufferedInputStream bis = new BufferedInputStream(
new FileInputStream(f))) {//try-with-resources to ensure closing stream
String name = f.getName();

int tmp = 0;
while ((tmp = bis.read(buffer)) > 0) {
//write each chunk of data into separate file with different number in name
File newFile = new File(f.getParent(), name + "."
+ String.format("%03d", partCounter++));
try (FileOutputStream out = new FileOutputStream(newFile)) {
out.write(buffer, 0, tmp);//tmp is chunk size
}
}
}
return null;
}

}

最佳答案

要以智能方式破坏音频文件,您可以考虑分类工具,例如 Lium 小组开发的这个工具。

http://www-lium.univ-lemans.fr/diarization/doku.php/welcome

此工具将为您提供一个包含过渡时间的 *.seg 文件。然后,使用ffmpeg或类似工具来剪切文件。

关于java - 如何使用 java 和 CMU Sphinx 将音频文件分解为多个片段,然后将分解的音频文件转录为文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32479267/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com