gpt4 book ai didi

java - 用于整个文本文件的 OpenNLP 句子检测 API

转载 作者:行者123 更新时间:2023-11-30 09:31:26 31 4
gpt4 key购买 nike

这是用于单个字符串的 OpenNLP Sentence Detector API 的代码:

package opennlp;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;

import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;

public class SentenceDetector {

public static void main(String[] args) throws FileNotFoundException {
InputStream modelIn = new FileInputStream("en-sent.zip");
SentenceModel model = null;
try {
model = new SentenceModel(modelIn);
}
catch (IOException e) {
e.printStackTrace();
}
finally {
if (modelIn != null) {
try {
modelIn.close();
}
catch (IOException e) {
}
}
}
SentenceDetectorME sentenceDetector = new SentenceDetectorME(model);
String sentences[] = sentenceDetector.sentDetect(" First sentence. Second sentence.");

for(String str : sentences)
System.out.println(str);
}
}

现在我的问题是如何传递整个文本文件并执行句子检测而不是单个字符串?

最佳答案

简单方法:将整个文件读取为字符串并以常规方式传递。以下方法将文件内容读取为字符串:

public String readFileToString(String pathToFile) throws Exception{
StringBuilder strFile = new StringBuilder();
BufferedReader reader = new BufferedReader(new FileReader(pathToFile));
char[] buffer = new char[512];
int num = 0;
while((num = reader.read(buffer)) != -1){
String current = String.valueOf(buffer, 0, num);
strFile.append(current);
buffer = new char[512];
}
reader.close();
return strFile.toString();
}

关于java - 用于整个文本文件的 OpenNLP 句子检测 API,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12895145/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com