gpt4 book ai didi

java - 如何使用伊利诺伊州分词器以句子作为输入?

转载 作者:行者123 更新时间:2023-12-01 11:44:46 26 4
gpt4 key购买 nike

我正在尝试使用 Illinois Chunker以每个句子为基础。从某种意义上来说,提供的入口点是以下代码片段:

public class ChunksAndPOSTags {
public static void main(String[] args) {
String filename = null;
try {
filename = args[0];
if (args.length > 1) throw new Exception();
}
catch (Exception e) {
System.err.println("usage: java edu.illinois.cs.cogcomp.lbj.chunk.ChunksAndPOSTags <input file>");
System.exit(1);
}

Chunker chunker = new Chunker();
Parser parser = new PlainToTokenParser(
new WordSplitter(new SentenceSplitter(filename)));
String previous = "";
for (Word w = (Word) parser.next(); w != null; w = (Word) parser.next()) {
String prediction = chunker.discreteValue(w);
if (prediction.startsWith("B-") ||
prediction.startsWith("I-") &&
!previous.endsWith(prediction.substring(2)))
System.out.print("[" + prediction.substring(2) + " ");
System.out.print("(" + w.partOfSpeech + " " + w.form + ") ");
if (!prediction.equals("O") &&
(w.next == null ||
chunker.discreteValue(w.next).equals("O") ||
chunker.discreteValue(w.next).startsWith("B-") ||
!chunker.discreteValue(w.next).endsWith(prediction.substring(2))))
System.out.print("] ");
if (w.next == null)
System.out.println();
previous = prediction;
}
}
}

我们如何修改上面的内容来一次一个句子而不是给出一个文本文件?

最佳答案

您应该创建自己的 SentenceParser,它只会返回您的字符串(您的“一次一个句子”)。

下面是示例代码

import LBJ2.parse.Parser;
import LBJ2.nlp.Sentence;

public class FakeSentenceSplitter implements Parser {

private final String sentenceText;

public FakeSentenceSplitter(String sentenceText) {
super();
this.sentenceText = sentenceText;
}

public Object next() {
return new Sentence(sentenceText);
}

public void reset() {
}

public void close() {
}
}

如果您还没有使用LBJ2软件包,可以下载here .

之后,您应该在这一行中使用新的句子拆分器:

Parser parser = new PlainToTokenParser(
new WordSplitter(new FakeSentenceSplitter(filename)));

关于java - 如何使用伊利诺伊州分词器以句子作为输入?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29241471/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com