gpt4 book ai didi

java - 如何从解析的文本中提取名词短语

转载 作者:塔克拉玛干 更新时间:2023-11-02 07:44:50 27 4
gpt4 key购买 nike

我已经用选区解析器解析了一个文本,将结果复制到如下文本文件中:

(ROOT (S (NP (NN Yesterday)) (, ,) (NP (PRP we)) (VP (VBD went) (PP (TO to)....
(ROOT (FRAG (SBAR (SBAR (IN While) (S (NP (PRP I)) (VP (VBD was) (NP (NP (EX...
(ROOT (S (NP (NN Yesterday)) (, ,) (NP (PRP I)) (VP (VBD went) (PP (TO to.....
(ROOT (FRAG (SBAR (SBAR (IN While) (S (NP (NNP Jim)) (VP (VBD was) (NP (NP (....
(ROOT (S (S (NP (PRP I)) (VP (VBD started) (S (VP (VBG talking) (PP.....

我需要从此文本文件中提取所有名词短语 (NP)。我编写了以下代码,仅从每行中提取第一个 NP。但是,我需要提取所有名词短语。我的代码是:

public class nounPhrase {

public static int findClosingParen(char[] text, int openPos) {
int closePos = openPos;
int counter = 1;
while (counter > 0) {
char c = text[++closePos];
if (c == '(') {

counter++;
}
else if (c == ')') {
counter--;
}
}
return closePos;
}

public static void main(String[] args) throws IOException {

ArrayList npList = new ArrayList ();
String line;
String line1;
int np;

String Input = "/local/Input/Temp/Temp.txt";

String Output = "/local/Output/Temp/Temp-out.txt";

FileInputStream fis = new FileInputStream (Input);
BufferedReader br = new BufferedReader(new InputStreamReader(fis,"UTF-8"
));
while ((line = br.readLine())!= null){
char[] lineArray = line.toCharArray();
np = findClosingParen (lineArray, line.indexOf("(NP"));
line1 = line.substring(line.indexOf("(NP"),np+1);
System.out.print(line1+"\n");
}
}
}

输出是:

(NP (NN Yesterday))...I need other NPs in this line also
(NP (PRP I)).....I need other NPs in this line also
(NP (NNP Jim)).....I need other NPs in this line also
(NP (PRP I)).....I need other NPs in this line also

我的代码只采用每行的第一个 NP 及其右括号,但我需要从文本中提取所有 NP。

最佳答案

虽然编写自己的树解析器是一个很好的练习(!),但如果您只想要结果,最简单的方法是使用 Stanford NLP 工具的更多功能,即 Tregex ,专为此类事情而设计。您可以更改最终的 while循环到这样的事情:

TregexPattern tPattern = TregexPattern.compile("NP");
while ((line = br.readLine()) != null) {
Tree t = Tree.valueOf(line);
TregexMatcher tMatcher = tPattern.matcher(t);
while (tMatcher.find()) {
System.out.println(tMatcher.getMatch());
}
}

关于java - 如何从解析的文本中提取名词短语,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29179453/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com