gpt4 book ai didi

java - netbeans : Unrecoverable error while loading a tagger model 中的 StanfordCoreNLP 错误

转载 作者:塔克拉玛干 更新时间:2023-11-01 22:46:09 25 4
gpt4 key购买 nike

我正在尝试使用 StanfordCoreNLP 来区分句子中的单数名词和复数名词。作为开始,我正在使用 http://nlp.stanford.edu/software/corenlp.shtml 中的代码.在 netbeans 8.0 中,我打开了一个新的 java 项目。我已经下载了 stanford-corenlp-full-2014-06-16 并将 jar 文件(包括模型 jar)添加到我的项目中:

enter image description here

代码类 SingularORPlural:

    import java.util.LinkedList;
import java.util.List;
import java.util.Properties;

import edu.stanford.nlp.ling.CoreAnnotations.LemmaAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.util.CoreMap;


/**
*
* @author ha
*/
public class SingularORPlural {

protected StanfordCoreNLP pipeline;

public SingularORPlural() {
// Create StanfordCoreNLP object properties, with POS tagging
// (required for lemmatization), and lemmatization
Properties props;
props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma");

/*
* This is a pipeline that takes in a string and returns various analyzed linguistic forms.
* The String is tokenized via a tokenizer (such as PTBTokenizerAnnotator),
* and then other sequence model style annotation can be used to add things like lemmas,
* POS tags, and named entities. These are returned as a list of CoreLabels.
* Other analysis components build and store parse trees, dependency graphs, etc.
*
* This class is designed to apply multiple Annotators to an Annotation.
* The idea is that you first build up the pipeline by adding Annotators,
* and then you take the objects you wish to annotate and pass them in and
* get in return a fully annotated object.
*
* StanfordCoreNLP loads a lot of models, so you probably
* only want to do this once per execution
*/
this.pipeline = new StanfordCoreNLP(props);
}

public List<String> lemmatize(String documentText)
{
List<String> lemmas = new LinkedList<String>();
// Create an empty Annotation just with the given text
Annotation document = new Annotation(documentText);
// run all Annotators on this text
this.pipeline.annotate(document);
// Iterate over all of the sentences found
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
for(CoreMap sentence: sentences) {
// Iterate over all tokens in a sentence
for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
// Retrieve and add the lemma for each word into the
// list of lemmas
lemmas.add(token.get(LemmaAnnotation.class));
}
}
return lemmas;
}


}

然后在主要部分:

System.out.println("Starting Stanford Lemmatizer");
String text = "How could you be seeing into my eyes like open doors? \n";
SingularORPlural slem = new SingularORPlural();
System.out.println( slem.lemmatize(text) );

我收到这个错误:

    run:
Starting Stanford Lemmatizer
Adding annotator tokenize
Adding annotator ssplit

Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:558)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at stanfordposcode.SingularORPlural.<init>(SingularORPlural.java:51)
at stanfordposcode.StanfordPOSCode.main(StanfordPOSCode.java:74)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:857)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:755)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:97)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:77)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:556)
... 6 more
Caused by: java.io.InvalidClassException: edu.stanford.nlp.tagger.maxent.ExtractorDistsim; local class incompatible: stream classdesc serialVersionUID = 2, local class serialVersionUID = 1
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621) at java.io.ObjectStreamClass.initNonProxy( at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
ObjectStreamClass.java:621)
ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readExtractors(MaxentTagger.java:582)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:808)
... 12 more
Java Result: 1
BUILD SUCCESSFUL (total time: 3 seconds)

我该如何解决这个错误。

最佳答案

我刚刚有同样的错误,所以

失败的原因是您使用的旧标记器文件(“english-left3words-distsim.tagger”)与 StanfordCoreNLP 的 src/binary/byte 代码的新版本不兼容。一切都应该一致/兼容 - 来自同一个盒子/构建。

简单的答案是:确保您使用正确的标记器文件。

这些简单的步骤将有所帮助:

  1. 下载:http://nlp.stanford.edu/software/stanford-corenlp-full-2014-06-16.zip
  2. 将此添加到您的 pom.xml(如果您使用 maven)
<dependencies>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.4</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.4</version>
<classifier>models</classifier>
</dependency>
</dependencies>

然后确保它可以正常工作:

public class TagText {
public static void main(String[] args) throws IOException,
ClassNotFoundException {

// Initialize the tagger
final MaxentTagger tagger = new MaxentTagger("taggers/english-left3words-distsim.tagger");

// The sample string
final String sample1 = "This is a sample text.";
final String sample2 = "The sailor dogs the hatch.";

// The tagged string
final String tagged1 = tagger.tagString(sample1);
final String tagged2 = tagger.tagString(sample2);

// Output the result
System.out.println(tagged1);
System.out.println(tagged2);
}
}

关于java - netbeans : Unrecoverable error while loading a tagger model 中的 StanfordCoreNLP 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24265004/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com