gpt4 book ai didi

stanford-nlp - 如何将自定义 TokensRegex 规则注释器与斯坦福 CoreNLP 服务器一起使用?

转载 作者:行者123 更新时间:2023-12-02 07:10:02 27 4
gpt4 key购买 nike

通过命令行使用 CoreNLP 时,TokensRegex 规则颜色注释器 (stanford-corenlp-full-2016-10-31/tokensregex/color.rules.txt) 加载成功,但在 Web 服务器上加载失败与java.lang.IllegalArgumentException:未知注释器:颜色

设置

# custom.properties
annotators=tokenize,ssplit,pos,lemma,ner,regexner,color
customAnnotatorClass.color = edu.stanford.nlp.pipeline.TokensRegexAnnotator
color.rules = tokensregex/color.rules.txt

命令行

$ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props custom.properties -file ./tokensregex/color.input.txt -outputFormat text
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator color with class edu.stanford.nlp.pipeline.TokensRegexAnnotator
...
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator color
[main] INFO edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor - Reading TokensRegex rules from tokensregex/color.rules.txt
[main] INFO edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor - Read 7 rules

# color.input.txt.output
Sentence #1 (9 tokens):
Both blue and light blue are nice colors.
[Text=Both CharacterOffsetBegin=0 CharacterOffsetEnd=4 PartOfSpeech=CC Lemma=both NamedEntityTag=O]
[Text=blue CharacterOffsetBegin=5 CharacterOffsetEnd=9 PartOfSpeech=JJ Lemma=blue NamedEntityTag=COLOR NormalizedNamedEntityTag=#0000FF]
...

服务器

  1. java -mx2g -cp "*"edu.stanford.nlp.pipeline.StanfordCoreNLPServer -c custom.properties
  2. wget --post-data '蓝色和浅蓝色都是不错的颜色。' 'localhost:9000/?properties={"annotators":"tokenize,ssplit,pos,lemma,ner,regexner,color","outputFormat":"json"}' -O -

    HTTP request sent, awaiting response... 500 Internal Server Error
    2016-11-05 14:41:27 ERROR 500: Internal Server Error.

    java.lang.IllegalArgumentException: Unknown annotator: color
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.ensurePrerequisiteAnnotators(StanfordCoreNLP.java:304)
    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.getProperties(StanfordCoreNLPServer.java:713)
    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.handle(StanfordCoreNLPServer.java:540)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
    at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

解决方案

在请求中包含自定义注释器属性:wget --post-data '蓝色和浅蓝色都是不错的颜色。' 'localhost:9000/?properties={"color.rules":"tokensregex/color.rules.txt","customAnnotatorClass.color":"edu.stanford.nlp.pipeline.TokensRegexAnnotator","annotators":"tokenize, ssplit,pos,lemma,ner,regexner,color","enforceRequirements":"false","outputFormat":"json"}' -O -

最佳答案

添加

"enforceRequirements":"false"

按照您的要求,应该会停止此错误!

关于stanford-nlp - 如何将自定义 TokensRegex 规则注释器与斯坦福 CoreNLP 服务器一起使用?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40441963/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com