- 使用 Spring Initializr 创建 Spring Boot 应用程序
- 在Spring Boot中配置Cassandra
- 在 Spring Boot 上配置 Tomcat 连接池
- 将Camel消息路由到嵌入WildFly的Artemis上
本文整理了Java中edu.stanford.nlp.process.WordToSentenceProcessor.<init>()
方法的一些代码示例,展示了WordToSentenceProcessor.<init>()
的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。WordToSentenceProcessor.<init>()
方法的具体详情如下:
包路径:edu.stanford.nlp.process.WordToSentenceProcessor
类名称:WordToSentenceProcessor
方法名:<init>
[英]Create a WordToSentenceProcessor using a sensible default list of tokens for sentence ending for English/Latin writing systems. The default set is: {".","?","!"} and any combination of ! or ?, as in !!!?!?!?!!!?!!?!!!. A sequence of two or more consecutive line breaks is taken as a paragraph break which also splits sentences. This is the usual constructor for sentence breaking reasonable text, which uses hard-line breaking, so two blank lines indicate a paragraph break. People commonly use this constructor.
[中]为英语/拉丁语书写系统的句子结尾创建一个WordToSentenceProcessor,使用一个合理的默认标记列表。默认设置为:{”,"?","!"} 以及任何组合!或就像!!!?!?!?!!!?!!?!!!。两个或两个以上连续换行符的序列被视为一个段落换行符,也可以拆分句子。这是用于合理文本断句的常用构造器,它使用硬换行,因此两个空行表示段落中断。人们通常使用这个构造函数。
代码示例来源:origin: stanfordnlp/CoreNLP
public WordsToSentencesAnnotator(boolean verbose, String boundaryTokenRegex,
Set<String> boundaryToDiscard, Set<String> htmlElementsToDiscard,
String newlineIsSentenceBreak, String boundaryMultiTokenRegex,
Set<String> tokenRegexesToDiscard) {
this(verbose, false,
new WordToSentenceProcessor<>(boundaryTokenRegex, null,
boundaryToDiscard, htmlElementsToDiscard,
WordToSentenceProcessor.stringToNewlineIsSentenceBreak(newlineIsSentenceBreak),
(boundaryMultiTokenRegex != null) ? TokenSequencePattern.compile(boundaryMultiTokenRegex) : null, tokenRegexesToDiscard));
}
代码示例来源:origin: stanfordnlp/CoreNLP
/** Return a WordsToSentencesAnnotator that never splits the token stream. You just get one sentence.
*
* @return A WordsToSentenceAnnotator.
*/
public static WordsToSentencesAnnotator nonSplitter() {
WordToSentenceProcessor<CoreLabel> wts = new WordToSentenceProcessor<>(true);
return new WordsToSentencesAnnotator(false, false, wts);
}
代码示例来源:origin: stanfordnlp/CoreNLP
wts = new WordToSentenceProcessor<>();
代码示例来源:origin: stanfordnlp/CoreNLP
/**
* For internal debugging purposes only.
*/
public static void main(String[] args) {
new BasicDocument<String>();
Document<String, Word, Word> htmlDoc = BasicDocument.init("top text <h1>HEADING text</h1> this is <p>new paragraph<br>next line<br/>xhtml break etc.");
System.out.println("Before:");
System.out.println(htmlDoc);
Document<String, Word, Word> txtDoc = new StripTagsProcessor<String, Word>(true).processDocument(htmlDoc);
System.out.println("After:");
System.out.println(txtDoc);
Document<String, Word, List<Word>> sentences = new WordToSentenceProcessor<Word>().processDocument(txtDoc);
System.out.println("Sentences:");
System.out.println(sentences);
}
}
代码示例来源:origin: stanfordnlp/CoreNLP
/** Return a WordsToSentencesAnnotator that splits on newlines (only), which are then deleted.
* This constructor counts the lines by putting in empty token lists for empty lines.
* It tells the underlying splitter to return empty lists of tokens
* and then treats those empty lists as empty lines. We don't
* actually include empty sentences in the annotation, though. But they
* are used in numbering the sentence. Only this constructor leads to
* empty sentences.
*
* @param nlToken Zero or more new line tokens, which might be a {@literal \n} or the fake
* newline tokens returned from the tokenizer.
* @return A WordsToSentenceAnnotator.
*/
public static WordsToSentencesAnnotator newlineSplitter(String... nlToken) {
// this constructor will keep empty lines as empty sentences
WordToSentenceProcessor<CoreLabel> wts =
new WordToSentenceProcessor<>(ArrayUtils.asImmutableSet(nlToken));
return new WordsToSentencesAnnotator(false, true, wts);
}
代码示例来源:origin: stanfordnlp/CoreNLP
public static void addEnhancedSentences(Annotation doc) {
//for every sentence that begins a paragraph: append this sentence and the previous one and see if sentence splitter would make a single sentence out of it. If so, add as extra sentence.
//for each sieve that potentially uses augmentedSentences in original:
List<CoreMap> sentences = doc.get(CoreAnnotations.SentencesAnnotation.class);
WordToSentenceProcessor wsp =
new WordToSentenceProcessor(WordToSentenceProcessor.NewlineIsSentenceBreak.NEVER); //create SentenceSplitter that never splits on newline
int prevParagraph = 0;
for(int i = 1; i < sentences.size(); i++) {
CoreMap sentence = sentences.get(i);
CoreMap prevSentence = sentences.get(i-1);
List<CoreLabel> tokensConcat = new ArrayList<>();
tokensConcat.addAll(prevSentence.get(CoreAnnotations.TokensAnnotation.class));
tokensConcat.addAll(sentence.get(CoreAnnotations.TokensAnnotation.class));
List<List<CoreLabel>> sentenceTokens = wsp.process(tokensConcat);
if(sentenceTokens.size() == 1) { //wsp would have put them into a single sentence --> add enhanced sentence.
sentence.set(EnhancedSentenceAnnotation.class, constructSentence(sentenceTokens.get(0), prevSentence, sentence));
}
}
}
代码示例来源:origin: stanfordnlp/CoreNLP
new WordToSentenceProcessor<>(ArrayUtils.asImmutableSet(new String[]{"\n"}));
this.countLineNumbers = true;
this.wts = wts1;
new WordToSentenceProcessor<>(ArrayUtils.asImmutableSet(new String[]{System.lineSeparator(), "\n"}));
this.countLineNumbers = true;
this.wts = wts1;
new WordToSentenceProcessor<>(ArrayUtils.asImmutableSet(new String[]{PTBTokenizer.getNewlineToken()}));
this.countLineNumbers = true;
this.wts = wts1;
if (Boolean.parseBoolean(isOneSentence)) { // this method treats null as false
WordToSentenceProcessor<CoreLabel> wts1 = new WordToSentenceProcessor<>(true);
this.countLineNumbers = false;
this.wts = wts1;
this.wts = new WordToSentenceProcessor<>(boundaryTokenRegex, boundaryFollowersRegex,
boundariesToDiscard, htmlElementsToDiscard,
WordToSentenceProcessor.stringToNewlineIsSentenceBreak(nlsb),
代码示例来源:origin: edu.stanford.nlp/corenlp
public WordsToSentencesAnnotator(boolean verbose) {
VERBOSE = verbose;
wts = new WordToSentenceProcessor<CoreLabel>();
}
代码示例来源:origin: com.guokr/stan-cn-com
public WordsToSentencesAnnotator(boolean verbose) {
this(verbose, false, new WordToSentenceProcessor<CoreLabel>());
}
代码示例来源:origin: com.guokr/stan-cn-com
/** Return a WordsToSentencesAnnotator that never splits the token stream. You just get one sentence.
*
* @param verbose Whether it is verbose.
* @return A WordsToSentenceAnnotator.
*/
public static WordsToSentencesAnnotator nonSplitter(boolean verbose) {
WordToSentenceProcessor<CoreLabel> wts = new WordToSentenceProcessor<CoreLabel>(true);
return new WordsToSentencesAnnotator(verbose, false, wts);
}
代码示例来源:origin: edu.stanford.nlp/stanford-corenlp
public WordsToSentencesAnnotator(boolean verbose, String boundaryTokenRegex,
Set<String> boundaryToDiscard, Set<String> htmlElementsToDiscard,
String newlineIsSentenceBreak, String boundaryMultiTokenRegex,
Set<String> tokenRegexesToDiscard) {
this(verbose, false,
new WordToSentenceProcessor<>(boundaryTokenRegex, null,
boundaryToDiscard, htmlElementsToDiscard,
WordToSentenceProcessor.stringToNewlineIsSentenceBreak(newlineIsSentenceBreak),
(boundaryMultiTokenRegex != null) ? TokenSequencePattern.compile(boundaryMultiTokenRegex) : null, tokenRegexesToDiscard));
}
代码示例来源:origin: edu.stanford.nlp/stanford-corenlp
/** Return a WordsToSentencesAnnotator that never splits the token stream. You just get one sentence.
*
* @return A WordsToSentenceAnnotator.
*/
public static WordsToSentencesAnnotator nonSplitter() {
WordToSentenceProcessor<CoreLabel> wts = new WordToSentenceProcessor<>(true);
return new WordsToSentencesAnnotator(false, false, wts);
}
代码示例来源:origin: com.guokr/stan-cn-com
public WordsToSentencesAnnotator(boolean verbose, String boundaryTokenRegex,
Set<String> boundaryToDiscard, Set<String> htmlElementsToDiscard,
String newlineIsSentenceBreak) {
this(verbose, false,
new WordToSentenceProcessor<CoreLabel>(boundaryTokenRegex,
boundaryToDiscard, htmlElementsToDiscard,
WordToSentenceProcessor.stringToNewlineIsSentenceBreak(newlineIsSentenceBreak)));
}
代码示例来源:origin: edu.stanford.nlp/corenlp
public static WordsToSentencesAnnotator newlineSplitter(boolean verbose) {
WordToSentenceProcessor<CoreLabel> wts =
new WordToSentenceProcessor<CoreLabel>("",
Collections.<String>emptySet(),
Collections.singleton("\n"));
return new WordsToSentencesAnnotator(wts, verbose);
}
代码示例来源:origin: stackoverflow.com
//split via PTBTokenizer (PTBLexer)
List<CoreLabel> tokens = PTBTokenizer.coreLabelFactory().getTokenizer(new StringReader(text)).tokenize();
//do the processing using stanford sentence splitter (WordToSentenceProcessor)
WordToSentenceProcessor processor = new WordToSentenceProcessor();
List<List<CoreLabel>> splitSentences = processor.process(tokens);
//for each sentence
for (List<CoreLabel> s : splitSentences) {
//for each word
for (CoreLabel token : s) {
//here you can get the token value and position like;
//token.value(), token.beginPosition(), token.endPosition()
}
}
代码示例来源:origin: com.guokr/stan-cn-com
public WordsToSentencesAnnotator(boolean verbose, String boundaryTokenRegex,
Set<String> boundaryToDiscard, Set<String> htmlElementsToDiscard,
String newlineIsSentenceBreak, String boundaryMultiTokenRegex,
Set<String> tokenRegexesToDiscard) {
this(verbose, false,
new WordToSentenceProcessor<CoreLabel>(boundaryTokenRegex,
boundaryToDiscard, htmlElementsToDiscard,
WordToSentenceProcessor.stringToNewlineIsSentenceBreak(newlineIsSentenceBreak),
(boundaryMultiTokenRegex != null)? TokenSequencePattern.compile(boundaryMultiTokenRegex):null, tokenRegexesToDiscard));
}
代码示例来源:origin: edu.stanford.nlp/corenlp
/**
* For internal debugging purposes only.
*/
public static void main(String[] args) {
new BasicDocument<String>();
Document<String, Word, Word> htmlDoc = BasicDocument.init("top text <h1>HEADING text</h1> this is <p>new paragraph<br>next line<br/>xhtml break etc.");
System.out.println("Before:");
System.out.println(htmlDoc);
Document<String, Word, Word> txtDoc = new StripTagsProcessor<String, Word>(true).processDocument(htmlDoc);
System.out.println("After:");
System.out.println(txtDoc);
Document<String, Word, List<Word>> sentences = new WordToSentenceProcessor<Word>().processDocument(txtDoc);
System.out.println("Sentences:");
System.out.println(sentences);
}
}
代码示例来源:origin: edu.stanford.nlp/stanford-corenlp
/**
* For internal debugging purposes only.
*/
public static void main(String[] args) {
new BasicDocument<String>();
Document<String, Word, Word> htmlDoc = BasicDocument.init("top text <h1>HEADING text</h1> this is <p>new paragraph<br>next line<br/>xhtml break etc.");
System.out.println("Before:");
System.out.println(htmlDoc);
Document<String, Word, Word> txtDoc = new StripTagsProcessor<String, Word>(true).processDocument(htmlDoc);
System.out.println("After:");
System.out.println(txtDoc);
Document<String, Word, List<Word>> sentences = new WordToSentenceProcessor<Word>().processDocument(txtDoc);
System.out.println("Sentences:");
System.out.println(sentences);
}
}
代码示例来源:origin: edu.stanford.nlp/stanford-parser
/**
* For internal debugging purposes only.
*/
public static void main(String[] args) {
new BasicDocument<String>();
Document<String, Word, Word> htmlDoc = BasicDocument.init("top text <h1>HEADING text</h1> this is <p>new paragraph<br>next line<br/>xhtml break etc.");
System.out.println("Before:");
System.out.println(htmlDoc);
Document<String, Word, Word> txtDoc = new StripTagsProcessor<String, Word>(true).processDocument(htmlDoc);
System.out.println("After:");
System.out.println(txtDoc);
Document<String, Word, List<Word>> sentences = new WordToSentenceProcessor<Word>().processDocument(txtDoc);
System.out.println("Sentences:");
System.out.println(sentences);
}
}
代码示例来源:origin: com.guokr/stan-cn-com
/**
* For internal debugging purposes only.
*/
public static void main(String[] args) {
new BasicDocument<String>();
Document<String, Word, Word> htmlDoc = BasicDocument.init("top text <h1>HEADING text</h1> this is <p>new paragraph<br>next line<br/>xhtml break etc.");
System.out.println("Before:");
System.out.println(htmlDoc);
Document<String, Word, Word> txtDoc = new StripTagsProcessor<String, Word>(true).processDocument(htmlDoc);
System.out.println("After:");
System.out.println(txtDoc);
Document<String, Word, List<Word>> sentences = new WordToSentenceProcessor<Word>().processDocument(txtDoc);
System.out.println("Sentences:");
System.out.println(sentences);
}
}
何时使用 init、带参数的 init 或 iOS 8 (Swift) 中的便利 init, Objective-C 中的 convenience init 等价于什么? 最佳答案 你的类将有一个必需
我正在阅读这个 First Search Program - Artificial Intelligence for Robotics 算法,我正在阅读它的 Python 代码。在这里,我们创建了一个
我觉得答案很明显,但我一直无法弄清楚,这对我来说似乎是一个反复出现的问题。基本上我想做这样的事情: extension NSData { convenience init(JSONObject
cloud-init 是在首次启动时在虚拟机上执行各种配置的包。您必须使用您的配置配置一个文件,然后将其扔到您的 VM 上,然后对其进行虚拟化。 但它究竟是如何工作的呢?用户数据如何发送到 VM,cl
我目前正在与 CoreOS 打交道,到目前为止,我认为我已经掌握了总体思路和概念。我还没有得到的一件事是执行 cloud-init . 我明白 cloud-init是一个为 CoreOS 做一些配置的
部署项目后,当客户端第一次向 TestServlet 发送请求时,服务器会创建 testServlet 对象,然后调用第一个 init() 方法(init(ServletConfig config))
是否有可能在 convenience init 中以某种方式解包可选 init? convenience init(...) { self.init?(...) ?? self.init() }
使用时 write_files使用 cloud-init,是否可以附加内容?如果是这样,如何? write_files: [ { "path": "/home/user/some-file
我对 Ansible 比较陌生,我创建了一个剧本,可以在“裸”服务器上安装 Tomcat 配置。我想知道如何解决能够更新 init.d 脚本的问题,同时避免在脚本没有更改时在剧本开始时停止服务。这是基
我打算在 iOS 中使用参数调用 init 方法中的默认 init 方法。像这样: -(id)init{ self = [super init]; if (self) {
Objective C 规范(来自 Apple)第 49 页指出每个声明实例变量的类都必须提供一个 init 方法来初始化它们 我的问题 -为什么这是必要的? NSObject 不会为实例初始化 iV
所以我有一个带有指定初始化器的类,它为每个存储的属性取值。我所有存储的属性也有一个默认值,所以我假设这个类有一个默认的初始化。 在我指定的 init 中,我调用 super.init() 问题是,如果
我对此有些困惑: class Person { var name: String var age: Int init(){ name = “Tim”
我有一个带有两个初始化方法的对象。其中一个接受 NSDictionary,另一个接受一大堆 String 变量。我想调用 NSDictionary init,然后从那里将我的字典转换为字符串,然后用我
我正在尝试为我的类创建一个方便的初始化:User。我之前为另一个类(class)做过这个,并且 - 再次创建它 - 我使用了相同的代码,只是我的用户类(class)有所不同。 这是我的用户类: imp
我已经通读了以下秘诀,它展示了一种使用 Google Cloud Endpoints 后端为 AngularJS 前端提供动力的方法: https://cloud.google.com/resourc
本文整理了Java中com.netflix.zuul.init.ZuulFiltersModule.()方法的一些代码示例,展示了ZuulFiltersModule.()的具体用法。这些代码示例主要来
我想实现一个 初始化 下功能 box.once() 在 Tarantool 中只执行一次,但是,只有在 时才对我有用初始化 已成功执行。 问题 : 如何使“onceinit”记录只有在 init 成功
如果在Xcode中创建新的游戏模板项目,则默认GameViewController将使用以下初始化程序实例化游戏场景: let scene = SKScene(fileNamed: "GameScen
我有一个 MKPolyline我要实现的 subblas NSCoding , IE。 @interface RSRoutePolyline : MKPolyline I asked a quest
我是一名优秀的程序员,十分优秀!