elasticsearch - Elasticsearch:如何限制Snowball Analyzer-6ren

elasticsearch - Elasticsearch:如何限制Snowball Analyzer

转载作者：行者123 更新时间：2023-12-03 02:06:56

33

4

使用雪球分析仪，当我查询“房屋”时会得到“房屋”的结果。我需要分析器进行大多数搜索，但是在这种情况下，返回值是无关紧要的。我将如何限制分析仪应对这些情况的工作？

最佳答案

您可以通过keyword_marker和stem_exclusion过滤器执行此操作:

Preventing stemming

The stem_exclusion parameter for language analyzers (see Configuring language analyzers) allowed us to specify a list of words that should not be stemmed. Internally, these language analyzers use the keyword_marker token filter to mark the listed words as keywords, which prevents subsequent stemming token filters from touching those words.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/controlling-stemming.html

Specifying keywords in a file

While the language analyzers only allow us to specify an array of words in the stem_exclusion parameter, the keyword_marker token filter also accepts a keywords_path parameter which allows us to store all of our keywords in a file. The file should contain one word per line, and must be present on every node in the cluster. See Updating stopwords for tips on how to update this file.

此示例(来自文档)显示了如何实现此目的:

PUT /my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "no_stem": {
          "type": "keyword_marker",
          "keywords": [ "skies" ] 
        }
      },
      "analyzer": {
        "my_english": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "no_stem",
            "porter_stem"
          ]
        }
      }
    }
  }
}

具体情况将因您现有的分析仪设置而异，但这可以助您一臂之力。

关于elasticsearch - Elasticsearch:如何限制Snowball Analyzer，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24998462/

33

4

0

文章推荐： xcode - 无法以XCode 6 Swift播放音频

文章推荐： php - 从终端使用 XDebug(仅获得 SSH 访问权限)

文章推荐： ios在页面更改时暂停音频

r - R 中的 Snowball 和 Snowball 包是否不同？
我正在使用 stemDocument用于使用 tm 提取文本文档R 中的包。示例代码: data("crude") crude[[1]] stemDocument(crude[[1]]) 我收到一条错
Solr Snowball 词干分析器与西类牙语不一致
我有这个词干字段: 搜索查询的预期结果 alquileres (rent
r - Snowball Stemmer 只词干最后一个词
我想使用 R 中的 tm 包对纯文本文档语料库中的文档进行词干。当我将 SnowballStemmer 函数应用于语料库的所有文档时，只有每个文档的最后一个词会被词干。 library(tm) lib
elasticsearch - ElasticSearch Snowball Analyzer无法与嵌套查询一起使用
我使用以下映射创建了索引 PUT http://localhost:9200/test1 { "mappings": { "searchText": {
php - Elasticsearch Snowball Analyzer想要确切的词
我一直在为项目使用Elastic Search，但是我发现Snowball Analyzer的结果有点奇怪。以下是我使用的映射示例。 $myTypeMapping = array( '_so
full-text-search - Lucene标准分析仪vs Snowball
刚开始使用Lucene.Net。我使用标准分析器索引了100,000行，运行了一些测试查询，并注意到如果原始术语为单数，则复数查询不会返回结果。我了解雪球分析器增加了词干支持，听起来不错。但是，我想知
search - 定制分析仪Elasticsearch Soundex Plus Snowball
以下对我有用(搜索“测试”还返回带有“测试”的字段): index : analysis : analyzer : default : type : snowball language : engli
java - Weka Snowball Stemmer 给出错误
我有一个这样的程序 - import weka.core.stemmers.SnowballStemmer; public class TestProject{ public static void
r - Snowball 的意大利 Stemmer 替代品
我正在尝试用 R 分析意大利语文本。正如您在文本分析中所做的那样，我已经删除了所有标点符号、特殊字符和意大利语停用词。但是我对 Stemming 有一个问题:只有一个意大利词干提取器(Snowba
elasticsearch - Elasticsearch 中的关键字 "Snowball"是什么意思？
当我使用 elastic search 时，我必须先索引它。在这个过程中，我盲目地使用了 "SNOWBALL", "KEYWORD" n analyzer 列。 Analyzer 的主要用途是什么(我
java - 使用 Snowball Stemmer 时发生不兼容的类更改错误
我被这个问题困扰了三天，但没有找到任何解决方案。我正在使用 Weka 开发人员版本 (3.7.10) 使用 NetBeans 7.3 开发 DM 应用程序。我正在尝试使用 Snowball 词干分析器
elasticsearch - 为什么在Elasticsearch 5.1中删除了 “snowball”分析器
我有Elasticsearch 2.4和许多使用“snowball”分析器的索引，但是今天我更新到5.1，并且此分析器停止工作，为什么要删除它们，以及如何将“snowball”分析器转换为5.1中的等
java - Snowball search_analyzer 不适用于 multi_match 查询
我正在尝试使用雪球分析器对索引进行查询。它似乎无法正常工作。如果我输入“starbucks”，它将返回 0 个结果，但是如果我输入“starbuck”，它将返回名称中包含“Starbucks”的所有数
python - 如何使用 NLTK snowball 词干提取器来提取西类牙语单词列表 Python
我正在尝试使用 NLTK 雪球词干提取器来词干西类牙语，但我遇到了一些我不知道的编码问题。这是我要操作的例句: En diciembre, los precios de la energía sub
java - 如何在 Java 中为 Lucene snowball 编写代码
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29); IndexSearcher indexSearcher; File file
linux - tm 和 Snowball 软件包命令在 Linux 中运行缓慢
我在 R 中使用 tm 和 Snowball 包进行文本挖掘。我最初在装有 Windows 7、8 GB 内存的笔记本电脑上运行它。后来我在一台 64 GB 内存的 Linux (Ubuntu) 机器
java - 在 JAR 中导出时，Weka 和 Snowball 不起作用
这个问题真让我抓狂，回答大多数人的想法:是的，我将 Snowball.jar 添加到了类路径我有一个简单的主类，应该将“going”一词词干为“go”: import weka.core.stem
python - 如何使用 nltk.stem.snowball 阻止 Shakespere/KJV
我想截取早期现代英语文本: sb.stem("loveth") >>> "lov" 显然，我需要做的就是a small tweak到雪球词干分析器: And to put the endings in
go - Snowball Edge - Golang 中的 aws-sdk-go 包 - 无法连接到 S3
我正在使用 Golang 中的 aws-sdk-go 包连接到 Amazon S3 以提供基于云的存储池。我有这个运作良好。我希望能够使用 Snowball 支持批量高速传输，因此我得到了一个 Sno

首页

博学

6Ren·AI

商城

elasticsearch - Elasticsearch:如何限制Snowball Analyzer