- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
这是我的索引
PUT /my_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"my_ascii_folding": {
"type" : "asciifolding",
"preserve_original": "true"
}
},
"analyzer": {
"include_special_character": {
"type": "custom",
"filter": [
"lowercase",
"my_ascii_folding"
],
"tokenizer": "whitespace"
}
}
}
}
}
PUT /my_index/_mapping/formulas
{
"properties": {
"content": {
"type": "text",
"analyzer": "include_special_character"
}
}
}
POST /_bulk
{"index":{"_index":"my_index","_type":"formulas"}}
{"content":"formula =IF(SUM(3;4;5))"}
{"index":{"_index":"my_index","_type":"formulas"}}
{"content":"some if words: dif difuse"}
GET /my_index/_search
{
"query": {
"simple_query_string" : {
"query": "if(",
"analyzer": "include_special_character",
"fields": ["_all"]
}
}
}
GET /my_index/_search
{
"query": {
"simple_query_string" : {
"query": "=if(",
"analyzer": "include_special_character",
"fields": ["_all"]
}
}
}
最佳答案
首先,我要感谢您为在本地获取要处理的数据集所需的所有请求。使查找问题的答案变得容易得多。
这里发生了一些相当有趣的事情。我想指出的第一件事是,使用_all字段时查询实际上发生了什么,因为有些细微的行为很容易引起混乱。
我将依靠_analyze端点来尝试帮助指出此处发生的情况。
首先,这是查询,用于分析如何根据“内容”字段解释查询:
GET my_index/_analyze
{
"analyzer": "include_special_character",
"text": [
"formula =IF(SUM(3;4;5))"
],
"field": "content"
}
{
"tokens": [
{
"token": "formula",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 0
},
{
"token": "=if(sum(3;4;5))",
"start_offset": 8,
"end_offset": 23,
"type": "word",
"position": 1
}
]
}
explain: true
GET my_index/_analyze
{
"analyzer": "include_special_character",
"text": [
"formula =IF(SUM(3;4;5))"
],
"field": "test"
}
{
"tokens": [
{
"token": "formula",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 0
},
{
"token": "=if(sum(3;4;5))",
"start_offset": 8,
"end_offset": 23,
"type": "word",
"position": 1
}
]
}
GET my_index/_analyze
{
"text": [
"formula =IF(SUM(3;4;5))"
],
"field": "test"
}
{
"tokens": [
{
"token": "formula",
"start_offset": 0,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "if",
"start_offset": 9,
"end_offset": 11,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "sum",
"start_offset": 12,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "3;4;5",
"start_offset": 16,
"end_offset": 21,
"type": "<NUM>",
"position": 3
}
]
}
The _all field is just a text field, and accepts the same parameters that other string fields accept, including analyzer, term_vectors, index_options, and store.
GET my_index/_analyze
{
"analyzer": "include_special_character",
"text": [
"some if words: dif difuse"
],
"field": "content"
}
{
"tokens": [
{
"token": "some",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 0
},
{
"token": "if",
"start_offset": 5,
"end_offset": 7,
"type": "word",
"position": 1
},
{
"token": "words:",
"start_offset": 8,
"end_offset": 14,
"type": "word",
"position": 2
},
{
"token": "dif",
"start_offset": 15,
"end_offset": 18,
"type": "word",
"position": 3
},
{
"token": "difuse",
"start_offset": 19,
"end_offset": 25,
"type": "word",
"position": 4
}
]
}
GET my_index/_analyze
{
"analyzer": "include_special_character",
"text": [
"=if("
],
"field": "_all"
}
{
"tokens": [
{
"token": "if",
"start_offset": 1,
"end_offset": 3,
"type": "<ALPHANUM>",
"position": 0
}
]
}
GET /my_index/_search
{
"query": {
"match": {
"_all": "=if("
}
}
}
GET /my_index/_search
{
"query": {
"match": {
"_all": "if("
}
}
}
GET my_index/formulas/AV9GIDTggkgblFY6zpKT/_termvectors?fields=content
{
"_index": "my_index",
"_type": "formulas",
"_id": "AV9GIDTggkgblFY6zpKT",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"content": {
"field_statistics": {
"sum_doc_freq": 7,
"doc_count": 2,
"sum_ttf": 7
},
"terms": {
"=if(sum(3;4;5))": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 8,
"end_offset": 23
}
]
},
"formula": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 7
}
]
}
}
}
}
}
PUT /my_index2
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"my_ascii_folding": {
"type": "asciifolding",
"preserve_original": "true"
}
},
"analyzer": {
"include_special_character_gram": {
"type": "custom",
"filter": [
"lowercase",
"my_ascii_folding"
],
"tokenizer": "ngram_tokenizer"
}
},
"tokenizer": {
"ngram_tokenizer": {
"type": "ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol"
]
}
}
}
}
}
PUT /my_index2/_mapping/formulas
{
"properties": {
"content": {
"type": "text",
"analyzer": "include_special_character_gram"
}
}
}
POST /_bulk
{"index":{"_index":"my_index2","_type":"formulas"}}
{"content":"formula =IF(SUM(3;4;5))"}
{"index":{"_index":"my_index2","_type":"formulas"}}
{"content":"some if words: dif difuse"}
GET my_index2/formulas/AV9GZ3sSgkgblFY6zpK2/_termvectors?fields=content
{
"_index": "my_index2",
"_type": "formulas",
"_id": "AV9GZ3sSgkgblFY6zpK2",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"content": {
"field_statistics": {
"sum_doc_freq": 102,
"doc_count": 2,
"sum_ttf": 106
},
"terms": {
"(3": {
"term_freq": 1,
"tokens": [
{
"position": 46,
"start_offset": 15,
"end_offset": 17
}
]
},
"(3;": {
"term_freq": 1,
"tokens": [
{
"position": 47,
"start_offset": 15,
"end_offset": 18
}
]
},
... Omitting the rest because of max response lengths.
}
}
}
GET /my_index2/_search
{
"query": {
"match": {
"content": {
"analyzer": "keyword",
"query": "=if("
}
}
}
}
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.9511943,
"hits": [
{
"_index": "my_index2",
"_type": "formulas",
"_id": "AV9GZ3sSgkgblFY6zpK2",
"_score": 2.9511943,
"_source": {
"content": "formula =IF(SUM(3;4;5))"
}
},
{
"_index": "my_index2",
"_type": "formulas",
"_id": "AV9GZ3sSgkgblFY6zpK3",
"_score": 0.30116585,
"_source": {
"content": "some if words: dif difuse"
}
}
]
}
}
GET my_index2/_analyze
{
"analyzer": "include_special_character_gram",
"text": [
"=if("
],
"field": "t"
}
{
"tokens": [
{
"token": "=i",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 0
},
{
"token": "=if",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 1
},
{
"token": "=if(",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 2
},
{
"token": "if",
"start_offset": 1,
"end_offset": 3,
"type": "word",
"position": 3
},
{
"token": "if(",
"start_offset": 1,
"end_offset": 4,
"type": "word",
"position": 4
},
{
"token": "f(",
"start_offset": 2,
"end_offset": 4,
"type": "word",
"position": 5
}
]
}
GET my_index2/_analyze
{
"analyzer": "keyword",
"text": [
"=if("
]
}
{
"tokens": [
{
"token": "=if(",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 0
}
]
}
GET /my_index2/_search
{
"query": {
"match": {
"content": {
"query": "=if(",
"analyzer": "keyword"
}
}
}
}
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.56074005,
"hits": [
{
"_index": "my_index2",
"_type": "formulas",
"_id": "AV9GZ3sSgkgblFY6zpK2",
"_score": 0.56074005,
"_source": {
"content": "formula =IF(SUM(3;4;5))"
}
}
]
}
}
PUT /my_index2/_mapping/formulas
{
"properties": {
"content": {
"type": "text",
"analyzer": "include_special_character_gram",
"search_analyzer": "keyword"
}
}
}
GET /my_index2/_search
{
"query": {
"match": {
"content": "=if("
}
}
}
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.56074005,
"hits": [
{
"_index": "my_index2",
"_type": "formulas",
"_id": "AV9GZ3sSgkgblFY6zpK2",
"_score": 0.56074005,
"_source": {
"content": "formula =IF(SUM(3;4;5))"
}
}
]
}
}
GET /my_index2/_search
{
"query": {
"simple_query_string": {
"query": "=if\\(",
"fields": ["content"]
}
}
}
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.56074005,
"hits": [
{
"_index": "my_index2",
"_type": "formulas",
"_id": "AV9GZ3sSgkgblFY6zpK2",
"_score": 0.56074005,
"_source": {
"content": "formula =IF(SUM(3;4;5))"
}
}
]
}
}
关于elasticsearch - 简单查询字符串,带有特殊字符,例如(和=,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46877483/
我以一种特殊的方式收到以下错误。 The point at which the driver is attempting to click on the element was not scrolle
我有一些包含如下方法的编译库: public boolean foo(String userID) { Class ntSystemClass = Thread.currentThread()
假设我有下表 name | genre --------------------- book 1 | scifi book 2 | horror book 3
我正在用代码进行语言翻译。 self.title.text = [NSString stringWithFormat:NSLocalizedString(@"Q%ld", nil), (long)qu
我想这样做,但到目前为止,我所拥有的只是: print("Will you go out with me?") 我希望代码能够正常工作,以便人们可以回答“是/否”,如果回答是"is",则将返回一条消息
这个问题在这里已经有了答案: 关闭 11 年前。 Possible Duplicate: How can I decode html characters in c#? 我有来自 HTML 的字符,
我想在 JavaScript 中对以下形式的字符串执行 ucwords(),它应该返回 Test1_Test2_Test3。 我已经在 SO 上找到了一个 ucwords 函数,但它只需要空格作为新词
“任何长度的正数表示为数字字符数组,因此介于‘0’和‘9’之间。我们知道最重要的密码位于数组索引 0 的位置。 例子: - 号码是 10282 - 数组将是数字 = [1,0,2,8,2] 考虑到这一
我目前正在开发一个显示特殊 unicode 字符(例如 ꁴ)的应用 现在我遇到了在旧设备上无法显示这些符号的问题。我如何知道它是否适用于当前设备? 我是否必须为每个 SDK 版本创建一个虚拟 Andr
在 HTML、XML 和部分 DTD 中,有两种特殊的标记结构: 以感叹号开头的标签结束,例如 和 以问号开头的标签 ,例如 和 我的问题是,这些构造类型中的每一种是否都有不同的名称,或者我是否必
我目前正在用 python 构建一个 shell。shell 可以执行 python 文件,但我还需要添加使用 PIPE 的选项(例如“|”表示第一个命令的输出将是第二个命令的输入)。 为了做到这一点
我的 MVC 项目中的路由无法正常工作... 我希望我所有的 View 都在 Views > Shared 文件夹中,如下所示: Error.cshtml (default) Index.cshtml
我有一个函数: public static ImageIcon GetIconImageFromResource(String path){ URL url = ARMMain.class.g
好的,所以我想在我的 html 页面中包含下面的字符。看起来很简单,只是我找不到它们的 HTML 编码。 注意:我想在没有大小元素的情况下执行此操作,纯文本就可以了 ^_^。 干杯。 最佳答案 你可以
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。 我们不允许提问寻求书籍、工具、软件库等的推荐。您可以编辑问题,以便用事实和引用来回答。 关闭 3 年前。
我是 C# 的新手,正在尝试使用 ASP.Net GridView(框架 3.5),当 gridView 文本包含以下内容时,我发现了一个大问题: ñ/Ñ/á/Á/é/É/í/Í/ó/Ó/ú/Ú or
在 Java 中,我尝试编写一个正则表达式来匹配特殊类型的 HTTP URL: http:///# 所以字符串有 4 段: 字符串文字:“http://”;那么 任意 1 个以上字符的字符串;那么 字
当我写查询时,我在表中有“to”列 SELECT to FROM mytable mysql_error 返回错误,如果将单词to插入``引号,即 SELECT `to` FROM mytable 查
我遇到了一个问题。事实上,我使用越南语文本,我想找到每个包含大写字母(大写字母)的单词。当我使用“re”模块时,我的函数 (temp) 没有捕捉到像“Đà”这样的词。另一种方法 (temp2) 是一次
在我的文本中,我想用一个空格替换以下特殊字符: symbols = ["`", "~", "!", "@", "#", "$", "%", "^", "&", "*", "(", ")", "_",
我是一名优秀的程序员,十分优秀!