gpt4 book ai didi

elasticsearch - Elasticsearch NGram Analyzer-更改查询结果的顺序

转载 作者:行者123 更新时间:2023-12-02 22:35:39 24 4
gpt4 key购买 nike

Elasticsearch Query根据得分更改显示结果

当前查询按以下顺序给出字段标题的结果。

  • 快速123
  • Foxes快速
  • 快速
  • Foxes Quick快速
  • 快速狐狸

  • 不应该
    3.反而是第一结果?

    另外,Foxs Quick Quick有两次Quick的出现,在Queried结果中应该有一些偏好。但这即将到来的4点。

    索引设置。
     {
    "fundraisers": {
    "settings": {
    "index": {
    "number_of_shards": "5",
    "provided_name": "fundraisers",
    "creation_date": "1546515635025",
    "analysis": {
    "analyzer": {
    "my_analyzer": {
    "filter": [
    "lowercase"
    ],
    "tokenizer": "my_tokenizer"
    },
    "search_analyzer_search": {
    "filter": [
    "lowercase"
    ],
    "tokenizer": "search_tokenizer_search"
    }
    },
    "tokenizer": {
    "my_tokenizer": {
    "token_chars": [
    "letter",
    "digit"
    ],
    "min_gram": "3",
    "type": "edge_ngram",
    "max_gram": "50"
    },
    "search_tokenizer_search": {
    "token_chars": [
    "letter",
    "digit",
    "whitespace"
    ],
    "min_gram": "3",
    "type": "ngram",
    "max_gram": "50"
    }
    }
    },
    "number_of_replicas": "1",
    "uuid": "mVweO4_sT3Ww00MzdLyavw",
    "version": {
    "created": "6020399"
    }
    }
    }
    }
    }

    Query

    GET fundraisers/_search?explain=true

    {
    "query": {
    "match_phrase": {
    "title": {
    "query": "qui",
    "analyzer": "my_analyzer"
    }
    }
    }
    }
    Mapping
    {
    "fundraisers": {
    "mappings": {
    "fundraisers": {
    "properties": {
    "status": {
    "type": "text"
    },
    "suggest": {
    "type": "completion",
    "analyzer": "simple",
    "preserve_separators": true,
    "preserve_position_increments": true,
    "max_input_length": 50
    },
    "title": {
    "type": "text",
    "analyzer": "my_analyzer"
    },
    "twitterUrl": {
    "type": "text",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    },
    "videoLinks": {
    "type": "text",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    },
    "zipCode": {
    "type": "text",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    }
    }
    }
    }
    }
    }

    我是否使用match_phrase,搜索分析器和ngrams使其变得过于复杂,或者有没有更简单的方法来达到预期的结果?

    引用:
    https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-match-query.html

    最佳答案

    好的,首先让我们创建一个最小且可重复的设置:

    PUT test
    {
    "settings": {
    "index": {
    "number_of_shards": "1",
    "number_of_replicas": "1",
    "analysis": {
    "analyzer": {
    "my_analyzer": {
    "filter": [
    "lowercase"
    ],
    "tokenizer": "my_tokenizer"
    },
    "search_analyzer_search": {
    "filter": [
    "lowercase"
    ],
    "tokenizer": "search_tokenizer_search"
    }
    },
    "tokenizer": {
    "my_tokenizer": {
    "token_chars": [
    "letter",
    "digit"
    ],
    "min_gram": "3",
    "type": "edge_ngram",
    "max_gram": "50"
    },
    "search_tokenizer_search": {
    "token_chars": [
    "letter",
    "digit",
    "whitespace"
    ],
    "min_gram": "3",
    "type": "ngram",
    "max_gram": "50"
    }
    }
    }
    }
    },
    "mappings": {
    "_doc": {
    "properties": {
    "title": {
    "type": "text",
    "analyzer": "my_analyzer"
    }
    }
    }
    }
    }

    PUT test/_doc/1
    {
    "title": "Quick 123"
    }
    PUT test/_doc/2
    {
    "title": "Foxes Quick"
    }
    PUT test/_doc/3
    {
    "title": "Quick"
    }
    PUT test/_doc/4
    {
    "title": "Foxes Quick Quick"
    }
    PUT test/_doc/5
    {
    "title": "Quick Foxes"
    }

    然后,让我们尝试最简单的查询:
    GET test/_search
    {
    "query": {
    "match": {
    "title": {
    "query": "qui"
    }
    }
    }
    }

    现在您的订单是:
  • 快速
  • Foxes Quick快速
  • 快速123
  • Foxes快速
  • 快速狐狸

  • 那几乎就是您所期望的,对吗?可能还有其他用例,但此查询未涵盖这些用例,但IMO您必须使用 multi_match并在不同的分析器上进行搜索,因为我不确定Edgegram上的 phrase_search是否有意义。

    关于elasticsearch - Elasticsearch NGram Analyzer-更改查询结果的顺序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54065319/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com