gpt4 book ai didi

ruby-on-rails-3.2 - Elasticsearch 查询字符串中的符号

转载 作者:行者123 更新时间:2023-11-29 02:45:20 25 4
gpt4 key购买 nike

我有一个名为偏差的属性的"file"(事件记录)。该属性的值类似于“Bin X”“Bin $”“Bin q”“Bin %”等。

我正在尝试使用 tire/elasticsearch 来搜索属性。我正在使用空白分析器来索引偏差属性。这是我创建索引的代码:

settings :analysis => {
:filter => {
:ngram_filter => {
:type => "nGram",
:min_gram => 2,
:max_gram => 255
},
:deviation_filter => {
:type => "word_delimiter",
:type_table => ['$ => ALPHA']
}
},
:analyzer => {
:ngram_analyzer => {
:type => "custom",
:tokenizer => "standard",
:filter => ["lowercase", "ngram_filter"]
},
:deviation_analyzer => {
:type => "custom",
:tokenizer => "whitespace",
:filter => ["lowercase"]
}
}
} do
mapping do
indexes :id, :type => 'integer'
[:equipment, :step, :recipe, :details, :description].each do |attribute|
indexes attribute, :type => 'string', :analyzer => 'ngram_analyzer'
end
indexes :deviation, :analyzer => 'whitespace'
end
end

当查询字符串不包含特殊字符时,搜索似乎工作正常。例如 Bin X 将只返回那些包含单词 Bin AND X 的记录。但是,搜索诸如 Bin $Bin % 之类的内容会显示包含单词 Bin 的所有结果几乎忽略了符号(带有符号 do 的结果在没有结果的搜索中显示得更高)。

这是我创建的搜索方法

def self.search(params)
tire.search(load: true) do
query { string "#{params[:term].downcase}:#{params[:query]}", default_operator: "AND" }
size 1000
end
end

下面是我构建搜索表单的方式:

<div>
<%= form_tag issues_path, :class=> "formtastic issue", method: :get do %>
<fieldset class="inputs">
<ol>
<li class="string input medium search query optional stringish inline">
<% opts = ["Description", "Detail","Deviation","Equipment","Recipe", "Step"] %>
<%= select_tag :term, options_for_select(opts, params[:term]) %>
<%= text_field_tag :query, params[:query] %>
<%= submit_tag "Search", name: nil, class: "btn" %>
</li>
</ol>
</fieldset>
<% end %>
</div>

最佳答案

您可以清理查询字符串。这是一种 sanitizer ,适用于我尝试扔给它的所有东西:

def sanitize_string_for_elasticsearch_string_query(str)
# Escape special characters
# http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters
escaped_characters = Regexp.escape('\\/+-&|!(){}[]^~*?:')
str = str.gsub(/([#{escaped_characters}])/, '\\\\\1')

# AND, OR and NOT are used by lucene as logical operators. We need
# to escape them
['AND', 'OR', 'NOT'].each do |word|
escaped_word = word.split('').map {|char| "\\#{char}" }.join('')
str = str.gsub(/\s*\b(#{word.upcase})\b\s*/, " #{escaped_word} ")
end

# Escape odd quotes
quote_count = str.count '"'
str = str.gsub(/(.*)"(.*)/, '\1\"\3') if quote_count % 2 == 1

str
end

params[:query] = sanitize_string_for_elasticsearch_string_query(params[:query])

关于ruby-on-rails-3.2 - Elasticsearch 查询字符串中的符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16205341/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com