gpt4 book ai didi

ruby-on-rails - solr,太阳黑子,错误请求,非法字符

转载 作者:行者123 更新时间:2023-12-04 22:12:31 25 4
gpt4 key购买 nike

我正在将太阳黑子搜索引入我的项目。我只通过名称字段搜索就得到了一个 POC。当我引入描述字段并重新索引已售出时,我收到以下错误。

** Invoke sunspot:reindex (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute sunspot:reindex
Skipping progress bar: for progress reporting, add gem 'progress_bar' to your Gemfile
rake aborted!
RSolr::Error::Http: RSolr::Error::Http - 400 Bad Request
Error: {'responseHeader'=>{'status'=>400,'QTime'=>18},'error'=>{'msg'=>'Illegal character ((CTRL-CHAR, code 11))
at [row,col {unknown-source}]: [42,1]','code'=>400}}

Request Data: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><add><doc><field name=\"id\">ItemsDesign 1322</field><field name=\"type\">ItemsDesign</field><field name=\"type\">ActiveRecord::Base</field><field name=\"class_name\">ItemsDesign</field><field name=\"name_text\">River City Clocks Musical Multi-Colored Quartz Cuckoo Clock</field><field name=\"description_text\">This colorful chalet style German quartz cuckoo clock accurately keeps time and plays 12 different melodies. Many colorful flowers are painted on the clock case and figures of a Saint Bernard and Alpine horn player are on each side of the clock dial. Two decorative pine cone weights are suspended beneath the clock case by two chains. The heart shaped pendulum continously swings back and forth.&#13;On every

我假设坏字符是 你可以在底部看到。那 散落在很多描述中。我什至不确定那是什么字符。

我该怎么做才能让 solr 忽略它或清理数据,以便已售出可以处理它。

谢谢

最佳答案

将以下内容放入初始化程序以自动清除任何 UTF8 控制字符的 sunspot 调用:

# config/initializers/sunspot.rb
module Sunspot
#
# DataExtractors present an internal API for the indexer to use to extract
# field values from models for indexing. They must implement the #value_for
# method, which takes an object and returns the value extracted from it.
#
module DataExtractor #:nodoc: all
#
# AttributeExtractors extract data by simply calling a method on the block.
#
class AttributeExtractor
def initialize(attribute_name)
@attribute_name = attribute_name
end

def value_for(object)
Filter.new( object.send(@attribute_name) ).value
end
end

#
# BlockExtractors extract data by evaluating a block in the context of the
# object instance, or if the block takes an argument, by passing the object
# as the argument to the block. Either way, the return value of the block is
# the value returned by the extractor.
#
class BlockExtractor
def initialize(&block)
@block = block
end

def value_for(object)
Filter.new( Util.instance_eval_or_call(object, &@block) ).value
end
end

#
# Constant data extractors simply return the same value for every object.
#
class Constant
def initialize(value)
@value = value
end

def value_for(object)
Filter.new(@value).value
end
end

#
# A Filter to allow easy value cleaning
#
class Filter
def initialize(value)
@value = value
end
def value
strip_control_characters @value
end
def strip_control_characters(value)
return value unless value.is_a? String

value.chars.inject("") do |str, char|
unless char.ascii_only? and (char.ord < 32 or char.ord == 127)
str << char
end
str
end

end
end

end
end

来源(Sunspot Github Issues): Sunspot Solr Reindexing failing due to illegal characters

关于ruby-on-rails - solr,太阳黑子,错误请求,非法字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23375336/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com