gpt4 book ai didi

elastic-stack - Logstash 无法从 URL 解析数组索引

转载 作者:行者123 更新时间:2023-12-01 03:30:37 25 4
gpt4 key购买 nike

我正在尝试从 URL 中提取查询参数。我正在解析的日志文件中令人不安的一行看起来像这样:

127.0.0.1 - - [09/May/2016:09:32:19 +0200] "GET /ps?attrib[vendor][]=GOK&attrib[vendor][0]=GOK HTTP/1.1" 200 12049 "-" "-"  
attrib的第一次出现产生一个散列(如预期的那样)。但是,第二次出现会导致异常:
IndexError: string not matched
[]= at org/jruby/RubyString.java:3910
set at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-event-2.3.3-java/lib/logstash/util/accessors.rb:64
[]= at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-event-2.3.3-java/lib/logstash/event.rb:136
filter at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-kv-2.1.0/lib/logstash/filters/kv.rb:287
each at org/jruby/RubyHash.java:1342
filter at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-kv-2.1.0/lib/logstash/filters/kv.rb:287
multi_filter at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/filters/base.rb:151
each at org/jruby/RubyArray.java:1613
multi_filter at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/filters/base.rb:148
filter_func at (eval):189
filter_batch at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:267
each at org/jruby/RubyArray.java:1613
inject at org/jruby/RubyEnumerable.java:852
filter_batch at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:265
worker_loop at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:223
start_workers at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:201

我猜这是因为 logstash 将 URL 中的数组索引解释为字符串,而索引实际上是整数。
经过几天的谷歌搜索和尝试不同的配置,我走到了死胡同。知道如何进行这项工作吗?

出于调试目的:

logstash config


input {
file {
path => "/var/log/apache2/some.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}

filter {
grok {
match => {
"message" => '%{IPORHOST:clientip} %{USER:ident} %{USER:auth}\s?(%{NUMBER:seconds:int}\/%{NUMBER:microseconds:int})? \[%{HTTPDATE:timestamp}\] "%{WORD:verb} (%{WORD:schema}:)?[\S]+/(%{DATA:endpoint})\?%{DATA:query_string} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}(\s{1}(?:%{HOSTNAME:backend_used}|-) (?:%{NUMBER:backend_time_seconds:float}|-)s)?'
}
}

urldecode {
field => "query_string"
charset => "ISO-8859-1"
}

kv {
field_split => "&"
source => "query_string"
recursive => true
allow_duplicate_values => false
}

date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
locale => en
}

geoip {
source => "clientip"
}

useragent {
source => "agent"
target => "useragent"
}
}

output {
stdout {
codec => json
}
}

custom dynamic template


{
"template": "apache_elk_example",
"settings": {
"index.refresh_interval": "5s"
},
"mappings": {
"_default_": {
"numeric_detection" : true,
"dynamic_templates": [
{
"message_field": {
"mapping": {
"index": "analyzed",
"omit_norms": true,
"type": "string"
},
"match_mapping_type": "string",
"match": "message"
}
},
{
"string_fields": {
"mapping": {
"index": "analyzed",
"omit_norms": true,
"type": "string",
"dynamic": true,
"fields": {
"raw": {
"index": "not_analyzed",
"ignore_above": 256,
"type": "string"
}
}
},
"match_mapping_type": "string",
"match": "*"
}
}
],
"properties": {
"geoip": {
"dynamic": true,
"properties": {
"location": {
"type": "geo_point"
}
},
"type": "object"
},
"@version": {
"index": "not_analyzed",
"type": "string"
}
},
"_all": {
"enabled": true
}
}
}
}

最佳答案

attrib[vendor][]=GOKattrib[vendor][0]=GOK对于 PHP,在语义上几乎相同,您是否考虑过在 kv 之前删除数字索引?筛选?就像是:

# Remove numeric indices from arrays, otherwise the kv filter will choke, eg:
# attrib[vendor][0]=GOK becomes attrib[vendor][]=GOK
mutate {
gsub => [
"query_string", "\[\d+\]", "[]"
]
}

kv {
...
}

关于elastic-stack - Logstash 无法从 URL 解析数组索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38126600/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com