gpt4 book ai didi

logstash grok,用 json 过滤器解析一行

转载 作者:行者123 更新时间:2023-12-05 01:00:26 27 4
gpt4 key购买 nike

我正在使用 ELK( Elasticsearch 、kibana、logstash、filebeat)来收集日志。我有一个包含以下几行的日志文件,每一行都有一个 json,我的目标是使用 Logstash Grok 取出 json 中的键/值对并将其转发到 Elasticsearch 。

2018-03-28 13:23:01  charge:{"oldbalance":5000,"managefee":0,"afterbalance":"5001","cardid":"123456789","txamt":1}

2018-03-28 13:23:01 manage:{"cuurentValue":5000,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}

我正在使用 Grok Debugger 来制作正则表达式模式并查看结果。我目前的正则表达式是:

%{TIMESTAMP_ISO8601} %{SPACE} %{WORD:$:data}:{%{QUOTEDSTRING:key1}:%{BASE10NUM:value1}[,}]%{QUOTEDSTRING:key2}:%{BASE10NUM:value2}[,}]%{QUOTEDSTRING:key3}:%{QUOTEDSTRING:value3}[,}]%{QUOTEDSTRING:key4}:%{QUOTEDSTRING:value4}[,}]%{QUOTEDSTRING:key5}:%{BASE10NUM:value5}[,}]

正如我们所看到的,它是硬编码的,因为真实日志中 json 中的键可以是任何单词,值可以是整数、 double 或字符串,而且键的长度会有所不同。所以我的解决方案是 Not Acceptable 。我的解决结果如下图,仅供引用。我正在使用 Grok patterns

我的问题是尝试在 json 中提取键是否明智,因为 Elasticsearch 也使用 json?其次,如果我尝试从 json 中提取键/值,是否有正确、简洁的 Grok 模式?

Grok 模式的当前结果在解析上述行中的第一行时给出以下输出。

{
"TIMESTAMP_ISO8601": [
[
"2018-03-28 13:23:01"
]
],
"YEAR": [
[
"2018"
]
],
"MONTHNUM": [
[
"03"
]
],
"MONTHDAY": [
[
"28"
]
],
"HOUR": [
[
"13",
null
]
],
"MINUTE": [
[
"23",
null
]
],
"SECOND": [
[
"01"
]
],
"ISO8601_TIMEZONE": [
[
null
]
],
"SPACE": [
[
""
]
],
"WORD": [
[
"charge"
]
],
"key1": [
[
""oldbalance""
]
],
"value1": [
[
"5000"
]
],
"key2": [
[
""managefee""
]
],
"value2": [
[
"0"
]
],
"key3": [
[
""afterbalance""
]
],
"value3": [
[
""5001""
]
],
"key4": [
[
""cardid""
]
],
"value4": [
[
""123456789""
]
],
"key5": [
[
""txamt""
]
],
"value5": [
[
"1"
]
]
}

第二次修改

是否可以使用 Logstash 的 Json 过滤器?但在我的情况下,Json 是行/事件的一部分,而不是整个事件是 Json。

================================================ =============

第三版

我看不到解析 json 的更新解决方案功能。我的正则表达式如下:

filter {
grok {
match => {
"message" => [
"%{TIMESTAMP_ISO8601}%{SPACE}%{GREEDYDATA:json_data}"
]
}
}
}


filter {
json{
source => "json_data"
target => "parsed_json"
}
}

它没有 key:value 对,而是 msg+json 字符串。解析的 json 没有被解析。

测试数据如下:

2018-03-28 13:23:01  manage:{"cuurentValue":5000,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}
2018-03-28 13:23:03 payment:{"cuurentValue":5001,"reload":0,"newbalance":"5002","posid":"987654321","something":"new3","additionalFields":2}
2018-03-28 13:24:07 management:{"cuurentValue":5002,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}

[2018-06-04T15:01:30,017][WARN ][logstash.filters.json ] Error parsing json {:source=>"json_data", :raw=>"manage:{\"cuurentValue\":5000,\"payment\":0,\"newbalance\":\"5001\",\"posid\":\"123456789\",\"something\":\"new2\",\"additionalFields\":1}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'manage': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"manage:{"cuurentValue":5000,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}"; line: 1, column: 8]>}
[2018-06-04T15:01:30,017][WARN ][logstash.filters.json ] Error parsing json {:source=>"json_data", :raw=>"payment:{\"cuurentValue\":5001,\"reload\":0,\"newbalance\":\"5002\",\"posid\":\"987654321\",\"something\":\"new3\",\"additionalFields\":2}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'payment': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"payment:{"cuurentValue":5001,"reload":0,"newbalance":"5002","posid":"987654321","something":"new3","additionalFields":2}"; line: 1, column: 9]>}
[2018-06-04T15:01:34,986][WARN ][logstash.filters.json ] Error parsing json {:source=>"json_data", :raw=>"management:{\"cuurentValue\":5002,\"payment\":0,\"newbalance\":\"5001\",\"posid\":\"123456789\",\"something\":\"new2\",\"additionalFields\":1}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'management': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"management:{"cuurentValue":5002,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}"; line: 1, column: 12]>}

请检查结果: enter image description here

最佳答案

您可以使用 GREEDYDATA 将整个 json block 分配给这样的单独字段,

%{TIMESTAMP_ISO8601}%{SPACE}%{GREEDYDATA:json_data}

这将为您的 json 数据创建一个单独的文件,

{
"TIMESTAMP_ISO8601": [
[
"2018-03-28 13:23:01"
]
],
"json_data": [
[
"charge:{"oldbalance":5000,"managefee":0,"afterbalance":"5001","cardid":"123456789","txamt":1}"
]
]
}

然后在 json_data 字段上应用一个json filter,如下所示,

json{
source => "json_data"
target => "parsed_json"
}

关于logstash grok,用 json 过滤器解析一行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49933195/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com