gpt4 book ai didi

elasticsearch - Grok模式用于数据,该数据由带有空格和可选值的管道分隔

转载 作者:行者123 更新时间:2023-12-02 23:58:56 25 4
gpt4 key购买 nike

我有一个文本文件/日志文件,其中的值由管道符号分隔。 “|”有多个空格。

我也只是想在没有gsub的情况下尝试。

下面是一个例子,

有谁知道如何编写GROK模式以提取logstash?因为我对此很陌生。提前致谢

5000|       |       |applicationLog     |ClientLog      |SystemLog      |Green      |       |2014-01-07 11:58:48.76948      |12345 (0x1224)|1) Error 2)Sample Log | Configuration Manager

最佳答案

由于不同单词之间的|数量不一致,因此您可以将其与.*?匹配,然后将其余数据提取为predefined grok pattern

%{NUMBER:num}.*?%{WORD:2nd}.*?%{WORD:3rd}.*?%{WORD:4th}.*?%{WORD:5th}.*?%{TIMESTAMP_ISO8601}

这会给你,
{
"num": [
[
"5000"
]
],
"BASE10NUM": [
[
"5000"
]
],
"2nd": [
[
"applicationLog"
]
],
"3rd": [
[
"ClientLog"
]
],
"4th": [
[
"SystemLog"
]
],
"5th": [
[
"Green"
]
],
"TIMESTAMP_ISO8601": [
[
"2014-01-07 11:58:48.76948"
]
],
"YEAR": [
[
"2014"
]
],
"MONTHNUM": [
[
"01"
]
],
"MONTHDAY": [
[
"07"
]
],
"HOUR": [
[
"11",
null
]
],
"MINUTE": [
[
"58",
null
]
],
"SECOND": [
[
"48.76948"
]
],
"ISO8601_TIMEZONE": [
[
null
]
]
}

您可以在 online grok debugger上对其进行测试。

由于您不熟悉 grok,因此您可能想阅读 grok filter plugin basics

如果可以的话,建议您也查看 dissect filter,它比 grok更快,更有效,

The Dissect filter is a kind of split operation. Unlike a regular split operation where one delimiter is applied to the whole string, this operation applies a set of delimiters to a string value. Dissect does not use regular expressions and is very fast. However, if the structure of your text varies from line to line then Grok is more suitable. There is a hybrid case where Dissect can be used to de-structure the section of the line that is reliably repeated and then Grok can be used on the remaining field values with more regex predictability and less overall work to do.

关于elasticsearch - Grok模式用于数据,该数据由带有空格和可选值的管道分隔,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50934589/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com