gpt4 book ai didi

windows - 日志文件中的正则表达式匹配,返回匹配上方和下方的动态内容

转载 作者:可可西里 更新时间:2023-11-01 11:34:12 26 4
gpt4 key购买 nike

我有一些格式如下的包罗万象的日志文件:

timestamp event summary
foo details
account name: userA
bar more details
timestamp event summary
baz details
account name: userB
qux more details
timestamp etc.

我想在日志文件中搜索 userB,如果找到,则从前面的时间戳回显到(但不包括)下面的时间戳。可能会有几个事件与我的搜索相匹配。最好在每个匹配项周围回显某种 --- start ------ end ---

这对于 pcregrep -M 来说是完美的,对吧?问题是,GnuWin32 的 pcregrep 会因多行正则表达式搜索大文件而崩溃,并且这些包罗万象的日志可能有 100 兆或更多。

我尝试过的

到目前为止,我的 hackish 解决方法包括使用 grep -B15 -A30 来查找匹配行并打印周围的内容,然后将现在更易于管理的 block 输送到 pcregrep 中进行抛光。问题是有些事件少于 10 行,而另一些则超过 30 行;在遇到较短的事件时,我得到了一些意想不到的结果。

:parselog <username> <logfile>

set silent=1
set count=0
set deez=20\d\d-\d\d-\d\d \d\d:\d\d:\d\d
echo Searching %~2 for records containing %~1...

for /f "delims=" %%I in (
'grep -P -i -B15 -A30 ":\s+\b%~1\b(@mydomain\.ext)?$" "%~2" ^| pcregrep -M -i "^%deez%(.|\n)+?\b%~1\b(@mydomain\.ext|\r?\n)(.|\n)+?\n%deez%" 2^>NUL'
) do (
echo(%%I| findstr "^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9]:[0-9][0-9]:[0-9][0-9]" >NUL && (
if defined silent (
set silent=
set found=1
set /a "count+=1"
echo;
echo ---------------start of record !count!-------------
) else (
set silent=1
echo ----------------end of record !count!--------------
echo;
)
)
if not defined silent echo(%%I
)

goto :EOF

有更好的方法吗?我遇到了一个看起来很有趣的 awk 命令,类似于:

awk "/start pattern/,/end pattern/" logfile

...但它也需要匹配一个中间模式。不幸的是,我不太熟悉 awk 语法。有什么建议吗?


Ed Morton 建议我提供一些示例日志记录和预期输出。

包罗万象的例子

2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730158    Mon Mar 25 08:02:28 2013    529 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 2   Logon Failure:

Reason: Unknown user name or bad password

User Name: user5f

Domain: MYDOMAIN

Logon Type: 3

Logon Process: Advapi

Authentication Package: Negotiate

Workstation Name: dc3

Caller User Name: dc3$

Caller Domain: MYDOMAIN

Caller Logon ID: (0x0,0x3E7)

Caller Process ID: 400

Transited Services: -

Source Network Address: 169.254.7.86

Source Port: 40838
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account: USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 2 Logon Failure:

Reason: Account locked out

User Name: USER6Q@MYDOMAIN.TLD

Domain: MYDOMAIN

Logon Type: 3

Logon Process: Advapi

Authentication Package: Negotiate

Workstation Name: dc3

Caller User Name: dc3$

Caller Domain: MYDOMAIN

Caller Logon ID: (0x0,0x3E7)

Caller Process ID: 400

Transited Services: -

Source Network Address: 169.254.7.89

Source Port: 55314
2013-03-25 08:02:32 Auth.Notice 169.254.5.62 Mar 25 08:36:38 DC4.mydomain.tld MSWinEventLog 5 Security 201326798 Mon Mar 25 08:36:37 2013 4624 Microsoft-Windows-Security-Auditing N/A Audit Success DC4.mydomain.tld 12544 An account was successfully logged on.

Subject:
Security ID: S-1-0-0
Account Name: -
Account Domain: -
Logon ID: 0x0

Logon Type: 3

New Logon:
Security ID: S-1-5-21-606747145-1409082233-725345543-160838
Account Name: DEPTACCT16$
Account Domain: MYDOMAIN
Logon ID: 0x1158e6012c
Logon GUID: {BCC72986-82A0-4EE9-3729-847BA6FA3A98}

Process Information:
Process ID: 0x0
Process Name: -

Network Information:
Workstation Name:
Source Network Address: 169.254.114.62
Source Port: 42183

Detailed Authentication Information:
Logon Process: Kerberos
Authentication Package: Kerberos
Transited Services: -
Package Name (NTLM only): -
Key Length: 0

This event is generated when a logon session is created. It is generated on the computer that was accessed.

The subject fields indicate...
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730162 Mon Mar 25 08:02:30 2013 675 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Pre-authentication failed:

User Name: USER8Y

User ID: %{S-1-5-21-606747145-1409082233-725345543-3904}

Service Name: krbtgt/MYDOMAIN

Pre-Authentication Type: 0x0

Failure Code: 0x19

Client Address: 169.254.87.158
2013-03-25 08:02:32 Auth.Critical etc.

示例命令

call :parselog user6q \\path\to\catch-all.log

预期结果

---------------start of record 1-------------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account: USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
---------------end of record 1-------------


---------------start of record 2-------------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 2 Logon Failure:

Reason: Account locked out

User Name: USER6Q@MYDOMAIN.TLD

Domain: MYDOMAIN

Logon Type: 3

Logon Process: Advapi

Authentication Package: Negotiate

Workstation Name: dc3

Caller User Name: dc3$

Caller Domain: MYDOMAIN

Caller Logon ID: (0x0,0x3E7)

Caller Process ID: 400

Transited Services: -

Source Network Address: 169.254.7.89

Source Port: 55314
---------------end of record 2-------------

最佳答案

这就是您使用 GNU awk(对于 IGNORECASE)所需的全部:

$ cat tst.awk
function prtRecord() {
if (record ~ regexp) {
printf "-------- start of record %d --------%s", ++numRecords, ORS
printf "%s", record
printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
}
record = ""
}
BEGIN{ IGNORECASE=1 }
/^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
{ record = record $0 ORS }
END { prtRecord() }

或使用任何 awk:

$ cat tst.awk
function prtRecord() {
if (tolower(record) ~ tolower(regexp)) {
printf "-------- start of record %d --------%s", ++numRecords, ORS
printf "%s", record
printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
}
record = ""
}
/^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
{ record = record $0 ORS }
END { prtRecord() }

无论哪种方式,您都可以在 UNIX 上将其运行为:

$ awk -v regexp=user6q -f tst.awk file

我不知道 Windows 语法,但我希望它即使不相同也非常相似。

请注意在脚本中使用 tolower() 使比较的两边都小写,因此匹配不区分大小写。如果您可以改为传递大小写正确的搜索正则表达式,则无需在比较的任何一侧调用 tolower()。 nbd,它可能只会稍微加快脚本的速度。

$ awk -v regexp=user6q -f tst.awk file
-------- start of record 1 --------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security
11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure
dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account: USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
--------- end of record 1 ---------

-------- start of record 2 --------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security
11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure
dc3 2 Logon Failure:

Reason: Account locked out

User Name: USER6Q@MYDOMAIN.TLD

Domain: MYDOMAIN

Logon Type: 3

Logon Process: Advapi

Authentication Package: Negotiate

Workstation Name: dc3

Caller User Name: dc3$

Caller Domain: MYDOMAIN

Caller Logon ID: (0x0,0x3E7)

Caller Process ID: 400

Transited Services: -

Source Network Address: 169.254.7.89

Source Port: 55314
--------- end of record 2 ---------

关于windows - 日志文件中的正则表达式匹配,返回匹配上方和下方的动态内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15628017/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com