gpt4 book ai didi

python - 如何使用 awk 或 sed 分割日志文件。替换 python 脚本

转载 作者:太空宇宙 更新时间:2023-11-03 14:01:12 25 4
gpt4 key购买 nike

假设您每天都有一个仪器日志文件。白天可能会发生多次重新启动。由于某种原因,您希望每次重新启动时都有一个文件。

最后我使用 python 来做这件事,但我想用 awk 或 sed 做同样的事情。请让我知道您的想法。

python脚本split_instrument_log.py

def split_instrument_log(filename):
first_line = '--- ServiceHost Start ---'
count = 0
with open(filename, 'r') as handle:
text = handle.read()
split_text = text.split('\n' + first_line)
for split in split_text:
split_file_name = filename + "." + str(count)
with open(split_file_name, 'w') as split_handle:
if count > 0:
split_handle.write(first_line)
split_handle.write(split)
count = count + 1

filename = "instrument.log";
split_instrument_log(filename)

示例仪器.log:

--- ServiceHost Start ---
11:43:54.745 00000001 HOST I Creating System 2/19/2018 11:43:54 AM
...
--- ServiceHost Start ---
14:47:37.071 00000001 HOST I Creating System 2/19/2018 2:47:37 PM
...
--- ServiceHost Start ---
18:27:57.463 00000001 HOST I Creating System 2/19/2018 6:27:57 PM
...

结果instrument.log.0

--- ServiceHost Start ---
11:43:54.745 00000001 HOST I Creating System 2/19/2018 11:43:54 AM
...

我有另一个日志,它以时间戳和地址开头,例如

[05/02/2018 13:32:30.160 UTC] Main Thread (0xb4692000)/ 0 INF socMainExecutable

如何更新 awk 脚本,但请注意时间戳和地址不是恒定的?

最佳答案

awk这很直接:

输入:

$ more instrument.log
--- ServiceHost Start ---
11:43:54.745 00000001 HOST I Creating System 2/19/2018 11:43:54 AM
blabla1
blabla2
blabla3
...
--- ServiceHost Start ---
14:47:37.071 00000001 HOST I Creating System 2/19/2018 2:47:37 PM
...
blabla4
blabla5
blabla6
--- ServiceHost Start ---
18:27:57.463 00000001 HOST I Creating System 2/19/2018 6:27:57 PM
...
blabla7
blabla8
blabla9

awk 脚本:

awk -v i=-1 '/--- ServiceHost Start ---/{i++; print $0 > "instrument.log."i; next}{print $0 >> "instrument.log."i}' instrument.log

输出:

$ more instrument.log.?
::::::::::::::
instrument.log.0
::::::::::::::
--- ServiceHost Start ---
11:43:54.745 00000001 HOST I Creating System 2/19/2018 11:43:54 AM
blabla1
blabla2
blabla3
...
::::::::::::::
instrument.log.1
::::::::::::::
--- ServiceHost Start ---
14:47:37.071 00000001 HOST I Creating System 2/19/2018 2:47:37 PM
...
blabla4
blabla5
blabla6
::::::::::::::
instrument.log.2
::::::::::::::
--- ServiceHost Start ---
18:27:57.463 00000001 HOST I Creating System 2/19/2018 6:27:57 PM
...
blabla7
blabla8
blabla9

说明:

  • -v i=-1传递变量 iawk初始值为-1 ,您还可以在 BEGIN 子句中定义它,如下所示: BEGIN{i=-1} .
  • /--- ServiceHost Start ---/{i++; print $0 > "instrument.log."i; next}每当awk查找包含 --- ServiceHost Start --- 的行它将增加 i并将该行内容打印到文件 "instrument.log."i在进入下一行之前。 (如果文件存在,它将覆盖该文件)
  • {print $0 >> "instrument.log."i}对于其他行,只需附加到文件 "instrument.log."i

关于python - 如何使用 awk 或 sed 分割日志文件。替换 python 脚本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49230695/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com