gpt4 book ai didi

bash - 我怎样才能拆分这个字符串

转载 作者:行者123 更新时间:2023-11-29 09:35:43 24 4
gpt4 key购买 nike

我目前正在尝试清理一些日志文件,以便它们采用更易于阅读的格式,并且一直在尝试使用 gnu cut 命令,该命令运行得相当好,尽管我真的想不出一个好的方法来删除[INFO] 部分字符串

logs/logs/server_1283258036.log:2010-08-31 23:06:51 [INFO] <NateMar> where?!
logs/logs/server_1281904775.log:2010-08-15 22:59:53 [INFO] <BoonTheMoon> §b<BoonTheMoon>§ohhhhhh
logs/logs/server_1282136782.log:2010-08-18 16:27:32 [INFO] <pinguin> <pinguin>§F :/
logs/logs/server_1282136782.log:2010-08-18 16:27:37 [INFO] <TotempaaltJ> <TotempaaltJ>§F That helped A LOT
logs/logs/server_1282136782.log:2010-08-18 16:27:37 [INFO] <Rizual> §b<Rizual>§F hm?
logs/logs/server_1282136782.log:2010-08-18 16:29:10 [INFO] <pinguin> <pinguin>§F bah
logs/logs/server_1282136782.log:2010-08-18 16:29:35 [INFO] <TotempaaltJ> <TotempaaltJ>§F Finished my houses
logs/logs/server_1282136782.log:2010-08-18 16:29:40 [INFO] <TotempaaltJ> <TotempaaltJ>§F or whatever
logs/logs/server_1282136782.log:2010-08-18 16:30:47 [INFO] <Rizual> §b<Rizual>§So much iron
logs/logs/server_1282136782.log:2010-08-18 16:30:58 [INFO] <TotempaaltJ> <TotempaaltJ>§F Ah yes, furnaces don't work.o
logs/logs/server_1282136782.log:2010-08-18 16:31:01 [INFO] <Rizual> §b<Rizual>§F They do
logs/logs/server_1282136782.log:2010-08-18 16:31:06 [INFO] <TotempaaltJ> <TotempaaltJ>§F Hm
logs/logs/server_1282136782.log:2010-08-18 16:31:08 [INFO] <Rizual> §b<Rizual>§F just need to use /lighter
logs/logs/server_1282136782.log:2010-08-18 16:31:12 [INFO] <Valrix> <Valrix>§FNotch fixed them?

我最终希望将字符串简化为类似于以下内容的内容(请记住,日志有两种格式,旧格式有 2 个名称副本,如上面的日志,以及更新的格式,其中只有一次名称(可以在第一行日志中看到,<natemar> 一个))

2010-08-31 23:06:51 <NateMar> where?!    
2010-08-15 22:59:53 <BoonTheMoon> ohhhhhh (this one would require both the same editing as above, plus removal of the "extra" name §b<BoonTheMoon>§)

我应该怎么做呢?考虑过使用 awk,虽然我很难理解它是如何工作的,所以不确定如何设置一些东西来做到这一点。任何帮助将不胜感激,谢谢!

最佳答案

在 sed、awk 和 bash 中有更多关于此的内容:

[ghoti@pc ~]$ cat text
logs/logs/server_1283258036.log:2010-08-31 23:06:51 [INFO] <NateMar> where?!
logs/logs/server_1281904775.log:2010-08-15 22:59:53 [INFO] <BoonTheMoon> §b<BoonTheMoon>§ohhhhhh

[ghoti@pc ~]$ sed 's/^[^:]*://;s/[[][^]]*[]] //' text
2010-08-31 23:06:51 <NateMar> where?!
2010-08-15 22:59:53 <BoonTheMoon> §b<BoonTheMoon>§ohhhhhh

[ghoti@pc ~]$ awk '{sub(/^[^:]+:/,""); $3=""} 1' text
2010-08-31 23:06:51 <NateMar> where?!
2010-08-15 22:59:53 <BoonTheMoon> §b<BoonTheMoon>§ohhhhhh

[ghoti@pc ~]$ while read line; do line=${line#*:}; echo "${line/\[*\] }"; done < text
2010-08-31 23:06:51 <NateMar> where?!
2010-08-15 22:59:53 <BoonTheMoon> §b<BoonTheMoon>§ohhhhhh

虽然这些很简单,但由于篇幅短小,它们可能并不完美。例如,awk 脚本通过删除第三个“单词”,留下分隔现在为空的单词的空格。

请注意,虽然单行代码看起来“优雅”,但对于快速工作而言,明确代码通常是更好的主意,尤其是当您必须处理未知的输入数据或不检查您的代码时运行后立即得到结果。

这更难阅读,但可能更安全,具体取决于您的输入:

[ghoti@pc ~]$ awk '$3~/^[[].+[]]$/{$3="";sub(/  /," ")} {sub(/^[^:]+:/,"")} 1' text
2010-08-31 23:06:51 <NateMar> where?!
2010-08-15 22:59:53 <BoonTheMoon> çb<BoonTheMoon>çohhhhhh

对于 bash 脚本,使用字符类比使用 glob 更安全:

[ghoti@pc ~]$ shopt -s extglob
[ghoti@pc ~]$ while read line; do line=${line#*:}; echo "${line/\[+([[:upper:]])\] /}"; done < text
2010-08-31 23:06:51 <NateMar> where?!
2010-08-15 22:59:53 <BoonTheMoon> çb<BoonTheMoon>çohhhhhh

请注意,extglob shopt 选项允许您在参数替换模式中使用更高级的模式匹配。 man bash 并查找 Pathname Expansion 以获取详细信息。

更新:

您已经为您的问题添加了一个最初不存在的新要求。以下是使用 awk 实现新要求的方法:

awk '$3~/^[[].+[]]$/{$3="";sub(/  /," ")} {sub(/^[^:]+:/,"")} $3~/^<.+>$/{sub(/^(§b)?<[[:alpha:]]+>§/,"",$4)} 1' text

如果第三个字符串看起来像括号中的昵称,这只是从第四个字符串中删除彩色昵称。这适用于您发布的示例,但只有您才能确定这是否适合您。

还有 bash:

shopt -s extglob
while read date time tag nick line; do
printf "%s %s %s %s\n" "${date#*:}" "$time" "$nick" "${line/#*([^< ])$nick??}"
done < text

关于bash - 我怎样才能拆分这个字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12343156/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com