gpt4 book ai didi

bash - 在不同行的两个字符串之间提取文本

转载 作者:行者123 更新时间:2023-11-29 09:16:04 26 4
gpt4 key购买 nike

我有一个包含以下随机主机的大电子邮件文件:

......
HOSTS: test-host,host2.domain.com,
host3.domain.com,another-testing-host,host.domain.
com,host.anotherdomain.net,host2.anotherdomain.net,
another-local-host, TEST-HOST

DATE: August 11 2015 9:00
.......

主机总是用逗号分隔,但它们可以分成一行、两行或多行(我无法控制这一点,不幸的是,这是电子邮件客户端所做的)。

所以我需要提取字符串“HOSTS:”和字符串“DATE:”之间的所有文本,将其换行,并用新行替换逗号,如下所示:

test-host
host2.domain.com
host3.domain.com
another-testing-host
host.domain.com
host.anotherdomain.net
host2.anotherdomain.net
another-local-host
TEST-HOST

到目前为止我想出了这个,但我丢失了与“HOSTS”在同一行的所有内容:

sed '/HOST/,/DATE/!d;//d' ${file} | tr -d '\n' | sed -E "s/,\s*/\n/g"

最佳答案

这样的东西可能对你有用:

sed -n '/HOSTS:/{:a;N;/DATE/!ba;s/[[:space:]]//g;s/,/\n/g;s/.*HOSTS:\|DATE.*//g;p}' "$file"

分割:

-n                       # Disable printing
/HOSTS:/ { # Match line containing literal HOSTS:
:a; # Label used for branching (goto)
N; # Added next line to pattern space
/DATE/!ba # As long as literal DATE is not matched goto :a
s/.*HOSTS:\|DATE.*//g; # Remove everything in front of and including literal HOSTS:
# and remove everything behind and including literal DATE
s/[[:space:]]//g; # Replace spaces and newlines with nothing
s/,/\n/g; # Replace comma with newline
p # Print pattern space
}

关于bash - 在不同行的两个字符串之间提取文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38015153/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com