gpt4 book ai didi

linux - 我如何根据这两行的单词总和组合两条相邻的行(递归)

转载 作者:塔克拉玛干 更新时间:2023-11-03 00:31:13 25 4
gpt4 key购买 nike

仅当两行的单词总和(定义为由空格或行尾符号分隔的连续字符的单词)小于 20 个单词时,我才尝试合并两个连续的行。

示例输入:

1This line has five words.
2This line has unfortunately six words.
3This line has also six words.
4The above three lines have a total of 18 words, which is less than 20, and should be combined into one line.
5This line has only 6 words.

期望的输出:

1This line has five words. 2This line has unfortunately six words. 3This line has also six words.
4The above three lines have a total of 18 words, which is less than 20, and should be combined into one line.
5This line has only 6 words.

我有以下代码作为起点,但我不知道如何设置条件,所以它会检查连续的两行。

awk '{while (sum(NF + NF+1) > 20) {sub ("\n", "")}}1'

两个问题是 while (sum(NF + NF+1) > 20)...我如何让它检查连续两行的总和?第二个问题......出于某种原因 sub ("\n", "") 没有去掉行尾的换行符,即使我在一行中尝试它也是如此。

谢谢。

最佳答案

Awk 逐行读取其输入,不读取它就无法知道下一行中的字段数(单词)。所以,你的逻辑是行不通的。

下面是实现此目的的直接方法;它只是缓冲行,直到字数达到 20,释放缓冲区内容,然后继续。

awk '(c += NF) < 20 {
buf = (buf sep $0)
sep = OFS
next
}
{
if (NR > 1)
print buf
buf = $0
c = NF
}
END {
print buf
}' file

关于linux - 我如何根据这两行的单词总和组合两条相邻的行(递归),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58366811/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com