gpt4 book ai didi

linux - 如何删除 .CSV 文件中第二次出现的模式后的其余记录

转载 作者:塔克拉玛干 更新时间:2023-11-02 23:24:04 24 4
gpt4 key购买 nike

我有一个 .CSV 文件,它在标题后有很少的记录,但是,在文件末尾之前有一个重复的标题,在该重复的标题之后还有一些记录(我不需要)。有没有一种方法可以检查第二次出现的 header 模式并删除该重复 header 之后的文件其余部分?以下是该文件的示例。

col0,col1, col2, col3 , col4 , col5, col6 ,
1value0,1value1,1value2,1value3,1value4,1value5,1value6,
2value0, 2value1, 2value2, 2value3, 2value4, 2value5, 2value6,
3value, 3value1, 3value2, 3value3, 3value4, 3value5, 3value6,
2value0, 4value1, 4value2, 4value3, 4value4, 4value5, 4value6,
5value0, 5value1, 5value2, 5value3, 5value4, 5value5, 5value6,
6value0, 6value1, 6value2, 6value3, 6value4, 6value5, 6value6,
,,,,,,,
,,,,,,,
,,,,,,,
(n-1)value0, (n-1)value1, (n-1)value2, (n-1)value3, (n-1)value4, (n-1)value5, (n-1)value6,
(n)value0, (n)value1, (n)value2, (n)value3, (n)value4, (n)value5, (n)value6,
col0,col1, col2, col3 , col4 , col5, col6 ,
1,unwanted, records, after, the, duplicate, header
2,unwanted, records, after, the, duplicate, header
3,unwanted, records, after, the, duplicate, header

我期待的输出如下所示

col0,col1, col2, col3 , col4 , col5, col6 ,
1value0,1value1,1value2,1value3,1value4,1value5,1value6,
2value0, 2value1, 2value2, 2value3, 2value4, 2value5, 2value6,
3value, 3value1, 3value2, 3value3, 3value4, 3value5, 3value6,
2value0, 4value1, 4value2, 4value3, 4value4, 4value5, 4value6,
5value0, 5value1, 5value2, 5value3, 5value4, 5value5, 5value6,
6value0, 6value1, 6value2, 6value3, 6value4, 6value5, 6value6,
,,,,,,,
,,,,,,,
,,,,,,,
(n-1)value0, (n-1)value1, (n-1)value2, (n-1)value3, (n-1)value4, (n-1)value5, (n-1)value6,
(n)value0, (n)value1, (n)value2, (n)value3, (n)value4, (n)value5, (n)value6,

P.S: 我有 GNU sed 4.1.5 版和 GNU Awk 3.1.5

非常感谢任何帮助。

最佳答案

试试这个:

awk 'a~$0{exit}NR==1{a=$0}1' file

关于linux - 如何删除 .CSV 文件中第二次出现的模式后的其余记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17972318/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com