gpt4 book ai didi

for-loop - AWK:用类似于FNR==NR的方法连接并处理三个或更多文件

转载 作者:行者123 更新时间:2023-12-05 03:40:04 25 4
gpt4 key购买 nike

由于我正在学习awk;我发现 FNR==NR 方法是处理两个文件的一种非常常用的方法。如果 FNR==NR;然后它是第一个文件,当FNR重置为1,同时从串联文件中读取每一行时,这意味着!(FNR== NR) 显然是第二个文件

当涉及到三个 或更多文件时,我看不到第二个和第三个文件的方式,因为它们具有相同的 !(FNR==NR)健康)状况。这让我想弄清楚怎么会有像 FNR2FNR3 这样的东西?

所以我实现了一个awk处理三个文件的方法。假设每个文件都有 FNR1 FNR2 FNR3。我为每个文件制作了单独运行的 for 循环。每个循环的条件都相同 NR==FNR# 并且实际上得到了我的预期:

所以我想知道是否有更清醒、简洁的方法可以提供与以下awk代码类似的结果

示例文件内容

$ cat file1
X|A1|Z
X|A2|Z
X|A3|Z
X|A4|Z
$ cat file2
X|Y|A3
X|Y|A4
X|Y|A5
$ cat file3
A1|Y|Z
A4|Y|Z

AWK for 循环

    $ cat fnrarray.sh 
awk -v FS='[|]' '{ for(i=FNR ; i<=NR && i<=FNR && NR==FNR; i++) {x++; print "NR:",NR,"FNR1:",i,"FNR:",FNR,"\tfirst file\t"}
for(i=FNR ; i+x<=NR && i<=FNR && NR==FNR+x; i++) {y++; print "NR:",NR,"FNR2:",i+x,"FNR:",FNR,"\tsecond file\t"}
for(i=FNR ; i+x+y<=NR && i<=FNR && NR==FNR+x+y; i++) {print "NR:",NR,"FNR3:",i+x+y,"FNR:",FNR,"\tthird file\t"}
}' file1 file2 file3

当前和期望的输出

$ sh fnrarray.sh
NR: 1 FNR1: 1 FNR: 1 first file
NR: 2 FNR1: 2 FNR: 2 first file
NR: 3 FNR1: 3 FNR: 3 first file
NR: 4 FNR1: 4 FNR: 4 first file
NR: 5 FNR2: 5 FNR: 1 second file
NR: 6 FNR2: 6 FNR: 2 second file
NR: 7 FNR2: 7 FNR: 3 second file
NR: 8 FNR3: 8 FNR: 1 third file
NR: 9 FNR3: 9 FNR: 2 third file

您可以看到 NRFNR# 对齐并且可以读取哪个 NR 对应于哪个 file#.


另一种方法

我在这里找到了这个方法 FNR==1{++f} f==1 {} Handling 3 Files using awk

但每次读取新行时,此方法都会替换arr1[1]

失败尝试 1

$ awk -v FS='[|]' 'FNR==1{++f} f==1 {split($2,arr); print arr1[1]}' file1 file2 file3 
A1
A2
A3
A4

for循环成功(arr1[1]没有改变)

$ awk -v FS='[|]' '{for(i=FNR ; i<=NR && i<=FNR && NR==FNR; i++) {arr1[++k]=$2; print arr1[1]}}' file1 file2 file3 
A1
A1
A1
A1

最佳答案

When it comes to three or more files I can't see a way which is secondand third file as both have the same !(FNR==NR) condition. This mademe to try to figure out how can there be something like FNR2 and FNR3?

示例如下:

$ cat f1
X|A1|Z
X|A2|Z
X|A3|Z
X|A4|Z

$ cat f2
X|Y|A3
X|Y|A4
X|Y|A5

$ cat f3
A1|Y|Z
A4|Y|Z

示例输出:

$ awk -F '|' 'FNR==1{file++}{array[file, FNR]=$0; max=max>FNR?max:FNR}END{for(f=1; f<=file; f++){ for(row=1; row<=max; row++){ key=f SUBSEP row; if(key in array)print "file: "f,"row :"row,"record: "array[key]   } }}' f1 f2 f3
file: 1 row :1 record: X|A1|Z
file: 1 row :2 record: X|A2|Z
file: 1 row :3 record: X|A3|Z
file: 1 row :4 record: X|A4|Z
file: 2 row :1 record: X|Y|A3
file: 2 row :2 record: X|Y|A4
file: 2 row :3 record: X|Y|A5
file: 3 row :1 record: A1|Y|Z
file: 3 row :2 record: A4|Y|Z

解释:

awk -F '|' 'FNR==1{                   # FNR will reset for every file
file++ # so whenever FNR==1 increment variable file
}
{
# array name : array
# array key being : file, FNR
# array value : $0 which current record/row
array[file, FNR] = $0;
# here we find which row count in all available files
max = max > FNR ? max : FNR
}

END{ # end block when all files are read
# start iterating over file
# as we now variable file hold total no files read
for(f=1; f<=file; f++)
{
# iterate now for record from each file
# variable max holds max row count
for(row=1; row<=max; row++)
{
# variable key will now have
# key = file-number SUBSET row-number
key=f SUBSEP row;
# if key exists in array
# print array value
if(key in array)
print "file: "f,"row :"row,"record: "array[key]
}
}
}' f1 f2 f3

其他选项是使用真正的多维数组,如下所示。 gawk 当然是特定的。

假设文件名是唯一的,否则使用 FNR==1{ file++} 并使用 file 代替 FILENAME

$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)
Copyright (C) 1989, 1991-2018 Free Software Foundation.

$ awk -F '|' '{
true_multi_array[FILENAME][FNR] = $0
}
END{
for(file in true_multi_array)
for(row in true_multi_array[file])
print "file:",file, "row :" row, "record:" true_multi_array[file][row]
}' f1 f2 f3
file: f1 row :1 record:X|A1|Z
file: f1 row :2 record:X|A2|Z
file: f1 row :3 record:X|A3|Z
file: f1 row :4 record:X|A4|Z
file: f2 row :1 record:X|Y|A3
file: f2 row :2 record:X|Y|A4
file: f2 row :3 record:X|Y|A5
file: f3 row :1 record:A1|Y|Z
file: f3 row :2 record:A4|Y|Z

关于for-loop - AWK:用类似于FNR==NR的方法连接并处理三个或更多文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68300877/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com