gpt4 book ai didi

bash - AWK - 通过 for 循环和条件检查处理多个文件

转载 作者:行者123 更新时间:2023-11-29 09:36:01 25 4
gpt4 key购买 nike

文件 1:我的文件名_WEEK.csv

w27_2018,257,1,26.20,0.00,24.26
w28_2018,257,1,7.97,0.00,24.26
w29_2018,257,1,34.86,0.00,24.26
w30_2018,257,1,3.29,0.00,24.26

文件 2:myfilename_MONTH.csv

m07_2018,257,1,94.78,0.00,121.31
m08_2018,257,1,719.60,0.00,262.47
m09_2018,257,1,14925.60,0.00,13903.24
m10_2018,257,1,51099.66,0.00,81600.69

文件 3:myfilename_HALF.csv

h02_2018,257,1,155345.19,480029.21,235802.91
h01_2019,257,1,273961.84,552545.36,140706.27
h02_2018,258,1,3250552.06,1299785.91,3697749.57
h01_2019,258,1,3582585.66,2670427.72,4009391.28

日历文件:

20180805,08/05/2018,w27_2018,WK27 2018,m07_2018,AUG 2018,q03_2018,Q03 2018,h02_2018,H02 2018,a2018,FY2018,27,WEEK 27,01,SUNDAY
20180806,08/06/2018,w27_2018,WK27 2018,m07_2018,AUG 2018,q03_2018,Q03 2018,h02_2018,H02 2018,a2018,FY2018,27,WEEK 27,02,MONDAY
...
20180811,08/11/2018,w27_2018,WK27 2018,m07_2018,AUG 2018,q03_2018,Q03 2018,h02_2018,H02 2018,a2018,FY2018,27,WEEK 27,07,SATURDAY
20180812,08/12/2018,w28_2018,WK28 2018,m07_2018,AUG 2018,q03_2018,Q03 2018,h02_2018,H02 2018,a2018,FY2018,28,WEEK 28,01,SUNDAY
..
20180816,08/16/2018,w28_2018,WK28 2018,m07_2018,AUG 2018,q03_2018,Q03 2018,h02_2018,H02 2018,a2018,FY2018,28,WEEK 28,05,THURSDAY

预期输出(为便于阅读而添加的换行符):

2018,w27_2018,WK27 2018,257,1,26.20,0.00,24.26
2018,w27_2018,WK27 2018,258,1,97192.07,9028.38,52130.32
2018,w27_2018,WK27 2018,300,1,181.44,0.00,-69.72

2018,m07_2018,AUG 2018,257,1,94.78,0.00,121.31
2018,m07_2018,AUG 2018,258,1,509253.46,45141.91,399648.71
2018,m07_2018,AUG 2018,300,1,409.10,0.00,-348.60

2018,h02_2018,H02 2018,257,1,155345.19,480029.21,235802.91
2018,h02_2018,H02 2018,258,1,3250552.06,1299785.91,3697749.57
2018,h02_2018,H02 2018,300,1,1112.93,0.00,-1164.35

我想加入所有 myfilename_* 使用 calendar_file 添加标签和财政年度:

个别命令是:

awk -F, 'NR==FNR {a[$3]=substr($12,3,4) FS $3 FS $4; next} {print a[$1] FS $2 FS $3 FS $4 FS $5 FS $6}' calendar_file myfilename_WEEK.csv >> my_report.csv

awk -F, 'NR==FNR {a[$5]=substr($12,3,4) FS $5 FS $6; next} {print a[$1] FS $2 FS $3 FS $4 FS $5 FS $6}' calendar_file myfilename_MONTH.csv >> my_report.csv

awk -F, 'NR==FNR {a[$9]=substr($12,3,4) FS $9 FS $10; next} {print a[$1] FS $2 FS $3 FS $4 FS $5 FS $6}' calendar_file myfilename_HALF.csv >> my_report.csv

我正在尝试将所有这些连接到一个循环中:

我已经尝试了以下但它不起作用:

    for exp_file in `ls myfilename_*.csv`
do
awk -F, '\
{ \
if(NR==FNR && FILENAME ~ /WEEK/) {a[$3]=substr($12,3,4) FS $3 FS $4; next} ;\
if(NR==FNR && FILENAME ~ /MONTH/) {a[$5]=substr($12,3,4) FS $5 FS $6; next} ;\
if(NR==FNR && FILENAME ~ /HALF/) {a[$9]=substr($12,3,4) FS $9 FS $10; next} ;\
{print a[$1] FS $2 FS $3 FS $4 FS $5 FS $6} \
}' calendar_file $exp_file >> my_report.csv
done

我怎样才能做到这一点?提前感谢您的帮助!

最佳答案

第一种方式(GNU awk,如果你没有GNU awk请留言):

awk -F, 'NR==FNR{y=substr($12,3,4); a[$3]=y FS $3 FS $4; b[$5]=y FS $5 FS $6; c[$9]=y FS $9 FS $10; next} FNR==1{printf nl;nl="\n"} match(FILENAME, /myfilename_([A-Z]*)/, f){NF=6;switch(f[1]){case "WEEK": $1=a[$1];break; case "MONTH": $1=b[$1];break; case "HALF": $1=c[$1];}}1' OFS=, calendar_file myfilename_{WEEK,MONTH,HALF}.csv

多行以提高可读性:

awk -F, '
NR==FNR{
y=substr($12,3,4);
a[$3]=y FS $3 FS $4;
b[$5]=y FS $5 FS $6;
c[$9]=y FS $9 FS $10;
next
}
FNR==1{printf nl;nl=ORS} ## The newlines between sectors, if you do not need those newlines then remove this line.
match(FILENAME, /myfilename_([A-Z]*)/, f){
NF=6; ## To limit results for 6 columns only, can remove it here.
switch(f[1]){
case "WEEK":
$1=a[$1];
break;
case "MONTH":
$1=b[$1];
break;
case "HALF":
$1=c[$1];
}
}1' OFS=, calendar_file myfilename_{WEEK,MONTH,HALF}.csv

对其的更新:

awk -F, '
NR==FNR{
y=substr($12,3,4);
a[$3]=y FS $3 FS $4;
b[$5]=y FS $5 FS $6;
c[$9]=y FS $9 FS $10;
next
}
FNR==1{printf nl;nl=ORS} ## The newlines between sectors, if you do not need those newlines then remove this line.
match(FILENAME, /myfilename_([A-Z]*)/, f){
NF=6; ## To limit results for 6 columns only, can remove in your case.
$1 = f[1]=="WEEK" ? a[$1] : ( f[1]=="MONTH" ? b[$1] : (f[1]=="HALF" ? c[$1] : $1) )
}1' OFS=, calendar_file myfilename_{WEEK,MONTH,HALF}.csv

第二种方式,更简洁且不使用switch(也是GNU awk):

awk -F, '
NR==FNR{
y=substr($12,3,4);
a[$3 "WEEK"]=y FS $3 FS $4;
a[$5 "MONTH"]=y FS $5 FS $6;
a[$9 "HALF"]=y FS $9 FS $10;
next
}
FNR==1{printf nl;nl=ORS} ## The newlines between sectors, if you do not need those newlines then remove this line.
match(FILENAME, /myfilename_([A-Z]*)/, f){
$1=a[$1 f[1]];
}1' OFS=, calendar_file myfilename_{WEEK,MONTH,HALF}.csv

第三种方式:如果您的数据都与它们的文件名相对应,就像您在示例中展示的那样,还有第三种方式可以消除 match 的需要,所以它可以在其他 awk 上工作:

awk -F, '
NR==FNR{
y=substr($12,3,4);
a[$3 "w"]=y FS $3 FS $4;
a[$5 "m"]=y FS $5 FS $6;
a[$9 "h"]=y FS $9 FS $10;
next
}
FNR==1{printf nl;nl=ORS} ## The newlines between sectors, if you do not need those newlines then remove this line.
$1~/^([wmh])[0-9]{2}_[0-9]{4}/{ ## Check first fields to make sure it matches, the checking is optional if your data is all like you showed.
$1=a[$1 substr($1,1,1)]
print
}' OFS=, calendar_file myfilename_{WEEK,MONTH,HALF}.csv

再考虑一下,根据您的数据文件名关系,实际上没有必要检查第一个字母(也没有文件名):

awk -F, '
NR==FNR{
y=substr($12,3,4);
a[$3]=y FS $3 FS $4;
a[$5]=y FS $5 FS $6;
a[$9]=y FS $9 FS $10;
next
}
FNR==1{printf nl;nl=ORS} ## The newlines between sectors, if you do not need those newlines then remove this line.
{ ## Add $1~/^([wmh])[0-9]{2}_[0-9]{4}/ to the beginning of this line if you want to check and make sure first column.
$1=a[$1]
}1' OFS=, calendar_file myfilename_{WEEK,MONTH,HALF}.csv

关于bash - AWK - 通过 for 循环和条件检查处理多个文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55546075/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com