gpt4 book ai didi

linux - 在不考虑 shell 脚本中缺失值的情况下以不规则间隔计算平均值?

转载 作者:太空狗 更新时间:2023-10-29 11:42:20 25 4
gpt4 key购买 nike

我有一个包含许多缺失值的数据集,如 -999。部分数据为

input.txt
30
-999
10
40
23
44
-999
-999
31
-999
54
-999
-999
-999
-999
-999
-999
10
23
2
5
3
8
8
7
9
6
10
and so on

我想在不考虑缺失值的情况下计算每 5、6、6 行间隔的平均值。

期望输出为

ofile.txt
25.75 (i.e. consider first 5 rows and take average without considering missing values, so (30+10+40+23)/4)
43 (i.e. consider next 6 rows and take average without considering missing values, so (44+31+54)/3)
-999 (i.e. consider next 6 and take average without considering missing values. Since all are missing, so write as a missing value -999)
8.6 (i.e. consider next 5 rows and take average (10+23+2+5+3)/5)
8 (i.e. consider next 6 rows and take average)

如果它是固定间隔(假设为 5),我可以这样做

awk '!/\-999/{sum += $1; count++} NR%5==0{print count ? (sum/count) :-999;sum=count=0}' input.txt

我在这里定期问了一个类似的问题Calculating average without considering missing values in shell script?但是这里我问的是不规则间隔的解决方案。

最佳答案

使用AWK

awk -v f="5" 'f&&f--&&$0!=-999{c++;v+=$0} NR%17==0{f=5;r++} 
!f&&NR%17!=0{f=6;r++} r&&!c{print -999;r=0} r&&c{print v/c;r=v=c=0}
END{if(c!=0)print v/c}' input.txt

输出

25.75
43
-999
8.6
8

分割

f&&f--&&$0!=-999{c++;v+=$0} #add valid values and increment count
NR%17==0{f=5;r++} #reset to 5,6,6 pattern
!f&&NR%17!=0{f=6;r++} #set 6 if pattern doesnt match
r&&!c{print -999;r=0} #print -999 if no valid values
r&&c{print v/c;r=v=c=0} #print avg
END{
if(c!=0) #print remaining values avg
print v/c
}

关于linux - 在不考虑 shell 脚本中缺失值的情况下以不规则间隔计算平均值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38516981/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com