Where am I going wrong? Trying to output average with awk(我哪里错了？尝试使用awk输出平均值)-6ren

Where am I going wrong? Trying to output average with awk(我哪里错了？尝试使用awk输出平均值)

转载作者：bug小助手更新时间：2023-10-25 22:07:27

I am taking the following data and trying to output the average with awk.

我正在获取以下数据，并尝试使用awk输出平均值。

$ cat chores.csv
Chore Name,Assigned to,estimate,done?
Laundry,Chelsey,45,N
Wash Windows,Sam,60,Y
Mop kitchen,Sam,20,N
Clean cookware,Chelsey,30,N
Unload dishwasher,Chelsey,10,N
Dust living room,Chelsey,20,N
Wash the dog,Sam,40,N

Here is the script I wrote:

以下是我写的剧本：

#!/bin/awk

BEGIN {
NR>1
FS=","
} $3 > 0 {
i++ ; tot+=$3
avg=tot/i
}
END{
printf "\nAverage: %.2f\n ", avg
}

When I run it, I get an incorrect output

当我运行它时，我得到了不正确的输出

awk -f avg.awk chores.csv

Average: 28.12

The answer should be 32.14

答案应该是32.14

更多回答

Add debugging prints, especially at the end print "tot="tot "\ti=" i. Worst case is to add debugging in the main loop. Good luck.

添加调试打印，特别是在打印“tot=”tot“\ti=”i的末尾。最糟糕的情况是在主循环中添加调试。祝好运。

Only the conditions immediately before a block gate whether that block is run. When you put NR>1 in the BEGIN instead of before the { i++; tot+=$3 }, you stop it from having any use.

只有紧接在块之前的条件才能确定该块是否正在运行。当你把nr>1放在开头，而不是放在{i++；tot+=$3}之前，你就停止了它的任何用途。

BTW, think of computing avg in the END block so you're doing it only once instead of over and over for every line..

顺便说一句，想想在END块中计算avg，这样你就只做一次，而不是一遍又一遍地为每一行计算。

Once you get an answer to the question you asked, make sure to add a test for i being non-zero before you try to divide by it, e.g. print (i ? tot/i : 0) or similar.

一旦您得到了问题的答案，请确保在尝试除以i之前添加i为非零的测试，例如print(i？TOT/I：0)或类似。

优秀答案推荐

You are counting the header line, even though it doesn't have the number you want.

您正在计算标题行，即使它没有您想要的数字。

Change to:

更改为：

#!/bin/awk

BEGIN {
    FS=","
} 
NR > 1 && $3 > 0 {  # NR > 1 check moved here
    i++; 
    tot += $3
}
END {
    avg=tot/i
    printf "\nAverage: %.2f\n ", avg
}

This also removes the NR > 1 from the BEGIN block, where it's not needed, and calculates the average only once, in the END, instead of for each row, as you're only printing that in the end anyway. Makes the code a bit cleaner.

这还从BEGIN块中删除了不需要的NR>1，并在最后只计算一次平均值，而不是每行，因为无论如何都只打印一次。使代码更简洁一些。

Your attempt at screening out the header line clearly isn't working. An obvious possibility would be something on this general order (untested, but simple enough I'd expect it to work anyway):

你试图筛选出标题行的尝试显然没有奏效。一种明显的可能性是这样的一般顺序(未经测试，但很简单，我预计它会起作用)：

$3 ~ /^[0-9]+$/    {
        i++;
        tot+=$3;
        avg=tot/i;
    }

Personally, I'd probably compute avg only once, in the END clause though. It's not clear to me what the NR>1 is intended to do. Maybe you intended it to be part of a pattern instead of an action? And even with a trivial awk script, it's worth the trouble to indent decently, so the script would looks something like this:

就我个人而言，我可能只会在End子句中计算一次avg。我不清楚NR>1的目的是什么。也许你打算让它成为一种模式的一部分，而不是一种行为？即使使用一个简单的awk脚本，也值得费力适当地缩进，因此该脚本将如下所示：

#!/bin/awk

BEGIN {
        FS=","
    }

$3 ~ /^[0-9]+$/    {
        i++ ;
        tot+=$3
    }

END{
    avg=tot/i
    printf "\nAverage: %.2f\n ", avg
}

To simplify the script for another answer:

要简化另一个答案的脚本，请执行以下操作：

awk -F, '$3 > 0  { i++; tot += $3 }  END { printf "\nAverage: %.2f\n ", tot/(i-1) }'

You do not need in gawk to skip first line, it is interpreted as 0. Also division can be added in printf command.

你不需要在gawk中跳过第一行，它被解释为0。也可以在printf命令中添加除法。

Also IMHO estimate column can be zero, but can't be negative so you can skip the check and also not required to use counter (variable i). So the script ca become even sampler

此外，IMHO估计列可以为零，但不能为负，因此您可以跳过检查，也不需要使用计数器(变量i)。因此，剧本可以变得更具采样性

awk -F, '{  tot += $3 }  END { printf "\nAverage: %.2f\n ", tot/(NR-1) }'

更多回答

@CharlesDuffy: Although "apparently" is often used to mean something like "probably", I'm using its real meaning, so this sentence is essentially equivalent to: "It is apparent that your attempt at screening out the header line isn't working."

@CharlesDuffy：虽然“显然”经常被用来表示“可能”之类的意思，但我使用的是它的真正含义，所以这句话基本上等同于：“很明显，你试图筛选出标题行的尝试没有奏效。”

@CharlesDuffy: I guess--I've edited to strengthen the statement a bit. Not sure it makes a big difference, but I guess it doesn't hurt anything, anyway.

@CharlesDuffy：我想--我做了一些编辑，以加强这一声明。我不确定这会有什么不同，但我想无论如何都不会有什么伤害。

Those versions aren't equivalent. The second script includes 0s in the average while the first one didn't. Granted, there are no 0s in the example data, but that doesn't mean they can't occur.

这些版本并不等同。第二个脚本在平均值中包含0，而第一个脚本中没有。当然，示例数据中没有0，但这并不意味着它们不会出现。

@MaksVerver, correct. The OP do not mention about zeroes, so I just add example in this direction :)

@MaksVerver，正确。OP没有提到零，所以我只在这个方向上添加了示例：)

文章推荐： Number plate detection JSON dataset(车牌检测JSON数据集)

average - 关于reduce part of RavenDB index, average calculation的语法问题
我正在努力寻找平均列的正确语法。我所拥有的——来自 RavenDB Studio 编辑器: map : from area in docs.Level5_AdministrativeAreas sel
java - 为什么 IntStream.average() 总是返回正确的结果而 LongStream.average() 有时却不能？
double average = LongStream .of(-4480186093928204294L, -1340542863544260591L, -600429628624003927
php - 梦幻足球: Compare a team's average points to the league average?
我在 MySQL 表中有梦幻足球联赛的数据。我想查询一个 php 页面的数据。我如何创建一个查询来计算球队得分与该特定年份联盟平均得分的比较？我的(简化的)数据表如下所示: 游戏 table :
javascript - 关于 "var avg = array.average()"中 array.average() 功能的混淆
干杯，伙计们。我被要求让这段代码在学习挑战中发挥作用，但我不确定如何处理“array.average()”部分，因为它不是一个函数。我被问到的是: var array = [5,44,23,11,5
java - 数组和搜索算法 : How is the "average N/2 steps to search an array" average value calculated?
我刚刚开始学习 Java 中的数据结构和算法(从数组开始)。我有两个问题。在我看来，算法执行中的“步骤”是实际上是算法访问的数组的位置。因为他们说数组中的插入一步发生，因为数据项被简单地插入到第一个
php - 如何使用谷歌分析 api 获取 'average session duration' 和 'Average pages viewed per visit'
嗨，我正在使用谷歌分析 api gapi查找网站的“平均 session 持续时间”和“每次访问浏览的平均页面” 我为此创建了一个仪表板，其值为 00:02:30和 4.58分别 ... 我使用以下代
sql - MySQL 查询 : work out the average rating for each user then order the results by average rating and number of ratings
SELECT username, (SUM(rating)/count(*)) as TheAverage, count(*) as TheCount FROM ratings WHERE month
excel - Office Excel平均函数: what's the difference between AVERAGE(A2:C2 B1:B10) and AVERAGE(A2:C2, B1 :B10)?
在 Office Excel AVERAGE 函数示例中，参数始终使用逗号作为分隔符。但是，AVERAGE(A2:C2 B1:B10) 在 Excel 中也有效。我的问题:有什么区别以及为什么？谢谢
python - (对于循环): How to put average values beside each number of the corresponding avg value and print the number(s) with the highest average?
我的代码: name = ["AAAAA 4 2 1 2 4 2 4 4 5 2 2 1 5 2 4 3 1 1 3 3 5", "BBB 5 2 1 2 4 5 4 4 1 2 2 2 4 4
android - 火力地堡分析 : Show Average of 'Value' Parameter and average of time like 'hh:mm:ss' in the Console for events sent from Android app
我一直在探索适用于 Android 的 Firebase Analytics，发现控制台的仪表板显示用户参与事件，该事件显示平均屏幕时间，如“hh:mm:ss”，还通过获取以下总和来显示“值”参数的平
average - 计算平均评分
很难说出这里问的是什么。这个问题是含糊的、模糊的、不完整的、过于宽泛的或修辞性的，无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开它，visit the help center 。已关
average - 平均倍数变化的问题
我使用维基百科文章中定义的折叠更改: http://en.wikipedia.org/wiki/Fold_change 我现在处理倍数变化已经有一段时间了，但从来没有真正需要计算我所有倍数变化的平均倍
计算两种颜色的 "average"
这仅与编程相关 - 与颜色及其表示有更多关系。我正在开发一个非常底层的应用程序。我在内存中有一个字节数组。那些是字符。它们是用抗锯齿渲染的:它们的值从 0 到 255，0 表示完全透明，255 完全
average - NetLogo:查找一组海龟的平均值
我正在尝试在用户界面中实现一个监视器，该监视器显示由海龟品种(海龟自己)共享的变量的平均值。有谁知道收集所有值的方法，将它们加在一起并除以海龟的数量以获得值或知道更简单的方法？最佳答案如果每只海龟
math - "Average"多个四元数？
我试图在我的 OpenGL 程序中将骨架动画从矩阵切换到四元数，但我遇到了一个问题: 给定多个单位四元数，我需要得到一个四元数，当用于变换向量时，将给出一个向量，该向量是每个四元数单独变换的向量的平均
moving-average - Clickhouse移动平均线
输入: Clickhouse 表A business_dttm(日期时间) 金额( float ) 我需要在每个 business_dttm 上计算 15 分钟(或最后 3 条记录)的移动总和例如
moving-average - 如何在不保留计数和数据总计的情况下计算移动平均线？
我正在尝试找到一种方法来计算移动累积平均值，而不存储迄今为止收到的计数和总数据。我想出了两种算法，但都需要存储计数: 新平均值 = ((旧计数 * 旧数据) + 下一个数据)/下一个计数新平均值
mysql - 如何优化SQL an/average(a)？
在我的 SQL 脚本中，我想标准化来自这样的子查询的值 select y/avg(y) from ( select x*z as y from test_table )T 我知道这个解决方案会起
python - 获取同一文件的多个随机损坏副本的 "average"
由于 Controller 损坏，文件在从 USB 闪存驱动器下载时会随机损坏。我下载了同一个文件的多个副本，每个副本的错误似乎通常是唯一且随机的。因此，我需要一个脚本来比较同一文件的几个(3 到
algorithm - Average-case算法分析
我正在尝试解决一个非常简单的算法分析(显然对我来说不是那么简单)。算法是这样的: int findIndexOfN(int A[], int n) { // this algorithm looks

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Where am I going wrong? Trying to output average with awk(我哪里错了？尝试使用awk输出平均值)