性能统计与性能记录-6ren

性能统计与性能记录

转载作者：行者123 更新时间：2023-12-02 08:08:29

30

4

我对 perf record 和 perf stat 之间的区别感到困惑，当涉及到计数事件(如页面错误、缓存未命中和 perf list 中的任何其他内容)时。我在“问题 1”的答案下方有 2 个问题可能也有助于回答“问题 2”，但如果没有，我将它们明确写出来。

问题 1:
我的理解是 perf stat 获取计数的“摘要”，但是当与 -I 选项一起使用时，会以指定的毫秒间隔获取计数。使用此选项，它是对间隔内的计数求和还是获得间隔内的平均值，或者完全是其他什么？我认为它是总结出来的。 perf wiki 声明它是聚合的，但我想这可能意味着。

问题2:
为什么 perf stat -e <event1> -I 1000 sleep 5 给出的计数与我对以下命令 perf record -e <event1> -F 1000 sleep 5 每秒的计数相加不一样？

例如，如果我使用“page-faults”作为 event1 的事件，我会在每个命令下得到以下输出。 (我假设 period 字段是 perf record 的 perf.data 文件中事件的计数)

性能统计

    perf stat -e page-faults -I 1000 sleep 5
    #           time             counts unit events
         1.000252928                 54      page-faults                                                 
         2.000498389      <not counted>      page-faults                                                 
         3.000569957      <not counted>      page-faults                                                 
         4.000659987      <not counted>      page-faults                                                 
         5.000837864                  2      page-faults

性能记录

    perf record -e page-faults -F 1000 sleep 5
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.016 MB perf.data (6 samples) ]
    perf script -F period
             1
             1
             1
             5
            38
           164

我预计如果我对 perf stat 的计数求和，我会得到与 perf record 的总和相同的值。如果我将 -c 选项与 perf record 一起使用并给出 1 的参数，我会得到一个接近的匹配。由于页面错误数量相对较少，这只是巧合吗？

到目前为止我使用过的引用资料:

brendangregg's perf blog

上面提到的 this page 上的 perf 记录和统计链接，称为“perf wiki”

我仔细研究了 here 以查看 perf 记录实际记录的方式和时间与写入 perf.data 的时间。

提前感谢您提供的任何和所有见解。

最佳答案

首先，您使用 sleep 和 page-faults 的测试用例不是理想的测试用例。在 sleep 期间不应该有页面错误事件，你真的不能期待任何有趣的事情。为了更容易推理，我建议使用 ref-cycles (硬件)事件和繁忙的工作负载，例如 awk 'BEGIN { while(1){} }' 。

Question 1: It is my understanding that perf stat gets a "summary" of counts but when used with the -I option gets the counts at the specified millisecond interval. With this option does it sum up the counts over the interval or get the average over the interval, or something else entirely? I assume it is summed up.

是的。这些值只是总结出来的。您可以通过测试来确认:

$ perf stat -e ref-cycles -I 1000 timeout 10s awk 'BEGIN { while(1){} }'
#           time             counts unit events
 1.000105072      2,563,666,664      ref-cycles                                                  
 2.000267991      2,577,462,550      ref-cycles                                                  
 3.000415395      2,577,211,936      ref-cycles                                                  
 4.000543311      2,577,240,458      ref-cycles                                                  
 5.000702131      2,577,525,002      ref-cycles                                                  
 6.000857663      2,577,156,088      ref-cycles                                                  

[ ... snip ... ]
[ Note that it may not be as nicely consistent on all systems due dynamic frequency scaling ]

$ perf stat -e ref-cycles -I 3000 timeout 10s awk 'BEGIN { while(1){} }' 
#           time             counts unit events
 3.000107921      7,736,108,718      ref-cycles                                                  
 6.000265186      7,732,065,900      ref-cycles                                                  
 9.000372029      7,728,302,192      ref-cycles

Question 2: Why doesn't perf stat -e <event1> -I 1000 sleep 5 give about the same counts as if I summed up the counts over each second for the following command perf record -e <event1> -F 1000 sleep 5?

perf stat -I 以毫秒为单位，而 perf record -F 以 HZ(1/s)为单位，因此 perf stat -I 1000 对应的命令是 perf record -F 1 。事实上，对于我们更稳定的事件/工作负载，这看起来更好:

$ perf stat -e ref-cycles -I 1000 timeout 10s awk 'BEGIN { while(1){} }'
#           time             counts unit events
 1.000089518      2,578,694,534      ref-cycles                                                  
 2.000203872      2,579,866,250      ref-cycles                                                  
 3.000294300      2,579,857,852      ref-cycles                                                  
 4.000390273      2,579,964,842      ref-cycles                                                  
 5.000488375      2,577,955,536      ref-cycles                                                  
 6.000587028      2,577,176,316      ref-cycles                                                  
 7.000688250      2,577,334,786      ref-cycles                                                  
 8.000785388      2,577,581,500      ref-cycles                                                  
 9.000876466      2,577,511,326      ref-cycles                                                  
10.000977965      2,577,344,692      ref-cycles                                                  
10.001195845            466,674      ref-cycles    

$ perf record -e ref-cycles -F 1 timeout 10s awk 'BEGIN { while(1){} }'
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data (17 samples) ]

$ perf script -F time,period        
3369070.273722:          1 
3369070.273755:          1 
3369070.273911:       3757 
3369070.273916:    3015133 
3369070.274486:          1 
3369070.274556:          1 
3369070.274657:       1778 
3369070.274662:    2196921 
3369070.275523: 47192985748 
3369072.663696: 2578692405 
3369073.663547: 2579122382 
3369074.663609: 2580015300 
3369075.664085: 2579873741 
3369076.664433: 2578638211 
3369077.664379: 2578378119 
3369078.664175: 2578166440 
3369079.663896: 2579238122

所以你看，最终结果对于 perf record -F 也是稳定的。不幸的是， perf record 的文档非常糟糕。您可以通过查看底层系统调用 -c 的文档来了解 -F 和 man perf_event_open 设置的含义:

sample_period, sample_freq A "sampling" event is one that generates an overflow notification every N events, where N is given by sample_period. A sampling event has sample_period > 0. When an overflow occurs, requested data is recorded in the mmap buffer. The sample_type field controls what data is recorded on each overflow.

sample_freq can be used if you wish to use frequency rather than period. In this case, you set the freq flag. The kernel will adjust the sampling period to try and achieve the desired rate. The rate of adjustment is a timer tick.

因此，虽然 perf stat 使用内部计时器每 -i 毫秒读取计数器的值，但 perf record 设置事件溢出计数器以在每个 -c 事件中取样。这意味着它对每个 N 事件(例如每个 N page-fault 或 cycles )进行采样。使用 -F ，它会尝试调节此溢出值以达到所需的频率。它尝试不同的值并相应地向上/向下调整。这最终适用于具有稳定速率的计数器，但对于动态事件会得到不稳定的结果。

关于性能统计与性能记录，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49216628/

30

4

0

文章推荐： react-native - react native Flex 高度

文章推荐：减少大栅格列表上马赛克的内存使用量

java - 记录 jsoup 记录
我有一个网站。必须登录才能看到里面的内容。但是，我使用此代码登录。 doc = Jsoup.connect("http://46.137.207.181/Account/Login.aspx")
email - 即使定义了 SPF 记录，也始终获得无(无 SPF 记录)
我正在尝试为我的域创建一个 SPF 记录并使我的邮件服务器能够对其进行评估。我在邮件服务器上使用 Postfix 并使用 policyd-spf (Python) 来评估记录。目前，我通过我的私有(p
dns - AWS 负载均衡器需要 @ cname 记录，但这会覆盖我的 @ MX 记录
我需要为负载平衡的 AWS 站点 mywebsite.com 添加 CName 记录。记录应该是: @ CNAME mywebsite.us-east-1.elb.amazon
记录，何时记录以及记录什么？
我目前正在开发一个相当大的多层应用程序，该应用程序将部署在海外。虽然我希望它在解聚后不会折叠或爆炸，但我不能 100% 确定这一点。因此，如果我知道我可以请求日志文件，以准确找出问题所在以及原因，那就
video - gstreamer 记录
我使用以下命令从我的网络摄像头录制音频和视频 gst-launch-0.10 v4l2src ! video/x-raw-yuv,width=640,height=480,framerate=30/1
记录 ffmpeg 控制台信息
我刚刚开始使用 ffmpeg 将视频分割成图像。我想知道是否可以将控制台输出信息保存到日志文件中。我试过“-v 10”参数，也试过“-loglevel”参数。我在另一个 SO 帖子上看到使用 ffmp
sql - 在我的查询中指定日期不会产生任何结果/记录
我想针对两个日期查询我的表并检索其中的记录。我这样声明我的变量； DECLARE @StartDate datetime; DECLARE @EndDate datetime; 并像这样设置我的变量
f# - 像Javascript对象一样循环F#记录
在 javascript 中，我可以使用简单的 for 循环访问对象的每个属性，如下所示 var myObj = {x:1, y:2}; var i, sum=0; for(i in myObj) s
visualization - 记录/可视化调用图的工具？
最近加入了一个需要处理大量代码的项目，我想开始记录和可视化调用图的一些流程，让我更好地理解一切是如何组合在一起的。这是我希望在我的理想工具中看到的: 每个节点都是一个函数/方法如果一个函数可以调用另
.net - 通过反射创建F#记录
如何使用反射在F#中创建记录类型？谢谢最佳答案您可以使用 FSharpValue.MakeRecord [MSDN]创建一个记录实例，但是我认为F#中没有任何定义记录类型的东西。但是，记录会编译为
documentation - 记录 yaml
关闭。这个问题不满足Stack Overflow guidelines .它目前不接受答案。想改善这个问题吗？更新问题，使其成为 on-topic对于堆栈溢出。 3年前关闭。 Improve thi
javascript - 创建具有关联外键的新模型实例/记录
我是 Sequelize 的新手并且遇到了一些语法问题。我制作了以下模型: // User sequelize.define('user', { name: { type: DataTyp
java - 从jsp到jsp显示${}记录
${student.name} Notify 这是我的output.jsp。请注意，我已经放置了一个链接“Notify”以将其转发到 display.jsp 上。但我不确定如何将 Stud
elasticsearch - 如何搜索没有特定字段的文档/记录？
例如，这是我要做的查询: server:"xxx.xxx.com" AND request_url:"/xxx/xxx/xxx" AND http_X_Forwarded_Proto:(https O
记录 SAS 脚本
我一直在开发大量 Java、PHP 和 Python。所有这些都提供了很棒的日志记录包(分别是 Log4J、Log 或logging)。这在调试应用程序时有很大帮助。特别是当应用程序 headless
grails - 记录/Log4J到数据库
在我的Grails应用程序中，我异步运行一些批处理过程，并希望该过程记录各种状态消息，以便管理员以后可以检查它们。我考虑过将log4j JDBC附加程序用作最简单的解决方案，但是据我所知，它不使用D
记录 MQ 消息
我想将进入 MQ 队列的消息记录到数据库/文件或其他日志队列，并且我无法修改现有代码。是否有任何方法可以实现某种类似于 HTTP 嗅探器的消息记录实用程序？或者也许 MQ 有一些内置的功能来记录消息？
f# - 记录 "with"语法中的泛型
如果我有一条包含通用字段的记录，在更改通用字段时是否有任何方法可以模仿方便的 with 语法？即如果我有 type User = // 'photo can be Bitmap or Url {
java - 记录/调试我的对象的状态
假设我有一个名为 Car 的自定义对象。其中的所有字段都是私有(private)的。 public class Car { private String mName; private
二郎。记录。现有字段
当记录具有特定字段时，我需要返回 true 的函数，反之亦然。示例: -record(robot, {name, type=industrial, ho

首页

博学

6Ren·AI

商城

性能统计与性能记录