?-6ren"> ?-运行 perf stat ls 显示: Performance counter stats for 'ls': 1.388670 task-clock -6ren">
gpt4 book ai didi

linux - 为什么 perf stat 将 "stalled-cycles-backend"显示为 <不支持>?

转载 作者:IT王子 更新时间:2023-10-29 00:14:12 27 4
gpt4 key购买 nike

运行 perf stat ls 显示:

Performance counter stats for 'ls':

1.388670 task-clock # 0.067 CPUs utilized
2 context-switches # 0.001 M/sec
0 cpu-migrations # 0.000 K/sec
266 page-faults # 0.192 M/sec
3515391 cycles # 2.531 GHz
2096636 stalled-cycles-frontend # 59.64% frontend cycles idle
<not supported> stalled-cycles-backend
2927468 instructions # 0.83 insns per cycle
# 0.72 stalled cycles per insn
615636 branches # 443.328 M/sec
22172 branch-misses # 3.60% of all branches

0.020657192 seconds time elapsed

为什么 stalled-cycles-backend 显示为“不受支持”?我需要什么样的 CPU、硬件、内核或用户空间软件才能看到这个值?

目前在不同的 Intel Core i5 和 i7 系统(Ivy Bridge 类型)上用匹配的 perf 版本在 RHEL 和 Linux 3.12 for x86_64 上试过这个。它们都不支持stalled-cycles-backend

更多信息:

$ perf list | grep stalled
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
stalled-cycles-frontend OR cpu/stalled-cycles-frontend/ [Kernel PMU event]

$ ls /sys/devices/cpu/events/
branch-instructions bus-cycles cache-references instructions mem-stores
branch-misses cache-misses cpu-cycles mem-loads stalled-cycles-frontend

$ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-frontend
event=0x0e,umask=0x01,inv,cmask=0x01

编辑:刚刚在 AMD Phenom II X6 1045T CPU 上尝试了这个,在 Ubuntu 12.04 和 Linux 3.2(32 位)下 - 这里它确实显示了 stalled-cycles-frontend 的值stalled-cycles-backend

最佳答案

看起来 perf 还没有更新以了解 Ivy Bridge 支持的所有性能监控事件。幸运的是,有一个通用的(尽管很麻烦)界面允许您访问性能监控事件的完整列表。当我快速查看列表时,我没有在列表中看到 stalled-cycles-backend,但也许我错过了,或者他们可能已经将它分解为可能使后端停止的所有不同事件.

我们从

开始
perf list --help

...显示以下注意

    1. Intel(R) 64 and IA-32 Architectures Software Developer's Manual
Volume 3B: System Programming Guide
http://www.intel.com/Assets/PDF/manual/253669.pdf

...用你最终到达的那个 URL 武装起来

http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf

...你想要第 19.3 节

19.3 PERFORMANCE MONITORING EVENTS FOR 3RD GENERATION INTEL® CORE™ PROCESSORS 3rd generation Intel® Core™ processors and Intel Xeon processor E3-1200 v2 product family are based on Intel microarchitecture code name Ivy Bridge. They support architectural performance-monitoring events listed in Table 19-1. Non-architectural performance-monitoring events in the processor core are listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3AH.

...因此对于架构事件,您需要表 19-1

19.1 ARCHITECTURAL PERFORMANCE-MONITORING EVENTS Architectural performance events are introduced in Intel Core Solo and Intel Core Duo processors. They are also supported on processors based on Intel Core microarchitecture. Table 19-1 lists pre-defined architectural performance events that can be configured using general-purpose performance counters and associated event-select registers.

**Table 19-1. Architectural Performance Events

enter image description here

enter image description here

...现在是棘手的部分,您将 UMask Value 作为高 2 位十六进制数字,Event Num 是 4 的低 2 位十六进制数字将提供给 perf stat 的十六进制数字硬件寄存器编号。

perf stat --help
   -e, --event=
Select the PMU event. Selection can be a symbolic event name (use
perf list to list all events) or a raw PMU event (eventsel+umask) in
the form of rNNN where NNN is a hexadecimal event descriptor.

...它说 NNN 但你可以给它 NNNN。让我们验证它是否有效,让我们向 perf stat 请求缓存未命中,既可以是符号事件名称,也可以是表 19-1 中的十六进制数字。为简单起见,我们将调用 date 命令。

$ perf stat -e r412e -e cache-misses date

Fri Mar 28 09:28:52 CDT 2014

Performance counter stats for 'date':

2292 r412e
2292 cache-misses

0.003322663 seconds time elapsed

$

如您所见,两者都报告了相同的数字,到目前为止还不错。现在我们转到表 19-5 的非架构硬件寄存器,这里列出的太多了,但我会列出一些:

enter image description here

关于linux - 为什么 perf stat 将 "stalled-cycles-backend"显示为 <不支持>?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22712956/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com