gpt4 book ai didi

linux - 哪些性能事件可以使用 PEBS?

转载 作者:太空狗 更新时间:2023-10-29 11:07:27 26 4
gpt4 key购买 nike

我想了解哪些事件可以对我的 precise 修饰符CPU(沙桥)。

英特尔软件开发人员手册(表 18-32。PEBS 性能英特尔微体系结构代号 Sandy Bridge 的事件)包含仅以下事件:INST_RETIREDUOPS_RETIREDBR_INST_RETIREDBR_MISP_RETIREDMEM_UOPS_RETIREDMEM_LOAD_UOPS_RETIREDMEM_LOAD_UOPS_LLC_HIT_RETIRED。和 SandyBridge_core_V15.json列出 PEBS > 0 的相同事件。

但是有some examples使用 perf,将 :p 添加到 cycles 事件。我可以在我的机器上成功运行 perf record -e cycles:p

同时 perf record -e cycles:p -vv -- sleep 1 打印 precise_ip 1。那么这是否意味着 CPU_CLK_UNHALTED 事件实际上使用了 PEBS?

是否可以获得完整的事件列表,支持:p

最佳答案

有黑客支持cycles:p在没有 PEBS 的 SandyBridge 上 CPU_CLK_UNHALTED.* .该 hack 是在 perf 的内核部分实现的在 intel_pebs_aliases_snb() .当用户请求 -e cycles这是 PERF_COUNT_HW_CPU_CYCLES (转换为 CPU_CLK_UNHALTED.CORE )非零 precise修饰符,此函数会将硬件事件更改为 UOPS_RETIRED.ALL与 PEBS:

  29    [PERF_COUNT_HW_CPU_CYCLES]      = 0x003c,

2739 static void intel_pebs_aliases_snb(struct perf_event *event)
2740 {
2741 if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
2742 /*
2743 * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
2744 * (0x003c) so that we can use it with PEBS.
2745 *
2746 * The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
2747 * PEBS capable. However we can use UOPS_RETIRED.ALL
2748 * (0x01c2), which is a PEBS capable event, to get the same
2749 * count.
2750 *
2751 * UOPS_RETIRED.ALL counts the number of cycles that retires
2752 * CNTMASK micro-ops. By setting CNTMASK to a value (16)
2753 * larger than the maximum number of micro-ops that can be
2754 * retired per cycle (4) and then inverting the condition, we
2755 * count all cycles that retire 16 or less micro-ops, which
2756 * is every cycle.
2757 *
2758 * Thereby we gain a PEBS capable cycle counter.
2759 */
2760 u64 alt_config = X86_CONFIG(.event=0xc2, .umask=0x01, .inv=1, .cmask=16);
2761
2762 alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
2763 event->hw.config = alt_config;
2764 }
2765 }

intel_pebs_aliases_snb hack 注册于 3557 __init int intel_pmu_init(void) 对于 case INTEL_FAM6_SANDYBRIDGE:/case INTEL_FAM6_SANDYBRIDGE_X:作为

3772        x86_pmu.event_constraints = intel_snb_event_constraints;
3773 x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
3774 x86_pmu.pebs_aliases = intel_pebs_aliases_snb;

pebs_aliases intel_pmu_hw_config() 调用什么时候precise_ip设置为非零:

2814 static int intel_pmu_hw_config(struct perf_event *event)
2815 {

2821 if (event->attr.precise_ip) {

2828 if (x86_pmu.pebs_aliases)
2829 x86_pmu.pebs_aliases(event);
2830 }

黑客是在 2012 年实现的,lkml 线程“[PATCH] perf, x86: Make cycles:p working on SNB”, “[tip:perf/core] perf/x86: Implement cycles:p for SNB/IVB” , cccb9ba9e4ee0d750265f53de9258df69655c40b, http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=cccb9ba9e4ee0d750265f53de9258df69655c40b :

perf/x86: Implement cycles:p for SNB/IVB

Now that there's finally a chip with working PEBS (IvyBridge), we can enable the hardware and implement cycles:p for SNB/IVB.

而且我认为,除了 arch/x86/events/intel/core.c 中的 linux 源代码之外,没有此类“精确”转换 hack 的完整列表。 , grep static void intel_pebs_aliases (通常执行 cycles:p/CPU_CLK_UNHALTED 0x003c)并检查 intel_pmu_init对于实际模型和精确 x86_pmu.pebs_aliases选择的变体:

  • intel_pebs_aliases_core2,INST_RETIRED.ANY_P (0x00c0) CNTMASK=16而不是 cycles:p
  • intel_pebs_aliases_snb,UOPS_RETIRED.ALL (0x01c2) CNTMASK=16而不是 cycles:p
  • intel_pebs_aliases_precdist 的最大值为 precise_ip , INST_RETIRED.PREC_DIST (0x01c0)而不是 cycles:ppp在 SKL、IVB、HSW、BDW 上

关于linux - 哪些性能事件可以使用 PEBS?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42166846/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com