gpt4 book ai didi

Is there a clear and reliable methodology for workloads profiling/modeling?(是否有一套清晰可靠的工作负载分析/建模方法?)

转载 作者:bug小助手 更新时间:2023-10-25 19:52:54 25 4
gpt4 key购买 nike



I want to evaluate the hardware resource requirements of a certain workload from the following aspects:

我想从以下几个方面来评估某个工作负载的硬件资源需求:



  1. Hyper-threading friendliness

  2. Sensitivity to memory latency/memory bandwidth

  3. Sensitivity to the number of allocated physical cores (in container-based workloads, the cpu quota remains unchanged, the impact of changing the number of cores in the cpuset)

  4. Sensitivity to pagecache

  5. Sensitivity to available cache space size


I would like to ask if there are relevant methodologies, tools or papers for practice in the above aspects?

请问在上述方面有没有相关的方法论、工具或文件可供实践?


更多回答

I think the question is too broad. Each point can be addressed separately and generally require different approaches. For 2), I would track the imc counters and the memory latency precise counters for example. For 1) it is quite complicated to do it rigorously. A good approach is certainly architecture-dependent. A not-to-bad solution is just to run it and compare results while tracking the source of slowdown with some PMU events. For 3), a strong scaling or weak scaling may be enough.

我认为这个问题太宽泛了。每一点都可以单独处理,通常需要不同的方法。例如,我会跟踪IMC计数器和内存延迟精确计数器。因为严格地做这件事是相当复杂的。好的方法当然是依赖于架构的。一个不错的解决方案是运行它并比较结果,同时使用一些PMU事件跟踪放缓的根源。对于3),强伸缩或弱伸缩可能就足够了。

IMO, the is no such thing as "available cache space size" at low level. This is an abstract concept and it is hard to evaluate the amount due to many low-level behavior (not to mention it is architecture-dependent and often not completely documented). You can change the cache size but often not really the available one. The cache-line allocation is done by the processor dynamically. In some cases, it can bypass some cache levels, and not even do the allocations in them.

在较低的水平上,不存在“可用高速缓存空间大小”这样的东西。这是一个抽象的概念,由于许多低级行为,很难评估数量(更不用说它依赖于体系结构,而且通常没有完整的文档记录)。您可以更改缓存大小,但通常不是真正可用的缓存大小。高速缓存线分配由处理器动态完成。在某些情况下,它可以绕过某些缓存级别,甚至不执行其中的分配。

@JérômeRichard I am considering using Intel's RDT technology to control/monitor the cache capacity used by the program. Do you think this method is okay?

@JérômeRichard我正在考虑使用英特尔的RDT技术来控制/监控程序使用的缓存容量。你觉得这个方法行吗?

优秀答案推荐
更多回答

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com