gpt4 book ai didi

java - 确定 EMR 作业在 Map 与 Reduce 任务上花费了多少时间的最佳方法是什么?

转载 作者:可可西里 更新时间:2023-11-01 14:58:35 25 4
gpt4 key购买 nike

我正在 Amazon 的 AWS EMR 中运行自定义 jar hadoop 作业,我想收集有关运行所有 Map 任务所花费的时间与运行 Reduce 任务所花费时间的数据。框架中是否有一种方法可以挖掘我尚未找到的这些数据?如果没有,有人对生成此数据的最佳方式有任何建议吗?

谢谢,

最佳答案

您可以在客户端日志的作业计数器部分找到此信息。例如:

Job Counters 
Killed reduce tasks=1
Launched map tasks=1
Launched reduce tasks=7
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=1071855
Total time spent by all reduces in occupied slots (ms)=4083210
**Total time spent by all map tasks (ms)=23819**
**Total time spent by all reduce tasks (ms)=45369**
Total vcore-milliseconds taken by all map tasks=23819
Total vcore-milliseconds taken by all reduce tasks=45369
Total megabyte-milliseconds taken by all map tasks=34299360
Total megabyte-milliseconds taken by all reduce tasks=130662720
Map-Reduce Framework
Map input records=3929235
Map output records=15716940
Map output bytes=132989251
Map output materialized bytes=633590
Input split bytes=86

关于java - 确定 EMR 作业在 Map 与 Reduce 任务上花费了多少时间的最佳方法是什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27696722/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com