gpt4 book ai didi

java - 如何在 java 中使用 yarn api 提交 mapreduce 作业

转载 作者:可可西里 更新时间:2023-11-01 15:25:30 25 4
gpt4 key购买 nike

我想使用 YARN java API 提交我的 MR 作业,我尝试像 WritingYarnApplications 那样做, 但我不知道要添加什么 amContainer,下面是我写的代码:

package org.apache.hadoop.examples;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.yarn.api.protocolrecords.GetNewApplicationResponse;
import org.apache.hadoop.yarn.api.records.ApplicationId;
import org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext;
import org.apache.hadoop.yarn.api.records.ContainerLaunchContext;
import org.apache.hadoop.yarn.api.records.Resource;
import org.apache.hadoop.yarn.client.api.YarnClient;
import org.apache.hadoop.yarn.client.api.YarnClientApplication;
import org.apache.hadoop.yarn.util.Records;
import org.mortbay.util.ajax.JSON;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class YarnJob {
private static Logger logger = LoggerFactory.getLogger(YarnJob.class);

public static void main(String[] args) throws Throwable {

Configuration conf = new Configuration();
YarnClient client = YarnClient.createYarnClient();
client.init(conf);
client.start();

System.out.println(JSON.toString(client.getAllQueues()));
System.out.println(JSON.toString(client.getConfig()));
//System.out.println(JSON.toString(client.getApplications()));
System.out.println(JSON.toString(client.getYarnClusterMetrics()));

YarnClientApplication app = client.createApplication();
GetNewApplicationResponse appResponse = app.getNewApplicationResponse();

ApplicationId appId = appResponse.getApplicationId();

// Create launch context for app master
ApplicationSubmissionContext appContext = Records.newRecord(ApplicationSubmissionContext.class);
// set the application id
appContext.setApplicationId(appId);
// set the application name
appContext.setApplicationName("test");
// Set the queue to which this application is to be submitted in the RM
appContext.setQueue("default");

// Set up the container launch context for the application master
ContainerLaunchContext amContainer = Records.newRecord(ContainerLaunchContext.class);
//amContainer.setLocalResources();
//amContainer.setCommands();
//amContainer.setEnvironment();

appContext.setAMContainerSpec(amContainer);
appContext.setResource(Resource.newInstance(1024, 1));

appContext.setApplicationType("MAPREDUCE");

// Submit the application to the applications manager
client.submitApplication(appContext);
//client.stop();
}
}

我可以使用命令界面正确运行 mapreduce 作业:

hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount /user/admin/input /user/admin/output/

但是我如何在 yarn java api 中提交这个 wordcount 作业呢?

最佳答案

您不使用 Yarn Client 提交作业,而是使用 MapReduce API 提交作业。 See this link for Example

但是,如果您需要对作业进行更多控制,例如获取完成状态、Mapper 阶段状态、Reducer 阶段状态等,您可以使用

job.submit();

代替

job.waitForCompletion(true)

您可以使用函数 job.mapProgress() 和 job.reduceProgress() 来获取状态。作业对象中有许多功能可供您探索。

就你的查询而言

hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount /user/admin/input /user/admin/output/

这里发生的是您正在运行 wordcount.jar 中可用的驱动程序。您使用的不是“java -jar wordcount.jar”,而是“hadoop jar wordcount.jar”。你也可以使用“yarn jar wordcount.jar”。与 java -jar 命令相比,Hadoop/Yarn 将设置必要的额外类路径。这将执行您的驱动程序的“main()”,该程序在命令中指定的类 org.apache.hadoop.examples.WordCount 中可用。

您可以在这里查看源代码 Source for WordCount class

我假设您想通过 yarn 提交作业的唯一原因是将其与某种服务集成,从而在某些事件上启动 MapReduce2 作业。

为此,您始终可以让您的驱动程序 main() 像这样。

public class MyMapReduceDriver extends Configured implements Tool {
public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();

/******/

int errCode = ToolRunner.run(conf, new MyMapReduceDriver(), args);

System.exit(errCode);
}

@Override
public int run(String[] args) throws Exception {

while(true) {

try{

runMapReduceJob();
}
catch(IOException e)
{
e.printStackTrace();
}
}
}

private void runMapReduceJob() {

Configuration conf = new Configuration();
Job job = new Job(conf, "word count");
/******/

job.submit();

// Get status
while(job.getJobState()==RUNNING || job.getJobState()==PREP){
Thread.sleep(1000);

System.out.println(" Map: "+ StringUtils.formatPercent(job.mapProgress(), 0) + " Reducer: "+ StringUtils.formatPercent(job.reduceProgress(), 0));

}
}}

希望这对您有所帮助。

关于java - 如何在 java 中使用 yarn api 提交 mapreduce 作业,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47767854/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com