gpt4 book ai didi

debugging - 如何使用Eclipse在MapReduce的主节点中调试工作程序节点?

转载 作者:行者123 更新时间:2023-12-02 20:08:59 25 4
gpt4 key购买 nike

我要执行以下任务:

我在主节点的 eclipse 中运行了MapReduce应用程序(如WordCount),并且我想查看工作节点如何使用Eclipse工作,因为我知道本地mapreduce作业和完全分布式的mapreduce作业之间存在一些不同的工作流程。

有什么方法可以实现?

最佳答案

您可以在本地运行任务,请参见How to Debug Map/Reduce Programs:

Start by getting everything running (likely on a small input) in the local runner. You do this by setting your job tracker to "local" in your config. The local runner can run under the debugger and runs on your development machine.

A very quick and easy way to set this config variable is to include the following line just before you run the job: conf.set("mapred.job.tracker", "local"); You may also want to do this to make the input and output files be in the local file system rather than in the Hadoop distributed file system (HDFS):conf.set("fs.default.name", "local");

You can also set these configuration parameters in hadoop-site.xml. The configuration files hadoop-default.xml, mapred-default.xml and hadoop-site.xml should appear somewhere in your program's class path when the program runs.



如果要在实际集群中调试任务,则必须将调试选项添加到Java起始行(例如 -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000),然后将Eclipse远程连接到等待的Java进程。例如,您可以设置 mapred.map.child.java.opts。有几个执行此操作的示例,但具体操作方法各不相同:
  • How to debug hadoop mapreduce jobs from eclipse?
  • REMOTE DEBUGGING OF HADOOP JOB WITH ECLIPSE

  • 一旦您了解到目标是将 -agentlib:...参数传递给Java命令行以启用远程调试器,以便Eclipse可以附加一些东西,那么实现的具体细节就变得无关紧要了。不过,我会避免使用hadoop-env.sh的修改。

    AFAIK Cloudera具有一个VM镜像,该镜像附带了用于本地M / R任务开发的预配置Eclipse,请参阅 How-to: Use Eclipse with MapReduce in Cloudera’s QuickStart VM

    关于debugging - 如何使用Eclipse在MapReduce的主节点中调试工作程序节点?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19135628/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com