gpt4 book ai didi

hadoop - Oozie MR 启动器有什么意义?

转载 作者:可可西里 更新时间:2023-11-01 14:15:17 24 4
gpt4 key购买 nike

我使用 Sqoop、Hive 和 Pig 操作创建了一个简单的 Oozie 工作流。对于其中的每一个 Action ,Oozie 都会启动一个 MR 启动器,然后由它启动 Action (Sqoop/Hive/Pig)。因此,工作流中的 3 个 Action 共有 6 个 MR 作业。

为什么Oozie启动一个MR launcher来启动action而不是直接启动action?

最佳答案

我在 Apache Flume 论坛上发了同样的帖子,这里是回复。

It's also to keep the Oozie server from being bogged down or becoming unstable. For example, if you have a bunch of workflows running Pig jobs, then you'd have the Oozie server running multiple copies of the Pig client (which is a relatively "heavy" program) directly. By moving all of the user code and external clients to map tasks in the launcher job, the Oozie server remains more light-weight and less prone to errors. It can also much more scalable this way because the launcher jobs distribute the the job launching/monitoring to other machines in the cluster; otherwise, with the Oozie server doing everything, we'd have to limit the number of concurrent workflows based on your Oozie server's machine specs (RAM, CPU, etc). And finally, from an architectural standpoint, the Oozie server itself is stateless; that is, everything is stored in the database and the Oozie server can be taken down at any point without losing anything. If we were to launch jobs directly from the Oozie server, then we'd now have some state (e.g. the Pig client cannot be restarted and resumed).

关于hadoop - Oozie MR 启动器有什么意义?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19488758/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com