gpt4 book ai didi

hadoop - 哪个类将 Hive & Ping 解析为 Map Reduce

转载 作者:可可西里 更新时间:2023-11-01 15:03:07 24 4
gpt4 key购买 nike

哪个是将 pig 和 hive 命令解析为 Map Reduce 作业的类,这种解析背后的算法是什么?

最佳答案

Pig 和 Hive 都使用 ANTLR构建一个编译器来解析他们的脚本。如果你对编译原理不熟悉,建议你阅读一些相关资料。

对于 Pig,ANLTR 的源代码是 src/org/apache/pig/parser/QueryLexer.gsrc/org/apache/pig/parser/QueryParser.g。它们将被编译为 org.apache.pig.parser.QueryLexerorg.apache.pig.parser.QueryParser。但是,这两个类用于将 Pig 脚本编译为抽象语法树。然后它将转换为 org.apache.pig.newplan.logical.relational.LogicalPlan。之后,LogcialPlan将转化为org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan。这里我列出了一些相关的源文件:

org.apache.pig.newplan.logical.relational.LogicalPlan
org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.MROperPlan
org.apache.pig.parser.QueryParserDriver.parse(String)
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(LogicalPlan, Properties)
org.apache.pig.PigServer.launchPlan(PhysicalPlan, String)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(PhysicalPlan, PigContext)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(MROperPlan, MapReduceOper, Configuration, PigContext)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(MROperPlan, String)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(PhysicalPlan, String, PigContext)
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.constructLROutput(List<Result>, List<Result>, Tuple)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.Map.collect(Context, Tuple)

对于Hive,ANLTR的源代码是ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g。它将被编译为 org.apache.hadoop.hive.ql.parse.HiveLexerorg.apache.hadoop.hive.ql.parse.HiveParser。这两个类用于将 Hive 脚本编译为抽象语法树。然后它会转化为org.apache.hadoop.hive.ql.QueryPlan。 Hive中的mapper和reducer分别是ExecMapper和ExecReducer。

这里我列出了一些相关的源文件:

org.apache.hadoop.hive.cli.CliDriver
org.apache.hadoop.hive.ql.Driver
org.apache.hadoop.hive.ql.Driver.run(String)
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(String, Context)
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(String, Context)
org.apache.hadoop.hive.ql.parse.ASTNode
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer
org.apache.hadoop.hive.ql.QueryPlan
org.apache.hadoop.hive.ql.Driver.compile(String, boolean)
org.apache.hadoop.hive.ql.exec.TaskRunner
org.apache.hadoop.hive.ql.Driver.execute()
org.apache.hadoop.hive.ql.exec.ExecDriver
org.apache.hadoop.hive.ql.exec.ExecMapper
org.apache.hadoop.hive.ql.exec.ExecReducer
org.apache.hadoop.hive.ql.exec.MapOperator

最后,我建议大家下载他们的源码,在eclipse中浏览,有什么想知道的问题可以自行查找。

关于hadoop - 哪个类将 Hive & Ping 解析为 Map Reduce,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16959627/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com