hadoop - 在您的实现中是否有人覆盖了 Mapper run(Context) 方法？-6ren

hadoop - 在您的实现中是否有人覆盖了 Mapper run(Context) 方法？

转载作者：可可西里更新时间：2023-11-01 15:10:12

27

4

https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Mapper.html#method.summary

run (Context) org.apache.hadoop.mapreduce.Mapper
方法
a). Expert users can override this method for more complete control over the execution of the Mapper.

目前 run(Context) 方法的默认行为是什么。

如果我重写 run(Context)，根据文档会得到什么样的特殊控制？

是否有人在您的实现中覆盖了此方法？

最佳答案

Currently what is the default behavior of run(Context) method.

默认实现在 Mapper 的 Apache Hadoop 源代码中可见。类:

/** * Expert users can override this method for more complete control over the * execution of the Mapper. * @param context * @throws IOException */ public void run(Context context) throws IOException, InterruptedException { setup(context); try { while (context.nextKeyValue()) { map(context.getCurrentKey(), context.getCurrentValue(), context); } } finally { cleanup(context); } }

总结:

调用setup进行一次性初始化。

遍历输入中的所有键值对。

将键和值传递给 map 方法实现。

调用cleanup 进行一次性拆卸。

If i override run(Context) what kind of special control will get as per the documentation?

默认实现始终遵循单个线程中的特定执行顺序。覆盖它的情况很少见，但它可能会为高度特化的实现打开可能性，例如不同的线程模型或尝试合并冗余键范围。

Is anyone overridden this method in your implementations?

在 Apache Hadoop 代码库中，有两个重写:

ChainMapper允许将多个 Mapper 类实现链接在一起，以便在单个映射任务中执行。 run 的覆盖设置了一个表示链的对象，并通过该映射器链传递每个输入键/值对。

MultithreadedMapper允许多线程执行另一个 Mapper 类。 Mapper 类必须是线程安全的。 run 的覆盖启动多个线程迭代输入键值对并将它们传递给底层 Mapper。

关于hadoop - 在您的实现中是否有人覆盖了 Mapper run(Context) 方法？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44637869/

27

4

0

文章推荐： hadoop - 如何在配置单元中转换复杂数据类型

文章推荐： http - 缓存控制 : 'private' makes 'no-cache="set-cookie"' unnecessary?

python - Mapper Mapper|用户|用户无法组装映射表的任何主键列 'users'
from sqlalchemy import * from sqlalchemy import create_engine, ForeignKey from sqlalchemy import Col
java - mapred.Mapper 与 mapreduce.Mapper
我使用 MR1 API(hadoop-core-1.2.1.jar) 编写了一个示例字数统计程序。映射器类定义如下， public interface Mapper extends JobConfi
MyBatis-Plus通过插件将数据库表生成Entiry,Mapper.xml,Mapper.class的方式
创建maven项目，修改pom.xml文件，如下： ? 1
hadoop - 如何直接将 mapper-reducer 的输出发送到另一个 mapper-reducer 而无需将输出保存到 hdfs
问题最终得到解决在底部查看我的解决方案最近我正在尝试运行 Mahout in Action 的第 6 章( list 6.1 ~ 6.4)中的推荐系统示例。但是我遇到了一个问题，我已经用谷歌搜索了但
c# - .net 中的对象复制方法 : Auto Mapper, Emit Mapper、隐式操作、属性复制
如果有人知道在 .NET 中执行此操作的更多方法，您对这些方法有何看法？您选择哪种方法，为什么？下面是.NET中对象拷贝不同方式的测试。与此原始线程相关的测试:How to copy value
c# - Mapper.Map(source, dest) 和 Mapper.Map(source) 有什么区别？
我能看出参数个数的不同，但我不知道实现上的不同。每种方法的行为是否存在重要差异？最佳答案第一个填充您传入的现有对象。第二个为您创建一个新对象。这是“项目”和“填充”之间的语义差异。关于c# -
c# - 从 Glass.Mapper.Sitecore 升级到 Glass.Mapper.Sc 时缺少 InstanceContext
我正在将一个项目从 Glass Mapper v2 (Glass.Mapper.Sitecore) 升级到 v4 (Glass.Mapper.Sc)，我遇到了一个问题，我们的解决方案是使用 Insta
hadoop - 使用 org.apache.hadoop.mapred.mapper 接口(interface)实现 "in mapper"设计模式
我正在实现一些 hadoop 应用程序。我的编码部分几乎完成了。但是想在阅读“Lin & Chris Dryer”的映射器设计模式书后改进编码器。至于这种方法的有效实现，需要在 map 函数中保留状态
python - sqlalchemy.exc.InvalidRequestError : One or more mappers failed to initialize - can't proceed with initialization of other mappers 错误
当我尝试访问该页面时发生此错误。我在创建表时没有遇到错误，但似乎仍然存在问题。模型是这样的: class User(UserMixin, db.Model): id = db.Column(
python - SQLAlchemy.exc.UnboundExecutionError : Could not locate a bind configured on mapper Mapper|SellsTable|sellers or this Session 错误
我创建了一个使用 SQLAlchemy 的类: class DbAbsLayer(object): def __init__(self): self.setConnection
asp.net-mvc-4 - 无法解析类型名称 : Glass. Mapper.Sc.Pipelines.Response.GetModel、Glass.Mapper.Sc
我试图在我的 MVC - Sitecore - 7.1 中的 v4.0.30319 项目中使用 Glass Mapper。以下是我安装的 Glass Mapper 版本 Glass Mapper 版
hadoop - 如果我使用 -mapper cat 而不是 -mapper org.apache.hadoop.mapred.lib.IdentityMapper，Hadoop Streaming 的性能会降低吗？
我在尝试使用 org.apache.hadoop.mapred.lib.IdentityMapper 作为 Hadoop Streaming 1.0.3 中 -mapper 的参数时遇到了问题。 “猫
Caused by: java.lang.NoClassDefFoundError: org/mybatis/spring/mapper/MapperScannerConfigurer(原因：java.lang.NoClassDefFoundError：org/mybatis/spring/mapper/MapperScannerConfigurer)
这是我的mybatis配置。这是我的pom.xml。。当我运行项目时，它显示了错误的原因：org/mybatis/spring/mapper/MapperScannerConfigurer.有没有人能
Mapper sql语句字段和实体类属性名字有什么关系
背景： 1.在数据库中有一个通知表可以看到其中的 gmt_create、 notifier_name、 outer_title 这三个字段是有下划线的 2.这张表
hadoop - 将任何类型的对象传递给Hadoop Mapper
hadoop配置对象仅允许在set方法中将字符串作为值 set(字符串名称，字符串值) 是否有一种简单的方法来设置任何其他对象类型？我想在映射器中检索这些对象。我注意到在0.15左右的版本中，有一个
hadoop - 自定义Hadoop Mapper
我要开发的更大目标如下: a)仪表板，除其他功能外，用户还可以上传文档(.pdf，.txt，.doc)。所有这些文档都转到特定目录。 b)用户还可以查询所有带有特定关键字标记的文档。现在，我希望使用
Jhipster生成代码后报错，Mapper could not be found
2016-10-20 18:03:51.253 WARN 17216 --- [restartedMain] .s.c.a.CommonAnnotationBeanPostProcessor:在名为“
Hadoop Mapper 运行缓慢
我正在尝试同时使用映射器和缩减器来运行作业，但映射器运行缓慢.. 如果对于相同的输入我禁用 reducers，映射器将在 3 分钟内完成而对于 mapper-reducer 作业，即使在 30 分钟后
java - 如何将附加数据传递给 Mapper？
由于一些数据在所有 map() 函数之间共享，我无法在 setup() 中生成它们，因为每个 setup() 对应于每个map() 函数，而我想做的是预先生成一些数据并将其存储在可实现的地方，然后在每
java - Mapper 类是在每个作业的基础上初始化的吗？
我正在使用 Hadoop，我想使用静态变量来减少必须进行的方法调用次数。以下是我如何使用静力学: public class Mapper extends Mapper { protected

首页

博学

6Ren·AI

商城

hadoop - 在您的实现中是否有人覆盖了 Mapper run(Context) 方法？