作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我有以下简单的mrjob
脚本,该脚本逐行读取一个大文件,在每一行上执行一个操作并打印输出:
#!/usr/bin/env python
from mrjob.job import MRJob
class LineProcessor(MRJob):
def mapper(self, _, line):
yield (line.upper(), None) # toy example: mapper just uppercase the line
if __name__ == '__main__':
# mr_job = LineProcessor(args=['-r', 'hadoop', '/path/to/input']) # error!
mr_job = LineProcessor(args=['/path/to/input'])
with mr_job.make_runner() as runner:
runner.run()
for line in runner.stream_output():
key, value = mr_job.parse_output_line(line)
print key.encode('utf-8') # don't care about value in my case
'-r', 'hadoop'
(请参见上面的注释),则会出现以下奇怪错误:
File "mrjob/runner.py", line 727, in _get_steps
'error getting step information: %s', stderr)
Exception: ('error getting step information: %s', 'Traceback (most recent call last):\n File "script.py", line 11, in <module>\n with mr_job.make_runner() as runner:\n File "mrjob/job.py", line 515, in make_runner\n " __main__, which doesn\'t work." % w)\nmrjob.job.UsageError: make_runner() was called with --steps. This probably means you tried to use it from __main__, which doesn\'t work.\n')
HadoopJobRunner
?
最佳答案
你想念吗
def steps(self):
return [self.mr(
mapper_init = ...
mapper = self.mapper,
combiner = ...,
reducer = ...,
)]
关于python - 如何创建一个Hadoop运行者?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18455538/
我是一名优秀的程序员,十分优秀!