作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
当我声明具有两个源(1 个 gcs 和 1 个 pubsub)的管道时,我收到错误,但仅限于 Beam DirectRunner。与 Google dataflow runner 配合使用,效果很好。我的管道有“流选项 = True”
gcsEventsColl = p | "Read from GCS" >> beam.io.ReadFromText("gs://sample_events_for_beam/*.log") \
| 'convert to dict' >> beam.Map(lambda x: json.loads(x))
liveEventsColl = p | "Read from Pubsub" >> beam.io.ReadFromPubSub(topic="projects/axxxx/topics/input_topic") \
| 'convert to dict2' >> beam.Map(lambda x: json.loads(x))
input_rec = (gcsEventsColl, liveEventsColl) | 'flatten' >> beam.Flatten()
DirectRunner 似乎对 ReadFromText 进行了一些不兼容的转换,但我不明白。
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 564, in run
return self.runner.run_pipeline(self, self._options)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/runners/direct/direct_runner.py", line 131, in run_pipeline
return runner.run_pipeline(pipeline, options)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/runners/direct/direct_runner.py", line 529, in run_pipeline
pipeline.replace_all(_get_transform_overrides(options))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 504, in replace_all
self._check_replacement(override)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 478, in _check_replacement
self.visit(ReplacementValidator())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 611, in visit
self._root_transform().visit(visitor, self, visited)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 1195, in visit
part.visit(visitor, pipeline, visited)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 1195, in visit
part.visit(visitor, pipeline, visited)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 1195, in visit
part.visit(visitor, pipeline, visited) [Previous line repeated 4 more times]
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 1198, in visit
visitor.visit_transform(self)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/apache_beam/pipeline.py", line 476, in visit_transform
transform_node) RuntimeError: Transform node AppliedPTransform(Read from GCS/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/ProcessKeyedElements/GroupByKey/GroupByKey,
_GroupByKeyOnly) was not replaced as expected.
我认为它与此代码有关,但我不确定: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/direct/direct_runner.py#L375
感谢您的帮助
最佳答案
这是由于错误而导致的内部故障。此错误消息意味着 Python DirectRunner 在尝试重写转换时损坏了管道图。
关于python - 在 Apache Beam 中混合流式和非流式源时,转换节点 AppliedPTransform 未按预期错误替换为 DirectRunner,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68125864/
当我声明具有两个源(1 个 gcs 和 1 个 pubsub)的管道时,我收到错误,但仅限于 Beam DirectRunner。与 Google dataflow runner 配合使用,效果很好。
当我声明具有两个源(1 个 gcs 和 1 个 pubsub)的管道时,我收到错误,但仅限于 Beam DirectRunner。与 Google dataflow runner 配合使用,效果很好。
我是一名优秀的程序员,十分优秀!