gpt4 book ai didi

java - flink 是否以批处理模式动态减少

转载 作者:行者123 更新时间:2023-11-30 06:01:25 25 4
gpt4 key购买 nike

根据flink流媒体文档:

The window function can be one of ReduceFunction, FoldFunction or WindowFunction. The first two can be executed more efficiently (see State Size section) because Flink can incrementally aggregate the elements for each window as they arrive.

这同样适用于批处理模式吗?在下面的示例中,我从 cassandra 读取约 36go 的数据,但我预计减少的输出会小得多(约 0.5go)。运行此作业是否需要 flink 将整个输入存储在内存中,或者它是否足够智能以迭代它

DataSet<MyRecord> input = ...;
DataSet<MyRecord> sampled = input
.groupBy(MyRecord::getSampleKey)
.reduce(MyRecord::keepLast);

最佳答案

根据 documentation on the Reduce Operation在 Flink 中,我看到以下内容:

A Reduce transformation that is applied on a grouped DataSet reduces each group to a single element using a user-defined reduce function. For each group of input elements, a reduce function successively combines pairs of elements into one element until only a single element for each group remains.

Note that for a ReduceFunction the keyed fields of the returned object should match the input values. This is because reduce is implicitly combinable and objects emitted from the combine operator are again grouped by key when passed to the reduce operator.

如果我没看错的话,Flink 在映射器端执行归约操作,然后在 reducer 端再次执行归约操作,因此实际发出/序列化的数据应该很小。

关于java - flink 是否以批处理模式动态减少,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52223288/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com