gpt4 book ai didi

apache-spark - 如何为spark structured streaming指定kafka consumer的group id?

转载 作者:行者123 更新时间:2023-12-04 04:00:05 24 4
gpt4 key购买 nike

我想在同一个 emr 集群中运行 2 个 spark 结构化流作业来使用同一个 kafka 主题。两个作业都处于运行状态。但是,只有一个job可以拿到kafka的数据。我对kafka部分的配置如下。

        .format("kafka")
.option("kafka.bootstrap.servers", "xxx")
.option("subscribe", "sametopic")
.option("kafka.security.protocol", "SASL_SSL")
.option("kafka.ssl.truststore.location", "./cacerts")
.option("kafka.ssl.truststore.password", "changeit")
.option("kafka.ssl.truststore.type", "JKS")
.option("kafka.sasl.kerberos.service.name", "kafka")
.option("kafka.sasl.mechanism", "GSSAPI")
.load()

我没有设置group.id。我猜两个作业中的同一个组 ID 会导致此问题。但是,当我设置 group.id 时,它会提示“用户指定的消费者组未用于跟踪偏移量”。解决这个问题的正确方法是什么?谢谢!

最佳答案

您需要运行 Spark v3。

来自 https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html

kafka.group.id

The Kafka group id to use in Kafka consumer while reading from Kafka.Use this with caution. By default, each query generates a unique groupid for reading data. This ensures that each Kafka source has its ownconsumer group that does not face interference from any otherconsumer, and therefore can read all of the partitions of itssubscribed topics. In some scenarios (for example, Kafka group-basedauthorization), you may want to use a specific authorized group id toread data. You can optionally set the group id. However, do this withextreme caution as it can cause unexpected behavior. Concurrentlyrunning queries (both, batch and streaming) or sources with the samegroup id are likely interfere with each other causing each query toread only part of the data. This may also occur when queries arestarted/restarted in quick succession. To minimize such issues, setthe Kafka consumer session timeout (by setting option"kafka.session.timeout.ms") to be very small. When this is set, option"groupIdPrefix" will be ignored.

关于apache-spark - 如何为spark structured streaming指定kafka consumer的group id?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63203448/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com