gpt4 book ai didi

java - 在 Kafka Consumer 中反序列化 Avro 数据包时出现堆空间问题

转载 作者:行者123 更新时间:2023-11-30 12:05:52 26 4
gpt4 key购买 nike

在 Kafka 消费者中反序列化 avro 消息时获取堆空间内存不足异常。

使用本地 kafka 生产者和消费者在 Java 中运行消费者代码,我尝试在 IntelliJ 中将堆内存增加到 10GB,但仍然遇到这个问题。

简单的消费者类代码

Properties props = new Properties();

props.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "test1");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
props.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
AvroDeserializer.class.getName());

KafkaConsumer<String, BookingContext> consumer = new KafkaConsumer<>(props);


consumer.subscribe(Arrays.asList("fastlog"));
while (true) {
ConsumerRecords<String, MyClass> records = consumer.poll(100);
for (ConsumerRecord<String, MyClass> record : records)
{
System.out.printf("----------------------" +
"+\noffset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());

}
}

这是我的反序列化器类,我在其中编写以在处理后将数据包转换为普通类。Avro 解串器代码:

public T deserialize(String topic, byte[] data) {
try {
T result = null;

if (data != null) {
LOGGER.debug("data='{}'", DatatypeConverter.printHexBinary(data));

DatumReader<GenericRecord> datumReader =
new SpecificDatumReader<>(MyClass.getClassSchema());
Decoder decoder = DecoderFactory.get().binaryDecoder(data, null);

result = (T) datumReader.read(null, decoder);
LOGGER.debug("deserialized data='{}'", result);
}
return result;
} catch (Exception ex) {
throw new SerializationException(
"Can't deserialize data '" + Arrays.toString(data) + "' from topic '" + topic + "'", ex);
}
}

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.avro.generic.GenericData$Array.<init>(GenericData.java:245)
at org.apache.avro.generic.GenericDatumReader.newArray(GenericDatumReader.java:391)
at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:257)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177)
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
at kafka.serializer.AvroDeserializer.deserialize(AvroDeserializer.java:59)
at kafka.serializer.AvroDeserializer.deserialize(AvroDeserializer.java:21)
at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:65)
at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:55)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:918)
at org.apache.kafka.clients.consumer.internals.Fetcher.access$2600(Fetcher.java:93)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1095)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1200(Fetcher.java:944)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:567)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:528)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1110)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
at SimpleConsumer.main(SimpleConsumer.java:43)

最佳答案

您发布的代码没有显示任何会耗尽内存的内容,但您显然将这些 result 返回的值存储在其他地方,而不是在它们之后进行清理。我建议您检查调用您的 deserialize 方法的任何内容,并检查您是否可能将所有这些结果存储在列表或其他数据结构中,而不是清理它们。

您可以做的另一件事是运行 JVM 探查器(如 JVisualVM),然后执行堆转储以显示阻塞 JVM 堆的对象类型/数量。

关于java - 在 Kafka Consumer 中反序列化 Avro 数据包时出现堆空间问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55965414/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com