Leveraging Kafka Spring Dead Letter Queue for Resilient Messaging

Introduction

In the realm of distributed systems, robust communication between microservices is paramount. Kafka, with its high throughput and fault-tolerant design, has become a go-to solution for building scalable messaging systems.

However, ensuring message reliability in asynchronous communication can be challenging, especially when dealing with failures and errors. One approach to handle such scenarios is the use of a Dead Letter Queue (DLQ), which acts as a safety net for messages that couldn’t be processed successfully on their initial attempt.

In this blog post, we’ll explore the concept of Kafka Spring Dead Letter Queue and see how to implement it.

Understanding Kafka Spring Dead Letter Queue

A Dead Letter Queue (DLQ) is a special queue where messages that fail to be processed are sent. In the context of Kafka and Spring, the Dead Letter Queue is an invaluable feature that enhances the resilience of message-driven applications. When a consumer encounters an error while processing a message from a Kafka topic, instead of discarding the message outright, it can be redirected to a designated DLQ for further analysis or processing.

Achieving non-blocking retry and DLT functionality with Kafka usually requires setting up extra topics and creating and configuring the corresponding listeners.

Errors trickle down levels of retry topics until landing in the DLT:

  • If message processing fails, the message is forwarded to a retry topic with a back off timestamp.
  • The retry topic consumer then checks the timestamp and if it’s not due it pauses the consumption for that topic’s partition.
  • When it is due the partition consumption is resumed, and the message is consumed again.
  • If the message processing fails again the message will be forwarded to the next retry topic, and the pattern is repeated until a successful processing occurs, or the attempts are exhausted,
  • If all retry attempts are exhausted the message is sent to the Dead Letter Topic for visibility and diagnosis.
  • Dead letter Topic messages can be reprocessed by being published back into the first retry topic. This way, they have no influence of the live traffic.

Non-Blocking Retries in Spring Kafka

Since Spring Kafka 2.7.0 failed deliveries can be forwarded to a series of topics for delayed redelivery.

It can described with an example:

public class RetryableKafkaListener {

  @RetryableTopic(
      attempts = "4",
      backoff = @Backoff(delay = 1000, multiplier = 2.0),
      autoCreateTopics = "false",
      topicSuffixingStrategy = TopicSuffixingStrategy.SUFFIX_WITH_INDEX_VALUE)
  @KafkaListener(topics = "orders")
  public void listen(String in, @Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
    log.info(in + " from " + topic);
    throw new RuntimeException("test");
  }

  @DltHandler
  public void dlt(String in, @Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
    log.info(in + " from " + topic);
  }
}

With this @RetryableTopic configuration, the first delivery attempt fails and the record is sent to a topic order-retry-0 configured for a 1-second delay.

When that delivery fails, the record is sent to a topic order-retry-1 with a 2-second delay.

When that delivery fails, it goes to a topic order-retry-2 with a 4-second delay, and, finally, to a dead letter topic orders-dlt handled by @DltHandler method.

Kafka Spring Dead Letter Queue is a powerful mechanism for handling message processing failures gracefully. By redirecting erroneous messages to a separate queue, it provides developers with the opportunity to analyze and remediate issues without losing valuable data. Incorporating DLQ into your Kafka-based applications ensures greater resilience and reliability in asynchronous messaging systems.

Leave a comment