The Outbox Pattern with Mongo, Kafka and Debezium in C#

Intro

In this article, we explore a practical implementation of the Outbox Pattern using C#, MongoDB, Debezium, and Kafka.

While the primary goal is not to discuss the Outbox Pattern in a theoretical sense, I’ll provide a brief overview for those not already familiar. In a dockerized environment, I’ll demonstrate how to build a reliable system for propagating data change events from service to service.

You can find the complete source code on GitHub.

The Dual Write Problem

The Dual Write Problem is a common challenge in distributed systems, particularly in microservice architectures where different services send events to each other. In essence, the problem arises when a service attempts to simultaneously update its own local database and publish an event to a message broker like Kafka for other services to consume.

While this may seem like a simple process, it’s inherently risky due to the lack of atomicity – there’s no guarantee that both operations will succeed. For example, if a service updates its local database but fails while publishing the event to the broker, the system’s state becomes inconsistent. The opposite scenario – failing to update the local database but successfully publishing the event – also leads to discrepancies.

Here’s a visual representation of a failing dual write:

Solution – the Outbox Pattern

The Outbox Pattern effectively mitigates the risk associated with dual writes by leveraging the atomicity of database transactions. It introduces an “outbox” table (or collection in Mongo terms) within the service’s local database. Instead of directly publishing an event to a message broker and separately updating the local database, both operations are combined into a single transaction with two steps:

  1. The state change is made to the local database, and an accompanying event representation is added to the Outbox collection within the same transaction. This ensures that either both changes are committed or neither are, maintaining the consistency of the local database.
  2. The events stored in the Outbox are then relayed to the message broker, Kafka, in this case, using a database Change Data Capture mechanism (via Debezium, in our implementation).

The following diagram demonstrates this process:

This article will offer you a clear picture of how this pattern works in a real-world scenario.

The Debezium MongoDB Outbox Event Router

For implementing the Outbox pattern in this demo, the Debezium MongoDB Connector plays a critical role. It allows you to reliably capture and stream changes made to the database and propagate them to Kafka. Kafka becomes the conduit for these changes, ensuring they are reliably delivered to consumers.

Here’s how it works.

The Debezium MongoDB Outbox Event Router is a Single Message Transformation (SMT) that is applied to events emitted by the Debezium MongoDB Connector. When a service wants to create an event, it stores it in an Outbox collection within its own database. The Debezium MongoDB Connector then captures the inserts to the Outbox and routes them to Kafka. The Outbox Event Router SMT transforms these events, ensuring that they are routed to the correct Kafka topic with the right key and content.

This forms the backbone of data synchronization among microservices, which is a crucial aspect of maintaining consistency in a distributed system.

Environment Setup

Setting up the environment for our example involves running several services: Zookeeper, Kafka, MongoDB, and Debezium’s Kafka Connect. To ease this process and ensure everything works together seamlessly, we’re using Docker Compose to spin up the environment.

You simply need to execute the init.sh script. It performs a few key steps:

  1. It starts all the necessary services by executing the docker compose file. This includes Zookeeper, Kafka, MongoDB, and Debezium’s Kafka Connect.
  2. Once MongoDB is running, it initiates a MongoDB replica set, which is required by Debezium to capture the data changes.
  3. It then creates the testdb database and the outbox collection within MongoDB.
  4. After setting up MongoDB, the script starts the Debezium MongoDB connector by making a POST request to the Kafka Connect REST API. This connector sends Mongo data change events to Kafka topics.

By running the init.sh script, you will set up a working environment with minimal hassle. The related services will be ready to receive and process events using the Outbox Pattern.

The Demo App

Let’s review some of the core components of the demo application.

Transactional Dual Write in the UserService

The CreateRandomUser method in UserService embodies the essence of the Outbox Pattern. Take a moment to review the implementation:

This method starts by initiating a MongoDB transaction, ensuring that the following operations either completely succeed or fail as a unit. It performs two main operations within this transaction:

  • User Creation: A new User object is created and inserted into the users collection in MongoDB. This signifies an important state change in the application, i.e., creating a new user.
  • Outbox Record Creation: Alongside the user creation, a new OutboxRecord is created and stored in the Outbox collection. This record is a representation of the state change – it includes details such as the AggregateType (“user”), the AggregateId (the ID of the newly created user), the Type of the event (“userCreated”), and the Payload (serialized data of the new user).

If an exception occurs during these operations, the transaction is aborted, leaving the system in a consistent state. If the operations succeed, the transaction is committed, and both the new user and the corresponding outbox event are persisted atomically.

Program Main Code

In the program main code, the CreateRandomUser method is invoked at some time interval which simulates a continuous generation of users and related outbox events. These outbox events are then picked up by Debezium.

The StartConsumer method, on the other hand, constantly listens for new messages in the Kafka topic outbox.event.user. Once a new message for a user creation is received, it prints the details to the console.

Here is the essential part of the StartConsumer implementation:

Running the App

If you start the application, you should see an outcome like this:

Summary

Throughout this article, we’ve explored a concrete implementation of the Outbox Pattern using MongoDB, Kafka, and Debezium in a C# environment. Our journey began by identifying the Dual Write Problem that many distributed systems face and how the Outbox Pattern offers a reliable solution to tackle it.

By following along with this guide, developers not only gain a deeper understanding of the Outbox Pattern but also acquire a practical solution that can be adapted to their own needs. I hope this knowledge empowers practitioners to build more reliable, scalable, and consistent distributed systems.

Resources

  1. GitHub Repo
  2. MongoDB Outbox Event Router
  3. Reliable Microservices Data Exchange With the Outbox Pattern

Site Footer

Subscribe To My Newsletter

Email address