Table of Contents
Intro
Data Replication is a common practice in the world of Distributed Systems.
It allows you to achieve the following:
- The system can continue working even if some parts have failed (and thus increasing availability).
- Offload some heavy reporting queries from the Primary (Leader) node and route them to a dedicated Secondary (Follower/Read-only) node.
- Scale out the number of machines that can serve read queries (and thus increase read throughput).
- Keep data geographically close to the users.
In most scenarios when data is replicated from one server to another, we refer to consistency as “eventual.” Meaning that whatever data we write to a specific node will eventually propagate to all the other nodes. The system will eventually become consistent.
In most cases, Eventual Consistency is all we need – for example, it’s perfectly fine to see your friend’s post on social media with a few seconds delay.
In other cases, though, this time frame required to move the data between servers can have very unpleasant effects – things suddenly disappearing, moving back in time, even privacy breaches!
Such issues appear when events that always need to be observed strictly together and/or one after the other somehow get re-ordered or partially lost.
I know this sounds too blurry and abstract for now but bear with me – you’ll understand very well the meaning of all this by the end of the article.
I believe it’s essential to be aware of those scenarios and familiarize ourselves with the techniques required to deal with them.
This first post will be a little more theoretical on the matter of Causal Consistency Guarantees.
Next, I’ll take the topic to a more practical level and discuss some general architectural considerations and design patterns together with some specific implementation choices in MongoDB.
If that sounds exciting – let’s dig in!
Useful Resources
This article is very much influenced by the Designing Data-Intensive Applications book. I strongly recommend it if you want to learn more about the nuts and bolts of Data-Driven Systems.
Another excellent learning source is the Introduction to Database Engineering course at Udemy.
What is Causal Consistency?
The easiest way to explain Causal (*) Consistency is with an example.
(*) Make sure you read that as “causal,” not “casual.”
Where Did Your Order Go?
Suppose you just placed an order on an e-commerce platform. You get a “success” message.
Then you go to the “all orders” page, and you don’t see the one you just placed.
That would be very irritating, right? Did the order actually go through? Did you get charged? Should you order again?
Then, right when you’re on the phone calling the support center, you reload the page, and the order suddenly appears.
So what happened?
Fundamentals of Database Engineering
Learn ACID, Indexing, Partitioning, Sharding, Concurrency control, Replication, DB Engines, Best Practices and More!
Eventual Consistency Wasn’t Enough
Here’s one possible scenario.
The initial submission of the order went to a database server that accepts writes – a Primary server. However, when you requested to read all your orders, you were redirected to a Secondary replica that’s still behind the Primary, e.g., your order is still not replicated. At some point, the order gets copied over to the Secondary, and you see it after a page refresh.
That’s a perfect example of Eventual Consistency. There was a time interval when the system was in an inconsistent state, but after a while, all the servers got synced with each other, so the system eventually became consistent again.
Still, the user experience was quite frustrating, wasn’t it? Eventual Consistency wasn’t enough in this case. You would expect to see your order right after you submit it successfully.
You need stronger guarantees that cover causal (happens-before) relations.
Transitioning to Causal Consistency
The activities of submitting your order and then reviewing it are causally related. If your Write request (submit the order) successfully went through, your subsequent Read request (read the orders) should return it.
Eventual Consistency doesn’t deal with that.
In this concrete example, the system does not provide a particular type of Causal Consistency Guarantee, called “Read Your Write,” which is one of the simpler ones to comprehend.
There are many types and variations of similarly broken causal relations, but I’ll review only the most common ones here.
Many of the sources that I went through had some fairly theoretical definitions of the topic. Instead, I’ll do my best to present you with an example-centric walkthrough that’s easier to wrap your head around.
Let’s get going!
Read Your Write
Let’s review the “Read Your Write” consistency through another example:
- The Client inserts a new comment for a blog post. This Write goes to the Primary server.
- He gets a confirmation that the comment is inserted.
- The comment is ready to be replicated (there are no guarantees when that happens).
- Meanwhile, the Client reads all the comments for the post. The request is redirected to the Secondary.
- No comments are returned because the comment is still not replicated.
This example is very similar to the “missing order” one from above, so I hope the workflow is clear.
Let’s move to some more exciting scenarios.
Monotonic Reads
Without a Monotonic Reads guarantee, the Client might get a feeling he’s moving back in time.
Explore the following diagram:
Fig. 2 – Violation of “Monotonic Reads” consistency
- Client A inserts a new comment for a blog post. This Write goes to the Primary server.
- He gets a confirmation that the comment is inserted.
- The comment is ready to be replicated (there are no guarantees when that happens).
- Client B reads all the comments for the post.
- This Read happens to hit the Primary node, so the comment inserted by Client A is returned.
- Client B reissues the same request.
- This time he gets redirected to the Secondary replica where the comment is still not replicated, so he gets an empty response.
Such scenarios can be pretty confusing – how did the comment disappear? Did it get deleted?
Monotonic Writes
On social media, a user may want to make his album private before uploading a new picture.
In this case, there are two Write operations – one for updating the album access policy and one for the photo upload.
Imagine what will happen if these two writes are somehow applied in the opposite order – first, the photo is uploaded, and then the album is made private.
In that case, the uploaded photo will be public for some time which was never the intention.
Even worse, what if the “privacy Write” gets lost forever?
Let’s review such scenarios.
Multiple Primary Servers
In the following example, we’ll explore the case when the two writes might get applied in the opposite order.
Let’s say the database supports multiple Primary servers, e.g. they all accept writes that need to be propagated to other servers afterward.
Note that MongoDB, which I’ll focus on in the following article, is a “single leader” database meaning that only one server per replica set accepts writes – the Primary. Therefore this scenario does not apply to this specific database.
Without a Monotonic Writes guarantee, a situation like the one below might occur:
- Client A makes his album private. The Write goes to Primary A.
- He gets a confirmation that the update is successful.
- The update is ready to be replicated (there are no guarantees when that happens).
- Then he uploads a photo that should not be public. This Write goes to Primary B.
- He gets a confirmation that the photo is successfully uploaded. From that point onwards, until the “privacy” update is replicated to Primary B, any user that hits Primary B can see the supposedly private photo of Client A.
- For example, Client B requests all the photos of Client A’s album.
- He gets back a response that contains the photo that should’ve been hidden!
Rollback During a Failover
Let’s now explore a case when a Write might just disappear.
Here’s a potential scenario.
A Primary server steps down (becomes unavailable). This can be due to various reasons – network glitches, software updates, some blocked processes, and so on.
In that case, some other server needs to be promoted to a Primary (a Leader), and the old Primary needs to become a Secondary (a Follower).
This process is known as a Failover.
The thing is, before the failure of the old Primary, it might have had pending writes that were still not replicated. When it returns to life, there’s a new Primary, so it needs to join the replica set as a Secondary.
So what happens to the writes that are still not replicated? At least in MongoDB, they are rolled back.
This is how a Write operation can potentially be lost.
The process is illustrated below:
- Client A makes his album private. The Write goes to Server A, which is currently the Primary server.
- He gets a confirmation that the update is successful.
- Server A fails (becomes unavailable). Note that the replication of the update cannot proceed.
- The point of a failover. Server A steps down as a Primary, and Server B becomes the new Primary.
- When Server A becomes available again and re-joins the replica set, it understands there’s another Primary elected, so it becomes Secondary and rollbacks the update of the album (and any other Write operation that couldn’t be replicated before the failover).
- Client A inserts a photo into the album.
- He gets a confirmation that the insert is successful.
- Client B requests all the photos for Client A’s album. This Read request hits Server B.
- The photo that was supposed to be private gets returned! That’s because the album update got lost.
Writes Follow Reads
Imagine you read a blog post, and you go to the comments section.
You read a comment by some Client A – let’s say he’s asking a question.
You know the answer, so you write your own comment in response to the one by Client A.
Now, imagine a scenario when some Client C goes to the comments and sees only yours but not the one by Client A.
That would be pretty confusing. Your comment alone doesn’t make any sense. It’s supposed to answer a question that is now missing.
The Writes Follow Reads consistency dictates that in this example, your comment should only be seen together with the one by Client A. If both comments are missing – that’s also OK. The important thing is the two writes go hand in hand.
Technically, we can present a violation of “Writes Follow Reads” similarly to the Failover case I showed in the previous section – we need a Write operation to disappear:
- Client A inserts a comment to a post. The Write hits Server A, which is Primary at the moment.
- He gets a confirmation that the insert is successful.
- Client B reads the comments for the post. The Read hits Server A.
- The comment by Client A is returned.
- Server A fails (becomes unavailable). Note that the replication of Client A’s comment cannot proceed.
- The point of a failover. Server A steps down as a Primary, and Server B becomes the new Primary.
- After Server A re-joins the replica set, it turns into a Secondary, and the comment by Client A is rollbacked.
- Client B inserts another comment as a response to Client A’s comment he’s seen before. The Write goes to Server B, which is the new Primary.
- Client B gets a confirmation that the insert is successful.
- Client C joins the party. He reads all the comments for the post. The Read request goes to Server B.
- Server B returns only the comment by Client B, which is more than confusing!
Note that there are other ways to present such a consistency breach – one example is with re-ordered writes with multiple leaders, as shown in Figure 3.
Summary
In this article, you learned about Causal Consistency Guarantees.
You’ve seen some very elaborate examples of things that can go wrong due to data replication in a Distributed System.
The next post will be more practical – I’ll concentrate on MongoDB and the toolset for enforcing the required consistency guarantees.
In the process, I’ll review the topics of Write and Read Concerns, Logical (Lamport) Clock, Causally Consistent Sessions, and more.
Thanks for reading!