Sunday 21 August 2022

Master-master vs master-slave database architecture

In this blog we will understand single copy, master-slave and multi-master database architecture. We are going to understand various pros and cons of each architecture with some examples.

Database without Replication (Single Copy)

In this architecture one standalone database server is used for all read and write DB operations from the application.

Single Copy Database


Advantage : Easy to setup and use

Disadvantages :

  • Low availability (Single point of failure) : If any problem happens in DB server then application goes down
  • Low accessibility : Users from distant location from server will take high response time due to network delay

Master Slave Replication

In Master Slave replication architechture, application writes data to single master database node. This master node sync/replicate data to all slave nodes either synchronously or asynchronously in near real-time. All read operations of application should be done from slave node only.

Master Slave Architecture



Advantages

  • Data Availability : If one slave goes down, reads can happen from another slave
  • Data Accessibility : Slaves at different locations contains copies of same data. So users accessing applications from different locations have less response time. You just need to run an instance of that application in that region behind a load balancer. (X-axis scaling in scalecube)
  • Segregation of Read and Write operation on different nodes. So it is easy to scale read operations.


Disadvantages :

  • Single point of failure : If master node goes down then one slave has to come up as master node. But during that time there might be a chance of write failures and data loss.
  • Writes are only on master node hence we cannot horizontally scale writes to the application.
  • During election of master node from slave nodes, there is no master available.


Data Replicating technique : Synchronous vs Asynchronous


Synchronous : Data from Master db is synched to slave db in an synchronous call

Asynchronous : Data from Master db is synched to slave db in an asynchronous manner


Synchronous DB replication has high consistency but it also causes high response time for application users.

Asynchronous DB replication has low consistency but it has less response time as compared to Synchronous replication.


So when Ack = 1 

Only 1 Slave server is acknowledged synchronously and other slaves are acknowledged  asynchronously.

This system is less consistent and has high throughput.


When Ack = N 

Where N is total count of slave DB nodes

All slave servers are acknowledged synchronously. This system is highly consistent and has low throughput.


You can also define sync replication value to 1 + half of number of slaves which is also known as quoram.

(quoram) q = N/2 + 1

Where N is total count of slave DB nodes

For example if total number of slave nodes(N) = 5, then quoram is 3.

If N=3 , then quoram is 2.


(Q) How to handle slave node outage ?

Slave nodes maintains bin log replication data recieved from master node. If a Slave node goes down and comes back after some time it resync from the last saved replication log in local disk to latest bin log transaction id from master node.


(Q) How to handle master node outage ?

If master node goes down in a master-slave replication then slave node with most latest transanctionId is picked as master node.

In case of semi synchronous replication, slave node which is synced synchronously is picked as master.

In case asynchronous replication, if a master node goes down, then there might be a chance of some data loss for your application.



Master master Replication

It is an architectural style where every node is treated as master and no node is slave node. Data read and write operations can be done on any master node. So all master nodes are responsive to client data queries. Data replication happens on the basis of Ack value of replication which decides consistency and throughput of application. Each master node is responsible to propagate data modifications to rest of the master nodes in the group.

Master Master Architecture



Advantages

  • No Rebalance, no reelection of master from slave nodes if one master goes down
  • Data Availability : If one master goes down, reads can happen from another master having same copy
  • Data Accessibility : Master nodes at different locations contain copy of the same data. So users accessing applications from different locations have less response time. You just need to run an instance of that application in that region behind a load balancer. (X-axis scaling in scalecube)
  • No single point of failure : If one master node goes down read and write operation can be done from other master nodes.
  • Write operation can be horizontally scaled by having more master nodes. This helps to distribute write load.


Disadvantages

  • Less consistency as most implementations prefer asynchronous replication.
  • High latency for high consistency



Example of Master Slave Replication : Mongodb 

Mongodb picks consistency and partition tolerance in CAP.


Example of Master Master Replication : Cassandra 

Cassandra picks availability and partition tolerance in CAP.


Some important links for reference :

Why mongodb is consistent not available and cassandra is available not consistent


What are master slave database


Database replication master slave


I will come back with some more details on mongodb and cassandra in my upcoming blogs.

Thanks



3 comments:

  1. Wonderful, Good explanation of how replication works. Waiting for next blogs in the series.

    ReplyDelete
  2. Simple and lucid. I would really appreciate it if you could provide some examples in future blogs, such as, When can we use Cassandra or columnar databases over No-SQL databases and why?

    ReplyDelete
    Replies
    1. Thanks Avinash. I will surely cover your doubts in my upcoming post.

      Delete