What is Sharding?

Sharding is a very important concept that helps the system to keep data in different resources according to the sharding process. The word “Shard” means “a small part of a whole“.  Sharding means dividing a larger part into smaller parts. In DBMS, Sharding is a type of database partitioning in which a large Database is divided or partitioned into smaller data and different nodes. These shards are not only smaller, but also faster and hence easily manageable. 

Illustration: Now let us consider two scenarios where there is no sharding and in other, we will be having simple sharding via medias to understand it better as follows:

  • Case 1: No Sharding 

No sharding

  • Case 2: Simple Sharding

Simple Sharding

Need for Sharding

Consider a very large database whose sharding has not been done. For example, let’s take a Database of a college in which all the student records (present and past) in the whole college are maintained in a single database. So, it would contain a very large number of data, say 100, 000 records. Now when we need to find a student from this Database, each time around 100, 000 transactions have to be done to find the student, which is very very costly. 

Sharding Of A Database

Now consider the same college students’ records, divided into smaller data shards based on years. Now each data shard will have around 1000-5000 students’ records only. So not only the database became much more manageable, but also the transaction cost each time also reduces by a huge factor, which is achieved by Sharding. Hence this is why Sharding is needed.

Features of sharding:

  1. Sharding makes the Database smaller
  2. Sharding makes the Database faster
  3. Sharding makes the Database much more easily manageable
  4. Sharding can be a complex operation sometimes
  5. Sharding reduces the transaction cost of the Database
  6. Each shard reads and writes its own data.
  7. Many NoSQL databases offer auto-sharding.
  8. Failure of one shard doesn’t affect the data processing of other shards.


Complete Reference to Databases in Designing Systems – Learn System Design

Previous Parts of this System Design Tutorial

Similar Reads

What is a Database?

When we store data, information, or interrelated data, in an organized manner in one place, it is known as Database....

Types of Databases

They are of 3 types as follows as listed and shown below media as follows:...

Databases Basics In System Designing

Role of Database in System Design...

Blob Storage

Let us say we are up to designing a Uber system where we are up to the booking, renting cabs, and many other services....

How to select the right database for the service?

It is a very crucial step when it comes to databases in designing systems. In order to get the right database for our data, we need to first look over 5 factors that are as follows:...

Challenges to databases while Scaling

We are facing a problem of increased cost for query operations no matter what the type of database. It is because the CPU is responsible for query operation whereas our data is stored in hard disk(secondary memory). Now CPU is computing a million input per second (MIPS) whereas our hard disk is only doing <100 operations per second no matter how fast it be. So they cannot interact with each other directly but have to correspond to which we bring primary memory (RAM) into play which can operate faster via caching but it is not optimized as perceived from the below media:...

How to overcome challenges to Databases while Scaling

Now let us discuss below concepts that help us in scaling our databases and overcoming these challenges that are as follows:...

What is Indexing?

Indexing is a procedure introduced for database operations and other queries (received by CPU) are optimized by reducing the amount of time needed to complete a query, indexing helps optimize queries and other database processes while fetching data in lesser time. The indexes are stored using the B-tree data structure. Only utilize indexing if the data is massive and the application requires a lot of reading. Indexing may slow down write operations if an application is write-intensive....

What is Data partitioning?

It is a database procedure of partitioning that involves breaking up a very large table into a number of smaller sections. Queries that access only a tiny portion of the data can run faster since there is fewer data to scan when huge tables are divided into smaller individual tables. When the amount of data is large and a single system cannot handle it, partitioning is used....

What is Sharding?

Sharding is a very important concept that helps the system to keep data in different resources according to the sharding process. The word “Shard” means “a small part of a whole“.  Sharding means dividing a larger part into smaller parts. In DBMS, Sharding is a type of database partitioning in which a large Database is divided or partitioned into smaller data and different nodes. These shards are not only smaller, but also faster and hence easily manageable....

Contact Us