Blob Storage

Let us say we are up to designing a Uber system where we are up to the booking, renting cabs, and many other services. 

Now here the consumer will be dropping text so as in order to easily communicate or call or worst dropping a nearby images where he/she wants to go. So as a driver can be there either by online call service or dropped image landed for which we will be needing databases to store the images. 

Wherever in a system we are having images and videos they are not directly termed as databases and rather are referred to as Blob storages because we are directly putting them. So in order to work, no queries can be operated over them.

The below media depicts the system architecture of the Uber app as follows in order to get showcasing data stores and to get an overview of database design: 

System Design Of the Uber App

Note: Amazon S3 is one of the top providers serving as data storage, also known as blob storage. 

CDN(Content Delivery Network): Images and videos that are stored in datastores (Amazon S3)now need to be widespread across different servers across the globe because there are humongous users that are querying for the same as compared to databases widespread across in accordance to geographical locations. 

Blob = S3(Datastore) + CDN(Content Delivery Network) 

Text Search Engine Capabilities: Now we want users to interact with the service then we need to provide searching via text of title and description. Here in order to aid searching common search engine capabilities which are seen in many system architectures such as Google Maps, Amazon Prime, and many more that we can easily think of. Now, this commonly is provided by Elastic search and Solr of which both are built on top of Apache Lucene. 

Now think of what will happen if the user enters grammatically wrong words while searching how a text search engine will be working is as shown below as follows:

Fuzzy Search: It is assistance played alongside text search capabilities where the user enters grammatically wrong while typing corresponding if we do not fetch any query result will lead to bad user capabilities. Hence we make our database smart with this technique by showcasing nothing. 

Now let us consider a sample system design be it Google, or Amazon where we want to store the database for analytics on all the transactions as per the system’s current or ongoing requirements then we do it a different way. Here we keep a larger chunk over all databases which is commonly known as data warehousing.

Data Warehousing: It is introduced where all the data is dumped into the database so as to serve various queering capabilities to generate reports. Data warehousing is generally not computed over online data but computed over offline data.

Example: Hadoop 

Complete Reference to Databases in Designing Systems – Learn System Design

Previous Parts of this System Design Tutorial

Similar Reads

What is a Database?

When we store data, information, or interrelated data, in an organized manner in one place, it is known as Database....

Types of Databases

They are of 3 types as follows as listed and shown below media as follows:...

Databases Basics In System Designing

Role of Database in System Design...

Blob Storage

Let us say we are up to designing a Uber system where we are up to the booking, renting cabs, and many other services....

How to select the right database for the service?

It is a very crucial step when it comes to databases in designing systems. In order to get the right database for our data, we need to first look over 5 factors that are as follows:...

Challenges to databases while Scaling

We are facing a problem of increased cost for query operations no matter what the type of database. It is because the CPU is responsible for query operation whereas our data is stored in hard disk(secondary memory). Now CPU is computing a million input per second (MIPS) whereas our hard disk is only doing <100 operations per second no matter how fast it be. So they cannot interact with each other directly but have to correspond to which we bring primary memory (RAM) into play which can operate faster via caching but it is not optimized as perceived from the below media:...

How to overcome challenges to Databases while Scaling

Now let us discuss below concepts that help us in scaling our databases and overcoming these challenges that are as follows:...

What is Indexing?

Indexing is a procedure introduced for database operations and other queries (received by CPU) are optimized by reducing the amount of time needed to complete a query, indexing helps optimize queries and other database processes while fetching data in lesser time. The indexes are stored using the B-tree data structure. Only utilize indexing if the data is massive and the application requires a lot of reading. Indexing may slow down write operations if an application is write-intensive....

What is Data partitioning?

It is a database procedure of partitioning that involves breaking up a very large table into a number of smaller sections. Queries that access only a tiny portion of the data can run faster since there is fewer data to scan when huge tables are divided into smaller individual tables. When the amount of data is large and a single system cannot handle it, partitioning is used....

What is Sharding?

Sharding is a very important concept that helps the system to keep data in different resources according to the sharding process. The word “Shard” means “a small part of a whole“.  Sharding means dividing a larger part into smaller parts. In DBMS, Sharding is a type of database partitioning in which a large Database is divided or partitioned into smaller data and different nodes. These shards are not only smaller, but also faster and hence easily manageable....

Contact Us