Availability in System Design

Availability is the percentage of time the system is up and working for the needs. 

It is a very important factor when for tech companies to provide services while designing systems. As recorded, Meta went down for 6 hours, corresponding to loss of estimated 60 million dollars.

There are levels associated with availability with respect to the service the system is offering. For instance, air traffic control requires a higher level of availability in comparison to the restaurant reservation system.

How availability is measured?

How availability is measured?

Now you must be thinking about what are these levels and how they are measured. Levels in availability are measured via downtime per year via order of ‘nines’. More ‘nines’ lead to lesser downtime. 

It is as shown below via table as follows:

Availability(%) Downtime/Year
90 ~36.5 days 
99 ~3.65 days 
99.9 ~8.7 Hours 
99.99 ~52 Minutes
99.999 ~6 Minutes

Note: 5 nines is considered as the golden standard of availability the system is available(up) to perform tasks.

How to increase Availability?

  1. Eliminate SPOF(major and important)
  2. Verify Automatic Failover
  3. Use Geographic Redundancy
  4. Continue upgrading and improving

From the above understanding, we can land up with two conclusions:

  1. Availability is low in monolithic architecture due to SPOF.
  2. Availability is high in distributed architecture due to redundancy.

Important Key Concepts and Terminologies – Learn System Design

System Design is the core concept behind the design of any distributed systems. System Design is defined as a process of creating an architecture for different components, interfaces, and modules of the system and providing corresponding data helpful in implementing such elements in systems.

In this article, we’ll cover the standard terms and key concepts of system design and performance, such as:

  • Latency, 
  • Throughput, 
  • Availability,
  • Redundancy,
  • Time
  • CAP Theorem
  • Lamport’s Logical Clock Theorem.

Important Key Concepts and Terminologies In System Design – Learn System Design

Let us see them one by one.

Similar Reads

Throughput in System Design

Throughput is defined as the measure of amount of data transmitted successfully in a system, in a certain amount of time. In simple terms, throughput is considered as how much data is transmitted successfully over a period of time....

Latency in System Design

Latency is defined as the amount of time required for a single data to be delivered successfully. Latency is measured in milliseconds (ms)....

Availability in System Design

Availability is the percentage of time the system is up and working for the needs....

Redundancy in System Design

Redundancy is defined as a concept where certain entities are duplicated with aim to scale up the system and reduce over all down-time....

Consistency in System Design

Consistency is referred to as data uniformity in systems....

Time in System Design

Time is a measure of sequences of events happening which is measured here in seconds in its SI unit.  It is measured using a clock which is of two types: Physical Clock: responsible for the time between systems.   Logical Clock: responsible for the time within a system....

CAP Theorem In System Design

Three desirable characteristics of distributed systems with replicated data are referred to as CAP: partition tolerance, availability, and consistency (among replicated copies) (in the face of the nodes in the system being partitioned by a network fault). According to this theorem, in a distributed system with data replication, it is not possible to ensure all three of the required properties—consistency, availability, and partition tolerance—at the same time. It claims that only two of the three properties stated below can be supported strongly by networked shared-data systems:...

Lamport’s Logical Clock Theorem

Lamport’s Logical Clock is a process to ascertain the sequence in which events take place. It acts as the foundation for the more complex Vector Clock Algorithm. A logical clock is required because a distributed operating system (Lamport) lacks a global clock....

Contact Us