How Netflix balance the high traffic load

1. Elastic Load Balancer

ELB in Netflix is responsible for routing the traffic to front-end services. ELB performs a two-tier load-balancing scheme where the load is balanced over zones first and then instances (servers).  

  • The First-tier consists of basic DNS-based Round Robin Balancing. When the request lands on the first load balancing ( see the figure), it is balanced across one of the zones (using round-robin) that your ELB is configured to use.
  • The second tier is an array of load balancer instances, and it performs the Round Robin Balancing technique to distribute the request across the instances that are behind it in the same zone.


ZUUL is a gateway service that provides dynamic routing, monitoring, resiliency, and security. It provides easy routing based on query parameters, URL, and path. Let’s understand the working of its different parts:

  • The Netty server takes responsibility to handle the network protocol, web server, connection management, and proxying work. When the request will hit the Netty server, it will proxy the request to the inbound filter.
  • The inbound filter is responsible for authentication, routing, or decorating the request. Then it forwards the request to the endpoint filter.
  • The endpoint filter is used to return a static response or to forward the request to the backend service (or origin as we call it).
  • Once it receives the response from the backend service, it sends the request to the outbound filter.
  • An outbound filter is used for zipping the content, calculating the metrics, or adding/removing custom headers. After that, the response is sent back to the Netty server and then it is received by the client.

Advantages of using ZUUL:

  • You can create some rules and share the traffic by distributing the different parts of the traffic to different servers.
  • Developers can also do load testing on newly deployed clusters in some machines. They can route some existing traffic on these clusters and check how much load a specific server can bear.
  • You can also test new services. When you upgrade the service and you want to check how it behaves with the real-time API requests, in that case, you can deploy the particular service on one server and you can redirect some part of the traffic to the new service to check the service in real-time.
  • We can also filter the bad request by setting the custom rules at the endpoint filter or firewall.

3. Hystrix

In a complex distributed system a server may rely on the response of another server. Dependencies among these servers can create latency and the entire system may stop working if one of the servers will inevitably fail at some point. To solve this problem we can isolate the host application from these external failures.

Hystrix library is designed to do this job. It helps you to control the interactions between these distributed services by adding latency tolerance and fault tolerance logic. Hystrix does this by isolating points of access between the services, remote system, and 3rd party libraries. The library helps to:

  • Stop cascading failures in a complex distributed system.
  • control over latency and failure from dependencies accessed (typically over the network) via third-party client libraries.
  • Fail fast and rapidly recover.
  • Fallback and gracefully degrade when possible.
  • Enable near real-time monitoring, alerting, and operational control.
  • Concurrency-aware request caching. Automated batching through request collapsing 

System Design Netflix | A Complete Architecture

Designing Netflix is a quite common question of system design rounds in interviews. In the world of streaming services, Netflix stands as a monopoly, captivating millions of viewers worldwide with its vast library of content delivered seamlessly to screens of all sizes. Behind this seemingly effortless experience lies a nicely crafted system design. In this article, we will study Netflix’s system design.

Important Topics for the Netflix System Design

  • Requirements of Netflix System Design
  • High-Level Design of Netflix System Design
    • Microservices Architecture of Netflix 
  • Low Level Design of Netflix System Design
    • How Does Netflix Onboard a Movie/Video?
    • How Netflix balance the high traffic load
    • EV Cache
    • Data Processing in Netflix Using Kafka And Apache Chukwa
    • Elastic Search
    • Apache Spark For Movie Recommendation
  • Database Design of Netflix System Design

Similar Reads

1. Requirements of Netflix System Design

1.1. Functional Requirements...

2. High-Level Design of Netflix System Design

We all are familiar with Netflix services. It handles large categories of movies and television content and users pay the monthly rent to access these contents. Netflix has 180M+ subscribers in 200+ countries....

2.1. Microservices Architecture of Netflix

Netflix’s architectural style is built as a collection of services. This is known as microservices architecture and this power all of the APIs needed for applications and Web apps. When the request arrives at the endpoint it calls the other microservices for required data and these microservices can also request the data from different microservices. After that, a complete response for the API request is sent back to the endpoint....

3. Low Level Design of Netflix System Design

3.1. How Does Netflix Onboard a Movie/Video?...

3.1. How Does Netflix Onboard a Movie/Video?

Netflix receives very high-quality videos and content from the production houses, so before serving the videos to the users it does some preprocessing....

3.2. How Netflix balance the high traffic load

1. Elastic Load Balancer...

3.3. EV Cache

In most applications, some amount of data is frequently used. For faster response, these data can be cached in so many endpoints and it can be fetched from the cache instead of the original server. This reduces the load from the original server but the problem is if the node goes down all the cache goes down and this can hit the performance of the application....

3.4. Data Processing in Netflix Using Kafka And Apache Chukwa

When you click on a video Netflix starts processing data in various terms and it takes less than a nanosecond. Let’s discuss how the evolution pipeline works on Netflix....

3.5. Elastic Search

In recent years we have seen massive growth in using Elasticsearch within Netflix. Netflix is running approximately 150 clusters of elastic search and 3, 500 hosts with instances. Netflix is using elastic search for data visualization, customer support, and for some error detection in the system....

3.6. Apache Spark For Movie Recommendation

Netflix uses Apache Spark and Machine learning for Movie recommendations. Let’s understand how it works with an example....

4. Database Design of Netflix System Design

Netflix uses two different databases i.e. MySQL(RDBMS) and Cassandra(NoSQL) for different purposes....

Contact Us