Introduction
Scalability is an application's ability to handle an increased load without a decrease in performance or availability.
Types of Scaling
Vertical Scaling (Scaling Up)
Here, we add more resources to handle the high traffic being experienced.
This is the easiest way to scale up your application. Since the application's logic doesn't care about machine resources, nothing special is needed to perform scaling up.
Scaling up is not enough due to the following:
Machine resources i.e. Memory, CPU, and storage are limited. Hence, you can't scale up endlessly.
Powerful machines are expensive.
In case your application is deployed to a single machine, if/when this machine goes down, your application becomes unavailable.
It's hard to deploy an application to a single machine without any downtime.
Machine maintenance is impossible without application downtime.
Horizontal Scaling (Scaling Out)
Client-side Traffic Balancing
- Domain Name System: You can use multiple IP Addresses for a single domain name and balance traffic between these IPs using different algorithms (round-robin is the most popular).
- Client-side: The client needs to know the server addresses and balance traffic between them by itself. This usually involves a library developed by server application owners that clients can use. Clients won't implement traffic balancing on their own.
Server-side Traffic Balancing
Load Balancer: Server that distributes incoming traffic across multiple servers. There are both hardware and software load balancers. Examples of Popular Software Load Balancer are:
NGINX: Web server and reverse proxy that can be used as a load balancer
HAProxy: Free, Open-source, reliable, high-performance TCP/HTTP load balancer.
Apache HTTP Server: Web Server that can be used as a load balancer.