A load balancer is the single host welcoming all traffic going into your application.
Elastic Load Balancing (ELB) will help you spreads traffic across multiple EC2 instances, so no single server is overwhelmed.
We've solved the scaling problem with Amazon EC2 Auto Scaling, but a seperate problem is traffic.
Let's have a look at what this means.
Imagine you're at a bustling pizza party, and there are three different tables where you can grab a slice. Strangely, most partygoers are crowding around one table, causing a wildly different number of people per table. The other two tables have plenty of pizza left, but guests arriving at the party often look confused, not sure which table to head to for a delicious slice.
It would be a lot smoother if we introduced a "Pizza Party Host."
This Host stands near the entrance, welcoming guests to the pizza party.
They carefully watch the pizza tables, taking count of how many people are around each one.
When new guests arrive, the Host guides them to the table with the shortest line, making sure all the tables are equally busy.
Everyone gets their pizza quickly and efficiently.
In your AWS environment, think of the multiple servers running the same software to serve requests coming in. How does a new request decide which server to go to? You don't want one server overwhelmed while the others stay unused.
You need a way to evenly distribute incoming requests among your servers. This solution is what we call load balancing. Just like the "Pizza Party Host" ensures fair pizza distribution, load balancing ensures fair work distribution among your servers.
Elastic Load Balancing (ELB)
Elastic Load Balancing is the AWS service that automatically distributes incoming application traffic across multiple resources, such as Amazon EC2 instances.
ELB often works hand in hand with EC2 Auto Scaling Groups:
When more traffic is coming in, EC2 Auto Scaling automatically adds new instances, lets ELB service know that it's ready to handle the traffic, and off ELB goes distributing requests to the new servers too.
Once traffic reduces and the EC2 fleet scales in, ELB first stops all new traffic, and waits for the existing requests to complete. Once that's done, the EC2 Auto Scaling can terminate the instances without disruption to existing customers.
Key facts about ELB
If you're using ELB to distribute traffic to your EC2 instances, you don't have to set up or maintain the load balancer separately on each of those instances. This is a real time saver!
ELB automatically works across multiple instances, and if one instance fails, it redirects traffic to the healthy ones. Your application remains available even if some instances go down, providing a seamless experience to your app users.
If heaps of new traffic comes in, ELB will handle them all with no changes to the hourly cost.
ELB isn't just used for external traffic (i.e. traffic that's come from the public). IT can also handle traffic between different parts your application.
For example, it can be the messenger between the front end and backend of a website. Frontend = the pretty page you see when you load the website. Backend = the behind the scenes code that makes everything run smoothly.
All front end servers send their requests to ELB, then ELB directs traffic to the back end servers.
Now, when the number of front end and back end EC2 instances change, all they need to do is tell ELB the changes, and ELB will handle the rest.
The front end doesn't know and doesn't care how many back end instances are running. This situation is called a decoupled architecture.