Scaling your app: load balancing

Draft Disclaimer: Please note that this article is currently in draft form and may undergo revisions before final publication. The content, including information, opinions, and recommendations, is subject to change and may not represent the final version. We appreciate your understanding and patience as we work to refine and improve the quality of this article. Your feedback is valuable in shaping the final release.

I might be repeating why was well illustrated on https://samwho.dev/load-balancing/?ref=bcd.dev

Why

1 person accessing your website vs 1k vs 100k vs 1M vs 100M
single server, single point of failure
increase availability: replication
reduce failure risk
how do you know how much servers you need
- determine
  - application load per user
  - max users capacity
  - daily users count

When

system could use redundancy or capacity/availability
vertical scaling is no longer cutting it

Requirements

servers
- minimum 2 application.
- 1 persistence
- 1 load balancer

What

layer 4 vs layer 7
see OSI network layer
auto scale up and down base on live traffic
- up when server running at 80%
- down when server running below 30%

How

Considerations

sessions, cache, database, storage need their own instance shared accross the replicas
each server can use the same load balancing technique described here. Example: database as 3 servers, 1 main, 2 replicas levering read/write connection
with autoscaling, you have to delete server on a new basis. We would not want to put anything that should not be deleted (ie database, cache)

Load balancing algorithm

dynamic
- least connections
  - client routed to server with least connection
  - requires keeping track of connection
- least time
  - based on latency: server response time
  - requires keeping track of response time
static
- hash
  - hash(ip or url)
  - same server for same result
- roundrobin
  - simplest, easy to understand
  - equal distribution
  - not ideal for long running user session
  - variations:
    - sticky
    - weighted
      - see weight as probability

With Nginx roundrobin on a pool of servers running same code

install nginx on all web servers (load balancer included)
deploy code on application code (ie node 1, and 2)

# Load balancer config
upsteam app {
  server public_ip1_here:80;
  server public_ip2_here:80;
	# Faster to put private ips when load balancer and nodes are on the same private network
  # server private_ip1_here:80;
  # server private_ip2_here:80;
}

# TODO: force https server block

server {
  listen 80 default_server;
	server_name lb.domain;
	charset utf-8;
	
	# TODO: add ssl block
	
	# For certificate renewals
	location /.well-known {
	  root /var/www/html;
		try_files $uri $uri/ =404;
	}
	
	location / {
	  proxy_set_header Host $http_host;
	  proxy_set_header X-Real-IP $remote_addr;
	  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
	  proxy_set_header X-Forwarded-Proto $scheme;
		proxy_pass http://app;
		proxy_redirect off;
		
		# Handle websocket connections
		proxy_http_version 1.1;
		proxy_set_header Upgrade $http_upgrade;
		proxy_set_header Connection "upgrade";		
	}

Not encrypting traffic between load balancer and application server because the latter is not available on the public internet. If they were, then yes absolutely. This is also makes it that only the load balancer needs https load balancer doing ssl termination

TODO: add to laravel nginx config

location ^# /.well-known/acme-challenge {
  allow all;
}

location ~* (?:^|/\. {
  deny all;
}

Additional steps if you are using laravel

trusted proxy ?

Database: Mysql to bind

localhost
private network ip so that hosts on the same network can access it

	create user app_user@'10.0.%' identified by 'securepassword';
	grant all privileges on app_db.* to app_user@'10.0.%';

Cache: Redis

bind to private address for the same reason above

Security considerations

application servers to not be accessible over internet
- 22 from me only
- 80 from private network
persistence server
- ssh port
load balancing
- http
- https
- ssh

AWS code deploy

deploy to many servers

deployment with AWS

EC2
- Elastic Cloud Compute
- deployment - Load balancer -> Target groups -> Auto scaling group -> EC2 instances
ECS

Conclusion

Application metrics

response time (min, max, avg)
requests a day
bandwidth a day (in, out)

Based on application metrics we can predict the need of our application

As any service, you can go with the managed / done for you plan:

google cloud load balancing
AWS Elastic Load balancing