Scaling your app: load balancing

Draft Disclaimer: Please note that this article is currently in draft form and may undergo revisions before final publication. The content, including information, opinions, and recommendations, is subject to change and may not represent the final version. We appreciate your understanding and patience as we work to refine and improve the quality of this article. Your feedback is valuable in shaping the final release.

Language Mismatch Disclaimer: Please be aware that the language of this article may not match the language settings of your browser or device.
Do you want to read articles in English instead ?

I might be repeating why was well illustrated on


  • 1 person accessing your website vs 1k vs 100k vs 1M vs 100M
  • single server, single point of failure
  • increase availability: replication
  • reduce failure risk
  • how do you know how much servers you need
    • determine
      • application load per user
      • max users capacity
      • daily users count


  • system could use redundancy or capacity/availability
  • vertical scaling is no longer cutting it


  • servers
    • minimum 2 application.
    • 1 persistence
    • 1 load balancer


  • layer 4 vs layer 7
  • see OSI network layer
  • auto scale up and down base on live traffic
    • up when server running at 80%
    • down when server running below 30%



  • sessions, cache, database, storage need their own instance shared accross the replicas
  • each server can use the same load balancing technique described here. Example: database as 3 servers, 1 main, 2 replicas levering read/write connection
  • with autoscaling, you have to delete server on a new basis. We would not want to put anything that should not be deleted (ie database, cache)

Load balancing algorithm

  • dynamic
    • least connections
      • client routed to server with least connection
      • requires keeping track of connection
    • least time
      • based on latency: server response time
      • requires keeping track of response time
  • static
    • hash
      • hash(ip or url)
      • same server for same result
    • roundrobin
      • simplest, easy to understand
      • equal distribution
      • not ideal for long running user session
      • variations:
        • sticky
        • weighted
          • see weight as probability

With Nginx roundrobin on a pool of servers running same code

  • install nginx on all web servers (load balancer included)
  • deploy code on application code (ie node 1, and 2)
# Load balancer config
upsteam app {
  server public_ip1_here:80;
  server public_ip2_here:80;
	# Faster to put private ips when load balancer and nodes are on the same private network
  # server private_ip1_here:80;
  # server private_ip2_here:80;

# TODO: force https server block

server {
  listen 80 default_server;
	server_name lb.domain;
	charset utf-8;
	# TODO: add ssl block
	# For certificate renewals
	location /.well-known {
	  root /var/www/html;
		try_files $uri $uri/ =404;
	location / {
	  proxy_set_header Host $http_host;
	  proxy_set_header X-Real-IP $remote_addr;
	  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
	  proxy_set_header X-Forwarded-Proto $scheme;
		proxy_pass http://app;
		proxy_redirect off;
		# Handle websocket connections
		proxy_http_version 1.1;
		proxy_set_header Upgrade $http_upgrade;
		proxy_set_header Connection "upgrade";		

Not encrypting traffic between load balancer and application server because the latter is not available on the public internet. If they were, then yes absolutely. This is also makes it that only the load balancer needs https load balancer doing ssl termination

TODO: add to laravel nginx config

location ^# /.well-known/acme-challenge {
  allow all;

location ~* (?:^|/\. {
  deny all;

Additional steps if you are using laravel

  • trusted proxy ?

Database: Mysql to bind

  • localhost
  • private network ip so that hosts on the same network can access it
	create user app_user@'10.0.%' identified by 'securepassword';
	grant all privileges on app_db.* to app_user@'10.0.%';

Cache: Redis

  • bind to private address for the same reason above

Security considerations

  • application servers to not be accessible over internet
    • 22 from me only
    • 80 from private network
  • persistence server
    • ssh port
  • load balancing
    • http
    • https
    • ssh

AWS code deploy

  • deploy to many servers

deployment with AWS

  • EC2
    • Elastic Cloud Compute
    • deployment - Load balancer -> Target groups -> Auto scaling group -> EC2 instances
  • ECS


Application metrics

  • response time (min, max, avg)
  • requests a day
  • bandwidth a day (in, out)

Based on application metrics we can predict the need of our application

As any service, you can go with the managed / done for you plan:

  • google cloud load balancing
  • AWS Elastic Load balancing