Network Load Balancer
HAProxy

Functions for TCP and HTTP-based applications that spreads requests across multiple servers.

17, May 2021    Avenash Kumar

Introduction

As organizations are migrating from there monolithic designed applications to microservices. This migration has greatly impacted how applications are scaled, secure and managed. HAProxy stands for High Availability Proxy, is a Software Load Balancer and Application Delivery Controller. It comes with all the ingredients that can make any application more secure and reliable, which includes build-in rate limiting, anomaly detection, connection queuing, health checks, detailed logs and metrics etc. This article will cover installation and configuration of HAProxy for balancing L4 and L7 network traffic. Described configurations can be tweak and tune for most load balancing requirements. The setup is simplified from a typical production setup and will use a single node HAProxy server, along with five application server instances (running inside docker container, within the same host) that will interact with database to serve requests forwarded from the HAProxy.

Before You Begin

1) This article will use L4 and L7 wherever possible. This refers layer 4 and layer 7 of OSI model, respectively. When discussing load balancing across the industry today, solutions are often bucketed into categories L4 and L7. L4 load balancer is generally unaware about any application details, as it deals with TCP/UDP packets. However, L7 load balancer can be used to make complex decisions for balancing network traffic, as it deals with actual message content.

2) This guide uses term application server in various sections. It refers to any web server, api server or any server that serve user requests. 

Application server without load balancer

Why we need network load balancer?

As the name suggest, load balancer is a way to maintain network traffic equilibrium across available application or web servers. Generally, application/web servers alone are not robust enough to handle extreme network traffic during busy times of the year. 

Moreover, for some reason if the server goes down end user would no longer be able to access the application. Load balancer brings flexibility, reliability, and availability. It is considered as an essential component for any simple or complex, normal or enterprise web application. 

Application server with load balancer

HAProxy can handle enormous volumes of network traffic with very little resource usage. High trafficking websites like Twitter, Facebook, Instagram, Reddit serving millions of requests per second through HAProxy. 

You don't have to work in such IT giants to defend usage of a network load balancer. Even a simple self-hosted web application/site/API requires load balancer to ensure availability, scalability and security. It is more like necessity, rather than good to have thing!

To get started, I have used single node environment to spin up HAProxy server and Spring Boot application servers through docker. HAProxy will distribute incoming traffic among configured application servers. In the later part will see how we can limit end users to use specific endpoints of an application through ACLs (Access Control Lists).

Installation

HAProxy is included in the package management systems of most Linux distributions:

MacOS:

brew install haproxy

Ubuntu 18.04:

sudo apt-get install haproxy

RHEL 8/CentOS 8:

sudo yum install haproxy

Start Application Servers

HAProxy forwards incoming requests to application server in order to get it processed. It doesn't serve any incoming traffic directly. For this exercise I have created a Spring Boot application, which exposes four endpoints /endpoint1, /endpoint2, /admin, /. The complete source code and snippets for this article are available on my GitHub project.

For the sake of ease and simplicity docker container environment is used to spin up five instances.

docker run -p 8081:80 -d avenashkumar/haproxy-demo-spring-boot-app --appId="Server 1"
docker run -p 8082:80 -d avenashkumar/haproxy-demo-spring-boot-app --appId="Server 2"
docker run -p 8083:80 -d avenashkumar/haproxy-demo-spring-boot-app --appId="Server 3"
docker run -p 8084:80 -d avenashkumar/haproxy-demo-spring-boot-app --appId="Server 4"
docker run -p 8085:80 -d avenashkumar/haproxy-demo-spring-boot-app --appId="Server 5" 

These application servers prints message (such as "Server 1: Hello from Endpoint1"). Typically, you can consider these as an actual application servers running either on Kubernetes Pod or VM.

Assuming your docker containers are running, you should be able to access localhost:8081/endpoint1.

HAProxy - Basic Configurations

At the bare minimum HAProxy requires at least two sections 1) frontend and 2) backend configuration. The frontend configuration covers node by which HAProxy listens for connections. Backend configuration deals with application servers where HAProxy can forward requests.

Frontend Configurations:

frontend www.myapp.com
        bind *:8080
        timeout client 60s
        default_backend all_app_servers

frontend: This represents start of frontend configuration section. You may have more than one frontend section in you configuration file, depending on web servers you want expose to the internet. Each frontend requires name, such as www.myapp.com to differentiate it from others.
bind: This directive assigns a listener to a given IP address and port. The IP can be omitted to bind to all IP addresses on the server. Perhaps,  port can be a single port, a range, or a comma-delimited list.
timeout client: This setting measures inactivity during periods that we would expect the client to be speaking. The “s” suffix denotes time in seconds.
default_backend: This setting can be found in almost every frontend section. It requires the name of a backend to send traffic to (which is why this name should be same as your backend configuration name). Moreover, it can be use with use_backend setting, if a request is neither routed by use_backend nor default_backend directive, HAProxy will return a 503 Service Unavailable error.

Backend Configurations:

backend all_app_servers
        timeout connect 10s
        timeout server 100s
        server server1 127.0.0.1:8081
        server server2 127.0.0.1:8082
        server server3 127.0.0.1:8083
        server server4 127.0.0.1:8084
        server server5 127.0.0.1:8085

backend: Like frontend this represents start of backend configurations, where we are tend to define group of servers that will be load balanced and assigned requests to serve. It also requires string value that uniquely identify backend configuration, if multiple backend sections are there.
timeout connect: Time that HAProxy will wait for a TCP connection to a backend server to be established. The “s” suffix denotes seconds. Without any suffix, the time is assumed to be in milliseconds.
timeout server: This setting measures inactivity when we would expect the backend server to be speaking. When a timeout expires, the connection is closed. Having sensible timeouts reduces the risk of deadlocked processes tying up a connections that could otherwise be reused.
server: Server directive is the heart of the backend section. Its first argument is a name, followed by the IP address and port of the backend server. You can specify a domain name instead of an IP address. In that case, it will be resolved at startup.

All together, this is how my HAProxy configuration file cfg-haproxy.config look like:

frontend www.myapp.com
        bind *:8080
        timeout client 60s
        default_backend all_app_servers
backend all_app_servers
        timeout connect 10s
        timeout server 100s
        server server1 127.0.0.1:8081
        server server2 127.0.0.1:8082
        server server3 127.0.0.1:8083
        server server4 127.0.0.1:8084
        server server5 127.0.0.1:8085

You might have some questions like which load balancing algorithm will be used, when running HAProxy? or what type (L4/L7) of load balancer it is? well these are good questions to ask. By default HAProxy uses roundrobin strategy to distribute traffic across available servers. Moreover, it supports variety of load balancing strategies e.g. leastconn (least connections), source, URI, etc. If you want to use any strategy other than roundrobin. You need to provide balance directive along with strategy name in your backend configuration.

As far as the type of load balancer is concerned, by default HAProxy operates as a simple L4 load balancer. In order to spin up HAProxy as L7 load balancer, you will need to provide mode directive along with http for L7 or tcp for L4 in both front and backend sections:

frontend www.myapp.com
        bind *:8080
        timeout client 60s
        mode http
        default_backend all_app_servers

backend all_app_servers
        timeout connect 10s
        timeout server 100s
        balance roundrobin
        mode http
        server server1 127.0.0.1:8081
        server server2 127.0.0.1:8082
        server server3 127.0.0.1:8083
        server server4 127.0.0.1:8084
        server server5 127.0.0.1:8085

Finally! you can spin up your HAProxy load balancer. Yay!!!

haproxy -f cfg-haproxy.config

Now you should be able to access localhost:8080/endpoint1 and confirm roundrobin strategy by hitting same endpoint again and again. If you want behavior same as sticky sessions, you can use source instead roundrobin in balance directive. But better not to do that! this will reduce scalability, as each time request will land on same application server.

Restrict HAProxy from Accessing Specific Endpoints

You may have encountered in a situation where you want to restrict your end user from accessing specific endpoints of your API. e.g. In the current setup I have exposed /admin endpoint which I want to restrict from my end user. HAProxy lets you to define rules to redirect or deny any incoming network traffic. Access Control Lists (ACLs) are the core component of HAProxy. They let you to build conditional checks which can filter and direct network traffic in a real-time.  

http-request deny if { path -i -m beg /admin }

http-request directive indicates that any http request that is accessing /admin endpoint will gets blocked. This configuration should placed in frontend configuration section to make it work! 

Serve Requests from Specific Set of Servers

Consider a situation where I want to handle incoming traffic for /endpoint1 from server1 and server2. Similarly, server3 and server4 are dedicated for serving requests for /endpoint2, while server5 for all other incoming requests. This setup is suitable for cases when some of the application endpoints are taking much longer time in HTTP request/response life cycle. In such cases you don't want your end user to get stuck in waiting queue.

Again!! Thanks to ACLs, with bare minimum configurations this behavior can be achieved easily.

frontend www.myapp.com
        bind *:8080
        timeout client 60s
        mode http
        acl app1 path_end -i /endpoint1
        acl app2 path_end -i /endpoint2        
 
       use_backend app1_backend  if app1    
 
       use_backend app2_backend  if app2
        default_backend all_app_backend
backend app1_backend
        timeout connect 10s
        timeout server 100s
        balance leastconn
        mode http
        server server1 127.0.0.1:8081
        server server2 127.0.0.1:8082
backend app2_backend
        timeout connect 10s
        timeout server 100s
        balance roundrobin
        mode http
        server server3 127.0.0.1:8083
        server server4 127.0.0.1:8084
backend all_app_backend
        timeout connect 10s
        timeout server 100s
        balance roundrobin
        mode http
        server server5 127.0.0.1:8085

acl: This setting requires name followed by condition.
use_backend: Directive allows you to specify conditions for using any available backend in the configuration file. e.g., to send incoming traffic from /endpoint1 to app1_backend backend, you can combine use_backend with an ACL that checks whether the URL path begins with /endpoint1.

Here I have created three backend systems, app1_backend is responsible for serving any incoming request for /endpoint1.  Similarly,  app2_backend will serve any incoming network traffic for /endpoint2, while all_app_backend handles all other requests. You might wonder why app1_backend is using leastconn strategy, instead roundrobin. Assuming in our current setup /endpoint1 is taking much longer time to complete HTTP request/response life cycle. In such cases, often roundrobin strategy doesn't work really well.  Perhaps,  leastconn do the job better. Using  leastconn HAProxy will send requests to servers with the fewest active connections, which reduces chances of server being overloaded.

Conclusion

Although I discussed very few set of features that HAProxy supports, you now have a load balancer that listens on ports 8080, balancing network traffic across several backend application servers, secure sensitive application endpoints (/admin) and even sending traffic matching a specific URL pattern to a different backend servers.

This exercise might seem simple but you have just built and configured a very powerful load balancer capable of handling a significant amount of network traffic.

Again, For your convenience, I put all the code and configurations used in this article in my GitHub repo.

© Copyright 2021 Avenash Kumar. All Rights Reserved.