> In this page

Introduction
What is load balancing
Techniques
Persistence
Limitations
How to choose
Application tuning

> Quick links

Home
Articles
Linksys NSLU2
Load Balancing
A cheap UPS for ALIX
RTC on Capacitor for ALIX
GuruPlug Server Plus
Elastic Binary Trees
Traffic around Paris

> Software News

Inject CD-2.6 20071218
Inject CD 20070808
Rescue CD 20070808
HAProxy 1.4.18 (stable)
HAProxy 1.3.26 (stable)
Linux 2.4.37.11 (stable)
HTTPTerm 1.6.2
Inject 32

> Site search

 
Web 1wt.eu

> Advertisements

> Load balancing techniques (cont'd)

2. Reducing the number of users per server

A more common approach is to split the population of users across multiple servers. This involves a load balancer between the users and the servers. It can consist in a hardware equipment, or in a software installed either on a dedicated front server, or on the application servers themselves. Note that by deploying new components, the risk of failure increases, so a common practise is to have a second load balancer acting as a backup for the primary one.

Generally, a hardware load balancer will work at the network packets level and will act on routing, using one of the following methods :

  • Direct routing : the load balancer routes the same service address through different local, physical servers, which must be on the same network segment, and must all share the same service address. It has the huge advantage of not modifying anything at the IP level, so that the servers can reply directly to the user without passing again through the load balancer. This is called "direct server return". As the processing power needed for this method is minimal, this is often the one used on front servers on very high traffic sites. On the other hand, it requires some solid knowledge of the TCP/IP model to correctly configure all the equipments, including the servers.
  • Tunnelling : it works exactly like direct routing, except that by establishing tunnels between the load balancer and the servers, theses ones can be located on remote networks. Direct server return is still possible.
  • IP address translation (NAT) : the user connects to a virtual destination address, which the load balancer translates to one of the servers's addresses. This is easier to deploy at first glance, because there is less trouble in the server configuration. But this requires stricter programming rules. One common error is application servers indicating their internal addresses in some responses. Also, this requires more work on the load balancer, which has to translate addresses back and forth, to maintain a session table, and it requires that the return traffic goes through the load balancer as well. Sometimes, too short session timeouts on the load balancer induce side effects known as ACK storms. In this case, the only solution is to increase the timeouts, at the risk of saturating the load balancer's session table.

On the opposite side, we find software load balancers. They most often act like reverse proxies, pretending to be the server and forwarding them the traffic. This implies that the servers themselves cannot be reached directly from the users, and that some protocols might never get load-balanced. They need more processing power than those acting at the network level, but since they splice the communication between the users and the servers, they provide a first level of security by only forwarding what they understand. This is the reason why we often find URL filtering capabilities on those products.

2.1. Testing the servers

To select a server, a load balancer must know which ones are available. For this, it will periodically send them pings, connection attempts, requests, or anything the administrator considers a valid measure to qualify their state. These tests are called "health checks". A crashed server might respond to ping but not to TCP connections, and a hung server might respond to TCP connections but not to HTTP requests. When a multi-layer Web application server is involved, some HTTP requests will provide instant responses while others will fail. So there is a real interest in choosing the most representative tests permitted by the application and the load balancer. Some tests might retrieve data from a database to ensure the whole chain is valid. The drawback is that those tests will consume a certain amount of server ressources (CPU, threads, etc...). They will have to be spaced enough in time to avoid loading the servers too much, but still be close enough to quickly detect a dead server. Health checking is one of the most complicated aspect of load balancing, and it's very common that after a few tests, the application developpers finally implement a special request dedicated to the load balancer, which performs a number of internal representative tests. For this matter, software load balancers are by far the most flexible because they often provide scripting capabilities, and if one check requires code modifications, the software's editor can generally provide them within a short timeframe.

<<< [ 2/10 ] >>>