Caching Servers
For the top/front layer of your server architecture, i.e. your cache servers, it is also important to have a clear understanding of the scalability issues of TCP connections.
The first thing you may notice when putting load on your system,
is that the cache server process runs out of file handles (if
the start script does not increase the right kernel
parameter). This is because the OS will use one file handle per
connection (as this is what applications can relate to) and the
default number of handles a given user process can create, (on
many systems) is 1024. However, this is easily remedied with
e.g. the ulimit -u
command (on at least Linux
and FreeBSD) or persistently using
/etc/sysctl.conf
(on Linux and FreeBSD) or
/etc/system
(on Solaris). The number of file
handles can be cranked up to many hundred thousand and thus has
no real limitation.
What is more interesting, is that the OS when creating the connection will create a connection from a local port to an anonymous port on the requesting host:
cache01:2323 -> ohterhost:1237
The port numbers are defined in the TCP protocol to be an unsigned 16bit number, yielding a maximum of 65535 ports available. Luckily, the local port number must not be unique across requesting hosts. Thus, the same port can be used for multiple connections to different IPs:
cache01:2323 -> ohterhost:1237 cache01:2323 -> yetanotherhost:4545
This means, that the maximum theoretical number of connections a cache server can handle, is (65535 - <number of ports reserved for system services, normally 1024>) * incoming IPs. For this to work as desired, it is important that the load balancer in front of the cache is fully transparent, exposing the incoming IP of the request to the cache server(s) and not the IP of the balancer itself.
To illustrate the last point, given the use case where three users are visiting your web site:
user1:2213 -> load-balancer:80 -> cache01:80 user2:1212 -> load-balancer:80 -> cache01:80 user3:5333 -> load-balancer:80 -> cache01:80
It is important that that cache01
either
sees the IPs of the requesting clients
(user01
, user02
and
user03
) and not the IP of the load
balancer, this is the optimal situation, or different IPs from
of load balancer.
The latter solution is somewhat a hack, but it also works; just adding an additional interface/IP to the load balancer and/or the cache server will generate an additional set of origin host/port and destination host/port combinations, which will also make the cache server handle more than 65535 connections.
However, if you can, go with the first option and make the load balancer transparent. Your cache server will then be able to handle as many TCP connection as your load balancer can pass on (given that your OS kernel manages to allocate and recycle enough TCP connections fast enough, of course).