Recently we spent a little time optimizing some servers. These are linux machines running apache serving static and dynamic content using php. Each apache process consumes 13mb of private resident memory under load and has a gigabit net connection. A sample bit of “large static content” is 2mb. Assume cl\ ients consuming that content need about 20s to get it down (100kb/s or so). That means we need to be spoon feeding about 2000 simultaneously connected clients in order to saturate the gigabit connection.
So turn up MaxClients, right? 13mb * 2000 (and a recompile, btw) about 26gb of RAM. uh. that’s not gonna work.
So there’s lots of ways to solve this problem, but before we start thinking about that, how would we simulate such a load so that we can validate the existence of this bottleneck now, and it’s resolution once we fix it?
seige is a great little bit of software that can simulate load:
siege-2.69/src/siege -c 200 -f url.txt
this would hit all urls in url.txt with 200 simultaneous connections. But we want each connection to take about 20s (yeah, I have one url to static content in that file).
siege-2.69/src/siege -c 200 -d 20 -f url.txt
Looks better, right? but not. This -d causes a 20s delay (actually a random delay between 0 and 20) between requests. I actually want to throttle bandwidth… So how about a little hack
siege-2.69/src/siege -c 200 -w 20 -f url.txt
-w or –read-delay (that I’ve hacked in and is available in this patch moves the sleep to after the connection is established.
Sure it’s a hack, but let’s see how far it gets us…
till the next, lloyd