Wednesday, May 31, 2006

Today's the day

Today's also the first day that I've ever linked to this site. The first link to this site comes from ThaiHostTalk.com.

The scaling experience of ThaiLE.com

My presious post regarding Web Server Performance Comparison ends with a note about ThaiLE.com. Now's the time to write about it some more.

ThaiLE.com is the largest banner exchange network in Thailand with over 2 million banner exchange requests a day, as of April'06. This means basically about 10 million SQL queries per day. It comes with real-time transaction update and a lot of other features and resides on a single Dell server (P4D 3.0G, 2G RAM, 70G SCSI 15000rpm). The server runs Gentoo Linux with NPTL and a lot of other optimizations.

It used to be LAMP (Linux, Apache2, MySQL and PHP) and now it's 3LMP (Linux, LSWS, Lighty, MySQL and PHP).

During May we saw a traffic growth of over 30% and it's time to do something about the server.

Toward the end of April, server load average was pretty high, and I started to play with lighty first becuase of its open source nature. Lighty was able to cut down on our LA about 20-30% during peak hours, while memory pressure was reduced about as much as 30-40%. But there were problem in a prior version of lighty that forced me to investigate lsws. With lsws, the lightened feeling that I'd experienced with lighty has carried on, and since its watchdog process is better that lighty, there have been just slight problems with lsws since a month ago. Now with its latest version, I'd be able to sleep soundly at night and the system is able to scale some more without adding any hardware. ;)

Oh well, lsws STD edition wouldn't give you more than 300 concurrent connections, I have to add lighty running on a second IP on the same machine to serve just banners. Swap space usage is almost nil (while it was couple hundred M during Apache2 tenure) and the system is able to serve about 3 million banner exchange transactions a day (or ~ 15 million MySQL queries) without any problem, web-server-wise.

Next thing to tinker with was MySQL and it is now fixed by migrating a couple high traffic tables to memory (heap) tables with cron jobs to update the on-disk tables with current values from memory.

End note, Apache is good, Lighty is good, LSWS is good. Choose the ones that best suits your needs. Also, I don't notice speed advantage of lsws over lighty but the experiment shows that lsws is of better stability than lighty when running with php-fastcgi (phplsapi in case of lsws).

Cudos to all people that make possible all those great software.

High Availability, and a lot... (work in progress)

High Availability
As more and more mission-critical applications move on the Internet, providing highly available services becomes increasingly important. One of the advantages of a clustered system is that it has hardware and software redundancy, because the cluster system consists of a number of independent nodes, and each node runs a copy of operating system and application software. High availability can be achieved by detecting node or daemon failures and reconfiguring the system appropriately, so that the workload can be taken over by the remaining nodes in the cluster.

pizzaman: "There are so many possible scenarios for doing HA stuff, hope that I'll be able to play with more of them soon."


Edit: Aug 16, 2006:

During this 3 days holiday, I've read through a lot and found many interesting solutions including:
Several nice articles:
I later stumbled upon:
  • Kevin Minnick's comments on these solutions (although I think pound doesn't provide HA on its own). Kevin touches upon perlbal, a good perl-based RP and load balancer that I've never heard of until now. Got to have a look later.
  • At danga, I saw its DFS called MogileFS. Looks cool, altought it lacks POSIX compliance that I need.

Tuesday, May 09, 2006

The adventures of scaling, Stage 1