Saturday, September 6, 2003

Why your e-mail is slow (or Dammit! It's not an IM!)

The managers at work held an emergency meeting yesterday. We're seeing these often enough that we're calling them the Friday afternoon meltdown. The cause of this one? The level one helpdesk drone flubbing an explanation (to a senior person) as to why incoming e-mail is being delayed up to 12 hours. He finally gave up and said "It's a massive virus infection." (without adding that it was everyone else on the planet that had the infection. We don't.) So, to practice for Monday morning, let me see if I can do better...

1) E-mail is handled by a number of machines as it goes from point A to point B. When the user A hit send, the message gets deposited on his local server.

2) That server may scan the mail for viruses/spam/inappropriate content before dumping it in the outgoing queue (a folder or directory on the harddrive). Normally that queue gets processed every five minutes or so. Depending on system load, this time period is variable.

3) The local mail server then hands the mail off to the next server (usually the site's firewall) which cause the mail to go through a similar process, queue, and forward process until

4) The previous step is repeated as the message passes from the local network and onto the Internet, then onto the recipient's network until

5) the e-mail is received at user B's local mail server.

Depending on the size of the organizations involved, this can happen up to or over 25 times (think about the number of places/people involved in delivering a hand-written letter to Aunt Sophie on the other side of the country).

Mail servers are designed to alter their characteristics depending on their current processing load. (This applies to Exchange, Sendmail, and Postfix as well as just about any other MTA.) Above a certain load, mail servers will ask delivering MTA's to hold their content so that the local server can catch up on its own deliveries.

Now mix in the SoBig virus. This thing has even outperformed Klez in the sheer numbers of infected traffic generated. Given an file size of about 72K and approximately a 1000 infected messages per day for a small-to-medium-sized organization, this means a processing requirement of about 72 MB per day. Throw THAT on top of the organization's normal mail traffic and mix in the usual bandwidth requirements for web browsing abuse, audio streaming, P2P file trading, and the ongoing problem with Blaster/Welchia. What you get is any under-sized gateway and/or gateway servers (mail handline devices in this case) slowing down delivery of mail.

Want to figure out which servers caused your mail to be delivered late? Read the message header. It'll show "Received by" dates and times for each server it passes through. One thing to remember though: not everyone keeps their system clocks set properly.

Overall, given the havoc being created by Welchia/SoBig and any organizations tendency to spend the least amount of money possible when buying IT equipment, count yourself lucky that it only took 12 hours for you to get your e-mail. Want something faster? Try using IM or the telephone!