Monday, October 17, 2005

Detecting infected clients via DNS

Consider this as another of my you-need-to-know-what-normal-is

About five years ago, a couple of us (at a previous job) wrote
a script to process DNS log files to watch for systems suddenly
performing massive amounts of DNS lookups. In other words, watching for
infected systems.

Someone recently wrote a paper on this same topic
and has received a bit of notoriety for it. There's no black art to it.
It's pretty easy to kluge together.

  1. First be sure that your
    internal DNS server can handle a heavier load. I recommend running a
    dedicated server using BSDi (even an older version) because the load
    that BIND puts on BSDi is barely noticeable.
  2. Turn on querylog.
    It'll generate log entries like:

    Oct 15 09:18:37 desk named[13556]: client query: IN A +
    Oct 15 09:18:56 desk named[13556]: client query: IN A +
  3. Obviously, Perl is perfect to extract data from these log
    entries. Write a script to parse each line and insert the data from the
    line into a MySQL or Postgres database.
  4. Then use Perl, PHP,
    Ruby, or [insert your favorite language here] to extract the data in
    different "views", such as total-queries-by-client,
    total-queries-by-network-per-minute (or hour or day),
    total-individual-queries-per-minute-by-target, etc.
  5. To go along
    with these data "view", it's usually helpful to graph the generated
    metrics for simple crayon-understanding graphics. To be useful, you'll
    want graphs for the last hour, the last day, the last week and the last
    month, along with a user-configurable graph generation script, so that
    you (or someone else) can make quick interpretations and make
    comparisons to previously collected data.
  6. Finally, you'll want a
    script to periodically clean up the log file, either archiving it or
    deleting it. Running querylog full-time with generate massive log
    files. It may also be a good idea to write scripts to aggregate the
    data in the database server, keeping only generic statistical totals for
    data past a certain age.
  7. Collecting/analyzing metrics such
    as these are well within the talents of the average network admin (and
    is usually free). I'm amazed that companies are willing to shell out
    big $$$ for something as simple as this.

    If you have anything to do
    with network adminstration, this is something that you should be able to
    do. If you "own" a network, this is something that you want at least
    one of your network admin or security types to do. (Think of it as
    being able to gather and analyze data for troubleshooting.)