Logging: Apache and IIS

Overview

It is critical to set up your webserver logging in a format that allows Urchin to properly interpret the data and produce fully detailed reporting. This article explains the process for the most common webservers, Apache and Microsoft IIS. For maximum reporting depth, it is important to enable logging to include Referral and User Agent information. To enable unique visitor reporting when using the Urchin Tracking Module (UTM), it is additionally required to enable cookie logging. UTM-based tracking is the only way to get true unique visitor reporting. It's advisable, although not required, that you decide whether you want to use UTM prior to changing your webserver logging. If so, you should enable cookies in your logs now. It will not hurt if you enable cookies but do not install UTM on your website immediately. You may want to look over the section on Visitor Tracking to familiarize yourself with the UTM installation before proceeding.

Configuration

Apache

By default, Apache generally logs in what's called common log format, and also provides an option to log in a more detailed format known as NCSA extended/combined log format. For optimal reporting, Urchin requires a variation of the NCSA extended/combined format. To configure Apache to use the appropriate format do the following:

  1. Make a backup copy of your httpd.conf file. Then use a text editor to open your original httpd.conf.
  2. Locate the section containing lines that begin with the word LogFormat
  3. Insert a new LogFormat line using one of the forms shown below, depending on whether you will be using UTM or not. The LogFormat entry must be added to your configuration file as a single line without carriage returns or line breaks. Make sure you pay close attention to entering in all the characters correctly.

    For websites that will not use UTM

    LogFormat "%h %v %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" urchin

    For UTM-enabled websites:

    LogFormat "%h %v %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\"" urchin

    The word "urchin" at the end of the LogFormat line is a nickname that will be used elsewhere in your httpd.conf to apply this format to a log file. This string can be anything you choose. Using "urchin" will help identify that this entry was created to accommodate Urchin processing.

  4. Examine the <VirtualHost> entry for which you wish to enable this new logging format. Deactivate any existing TransferLog or CustomLog entries within a <VirtualHost></VirtualHost> group by inserting a # in front (e.g. TransferLog becomes #TransferLog). Then insert the following new CustomLog entry, replacing the string path_to_log with the appropriate path to your log location:

    CustomLog path_to_log/access.log urchin

    If you chose some identifier other than "urchin" as the nickname for your LogFormat entry earlier, use that nickname in place of "urchin" in the CustomLog entry.

  5. Save the edits to your httpd.conf file.

  6. IMPORTANT! Check the syntax of your new httpd.conf by running the command:

    apachectl configtest

    This should produce the response syntax ok. If not, doublecheck your httpd.conf file and fix any errors. If you cannot get the correct response, do not continue with this procedure. Instead, make a backup copy of your edited file, then restore the original by overwriting this version with a copy of httpd.conf you saved at the start of this procedure. This will ensure that your webserver continues to work normally while you figure out what is wrong with your changes.

  7. Once you have confirmed the syntax of your httpd.conf, restart Apache. The preferred method is by calling the apachectl script, which is typically installed with Apache.

    apachectl restart

  8. Check the logging. Open a browser and hit the site in question a few times. Then examine the last few lines of the log file specified in your CustomLog entry. You should see several recent hits have been written to the log. For the Urchin modified extended/combined log format, a log line will look similar to this:

    64.40.51.27 www.urchin.com - [28/Aug/2002:15:11:01 -0700] "GET /images/urchin_header_logo.gif HTTP/1.1" 200 3017 "http://www.urchin.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"

    If you have configured UTM on your site and have turned on cookie logging a log line will look similar to this:

    64.40.51.27 www.urchin.com - [28/Aug/2002:15:11:01 - 0700] "GET /images/urchin_header_logo.gif HTTP/1.1" 200 3017 "http://www.urchin.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" "__utma=171060324.1378004559.1063331913.1063334677.1063521838.3; __utmb=171060324; __utmc=171060324"

    Note the additional UTM cookie information at the end of the line.

Microsoft Internet Information Server (IIS)

Note: Microsoft IIS uses a W3C logging format.

Urchin can provide very basic reporting if your IIS log files have, at the very least, the following fields:

  • Date
  • Time
  • C-IP
  • CS-URI-Stem
  • SC-Status
  • SC-Bytes
These are required fields. Without them you will not get meaningful reporting. However, this minimal logging does not provide enough information for Referral and Browser reporting. Therefore it is advisable to set more detailed logging properties for your IIS server.

IIS logging properties are configured either separately for each domain on the server, or globally. For servers with more than a few domains, the global option is recommended. The following steps will ensure that the required log file fields are being recorded. If you elect to log additional fields, Urchin will just ignore them at processing time. However, logging unneeded fields will increase the size of your log files so it is best to only log the fields needed by Urchin.

  1. Launch the IIS services management tool by going to Start->Programs->Administrative Tools->Computer Management
  2. Expand the Services and Applications tree, then select Internet Information Services, which should bring up a list of websites (except on Windows 2003 Server which will require that you further expand the Web Sites folder to get a listing of sites).
  3. Right click on the entry for the site you want to modify and select Properties
  4. Select the Web Site tab and in the section at the bottom of this screen verify that the Enable Logging checkbox is checked. Then from the Active Log Format dropdown menu choose W3C Extended Log File Format.
  5. Click on the Properties button next to the Active Log Format box
  6. Select the Extended Properties tab
  7. Check the boxes for the following fields:
      Date [ date]
      Time [ time ]
      Client IP Address [ c-ip ]
      User Name [ cs-username ]
      Method [ cs-method ]
      URI Stem [ cs-uri-stem ]
      URI Query [ cs-uri-query ]
      Protocol Status [ sc-status ]
      Bytes Sent [ sc-bytes ]
      User Agent [ cs[User-Agent] ]
      Referer [ cs[Referer] ]
      Cookie [ cs[Cookie] ] (This field only required for UTM tracking)
  8. You should make sure the Process Accounting box is unchecked as it does not provide useful web access activity information.
  9. Select Apply and OK on each window to save your settings.
  10. It is not necessary to restart IIS. Your logs should immediately begin logging according to the new settings.