Introduction
Urchin can process virtually any log file format. By providing Urchin with the necessary information about the log format, Urchin will read and parse the raw data according to your configuration. This article describes a step by step method for creating custom log formats. Once a custom format is created it can be selected in the Administration Interface as the format for any log file.
It should be noted that certain log data is required in order to create certain reports. You will need to be sure your log contains the minimally required fields for Urchin to process it. If you are unsure see the Log Files - Logging - Other Webservers document in the Urchin Administration section of the Documentation Center.
Creating Custom Formats
A custom log format is created by creating a format specification file in the
[urchin install dir]/lib/custom/logformats/folder. A sample file is provided called 'custom.lf'. Multiple custom files can be created. Each one needs the '.lf' extension for Urchin to recognize it. The default built-in log formats such as apache, w3c, netscape, etc. are located in
[urchin install dir]/lib/reporting/logformats/In each of the above directories, there is an available fields list in the fieldlist.txt file. The custom folder holds the custom fields list and the reporting folder holds the standard fields list. Custom log formats can refer to fields in either list by using the field id numbers. Once a custom format is created, it is available for selection in the Admininstration Interface. Here are the basic steps for creating a custom format:
Step 1: Copy the example
In the 'lib/custom/logformats' folder of the Urchin
distribution, make a copy
of the 'custom.lf' file. The new filename should not use
spaces, and it needs
the '.lf' extension.
Step 2: Set the primary positions
Edit the file created in the above step. The file contains
a lot of useful
information about each variable. The first step is to set
the PrimaryPositions
variable. This comma separated list identifies each field
(by id number)
in the log format and its relative position. Use the
fieldlist.txt files in
the two directories mentioned above to determine which field
ids to use. If
you have custom date and time formats that are different
from Apache and IIS
formats, use fields 16 and 17 respectively. If the date and
time are together
in one field, just use field 16.
Step 3: Check the fields separator
Check the FieldSeparator1 variable, and set this to the
separator between
your fields. Use \s for a space and a \t for a tab.
Step 4: Is the HTTP status field available?
If the HTTP status field is available, then leave the
StatusRequired variable
set to YES. This will separate valid and error hits
appropriately. If there
is not status in the log, then set this to NO, so that all
hits will be
considered valid.
Step 5: Are you using a custom Date/Time format?
If so, then edit the CustomDateFormat for field 16 and the
CustomTimeFormat
for field 17. The format is specified using the % variables
defined in the
Custom Date Format article later in this section of the
documentation.
Step 6: Check the other variables
Most of the other variables will be OK for most custom log
formats, but
check the comments provided in the file on each variable to
see if it
applies to your situation.
Step 7: Do you have custom calculated fields?
In addition to the format, you can specify custom calculated
fields in the
specification file. An example with comments is provided at
the bottom
of the file. The custom calculated field works the same way
as the
Advanced Filter in the filtering section except that custom
calculated
fields are processed first. Please see the article on
custom calculated
fields form more information.
Save the file and you ready to assign it as the format for a log file in the Administrative Interface. It will automatically show up as one of the format options in the pull down menu for log formats.