Filtering Overview

This article authored by Panalysis, an Urchin Software Authorized Consultant

Filtering is an important concept in Urchin 6. Filters allow you to control how the data is shown in Urchin's reports. This can be simply including or excluding data or complex modifications.

Typical usage of filters include:

  • Excluding your own activities from the website.
  • Creating reports that are restricted to a specific area or directory in the website.
  • Adding the domain name to the Top Content report.
  • Adding the URL query parameters to the Top Content report.

Filters work by applying one of 5 possible actions to a specific field in the log file or Urchin reports. The possible actions include:

  • Exclude – removing data from the reports

  • Include – limiting the data shown in reports to specific items

  • Search & Replace – Modifying the data by replacing data with other data /li>

  • Lookup Table – Adding content to the reports based on an external table of data. E.g. converting ?id=1 into ?id=My Home Page

  • Advanced – Modifying the data be extracting and combining data.

To access filters you must be logged in as an Administrator. Once logged in you can create filters by clicking on the Configuration tab at the top right hand corner of the screen and then going to Urchin Profiles. From here you can either add the filter via the Filter Manager or by editing or creating a Profile and adding a Filter from the Filters Tab and clicking on the plus icon.

Filtering Sequence

In Urchin 6 you are able to assign a filter either to a Profile or Log Source.

Urchin processes filters just prior to the creation of the reports. The processing sequence that Urchin uses is as follows:

Read raw log file -> Create the "Raw" log file fields -> Process the fields into the final "Auto" fields -> Process filters in profile -> Process filters in log source -> Show final report.

This means that you can only use filters to modify data that is shown in the report. As such most filters will use the fields marked as "(Auto)" to modify data that is shown in the final reports. There are of course cases where you will need to use the fields marked as "(Raw)", however these are relatively rare.

Important: If filters of different types are used, then they will be applied in the following order:

  • advanced, search&replace

  • lookup table
  • exclude & include.

If multiple filters of the same type are specified, they are processed in the same order in which they are linked with the log sources via the admin interface (or directly inside db). Each filter will only receive the data available after the previous filter has been applied.

A common mistake is in the use of Include filters where two Include filters are applied in sequence. E.g.

Filter 1: Include a directory /somedirectory/ Filter 2: Include another directory /someotherdirectory/

As a page can't reside in both directories, the resulting report will show 0 pages. The reason is that the second directory will only receive data which is located in the /somediretory directory. As the second filter is only going to include the pattern /someotherdirectory then the result will be 0 pages as no page remaining after Filter 1 will match this pattern.

The correct solution to this problem is to use a Regular Expression that matches both directories in the one filter. E.g. ^/(somedirectory|somotherdirectory)/.*

Hint: Test your filter patterns in the relevant report before adding these to the Urchin Profile using the Filter Field at the top of the report. E.g. Test a URL in the Content Optimization -> Content Performance -> Top Content report.