Many sites today will use a CGI, ASP or other scripting mechanism to provide dynamic content. Often, a single script is used to deliver multiple pages of information. While this can be a handy way to track users sessions or provide ?live? content, it poses an additional challenge for meaningful reporting.
By default, Urchin strips all the parameters associated with a page request (e.g. those that would typically be used with a CGI or ASP) and stores only the pathname of the page requested in its database. The DynamicURL filtering feature allows you to use regular expressions to selectively capture these parameters and present them in an intuitive way.
As an example, a CGI script might be used to deliver information about all products in a catalog. The script draws from a database, and uses parameters passed through the request to determine which product to display. The resulting hit in the webserver log for this request might look like:
/cgi-bin/showProduct.cgi?sessionId=123456789&productId=knobs |______________________| |_________________________________|
Under normal operation, Urchin will record that the showProduct.cgi page was requested, and all parameters up to and including the "?" will be stripped. By using a DynamicURL filter, Urchin can store some or all of the parameters and produce a unique page record based on the parameter list.
Now in this example, we don?t necessarily want to capture the entire second part of the request because of the ?sessionId.? Let?s assume that this parameter changes for each visit and we get 30,000 visits per day. Including this piece of information would create far too many unique pages and render the Pages reporting useless. Instead we just want to capture the ?productId? and report only on that information.
We may still want to know which script was used as well as which product was implicated in the request. By using a DynamicURL filter, we can capture multiple parts of the request and recombine them into a new, formatted request ready for reporting. Here is an example of a filter that could be used with the page request above:
This regular expression will match the above request no matter what the value of the sessionId or productId was. And the parenthesis capture the parts of the request that we want to keep for reporting. The effective request of the above example would look like:
Up to 5 sets of parenthesis can be used. And, multiple filters can be applied. If a request does not match the DynamicURL filter, it is left unmodified, but still included in the reporting. This allows you to use multiple DynamicURL filters for each area of a site. Keep in mind there is a slight performance hit for each filter used.
Note that DynamicURL filters can only be applied to the base URL and query string that form the page request. They cannot be used to filter referrals or any other fields in the log file. Also, when DynamicURLs and FilterIn/FilterOut are used together the DynamicURL will be applied after the other filters. So consideration must be given to how one set of filters affects the others when choosing what to filter.
Example 1: We want to capture the all the specific Knowledgebase article IDs in the Urchin 4 report for help.urchin.com. Here's a sample of what the Request portion of the hit looks like in the log file:
1. /knowledge.cgi 1,081 46.43% 2. /knowledge.cgi/id=767 244 10.48% 3. /knowledge.cgi/id=807 136 5.84% 4. /knowledge.cgi/id=768 50 2.15% 5. /knowledge.cgi/id=777 40 1.72%
Example 2: We want to capture the all the search keywords used in the Urchin 4 report for help.urchin.com. Here's a sample of what the Request portion of the hit looks like in the log file:
1. /knowledge.cgi 1,373 68.65% 2. /knowledge.cgi/keyword=utm 29 1.45% 3. /knowledge.cgi/keyword=default+page 18 0.90% 4. /knowledge.cgi/keyword=no+referral 11 0.55% 5. /knowledge.cgi/keyword=scheduler 10 0.50%