udb-sanitizer: Database Maintenance Utility

udb-sanitizer: Database Maintenance Utility

Overview

The Urchin Database Maintenance Utility, udb-sanitizer, provides a means for checking the Urchin 6 profile databases and performing various maintenance operations on these databases.

The types of operations that udb-sanitizer can perform are:

  • Roll back databases to a previous saved backup state
  • Delete profile data for a day, multiple days, or an entire month

Usage

udb-sanitizer is located in the util directory of the Urchin 6 distribution.

Usage of the utility is as follows:

  udb-sanitizer [-h] (prints usage message and exits)

  udb-sanitizer [-v] (prints version and exits)

  udb-sanitizer -p profile [-d YYYYMM[DD]] -bfhprq] [-z [-e DD]] 

where:

   -b  go directly to rollback option

   -d  specifies year-month and optionally the day to operate on

   -e  with z and d options, zero multiple days (range d->e) in same month

   -f  force action to occur without confirmation

   -h  print this help information

   -p  specifies name of profile (required)

   -a  specifies name of account

   -r  go directly to remove option

   -q  quiet mode, suppress output except for critical user confirmation

   -z  go directly to zero-day option 

Note: When udb-sanitizer is called with options that do not completely describe what action to take, it will display the usage text (equivalent to -h option). You can cause an action to be performed without any user interaction by using a combination of options –f and -q in conjunction with applicable use of the -d, -b, -r, or -z options.

Operation

In normal operation, udb-sanitizer is invoked from a command shell with providing options applicable for planned type of operation.

Actions associated with an available options presented above are:

1. Data rollback (–r)

  • This options allows user to revert back to an already archived data for a profile. The user is presented with a list of ZIP archive backups to choose from and the contents of the selected archive file replaces the existing reporting database. The ZIP archives are named with the following convention "YYYYMM-backupv6-YYYYMMDDHHMMSS.zip", where the first YYYYMM refers to the month of data being backed up (e.g. 200803 refers to March 2008), and the YYYYMMDDHHMMSS portion is the timestamp of when the ZIP archive was created. This timestamp should be helpful in determining which ZIP archive you want to roll back to.

2. Delete monthly data (-d)

  • All data for a particular profile for the specified month is removed. This option is useful for zeroing out the statistics for a month if the data is incorrect, e.g. the wrong filters were applied or the wrong logs were processed; or perhaps some of the advanced profile parameters were changed such as the click path depth or referral level and it is desirable to update that month's Urchin reporting data to reflect the change. This action can be performed without user interaction by invoking udb-sanitizer with the -f, -r and -q arguments.

 

udb-sanitizer -f –q -r -d 200309 -p mysite.com 

3. Zero data for one or more days

  • This option allows data for selected range of days within one month to be zeroed out, thereby allowing Urchin log processing to be rerun for those days only (e.g. urchin -p profile -d YYYYMMDD). This action can be performed without user interaction to zero out a single day by invoking udb-sanitizer with the -f, –q, -z and -d arguments, e.g.

 

udb-sanitizer –f -q -z -d 20030907 -p mysite.com 
  • and for multiple days by including the "-e" argument as well to specify an end date, e.g.

 

udb-sanitizer -f -z -d 20030907 -e 10 -p mysite.com 
  • which will zero out data for September 7th through the 10th. Note that –e argument only expects the ending day. This is more efficient than invoking multiple instances of udb-sanitizer to zero out a single day at a time, as the database indexes and headers are only checked once. The index/header checking operation can require a noticeable amount of time on profiles with a lot of data.

4. If the profile is part of an account, then account name should be specified using the -a argument.

Important: Actions that delete daily or monthly data cannot be undone! The only recourse is to reprocess the webserver logs for that time period to repopulate the profile databases. Use these options with care.