This article is intended to describe some of the terms used when Urchin processing completes under error conditions and give some insight on how to recover using the sanitizer utility.
Hard close, soft close, stop button
When an error condition occurs during processing, Urchin will
attempt to properly close the Urchin database for the currently
running Profile. The first attempt is called a 'soft close',
which properly flushes all of the buffers, and updates headers
and indexes so that data is complete. As of Urchin 4.1, the
soft close will update the log tracking as well, so that
processing
can continue where it left off. A successful 'soft close',
indicated
by the word 'done' results in a stable and clean database.
If the soft close fails or an unrecoverable error occurs, Urchin will 'hard close' the database which flushes the io buffers, but does not complete any writing being done by the Urchin process. When a hard close occurs, the database is left in an unknown state and should be run through the sanitizer for checks.
Clicking the available Stop Button will cause a soft close on UNIX and the process to be killed on Windows. When the process is killed directly via the Stop Button on Windows platforms, Urchin exits immediately, and the database is left in an unknown state similar to the hard close. Killing the process on UNIX from a command line is effectively a hard close.
Using the udb-sanitizer utility
The udb-sanitizer utility, located in the 'util' folder
within the
Urchin installation, can perform five recovery functions in
addition
to checking the integrity of the data. Access the
udb-sanitizer with
a Command Prompt or other command tool. Change directories
into the
'util' folder within Urchin and run:
udb-sanitizer -p (profile) [-d YYYYMM]where (profile) is the name of the profile you wish to examine, and YYYYMM is the year/month of the data you wish to examine. If the date is not provided, all data will be examined for the profile.
The sanitizer will output the results of the initial checks including the state of the header and data records. Please note that if all the checks pass with an 'ok', that there could still be problems with the indexes. It is recommended to rebuild the indexes (after fixing other issues) any time a hard close occurs.
The sanitizer will provide you with five options:
- Rollback data to state before last run
- Delete this month entirely
- Rebuild header to match data
- Rebuild indexes
- Zero out one day
The second and fifth options allow you to delete data. Using either of these options will most likely require you to temporarily turn off log tracking while reprocessing these date ranges. The log tracking information does not get deleted, and Urchin will think that it has already processed this data unless this is disabled.
The rebuild header option should be used if the Header check fails, and you do not wish to roll back to a previous state. This will align the Header with the actual data records in the database.
The rebuild index option can be run to rebuild the database indexes. If any index errors occur, it is recommended to run this option.