Custom E-commerce Logs

Overview

Many shopping carts provide the ability to capture and log valuable information regarding purchases in formats other than ELF or ELF2, and therefore cannot be automatically processed by Urchin. This article explains how to create a custom log format for your E-commerce log file if you cannot alter your shopping cart to generate ELF/ELF2. Before continuing, please read the article titled "Custom Log Formats" in the Advanced Topics->Customization section of the Document Library, which explains the creation of custom logs in detail.

E-commerce Log Format Types

Shopping carts are capable of logging information about purchases and the items purchased in either a single line format or a multi-line format. In single format each line contains all the information necessary to completely describe a transaction and the items purchased and all lines have the same layout. In multi-line formats, multiple lines are used to describe a purchase, with one format for the transaction lines and another format for the items purchased. ELF/ELF2 logs are multi-line formats. You must examine your E-commerce logs to determine if the data is single line or multi-line as this will affect how you set up your custom log format. Please follow the instructions below depending on your type of log format.

General E-commerce Logging Requirements

Regardless of the format of the log entries your shopping cart produces, each entry must contain the date and time and at least one of the following fields to provide visitor correlation:

  • Remote Host or IP Address (for IP-Only or IP-Useragent visitor methods)
  • Useragent (for IP-Useragent visitor method)
  • Cookies (for UTM or SID visitor method)
  • Session ID (for SID visitor method)
If any of the above fields are missing Urchin will not produce meaningful analysis of your revenue. Urchin also defines the following E-commerce fields:
  • %{ORDERID} is the order number
  • %{STORE} is the name/id of the storefront
  • %{SESSIONID} is the unique session identifier of the customer
  • %{TOTAL} is the transaction total including tax and shipping (decimal only, no '$' characters)
  • %{TAX} is the amount of tax charged to the subtotal
  • %{SHIPPING} is the amount of shipping charges
  • %{BILL_CITY} is the billing city of the customer
  • %{BILL_STATE} is the billing state of the customer
  • %{BILL_ZIP} is the billing zip code of the customer
  • %{BILL_COUNTRY} is the billing country of the customer
  • %{PRODUCT_CODE} is the identifier of the product
  • %{PRODUCT_NAME} is the name of the product
  • %{VARIATION} is an optional variation of the product for colors, sizes, etc
  • %{PRICE} is the unit price of the product (decimal only, no '$' characters)
  • %{QUANTITY} is the quantity ordered of this product
  • %{UPSOLD} is a boolean (0|1) if the product was on sale

Single-line Format Logs

Follow these instructions if your E-commerce log file only contains hits that all have the same line format as explained above.

  1. Create a new custom log format in the lib/custom/logformats directory by making a copy of the custom.lf.sample logformat file. Name your copy with a .lf suffix.
  2. Edit your new custom log format file and set the following entries based on the recommendations below:
    • PrimaryPositions: This entry specifies the order of fields in your log file. Create a comma separated list of field ids which describes your field order. The field names and ids are found in the lib/reporting/logformats/fieldlist.txt file. See example below.
    • SecondaryPositions: Leave this as '-' since it is not used for single-line format log files.
    • PrimaryKey: Leave this as '-' since it is not used for single-line format log files.
    • SecondaryKey: Leave this as '-' since it is not used for single-line format log files.
    • PrimaryContent: Valid entries for this field are TRANSACTION or ITEM. If the hits in your log file describe the purchase of each individual product, set this to ITEM. If the hits in the log file describe the entire purchase, set this to TRANSACTION.
    • SecondaryContent: Leave this as '-' since it is not used for single-line format log files.
    • CommentKey: If some of the lines in your log file are comments or are not considered hits and begin with a specific character, enter the character here.
    • FieldSeparator1: The field separators define which characters are considered field separators. Typical entries are tabs (\t) and spaces (\s). Set these appropriately based on the characters between the fields in your log file.
    • FieldSeparator2: See FieldSeparator1 above
    • QuotesEscapeSep: This specifies whether field separators will be ignored inside a field that contains quote "" characters. This should probably be left as YES.
    • BracketsEscapeSep: This specifies whether field separators will be ignored inside a field that contains bracket [] characters. This should probably be left as YES.
    • MergSuccessiveSep: This specifies whether to consider two separator characters in a row as one separator. This can probably be left as NO.
    • CleanWhiteSpace: This specifies whether to remove white space from the ends of the fields when they are parsed. This can probably be left as NO.
    • StatusRequired: Leave this set to NO unless your hits contain web server type status codes
    • CustomDateFormat: If your log format contains a custom date format, set the appropriate strptime format that describes the entry
    • CustomTimeFormat: If your log format contains a custom time format, set the appropriate strptime format that describes the entry
  3. Save your custom log format in the lib/custom/logformats directory
  4. Select the custom log format for your log source in the Urchin Admin interface.
  5. Process your log file(s) with Urchin.

Single-line Format Example

The following example is a single hit from a log that only has transaction data.

12345 123.123.123.123 "Urchin Store" [26/Aug/2003:11:43:02
-0700] 192.73 "San Diego" "CA" 
92101 "US" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;)"
"__utma=171060324.2734232095.1061444425.1061444425.1061444763.2"
The list below shows each field name listed with the id number obtained from the lib/reporting/logformats/fieldslist.txt file. The id numbers thus assigned are used in the PrimaryPositions field in your custom log format file.
  1. Transaction ID 25
  2. Remote Host or IP Address 12
  3. Store Name 26
  4. Apache Date/Time 3
  5. Total Cost 28
  6. Bill City 31
  7. Bill State 32
  8. Bill Zip 33
  9. Bill Country 34
  10. User Agent 13
  11. Cookies 14
Based on the list above, you would set the following entries in the custom logformat file:
PrimaryPositions: "25, 12, 26, 3, 28, 31, 32, 33, 34, 13, 14"
SecondaryPositions: -
PrimaryKey: -
SecondaryKey: -
PrimaryContent: TRANSACTION
SecondaryContent: -
CommentKey: #
FieldSeparator1: \s
FieldSeparator2: \t
QuotesEscapeSep: YES
BracketsEscapeSep: YES
MergSuccessiveSep: NO
CleanWhiteSpace: NO
StatusRequired: NO
CustomDateFormat: -
CustomTimeFormat: -
The PrimaryPositions specify the field order and the PrimaryContent tells Urchin that this log contains transactions (or general information about purchases). The field separators were set to space and tab since the fields were separated by white space. The custom date/time formats were not specified since the date/time was formatted as an Apache date.

Multi-line Format Logs

Urchin has the ability to read multi-line formats as long as the beginning character of each line contains a specific character that can identify which format is being used. For example, the ELF/ELF2 log files contain a '!' exclamation character as the first character in the transaction line. The item lines do NOT contain a leading '!' character.

Follow these instructions if your E-commerce log file contains two different format lines, one for the transaction and the other for product or item details.

  1. Create a new custom log format in the lib/custom/logformats directory by making a copy of the custom.lf.sample logformat file. Name your copy with a .lf suffix.
  2. Edit the new custom log format file and set the following entries based on the recommendations below:
    • PrimaryPositions: This entry specifies the order of fields in your log file. Create a comma separated list of field ids which describes your field order. The field names and ids are found in the lib/reporting/logformats/fieldlist.txt file.
    • SecondaryPositions: This entry specifies the order of fields in your log file. Create a comma separated list of field ids which describes your field order. The field names and ids are found in the lib/reporting/logformats/fieldlist.txt file.
    • PrimaryKey: Set the primary key to the character that identifies the log file line as the same format described by the primarypositions
    • SecondaryKey: Set the seconday key to the character that identifies the log file line as the same format described by the secondarypositions
    • PrimaryContent: Valid entries for this field are TRANSACTION or ITEM. If the hits in your log file describe the purchase of each individual product, set this to ITEM. If the hits in the log file describe the entire purchase, set this to TRANSACTION.
    • SecondaryContent: See PrimaryContent above
    • CommentKey: If some of the lines in your log file are comments or are not considered hits and begin with a specific character, enter the character here.
    • FieldSeparator1: The field separators define which characters are considered field separators. Typical entries are tabs (\t) and spaces (\s). Set these appropriately based on the characters between the fields in your log file.
    • FieldSeparator2: See FieldSeparator1 above
    • QuotesEscapeSep: This specifies whether field separators will be ignored inside a field that contains quote "" characters. This should probably be left as YES.
    • BracketsEscapeSep: This specifies whether field separators will be ignored inside a field that contains bracket [] characters. This should probably be left as YES.
    • MergSuccessiveSep: This specifies whether to consider two separator characters in a row as one separator. This can probably be left as NO.
    • CleanWhiteSpace: This specifies whether to remove white space from the ends of the fields when they are parsed. This can probably be left as NO.
    • StatusRequired: Leave this set to NO unless your hits contain web server type status codes
    • CustomDateFormat: If your log format contains a custom date format, set the appropriate strptime format that describes the entry
    • CustomTimeFormat: If your log format contains a custom time format, set the appropriate strptime format that describes the entry
  3. Save your custom log format in the lib/custom/logformats directory
  4. Select the custom log format for your log source in the Urchin Admin interface.
  5. Process your log file(s) with Urchin.