Administrative API Developer’s Guide: Protocol
Introduction
The Google Search Appliance Administration API enables administrators to configure a search appliance programmatically. This API provides functions for creating, retrieving, updating, and deleting search appliance configuration settings.
The Google Search Appliance Administration API follows the principles of the Google Data APIs. Google Data APIs are based on both the Atom 1.0 and RSS 2.0 syndication formats in addition to the Atom Publishing Protocol.
The audience for this guide are XML programmers who have access to a Google Search Appliance. The user name and password for the Admin Console are required to obtain the authentication token necessary to run applications for this API.
This guide consists of the following sections:
API Operations
To use this API, you can send HTTP requests to a search appliance to instruct the search appliance to perform a create, retrieve, update, or delete configuration information in the search appliance.
This section explains the different types of operations that the API supports. See also How the API Works, which identifies the URL that corresponds to each API operation.
The operations are as follows:
- Create--Operations to add a new object, such as a collection or front end. To perform any of these operations, issue an HTTP
POST
request with the appropriate URL. The body of thePOST
request is an XML document that contains information about a resource to create. - Retrieve--Operations to request and obtain information about search appliance features. For information on the Google Data API retrieval operations, see the Google Search Appliance Administrative API Developer’s Guide: Java and Google Search Appliance Administrative API Developer’s Guide: .NET. To retrieve information about a resource, issue an HTTP
GET
request to the appropriate URL that identifies a resource to retrieve. - Update--Operations to modify information about search appliance. To update the information, issue an HTTP
PUT
request to the appropriate URL. The body of thePUT
request is an XML document that contains information about a resource to update. - Delete--Operations to delete objects such as a collection or a front end. To perform any of these operations, issue an HTTP
DELETE
request to the appropriate URL. The URL contains information that identifies a resource to delete.
The search appliance verifies that all create and update requests contain valid XML, include all required data fields, and meet authentication requirements.
Authenticating Your Google Search Appliance Account
You can send API requests over HTTPS or HTTP.
Specify an authentication token with each API request. The search appliance uses the token to authorize access to the operation that you request. Authentication tokens are available only to users who have administrative rights to the search appliance, and the tokens authorize operations only within a search appliance.
To obtain an authentication token, submit an HTTP POST
request to port 8443 on a search appliance as shown in the following URL:
https://Search_Appliance:8443/accounts/ClientLogin
The following guidelines apply to the request:
- Include in the
POST
body a string in the following format:&Email=username&Passwd=password
Make the following changes to this string:
- Replace username with a user name that has an Admin Console administrator account.
- Replace password with the password for the Admin Console account.
- The user name and password values must be URL-encoded. For example, the URL-encoded form of the
AcQ.87@
password is theAcQ%2E87%40
value. - The
POST
request must specify the valueapplication/x-www-form-urlencoded
for theContent-Type
header.
The search appliance returns a response containing your authentication token in response to a POST
request. The authentication token is the Auth
value on that page, and you need to extract the token from the page. When you submit an API request, you must set the Content-Type
and authorization headers as follows:
Content-type: application/atom+xml
Authorization: GoogleLogin auth=your-authentication-token
How the API Works
To execute an operation using the API, submit an HTTP POST
, GET
, PUT
, or DELETE
request to the URL that corresponds to the operation that you wish to perform. Each URL includes variables that identify the resource that you are creating, retrieving, updating or deleting. The URL pattern is as follows:
http://Search_Appliance:8000/feeds/Collection_Name/Entry
The Collection_Name and Entry values indicate a search appliance configuration. Note that all create and update requests (POST
and PUT
requests) also require that you submit an XML document that contains the information you need to fulfill the request. Send the content using the application/atom+xml
content type. The section XML Request Formats explains the XML structures.
XML Element Definitions
The following XML elements can be used in a reporting API request. The elements are listed in the order that they appear in an API request.
&
) character must be XML-escaped as &
when used in <gsa:content ...> values. For example:<gsa:content name='followURLs'>^http://my.domain.com/index.php?a=2&b=1</gsa:content>
atom:feed
Definition
The <atom:feed>
element encapsulates an API response to a request to retrieve all the information in one configuration collect.
Example
<atom:feed xmlns="http://www.w3.org/2005/Atom"
xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/">
Child Elements
atom:id, atom:link, atom:entry
Content Format
Container
atom:entry
Definition
The <atom:entry>
encapsulates an API request or an API Atom response.
Child Elements
atom:id, gsa:content, atom:link
Content Format
Container
atom:id
Definition
The <atom:id>
element’s value identifies a permanent, unique identifier for a feed. This element is included in API responses.
Example
<atom:id>https://gsa/feeds/config/crawlURLs</atom:id>
Child Element
Content Format
String (IRI)
atom:link
Definition
The <atom:link>
tag provides an RFC 3987 IRI reference (http://www.ietf.org/rfc/rfc3987.txt) related to an API results feed or a resource in the feed.
Attributes
Name |
Format |
Description |
---|---|---|
|
Text |
The
Use an HTTP
GET request to retrieve a resource, an HTTP PUT request to update a resource, and an HTTP DELETE request to delete a resource. |
|
Text |
The |
Example
<atom:link rel="edit" type="application/atom+xml"
href="https://gsa/feeds/config/crawlURLs"/>
Parent Element
Content Format
Empty
atom:updated
Definition
The <atom:updated>
tag identifies the date and time that an entry in an Atom feed was updated.
Example
<atom:updated>1970-01- 01T00:00:00.000Z</atom:updated>
Parent Elements
Content Format
Date
gsa:content
Definition
The <gsa:content>
tag specifies properties of the search appliance Admin Console settings. The <entry>
must contain at least one <gsa:content>
. The attribute name
specifies the name of property and the value for the property should be put in content.
Example
<gsa:content name=’crawlURLs’>http://yourdomain.com/</gsa:content>
Parent Element
Content Format
Complex
XML Request Formats
For API requests to create or update information (HTTP POST
and PUT
requests), the body of a request must be an XML document that provides the data necessary to complete a request.
For API requests to retrieve or delete information (HTTP GET
and DELETE
requests), the URL and HTTP request type specify all of the information that the search appliance needs to fulfill the request. Put all necessary information in the <gsa:content>
XML tag.
The following example updates the crawl URLs in a search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://ent1:8000/feeds/config/crawlURLs</id> <gsa:content name=’crawlURLs’>http://yourdomain.com/</gsa:content> <gsa:content name=’startURLs’>http://yourdomain.com/</gsa:content> <gsa:content name=’doNotCrawlURLs’> http://yourdomain.com/not_allow </gsa:content> </entry>
XML Response Formats
Depending on the API request, the search appliance Administrative API returns XML responses. The XML response is a Google Data Atom entry. The <entry>
must contain at least one <gsa:content>
. All the search appliance related information are put in <gsa:content>
XML tag. For example, the following list defines a GSAEntry
response as an XML document that contains information about the crawl URLs. The client libraries convert this XML response into a GSAEntry
object.
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://ent1:8000/feeds/config/crawlURLs</id> <updated>2008-12-08T20:11:58.342Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/crawlURLs’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/crawlURLs’/> <gsa:content name=’entryID’>crawlURLs</gsa:content> <gsa:content name=’crawlURLs’>http://yourdomain.com/</gsa:content> <gsa:content name=’startURLs’>http://yourdomain.com/</gsa:content> <gsa:content name=’doNotCrawlURLs’> http://yourdomain.com/not_allow </gsa:content> </entry>
Content Sources
The sections that follow describe how to configure the Content Sources features of the Admin Console:
- Crawl URLs
- Data Source Feed
- Feeds Trusted IP Addresses
- Crawl Schedule
- Crawler Access Rules
- Host Load Schedule
- Freshness Tuning
- Recrawl URL Patterns
- Connector Managers
- OneBox Settings
- OneBox Modules
- Crawl Status
- Document Status
Crawl URLs
Retrieve and update crawl URLs for a search appliance using the crawlURLs
entry of the config
feed.
Property |
Description |
---|---|
|
Do not crawl URLs with the following URL patterns. |
|
Follow and crawl only URLs with the following URL patterns. |
|
Start crawling from the following URL patterns. |
Retrieving Crawl URLs
To get the crawl URLs information for a search appliance, send an authenticated GET
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/crawlURLs
The following example requests the current crawl URLs values from a search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/crawlURLs</id> <updated>2008-12-12T07:49:32.957Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/crawlURLs’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/crawlURLs’/> <gsa:content name=’entryID’>crawlURLs</gsa:content> <gsa:content name=’startURLs’>http://www.example.com/</gsa:content> <gsa:content name=’doNotCrawlURLs’>.xls$</gsa:content> <gsa:content name=’followURLs’>http://www.example.com/</gsa:content> </entry>
Updating Crawl URLs
To update Crawl URLs information for a search appliance, send an authenticated PUT
request to the config feed URL:
http://Search_Appliance:8000/feeds/config/crawlURLs
The following example overwrites the crawl URLs specified in the entry to update:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/crawlURLs</id> <gsa:content name=’entryID’>crawlURLs</gsa:content> <gsa:content name=’startURLs’>http://www.example.com/</gsa:content> <gsa:content name=’doNotCrawlURLs’>.xls$</gsa:content> <gsa:content name=’followURLs’>http://www.example.com/</gsa:content> </entry>
Data Source Feed
Retrieve, delete, and destroy data source feed information for a search appliance using the feed
feed. The Google Search Appliance supports an interface known as the “feeds interface,” which is different from a Google Data API feed. To differentiate between these terms, the feeds interface on the search appliance is referred to as a data source feed. For more information on data source feeds, see the Feeds Protocol Developer’s Guide.
Parameter |
Description |
---|---|
|
To get all feed information, this parameter is the feed data source ( |
|
The first log line to retrieve. The default value is line 1. |
|
The maximum number of log lines to retrieve. The default value is 50 lines. |
The following properties provide data source feed information.
Property |
Description |
---|---|
|
The number of documents that had errors and were not added to the data source feed. |
|
The name of the data source feed. |
|
Feed state: |
|
The timestamp for the search appliance at the start of each stage (in milliseconds). |
|
Feed type, |
|
The starting line of the log. |
|
The log content. |
|
The number of documents that have completed indexing. |
|
The end line of the log. |
|
The total lines in the log. |
|
The command sent to a search appliance to delete a data source feed. The value can only be |
Retrieving Data Source Feed Information
To retrieve information about all data source feeds for a search appliance, send an authenticated GET
request to the feed
feed URL:
http://Search_Appliance:8000/feeds/feed?query=feedDataSource
The following example result includes current feeds values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/feed</id> <updated>2008-12-12T12:57:22.970Z</updated> <link rel=’http://schemas.example.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/feed/Feed_ID</id> <updated>2008-12-12T12:57:22.970Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed’/> <gsa:content name=’entryID’> sample_feed2_20081212_005647_000000_FULL_FEED_0 </gsa:content> <gsa:content name=’errorRecords’>0</gsa:content> <gsa:content name=’successRecords’>1</gsa:content> <gsa:content name=’feedType’>0</gsa:content> <gsa:content name=’feedDataSource’>sample_feed2</gsa:content> <gsa:content name=’feedState’>2</gsa:content> <gsa:content name=’feedTime’>1229072207000</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/feed/Feed_ID</id> <updated>2008-12-12T12:57:22.970Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed’/> <gsa:content name=’entryID’> sample_feed_20081212_005123_000000_FULL_FEED_0 </gsa:content> <gsa:content name=’errorRecords’>1</gsa:content> <gsa:content name=’successRecords’>0</gsa:content> <gsa:content name=’feedType’>0</gsa:content> <gsa:content name=’feedDataSource’>sample_feed</gsa:content> <gsa:content name=’feedState’>4</gsa:content> <gsa:content name=’feedTime’>1229071883000</gsa:content> </entry> </feed>
feedDataSource
value. Alternatively, you can get all the feeds if you do not supply a query. Whether or not you supply a query, you can get information about at most five feeds for each feedDataSource
value.To get information about individual feeds from a search appliance, send an authenticated GET
request to the feed
feed URL:
http://Search_Appliance:8000/feeds/feed/Feed_File_ID
The result is an entry that includes current values for an individual feed:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/feed/Feed_ID</id> <updated>2008-12-12T13:03:27.434Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed/Feed_ID’> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/feed/Feed_ID/’> <gsa:content name=’entryID’> sample_feed_20081212_005123_000000_FULL_FEED_0 </gsa:content> <gsa:content name=’toLine’>1</gsa:content> <gsa:content name=’errorRecords’>1</gsa:content> <gsa:content name=’successRecords’>0</gsa:content> <gsa:content name=’logContent’> ProcessNode: Not match URL patterns, skipping record with URL: http://www.sample_feed.com/sample_data.html </gsa:content> <gsa:content name=’feedType’>0</gsa:content> <gsa:content name=’fromLine’>1</gsa:content> <gsa:content name=’totalLines’>1</gsa:content> <gsa:content name=’feedDataSource’>sample_feed</gsa:content> <gsa:content name=’feedState’>4</gsa:content> <gsa:content name=’feedTime’>1229071883000</gsa:content> </entry>
Deleting a Data Source Feed
To delete a data source feed from a search appliance, you must delete one of its individual feed files by sending an authenticated PUT
request to the feed
feed URL:
http://Search_Appliance:8000/feeds/feed/Feed_File_ID
The Feed_File_ID ised in this command corresponds to an entryID
, as shown in Retrieving Data Source Feed Information To delete a data source, you must delete one of its feed files.
Use the following XML for the PUT
request:
<?xml version='1.0' encoding='UTF-8'?> <atom:entry xmlns:atom="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007"> <gsa:content name="updateMethod">delete</gsa:content> </atom:entry>
After deleting, the deleted feed name continues to exist, but has a feed type of DELETED
. To remove a feed from existence use the destroy option.
Destroying a Data Source Feed
To destroy a data source feed from a search appliance, send an authenticated DELETE
request to the feed
feed URL:
http://Search_Appliance:8000/feeds/feed/Feed_File_ID
Feeds Trusted IP Addresses
Retrieve and update the trusted IP addresses for feeds for a search appliance using the feedTrustedIP
entry of the config
feed.
Property |
Description |
---|---|
|
Trusted IP addresses: Either a list of IP addresses or |
Retrieving Feeds Trusted IP Addresses
To get the feeds trusted IP address information for a search appliance, send an authenticated GET
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/feedTrustedIP
The result is an entry that includes current feeds trusted IP values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/feedTrustedIP</id> <updated>2008-12-12T09:17:20.830Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/feedTrustedIP’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/feedTrustedIP’/> <gsa:content name=’entryID’>feedTrustedIP</gsa:content> <gsa:content name=’trustedIPs’>all</gsa:content> </entry>
Updating Feeds Trusted IP Addresses
To update feeds trusted IP information for a search appliance, send an authenticated PUT
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/feedTrustedIP
The following example updates the feeds trusted IP specified in an entry:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/feedTrustedIP</id> <gsa:content name=’entryID’>feedTrustedIP</gsa:content> <gsa:content name=’trustedIPs’>127.0.0.1</gsa:content> </entry>
Crawl Schedule
Retrieve and update the crawl schedule of a search using the crawSchedule
entry of the config
feed.
Property |
Description |
---|---|
|
Displays You can also change crawl modes by setting 1 for scheduled crawl or 0 for continuous crawl mode. |
|
The schedule of crawl, only available in scheduled crawl mode. The Where:
A scheduled crawl begins on the Day and Time and continues for the specified Duration. |
Retrieving a Crawl Schedule
To check the crawl mode and get the crawl schedule, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/config/crawlSchedule
The response is as follows:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/crawlSchedule</id> <updated>2008-12-11T06:29:35.862Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/crawlSchedule’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/crawlSchedule’/> <gsa:content name=’entryID’>crawlSchedule</gsa:content> <gsa:content name=’isScheduledCrawl’>0</gsa:content> </entry>
Updating a Crawl Schedule
To update the crawl schedule, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/config/crawlSchedule
The following example changes the crawl schedule:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>crawlSchedule</gsa:content> <gsa:content name=’isScheduledCrawl’>1</gsa:content> <gsa:content name=’crawlSchedule’>0,0300,360 2,0000,1200</gsa:content> </entry>
The following example changes crawl mode to continuous crawl:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>crawlSchedule</gsa:content> <gsa:content name=’isScheduledCrawl’>0</gsa:content> </entry>
Crawler Access Rules
Create, retrieve, update, and delete crawler access rules on a search appliance.
Crawler access rules instruct the crawler how to authenticate when crawling protected content, as shown in the following list of properties:
Property |
Description |
---|---|
|
Windows domain (for NTLM) or empty (for HTTP Basic authorization) |
|
Indicates whether users can get results on both the public content (normally available to everyone) and the secure (confidential) content. The value can be |
|
The entries in crawler access rules are sequential rules. The order indicates the sequence. The order is an integer value starting from |
|
Password for authentication. |
|
URL pattern that matches files with secure content. |
|
User name for authentication. |
Inserting a Crawler Access Rule
To insert a new crawl access rule, send an authenticated POST
request to the following URL:
http://Search_Appliance:8000/feeds/crawlAccessNTLM
The following example inserts a new crawler access rule:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’> #URL pattern for the new crawler access rule </gsa:content> <gsa:content name=’domain’>domainone</gsa:content> <gsa:content name=’isPublic’>1</gsa:content> <gsa:content name=’username’>username</gsa:content> <gsa:content name=’password’>password</gsa:content> </entry>
Retrieving Crawler Access Rules
To retrieve a list of crawl access rules, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/crawlAccessNTLM
The following example shows a sample result:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/crawlAccessNTLM</id> <updated>2009-03-22T06:33:40.471Z</updated> <link rel=’http://schemas.google.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM’/>\ <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/crawlAccessNTLM/http://example.com/</id> <updated>2009-03-22T06:33:40.471Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM’/> <gsa:content name=’entryID’>http://example.com/</gsa:content> <gsa:content name=’urlPattern’>http://example.com/</gsa:content> <gsa:content name=’username’>userone</gsa:content> <gsa:content name=’order’>1</gsa:content> <gsa:content name=’domain’>domainone</gsa:content> <gsa:content name=’isPublic’>0</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/crawlAccessNTLM/http://example2.com/</id> <updated>2009-03-22T06:33:40.471Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM’/> <gsa:content name=’entryID’>http://example2.com/</gsa:content> <gsa:content name=’urlPattern’>http://example2.com/</gsa:content> <gsa:content name=’username’>usertwo</gsa:content> <gsa:content name=’order’>2</gsa:content> <gsa:content name=’domain’></gsa:content> <gsa:content name=’isPublic’>1</gsa:content> </entry> </feed>
To retrieve an individual crawler access rule, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/crawlAccessNTLM/urlPattern
The following example shows a sample result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/crawlAccessNTLM/http%3A%2F%2Fexample.com%2F</id> <updated>2009-03-23T10:19:55.045Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM/http%3A%2F%2Fexample.com%2F’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/crawlAccessNTLM/http%3A%2F%2Fexample.com%2F’/> <gsa:content name=’entryID’>http://example.com/</gsa:content> <gsa:content name=’urlPattern’>http://example.com/</gsa:content> <gsa:content name=’username’>userone</gsa:content> <gsa:content name=’order’>1</gsa:content> <gsa:content name=’domain’>domainone</gsa:content> <gsa:content name=’isPublic’>0</gsa:content> </entry>
password
property is not available when retrieving crawler access rules.Updating a Crawler Access Rule
To update a crawl access rule, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/crawlAccessNTLM/urlPattern
The following example request body shows the result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’urlPattern’>#new URL pattern</gsa:content> <gsa:content name=’domain’>newdomain</gsa:content> <gsa:content name=’isPublic’>0</gsa:content> <gsa:content name=’order’>2</gsa:content> <gsa:content name=’username’>newuser</gsa:content> <gsa:content name=’password’>newpass</gsa:content> </entry>
Deleting a Crawler Access Rule
To delete a crawl access rule, send an authenticated DELETE
request to the following URL:
http://Search_Appliance:8000/feeds/crawlAccessNTLM/urlPattern
Host Load Schedule
Retrieve and update the host load schedule for a search appliance using the hostLoad
entry of the config
feed.
Property |
Description |
---|---|
|
The default web server host load, a float value. |
|
Exceptions to the default web server host load. This property consists of one or more lines of text in the following format:
Where:
|
|
Maximum number of URLs to crawl, an integer value. |
Retrieving a Host Load Schedule
To get the host load schedule information for a search appliance, send an authenticated GET
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/hostLoad
The result is an entry that contains the current host load schedule values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/hostLoad</id> <updated>2008-12-15T13:28:00.931Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/hostLoad’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/hostLoad’/> <gsa:content name=’entryID’>hostLoad</gsa:content> <gsa:content name=’defaultHostLoad’>3.6</gsa:content> <gsa:content name=’exceptionHostLoad’>www.example.com 1 2 2.3</gsa:content> <gsa:content name=’maxURLs’>2000</gsa:content> </entry>
Updating a Host Load Schedule
To update the host load schedule information for a search appliance, send an authenticated PUT
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/hostLoad
The following example overwrites a host load schedule:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/hostLoad</id> <gsa:content name=’entryID’>hostLoad</gsa:content> <gsa:content name=’defaultHostLoad’>2.4</gsa:content> <gsa:content name=’exceptionHostLoad’> * 3 5 1.2 www.example.com 1 6 3.6 </gsa:content> <gsa:content name=’maxURLs’>3000</gsa:content> </entry>
Freshness Tuning
Increase or decrease how often a search appliance crawls a URL pattern using the freshness
entry to the config
feed.
Property |
Description |
---|---|
|
URL patterns for pages that contain archival or rarely changing content. |
|
URL patterns for pages to recrawl regardless of their response to |
|
URL patterns for pages on which content changes often (typically more than once a day). |
Retrieving Freshness Tuning Settings
To get the settings for freshness tuning, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/config/freshness
The response is as follows:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/freshness</id> <updated>2008-12-11T07:16:26.220Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/freshness’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/freshness’/> <gsa:content name=’entryID’>freshness</gsa:content> <gsa:content name=’archiveURLs’>http://good/</gsa:content> <gsa:content name=’frequentURLs’>http://frequent/</gsa:content> <gsa:content name=’forceURLs’>http://force/</gsa:content> </entry>
Updating Freshness Tuning Settings
To update the settings for freshness tuning, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/config/freshness
The following is an example of a request body:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>freshness</gsa:content> <gsa:content name=’archiveURLs’>http://good/</gsa:content> <gsa:content name=’frequentURLs’>http://frequent/</gsa:content> <gsa:content name=’forceURLs’>http://force/</gsa:content> </entry>
Recrawl URL Patterns
Recrawl URL patterns using the recrawlNow
entry to the command
feed.
If you discover a set of URLs that you want crawled (usually because changes made to the web pages or because of a temporary error or misconfiguration present when the crawler last tried to crawl the URL), you can enter the pattern to inject it quickly into the queue of URLs the search appliance is crawling.
Property |
Description |
---|---|
|
URL patterns to be recrawled. |
Recrawling URL Patterns
To recrawl URL patterns, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/command/recrawlNow
The following is an example of a request body:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>recrawlNow</gsa:content> <gsa:content name=’recrawlURLs’>http://recrawl/page.html</gsa:content> </entry>
The following is an example of a request body with multiple recrawl URLs:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>recrawlNow</gsa:content> <gsa:content name=’recrawlURLs’>http://recrawl/page1.html http://recrawl/page2.html http://recrawl/page3.html </gsa:content> </entry>
Connector Managers
Insert, retrieve, update, and delete connector managers on a search appliance.
Property |
Description |
---|---|
|
A description of the connector manager. |
|
The status of the connection between a Google Search Appliance and the connector manager deployed on an application server. The value can be |
|
The URL of the application server where the connector manager is installed. |
Inserting a Connector Manager
To insert a new connector manager, send an authenticated POST
request to the following URL:
http://Search_Appliance:8000/feeds/connectorManager
The following example inserts a new connector manager:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>ConnectorManagerOne</gsa:content> <gsa:content name=’description’>Connector Manager One Description</gsa:content> <gsa:content name=’url’>http://example.com:port/</gsa:content> </entry>
Retrieving Connector Managers
To retrieve a list of connector managers, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/connectorManager
The following example shows a sample result:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/connectorManager</id> <updated>2009-03-22T06:31:15.357Z</updated> <link rel=’http://schemas.google.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/connectorManager/ConnectorManagerOne</id> <updated>2009-03-22T06:31:15.357Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager’/> <gsa:content name=’entryID’>ConnectorManagerOne</gsa:content> <gsa:content name=’status’>Disconnected</gsa:content> <gsa:content name=’description’> Connector Manager One Description</gsa:content> <gsa:content name=’url’>http://example.com:port/</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/connectorManager/ConnectorManagerTwo</id> <updated>2009-03-22T06:31:15.357Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager’/> <gsa:content name=’entryID’>ConnectorManagerTwo</gsa:content> <gsa:content name=’status’>Disconnected</gsa:content> <gsa:content name=’description’> Connector Manager Two Description </gsa:content> <gsa:content name=’url’>http://example2.com:port/</gsa:content> </entry> </feed>
To retrieve an individual connector manager, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/connectorManager/ConnectorManager_Name
The following example shows a sample result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/connectorManager/ConnectorManagerOne</id> <updated>2009-03-22T06:33:26.140Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager/ConnectorManagerOne’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/connectorManager/ConnectorManagerOne’/> <gsa:content name=’entryID’>ConnectorManagerOne</gsa:content> <gsa:content name=’status’>Disconnected</gsa:content> <gsa:content name=’description’>Connector Manager One Description</gsa:content> <gsa:content name=’url’>http://example.com:port/</gsa:content> </entry>
Updating a Connector Manager
To update the description
and url
in a connector manager, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/connectorManager/ConnectorManager_Name
The following example request body shows the result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’description’>new description</gsa:content> <gsa:content name=’url’>#new URL</gsa:content> </entry>
Deleting a Connector Manager
To delete a connector manager, send an authenticated DELETE
request to the following URL:
http://Search_Appliance:8000/feeds/connectorManager/ConnectorManager_Name
OneBox Settings
Retrieve or update a OneBox setting for a search appliance using the oneboxSetting
entry of the config
feed.
Property |
Description |
---|---|
|
Maximum number of OneBox results per search. |
|
OneBox response timeout. |
Retrieving OneBox Settings
To get a OneBox setting for a search appliance, send an authenticated GET
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/oneboxSetting
The following example result is an entry that includes current OneBox setting values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/oneboxSetting</id> <updated>2008-12-12T09:21:47.477Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/oneboxSetting’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/oneboxSetting’/> <gsa:content name=’entryID’>oneboxSetting</gsa:content> <gsa:content name=’maxResults’>2</gsa:content> <gsa:content name=’timeout’>1000</gsa:content> </entry>
Updating OneBox Settings
To update the OneBox settings for a search appliance, send an authenticated PUT
request to the config
feed URL:
http://Search_Appliance:8000/feeds/config/oneboxSetting
The following example overwrites the OneBox setting specified in the entry to update:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/oneboxSetting</id> <gsa:content name=’entryID’>oneboxSetting</gsa:content> <gsa:content name=’maxResults’>3</gsa:content> <gsa:content name=’timeout’>2000</gsa:content> </entry>
OneBox Modules
Retrieve the names of and delete OneBox modules from a search appliance using the onebox
feed.
Property |
Description |
---|---|
|
The log content for OneBox logs. |
Retrieving OneBox Module Names
To get the OneBox information for a search appliance, send an authenticated GET
request to the onebox
feed URL:
http://Search_Appliance:8000/feeds/onebox
The following example retrieves the current OneBox values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/onebox</id> <updated>2008-12-15T13:37:36.678Z</updated> <link rel=’http://schemas.example.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/onebox/oneboxone</id> <updated>2008-12-15T13:37:36.678Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox’/> <gsa:content name=’entryID’>oneboxone</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/onebox/oneboxtwo</id> <updated>2008-12-15T13:37:36.678Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox’/> <gsa:content name=’entryID’>oneboxtwo</gsa:content> </entry> </feed>
onebox
feed supplies only the names of each OneBox module.To view OneBox information for a search appliance, send an authenticated GET
request to the onebox
feed URL for a OneBox name:
http://Search_Appliance:8000/feeds/onebox/OneBox_Name
The result is an entry that includes current individual OneBox values for a search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/onebox/oneboxone</id> <updated>2008-12-15T13:39:42.895Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox/oneboxone’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/onebox/oneboxone’/> <gsa:content name=’entryID’>oneboxone</gsa:content> <gsa:content name=’logContent’>onebox logs</gsa:content> </entry>
Deleting a OneBox Module
To delete a OneBox module from a search appliance, send an authenticated DELETE
request to the onebox
feed URL:
http://Search_Appliance:8000/feeds/onebox/OneBox_Name
Crawl Status
Check the crawl status, and also pause or resume crawl using the pauseCrawl
entry of the command
feed.
Property |
Description |
---|---|
|
|
Retrieving the Crawl Status
To check status of crawl, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/command/pauseCrawl
The response result is as follows:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/command/pauseCrawl</id> <updated>2008-12-11T08:55:57.824Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/command/pauseCrawl’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/command/pauseCrawl’/> <gsa:content name=’entryID’>pauseCrawl</gsa:content> <gsa:content name=’pauseCrawl’>0</gsa:content> </entry>
Pausing or Resuming Crawl
To pause or resume crawl, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/command/pauseCrawl
The following is an example of a request to resume crawl:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>pauseCrawl</gsa:content> <gsa:content name=’pauseCrawl’>0</gsa:content> </entry>
Document Status
Retrieve the status of the documents that have been crawled and served using the documentStatus
entry of the status
feed. The properties for the document status are:
Property |
Description |
---|---|
|
The number of documents crawled since midnight. (Midnight pertains to the time that is set on the search appliance.) |
|
Current crawling rate measured in pages per second. |
|
Document errors that occurred since midnight on the search appliance. |
|
Document bytes that have been filtered by domain, language, file type, or metadata. |
|
The number of URLs found that match crawl patterns. |
|
The number of total documents that have been served. |
Retrieving Document Status
To retrieve document status, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/status/documentStatus
The response result is as follows:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/stats/documentStatus</id> <updated>2008-12-11T08:38:05.048Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/status/documentStatus’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/status/documentStatus’/> <gsa:content name=’entryID’>documentStatus</gsa:content> <gsa:content name=’servedURLs’>0</gsa:content> <gsa:content name=’crawlPagePerSecond’>0</gsa:content> <gsa:content name=’crawledURLsToday’>0</gsa:content> <gsa:content name=’foundURLs’>1</gsa:content> <gsa:content name=’filteredBytes’>0</gsa:content> <gsa:content name=’errorURLsToday’>0</gsa:content> </entry>
Index
The sections that follow describe how to configure the Index features of the Admin Console:
Collections
Create, retrieve, update, and delete collections on a search appliance.
A collection is a group of URL patterns that can be searched separately from other URL patterns.
Property |
Description |
---|---|
|
The name of a collection to create (only required when creating a new collection). |
|
The URL patterns to exclude from this collection. |
|
The URL patterns to include in this collection. |
|
The collection settings exported from the Admin Console. Only required when creating a new collection by the |
|
The method of creating (only required when creating a new collection). Possible values: |
Creating a Collection
To create a new collection, send an authenticated POST
request to the following URL:
http://Search_Appliance:8000/feeds/collection
To create a new collection with a default setting, use the following entry:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’collectionName’>new_collection</gsa:content> <gsa:content name=’insertMethod’>default</gsa:content> </entry>
To specify the settings for a new collection, send the following entry:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’collectionName’>new_collection</gsa:content> <gsa:content name=’insertMethod’>customize</gsa:content> <gsa:content name=’followURLs’>#url in new collection</gsa:content> <gsa:content name=’doNotCrawlURLs’># url not in new collection</gsa:content> </entry>
Retrieving All Collections
To retrieve a list of collections, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/collection
The following example shows a sample result:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/collection</id> <updated>2008-12-11T08:01:21.253Z</updated> <link rel=’http://schemas.example.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance</generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/collection/default_collection</id> <updated>2008-12-11T08:01:21.253Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection’/> <gsa:content name=’entryID’>default_collection</gsa:content> <gsa:content name=’followURLs’>/</gsa:content> <gsa:content name=’doNotCrawlURLs’></gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/collection/new2_collection</id> <updated>2008-12-11T08:01:21.253Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection’/> <gsa:content name=’entryID’>new_collection</gsa:content> <gsa:content name=’followURLs’>#urls in new collection</gsa:content> <gsa:content name=’doNotCrawlURLs’></gsa:content> </entry> </feed>
Retrieving a Collection
To retrieve an attribute in a single collection, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/collection/Collection_Name
The following example response shows the result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/collection/default_collection</id> <updated>2008-12-11T08:18:04.372Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection/default_collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/collection/default_collection’/> <gsa:content name=’entryID’>default_collection</gsa:content> <gsa:content name=’followURLs’>/</gsa:content> <gsa:content name=’doNotCrawlURLs’></gsa:content> </entry>
Updating a Collection
To update an attribute in a collection, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/collection/Collection_Name
The following example request body shows the result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’followURLs’>#updated urls</gsa:content> <gsa:content name=’doNotCrawlURLs’></gsa:content> </entry>
Deleting a Collection
To delete a collection, send an authenticated DELETE
request to the following URL:
http://Search_Appliance:8000/feeds/collection/Collection_Name
Index Diagnostics
List crawled documents and retrieve the status of documents in a search appliance using the diagnostics
feed.
Document Status Values
The following tables list document status values.
all
to indicate any status value.Successful Crawl:
Value |
Description |
---|---|
|
Crawled from remote server |
|
Crawled from cache |
Crawl Errors:
Value |
Description |
---|---|
|
Redirect with no location header |
|
Document not found (404) |
|
Other HTTP 400 Errors |
|
HTTP 0 error |
|
Permanent DNS failure |
|
Empty document |
|
Image conversion failed |
|
Authentication failed |
|
Conversion error |
|
HTTP 500 error |
|
Robots.txt unreachable |
|
Temporary DNS failure |
|
Connection failed |
|
Connection timeout |
|
Connection closed |
|
Connection refused |
|
Connection reset |
|
No route to host |
|
Other error |
Crawl Exclusions:
Value |
Description |
---|---|
|
Not in URLs to crawl |
|
In URLs not to crawl |
|
Off domain redirect |
|
Long redirect chain |
|
Infinite URL space |
|
Unhandled protocol |
|
URL too long |
|
Robots no-index |
|
Rejected by rewrite rules |
|
Unknown extension |
|
Disallowed by a meta tag |
|
Disallowed by robots |
|
Unhandled content type |
|
No filter for content type |
|
Robots.txt forbidden |
Listing Crawled Documents
Query parameters:
Parameter |
Description |
---|---|
|
Name of the collection that you want to list. The default value is the last used collection. |
|
|
|
|
|
The page you want to view. The files from a URI may be separated into several pages to return. The page number starts from |
|
The key field of sorting. |
|
The prefix of the URI of the documents that you want to list. If not blank, it must contain at least |
|
A filter of the document status. The values of |
To list documents, send an authenticated GET
request to root entry of diagnostics feed.
http://Search_Appliance:8000/feeds/ diagnostics?uriAt=http%3A%2F%2Fserver.com%2Fsecured%2Ftest1
Returns a description
entry, a set of documents status entries and a set of directories status entries.
Description entry properties:
Property |
Description |
---|---|
<Entry Name> |
|
|
The total number of pages to return. |
|
The prefix of the URL taken from the query parameters. |
Directory status entry properties:
Property |
Description |
---|---|
<Entry Name> |
The URL of a directory. |
|
The number of crawled documents in a directory. |
|
The number of excluded URL patterns in a directory. |
|
The number of retrieval error for documents in a directory. |
|
|
Document status entry properties:
Property |
Description |
---|---|
<Entry Name> |
The URL pattern of a document to check its status. |
|
The status of a document. The values of |
|
Indicates if the cookie server encountered an error. |
|
The last time that the search appliance indexed a document. |
|
|
Example:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/diagnostics</id> <updated>2009-03-26T04:47:40.814Z</updated> <link rel=’http://schemas.google.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/ diagnostics?uriAt=http%3A%2F%2Fserver.com%2Fsecured%2Ftest1%2F’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/diagnostics/http://server.com/secured/test1/ level_1_0</id> <updated>2009-03-26T04:47:40.813Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T04:47:40.813Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <gsa:content name=’entryID’> http://server.com/secured/test1/level_1_0 </gsa:content> <gsa:content name=’numCrawledURLs’>217</gsa:content> <gsa:content name=’numExcludedURLs’>0</gsa:content> <gsa:content name=’type’>DirectoryContentData</gsa:content> <gsa:content name=’numRetrievalErrors’>0</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/diagnostics/http://server.com/secured/test1/ doc_0_0.html</id> <updated>2009-03-26T04:47:40.814Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T04:47:40.814Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <gsa:content name=’entryID’> http://server.com/secured/test1/doc_0_0.html </gsa:content> <gsa:content name=’isCookieServerError’>0</gsa:content <gsa:content name=’timeStamp’>1238042696</gsa:content> <gsa:content name=’docState’>2</gsa:content> <gsa:content name=’type’>FileContentData</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/diagnostics/description</id> <updated>2009-03-26T04:47:40.814Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T04:47:40.814Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics’/> <gsa:content name=’entryID’>description</gsa:content> <gsa:content name=’numPages’>1</gsa:content> <gsa:content name=’uriAt’>http://server.com/secured/test1/</gsa:content> </entry> </feed>
Getting Crawled Document Status
Get the status for documents that have been crawled for a collection.
Parameter |
Description |
---|---|
|
Name of the collection for which you want to list the document status. The default value is the last used collection. |
To retrieve detailed information for a document, send an authenticated GET
request to a document entry of the diagnostics
feed.
http://Search_Appliance:8000/feeds/diagnostics/ http%3A%2F%2Fserver.com%2Fsecured%2Ftest1%2Fdoc_0_2.html
A detailed document status entry is returned with the following properties.
Property |
Description |
---|---|
<Entry Name> |
The URL of a document. |
|
The number of backward links for the document. |
|
The list of collections that contain the document. |
|
The size of the document content. |
|
The type of the document. |
|
The frequency at which the document is being scheduled to crawl, with possible values of |
|
A multi-line history of the document crawl including the timestamp when the document was crawled, the document status code and description in the following format:
For status code values, see Document Status Values. |
|
If the document is currently in process. |
|
The date that the document was indexed. |
|
The number of forward links for the document. |
|
If a cached page for the document is indexed. |
|
The last modified date of the document. |
|
The timestamp of the version being served. |
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/diagnostics/http%3A%2F%2Fexample.com%2Fdoc.html</id> <updated>2009-03-26T05:41:43.724Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T05:41:43.724Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics/http%3A%2F%2Fexample.com%2Fdoc.html’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/diagnostics/http%3A%2F%2Fexample.com%2Fdoc.html’/> <gsa:content name=’entryID’>http://example.com/doc.html</gsa:content> <gsa:content name=’backwardLinks’>0</gsa:content> <gsa:content name=’forwardLinks’>0</gsa:content> <gsa:content name=’isCached’>1</gsa:content> <gsa:content name=’lastModifiedDate’>-1</gsa:content> <gsa:content name=’collectionList’>Default,default_collection</gsa:content> <gsa:content name=’date’>-1</gsa:content> <gsa:content name=’currentlyInFlight’>0</gsa:content> <gsa:content name=’contentSize’>641</gsa:content> <gsa:content name=’contentType’>text/html</gsa:content> <gsa:content name=’crawlFrequency’>normal</gsa:content> <gsa:content name=’crawlHistory’> 1245977534 2 Unchanged. 1245955634 1 Crawled: New Document 1245951054 2 Unchanged. </gsa:content> <gsa:content name=’latestOnDisk’>1245977534</gsa:content> </entry>
Content Statistics
Get content statistics for each kind of documents using the contentStatistics
feed.
Common query parameters for all requests:
Parameter |
Description |
---|---|
|
Name of the collection which you want to list. The default value is the last used collection. |
Content statistics entry properties:
Property |
Description |
---|---|
<Entry Name> |
The content type of documents, such as |
|
The average document size of this content type. |
|
The maximal document size of this content type. |
|
The minimal document size of this content type. |
|
The file number of this content type. |
|
The total document size of this content type. |
Retrieving Content Statistics for All Document Types
To retrieve content statistics for all kinds of document in a search appliance, send an authenticated GET
request to the root
entry of the contentStatistics
feed.
http://Search_Appliance:8000/feeds/contentStatistics
A list of content statistics entries is returned.
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/contentStatistics</id> <updated>2009-03-26T05:45:33.701Z</updated> <link rel=’http://schemas.google.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/contentStatistics/text/html</id> <updated>2009-03-26T05:45:33.701Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T05:45:33.701Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics’/> <gsa:content name=’entryID’>text/html</gsa:content> <gsa:content name=’numFiles’>1,037</gsa:content> <gsa:content name=’minSize’>606</gsa:content> <gsa:content name=’avgSize’>2.5k</gsa:content> <gsa:content name=’totalSize’>2.5M</gsa:content> <gsa:content name=’maxSize’>38k</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/contentStatistics/text/pdf</id> <updated>2009-03-26T05:45:33.701Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T05:45:33.701Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics’/> <gsa:content name=’entryID’>text/pdf</gsa:content> <gsa:content name=’numFiles’>3</gsa:content> <gsa:content name=’minSize’>24k</gsa:content> <gsa:content name=’avgSize’>136k</gsa:content> <gsa:content name=’totalSize’>407k</gsa:content> <gsa:content name=’maxSize’>217k</gsa:content> </entry>
Retrieving Content Statistics for a Document Type
To retrieve content statistics for a document type in a search appliance, send an authenticated GET
request to the content statistics entry of the contentStatistics
feed.
http://Search_Appliance:8000/feeds/contentStatistics/text%2Fpdf
A content statistics entry is returned.
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/contentStatistics/text%2Fpdf</id> <updated>2009-03-26T05:51:32.659Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T05:51:32.659Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics/text%2Fpdf’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/contentStatistics/text%2Fpdf’/> <gsa:content name=’entryID’>text/pdf</gsa:content> <gsa:content name=’numFiles’>3</gsa:content> <gsa:content name=’minSize’>24k</gsa:content> <gsa:content name=’avgSize’>136k</gsa:content> <gsa:content name=’totalSize’>407k</gsa:content> <gsa:content name=’maxSize’>217k</gsa:content> </entry>
Reset Index
Reset your crawling queues and delete your search index, removing all its contents.
Property |
Description |
---|---|
|
Set to |
|
Status code for resetting the index. |
|
Status message. Possible values are |
Retrieving Status After Resetting the Index
To check the status of resetting the index, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/command/resetIndex
An example response result is as follows:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/command/resetIndex</id> <updated>2008-12-11T09:00:21.907Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/command/resetIndex’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/command/resetIndex’/> <gsa:content name=’entryID’>resetIndex</gsa:content> <gsa:content name=’resetStatusCode’>2</gsa:content> <gsa:content name=’resetIndex’>1</gsa:content> <gsa:content name=’resetStatusMessage’>PROGRESS</gsa:content> </entry>
Resetting the Index
To reset the index, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/command/resetIndex
The following is an example of resetting the index:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’resetIndex’>1</gsa:content> </entry>
Search
The sections that follow describe how to configure the Search features of the Admin Console:
- Front Ends, Remove URLs, and Relative OneBoxes
- Output Format XSLT Stylesheet
- KeyMatch
- Related Queries
- Query Suggestion
- Search Status
Front Ends, Remove URLs, and Relative OneBoxes
Retrieve, update, and delete front ends, remove URLs, and relative OneBox modules for a search appliance using the frontend
feed. A relative OneBox is a OneBox module that you assign to work with a front end. Remove URLs are URL patterns that you want to exclude from appearing in an index for a front end.
Property |
Description |
---|---|
|
OneBox modules for a front end. Specify a comma-separated list of OneBox module names. The OneBox names display in alphabetic order. |
|
Remove URLs for a front end. |
Retrieving Front Ends, Remove URLs, and Relative OneBoxes
To get front end information for a search appliance, send an authenticated GET
request to the frontend
feed URL:
http://Search_Appliance:8000/feeds/frontend
The following result is a feed that includes current front ends values for a search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/frontend</id> <updated>2008-12-15T14:48:14.851Z</updated> <link rel=’http://schemas.example.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/frontend’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/frontend’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/frontend/default_frontend</id> <updated>2008-12-15T14:48:14.851Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/frontend’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/frontend’/> <gsa:content name=’entryID’>default_frontend</gsa:content> <gsa:content name=’frontendOnebox’>oneboxone,oneboxtwo</gsa:content> <gsa:content name=’removeUrls’>http://www.example.com/</gsa:content> </entry> </feed>
To get the individual front end information for a search appliance, send an authenticated GET
request to the frontend
feed URL for the front end name:
http://Search_Appliance:8000/feeds/frontend/Front_End
The following result is an entry that includes current individual front end values for a search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/frontend/default_frontend</id> <updated>2008-12-15T16:21:26.012Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/frontend/default_frontend’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/frontend/default_frontend’/> <gsa:content name=’entryID’>default_frontend</gsa:content> <gsa:content name=’frontendOnebox’>oneboxone,oneboxtwo</gsa:content> <gsa:content name=’removeUrls’>http://www.example.com/</gsa:content> </entry>
Updating Remove URLs and Relative OneBoxes
To update the remove URLs and relative OneBoxes that are associated with a front end for a search appliance, send an authenticated PUT
request to the frontend
feed URL:
http://Search_Appliance:8000/feeds/frontend/Front_End
The following example updates the values for remove URLs and relative OneBox modules for a front end:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/frontend/default_frontend</id> <gsa:content name=’entryID’>default_frontend</gsa:content> <gsa:content name=’frontendOnebox’>oneboxtwo</gsa:content> <gsa:content name=’removeUrls’>http://www.example2.com/</gsa:content> </entry>
Inserting Remove URLs and Relative OneBoxes
To insert a front end and remove URLs for a search appliance, send an authenticated POST
request to the frontend
feed URL:
http://Search_Appliance:8000/feeds/frontend
The following example specifies a URL pattern to remove from an index for the frontend_one
front end:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/frontend/frontend_one</id> <gsa:content name=’entryID’>frontend_one</gsa:content> <gsa:content name=’removeUrls’>http://www.example3.com/</gsa:content> </entry>
frontendOnebox
property is not supported.Deleting a Front End
To delete a front end from a search appliance, send an authenticated DELETE
request to the frontend
feed URL:
http://Search_Appliance:8000/feeds/frontend
Output Format XSLT Stylesheet
Retrieve and update XSLT template and other output format related properties for each language of each front end using the frontend
entry of the outputFormat
feed.
Parameter |
Description |
---|---|
|
Specify a language for the output format properties that you want to retrieve. Each front end can contain multiple languages, and each language has its own output format properties. Each front end + language can have its own XSLT stylesheet. The Administrators who use the Admin Console set the language in their browser and the Admin Console then displays in that language (if the Admin Console has been translated into that language). Hence the |
Use the following properties to retrieve an output format stylesheet.
Property |
Description |
---|---|
|
|
|
|
|
In a retrieving operation, language is determined by the language specified by |
|
|
|
The output format of the XSLT code. |
restoreDefaultFormat
content is mutually exclusive from the styleSheetContent
. For each update action, you can restore the output format style sheet XSLT back to its original default values, or set the style sheet XSLT to a custom format, or neither, but not both.Retrieving the Output Format XSLT Stylesheet
To get the output format stylesheet information for a search appliance, send an authenticated GET
request to the outputFormat
feed URL:
http://Search_Appliance:8000/feeds/outputFormat/Front_End?language=Language_Code
The result is an entry that includes all stylesheet information for the designated Front_End and Language_Code.
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/outputFormat/default_frontend</id> <updated>2008-12-09T23:59:51.078Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/outputFormat/default_frontend’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/outputFormat/default_frontend’/> <gsa:content name=’entryID’>default_frontend</gsa:content> <gsa:content name=’isStyleSheetEdited’>0</gsa:content> <gsa:content name=’styleSheetContent’> <!-- *** START OF STYLESHEET *** --> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0"& ;gt; <xsl:include href="customer-onebox.xsl"/> <xsl:output method="html"/> <xsl:variable name="show_logo">1</xsl:variable> <xsl:variable name="logo_url">images/Title_Left.png</xsl:variable> <xsl:variable name="logo_width">200</xsl:variable> <xsl:variable name="logo_height">78</xsl:variable>........<xsl:templa te match="@*|node()"/> </xsl:stylesheet> <!-- *** END OF STYLESHEET *** --> </gsa:content> <gsa:content name=’isDefaultLanguage’>1</gsa:content> <gsa:content name=’language’>en</gsa:content> </entry>
Updating the Output Format XSLT Stylesheet
To update the output format stylesheet information for a search appliance, send an authenticated PUT
request to the outputFormat
feed URL:
http://Search_Appliance:8000/feeds/outputFormat/Front_End
Specify the language
parameter in the language
property of the entry to update.
This value overwrites the stylesheet properties specified in the entry to update for the designated Front_End and Language_Code.
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/outputFormat/default_frontend</id> <gsa:content name=’entryID’>default_frontend</gsa:content> <gsa:content name=’language’>en</gsa:content> <gsa:content name=’restoreDefaultFormat’>1</gsa:content> <gsa:content name=’styleSheetContent’> <!-- *** START OF STYLESHEET *** --> <!-- XML escaped XSLT code goes here --> <!-- *** END OF STYLESHEET *** --> </gsa:content> <gsa:content name=’isDefaultLanguage’>1</gsa:content> </entry>
KeyMatch
Retrieve or update KeyMatch settings on a search appliance using the keymatch
feed. KeyMatch lets you promote specific web pages on your site. The parameters for this feed are:
Parameter |
Description |
---|---|
|
A query string to perform a full-text search. For example, if you specify |
|
The starting line number of a result, the default value is |
|
The number of result lines in a response, the default value is |
The keymatch
feed has the following properties:
Property |
Description |
---|---|
line_number |
The line_number of the KeyMatch configuration rule. |
|
The KeyMatch settings to replace the existing values. You can specify multiple lines of KeyMatch values. The line delimiter is |
|
The total number of result lines. |
|
The original KeyMatch settings to change. You can include multiple lines of KeyMatch values. The line delimiter is |
|
The starting line number of the KeyMatch configuration to change. The minimum value is |
|
The method to change KeyMatch configurations. Possible values are:
|
A KeyMatch configuration rule is in the following format:
Search_Terms,KeyMatch_Type,URL,Title
The KeyMatch_Type is one of the three values, KeywordMatch
, PhraseMatch
, and ExactMatch
. The Search_Terms and URL fields cannot be empty. The KeyMatch configuration conforms to the CSV format, which uses a comma to separate values.
Retrieving KeyMatch Settings
To get KeyMatch settings, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/keymatch/Front_End_Name?query=Search_String&startLine=Start_Line&maxLines=Max_Lines;
The following example retrieves KeyMatch settings--note that gsa:content name="2"
(or 0
or 1
) shows the use of the line_number property:
<?xml version="1.0" ?> <entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007"> <id>http://ent1:8000/feeds/keymatch/default_frontend</id> <updated>2008-12-05T03:13:19.806Z</updated> <link href="http://ent1:8000/feeds/keymatch/default_frontend" rel="self" type="application/atom+xml"/> <link href="http://ent1:8000/feeds/keymatch/default_frontend" rel="edit" type="application/atom+xml"/> <gsa:content name="entryID">default_frontend</gsa:content> <gsa:content name="2"> Google News,ExactMatch,http://news.google.com/,News </gsa:content> <gsa:content name="numLines">3</gsa:content> <gsa:content name="1"> Google Search,PhraseMatch,http://www.google.com/,I’m Feeling Lucky! </gsa:content> <gsa:content name="0"> Python,KeywordMatch,http://www.python.org/,Python Programming Language </gsa:content> </entry>
Updating KeyMatch Settings
To change KeyMatch settings, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/keymatch/Front_End
The following example appends KeyMatch settings:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’updateMethod’>append</gsa:content> <gsa:content name=’newLines’> image,KeywordMatch,http://images.google.com/,Google Image Search video,KeywordMatch,http://www.youtube.com/,Youtube rss feed,PhraseMatch,http://www.google.com/reader,Reader </gsa:content> </entry>
The following example updates KeyMatch settings:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’updateMethod’>update</gsa:content> <gsa:content name=’startLine’>0</gsa:content> <gsa:content name=’originalLines’> image,KeywordMatch,http://images.google.com/,Google Image Search video,KeywordMatch,http://www.youtube.com/,Youtube rss feed,PhraseMatch,http://www.google.com/reader,Reader </gsa:content> <gsa:content name=’newLines’> ,,, video,KeywordMatch,http://video.google.com/,Video Search rss feed,PhraseMatch,http://www.example.com/,RSS example </gsa:content> </entry>
,,,
).The following example replaces a KeyMatch setting:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’updateMethod’>replace</gsa:content> <gsa:content name=’newLines’> image,KeywordMatch,http://images.google.com/,Google Image Search video,KeywordMatch,http://www.youtube.com/,Youtube rss feed,PhraseMatch,http://www.google.com/reader,Reader </gsa:content> </entry>
Related Queries
Retrieve or update related queries on a search appliance using the synonym
feed. (Related queries are also known as synonyms.)
Use related queries to associate alternative words or phrases with specified search terms.
Parameter |
Description |
---|---|
|
A query string to perform a full-text search. For example, if you specify |
|
The starting line number of the results, the default value is |
|
The number of result lines in a response, the default value is |
Use the following properties:
Property |
Description |
---|---|
line_number |
The line_number of a related query configuration rule in the list of rules. |
|
The new related query configuration to change. You can include multiple lines of related query values. The line delimiter is |
|
The number of total result lines. |
|
The original related query configurations to change. You can include multiple lines of related query values. The line delimiter is |
|
The starting line number of the related query configuration to change. The minimum value is |
|
The method to change related query configurations. Possible values are:
|
A related queries configuration rule is in the following format:
Search_Terms,Related_Queries
The Search_Terms and the Related_Queries values cannot be empty. The related queries configuration conforms to the CSV format, which uses a comma to separate values.
Retrieving Related Queries
To get related queries, send an authenticated GET
request to the following URL (wrapped for readability):
http://Search_Appliance:8000/feeds/synonym/Front_End?query=Search_String&startLine=Start_Line&maxLines=Max_Lines
The following example retrieves related queries:
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007"> <id>http://ent1:8000/feeds/synonym/default_frontend</id> <updated>2008-12-15T06:41:20.954Z</updated> <link href="http://sa42.example.com:8000/feeds/synonym/default_frontend" rel="self" type="application/atom+xml"/> <link href="http://sa42.example.com:8000/feeds/synonym/default_frontend" rel="edit" type="application/atom+xml"/> <gsa:content name="entryID">default_frontend</gsa:content> <gsa:content name="2">stock,security</gsa:content> <gsa:content name="numLines">3</gsa:content> <gsa:content name="1">google,googol</gsa:content> <gsa:content name="0">airplane,aircraft</gsa:content> </entry>
Updating Related Queries
To change related queries, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/synonym/Front_End
The following example appends related queries:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’updateMethod’>append</gsa:content> <gsa:content name=’newLines’> airplane,aircraft google,googol stock,security </gsa:content> </entry>
The following example updates related queries:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’updateMethod’>update</gsa:content> <gsa:content name=’startLine’>0</gsa:content> <gsa:content name=’originalLines’> airplane,aircraft google,googol </gsa:content> <gsa:content name=’newLines’> airplane,helicopter , </gsa:content> </entry>
,
).The following example replaces all related queries:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’updateMethod’>replace</gsa:content> <gsa:content name=’newLines’> airplane,aircraft google,googol stock,security </gsa:content> </entry>
Query Suggestion
There are two features for working with query suggestions:
Query Suggestion Blacklist
The query suggestion blacklist supports the /suggest feature described in the "Query Suggestion Service /suggest Protocol" chapter of the Search Protocol Reference. This feature uses the suggest
feed to retrieve and update the query suggestion blacklist entries.
Property |
Description |
---|---|
|
Content of the suggest blacklist file. |
The query suggestion blacklist supports the regular expressions in the re2 library (http://code.google.com/p/re2/wiki/Syntax). If you want specify an exact match, you need to use the following syntax:
^the_word_to_match$
Retrieving Query Suggestion Blacklist Information
Retrieve query suggestion blacklist information as follows:
GET request URL: http://Search_Appliance:8000/feeds/suggest/suggestBlacklist
Updating Query Suggestion Blacklist Entries
Update query suggestion blacklist entries as follows:
PUT request URL: http://Search_Appliance:8000/feeds/suggest/suggestBlacklist <?xml version=’1.0’ encoding=’UTF-8’?> <atom:entry xmlns:atom=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’ xmlns:apps=’http://schemas.google.com/apps/2006’> <gsa:content name=’suggestBlacklist’> bad_word_3 ^bad_word_1$ car[0-9]{4}.* </gsa:content> </atom:entry>
Query Suggestion Refresh
The query suggestion refresh supports the /suggest feature described in the "Query Suggestion Service /suggest Protocol" chapter of the Search Protocol Reference. This feature uses the suggest
feed to refresh the query suggestion database.
Property |
Description |
---|---|
|
Triggers a query suggestion refresh. |
Refresh query suggestions as follows:
PUT request URL: http://Search_Appliance:8000/feeds/suggest/suggestRefresh <?xml version=’1.0’ encoding=’UTF-8’?> <atom:entry xmlns:atom=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’ xmlns:apps=’http://schemas.google.com/apps/2006’> <gsa:content name=’suggestRefresh’>1</gsa:content> </atom:entry>
Search Status
Retrieve serving status for a search appliance using the servingStatus
entry of the status
feed.
Property |
Description |
---|---|
|
Average queries per minute recently served on the search appliance. |
|
Recent search latency in seconds. |
Retrieving the Serving Status Entry
To get the current search appliance serving status, send an authenticated GET
request to the status
feed URL:
http://Search_Appliance:8002/feeds/status/servingStatus
The following result is an entry that includes the current serving status values for the search appliance:
<?xml version="1.0" encoding="UTF-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007"> <id>http://gsa:8002/feeds/status/servingStatus</id> <updated>2014-03-14T16:05:56.668Z</updated> <link rel="self" type="application/atom+xml" href="http://gsa:8002/feeds/status/servingStatus"/> <link rel="edit" type="application/atom+xml" href="http://gsa:8002/feeds/status/servingStatus"/> <gsa:content name="entryID">servingStatus</gsa:content> <gsa:content name="searchLatency">0.07</gsa:content> <gsa:content name="queriesPerMinute">0.6</gsa:content> </entry>
Reports
The sections that follow describe how to configure the Reports features of the Admin Console:
Search Reports
Generate, update and delete search log using the searchReport
feed and the following properties.
Property |
Description |
---|---|
<Entry Name> |
<Search_Report_Name>@<Collection_Name> |
|
(Write only) The collection name, which is only needed when creating a search report. |
|
Terms to exclude when running scripts that create diagnostic data from test queries. All the specified terms in a search query are removed from the report. Use commas to separate multiple terms. |
|
(Read only) Indicates if the search report contains the final result. If so, it means the last update date is later than |
|
(Read only) The search report content, which is only returned when get search report content and content is ready. |
|
(Read only) The creation date of the search report. |
|
The dates of the queries that are collected in the search report. |
|
(Write only) The report name, which is only needed when creating a search report. |
|
(Read only) The status of the search report. |
|
The number of top queries to be generated. |
|
Indicates if a search has results. The default value is |
Listing a Search Report
List a search report using the following query parameters:
Parameter |
Description |
---|---|
|
Collection name for the search report. The default value is |
To list search report entries, send an authenticated GET
request to the root
entry of the searchReport
feed.
http://Search_Appliance:8000/feeds/searchReport/
A list of search report entries are returned.
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchReport</id> <updated>2009-03-26T07:26:55.991Z</updated> <link rel=’http://schemas.google.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/searchReport/aaa@default_collection</id> <updated>2009-03-26T07:26:55.991Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T07:26:55.991Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <gsa:content name=’entryID’>aaa@default_collection</gsa:content> <gsa:content name=’diagnosticTerms’>comments</gsa:content> <gsa:content name=’reportState’>2</gsa:content> <gsa:content name=’reportCreationDate’> March 26, 2009 12:14:14 AM PDT </gsa:content> <gsa:content name=’reportDate’>month_3_2009</gsa:content> <gsa:content name=’withResults’>true</gsa:content> <gsa:content name=’topCount’>100</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/searchReport/bbb@default_collection</id> <updated>2009-03-26T07:26:55.991Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T07:26:55.991Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <gsa:content name=’entryID’>bbb@default_collection</gsa:content> <gsa:content name=’diagnosticTerms’></gsa:content> <gsa:content name=’reportState’>2</gsa:content> <gsa:content name=’reportCreationDate’> March 26, 2009 12:24:16 AM PDT </gsa:content> <gsa:content name=’reportDate’>month_3_2009</gsa:content> <gsa:content name=’withResults’>true</gsa:content> <gsa:content name=’topCount’>100</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry> </feed>
Creating a Search Report
Create a new search report entry by sending an authenticated POST
request to the root
entry of the searchReport
feed.
http://Search_Appliance:8000/feeds/searchReport/
The possible date formats for reports are as follows.
Purpose |
Format |
---|---|
Date |
date |
Month |
month |
Year |
year |
Date range |
range |
An example request with content is:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’reportName’>bbb</gsa:content> <gsa:content name=’collectionName’>default_collection</gsa:content> <gsa:content name=’reportDate’>month_3_2009</gsa:content> <gsa:content name=’withResults’>true</gsa:content> <gsa:content name=’topCount’>100</gsa:content> </entry>
A new search report entry is generated and returned:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchReport</id> <updated>2009-03-26T07:22:25.162Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T07:22:25.162Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport’/> <gsa:content name=’entryID’>bbb@default_collection</gsa:content> <gsa:content name=’diagnosticTerms’></gsa:content> <gsa:content name=’reportState’>1</gsa:content> <gsa:content name=’reportCreationDate’> March 26, 2009 12:22:25 AM PDT </gsa:content> <gsa:content name=’reportDate’>month_3_2009</gsa:content> <gsa:content name=’withResults’>true</gsa:content> <gsa:content name=’topCount’>100</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry>
Retrieving a Search Report
To check search report status and retrieve search log content, send an authenticated GET
request to a search report entry of the searchReport
feed.
http://Search_Appliance:8000/feeds/searchReport/aaa@default_collection
The following is a returned search report entry that contains log content (if the content is ready):
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchReport/aaa%40default_collection</id> <updated>2009-03-26T07:14:56.343Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T07:14:56.343Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport/aaa%40default_collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport/aaa%40default_collection’/> <gsa:content name=’entryID’>aaa@default_collection</gsa:content> <gsa:content name=’diagnosticTerms’>comments</gsa:content> <gsa:content name=’reportState’>2</gsa:content> <gsa:content name=’reportContent’>******Report Content******</gsa:content> <gsa:content name=’reportCreationDate’> March 26, 2009 12:14:14 AM PDT </gsa:content> <gsa:content name=’reportDate’>month_3_2009</gsa:content> <gsa:content name=’withResults’>true</gsa:content> <gsa:content name=’topCount’>100</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry>
Updating a Search Report
Update the search report status and get search report content by sending an authenticated PUT
request to a search report entry of the searchReport
feed. There are no properties for this feed.
http://Search_Appliance:8000/feeds/searchReport/bbb@default_collection
An example request with content is:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> </entry>
A search log entry is returned:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchReport/bbb%40default_collection</id> <updated>2009-03-26T07:24:16.099Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T07:24:16.099Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport/bbb%40default_collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchReport/bbb%40default_collection’/> <gsa:content name=’entryID’>bbb@default_collection</gsa:content> <gsa:content name=’diagnosticTerms’></gsa:content> <gsa:content name=’reportState’>3</gsa:content> <gsa:content name=’reportCreationDate’> March 26, 2009 12:22:25 AM PDT </gsa:content> <gsa:content name=’reportDate’>month_3_2009</gsa:content> <gsa:content name=’withResults’>true</gsa:content> <gsa:content name=’topCount’>100</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry>
Deleting a Search Report
To update the search report status and get search log content, send an authenticated DELETE
request to a search report entry of the searchReport
feed.
http://Search_Appliance:8000/feeds/searchReport/bbb@default_collection
A search report entry will be deleted.
Search Logs
Generate, update, and delete search logs using the searchLog
feed.
Search log entry properties:
Property |
Description |
---|---|
<Entry Name> |
<Search_Log_Name>@<Collection_Name> |
|
(Write only) The collection name, which is only needed when creating a search log. |
|
(Read only) The starting line of a search log that returns in |
|
(Read only) Indicates if the search log contains the final result. If so, it means the last update date is later than |
|
(Read only) A part of the search log content that is returned when getting search log content and the content is ready. |
|
(Read only) The creation date of a search log. |
|
The dates for the queries that are collected in the search log. |
|
(Write only) The report name, which is only needed when creating a search log. |
|
(Read only) The status of the search log: |
|
(Read only) The ending line of the search log that is returned in |
|
(Read only) The number of lines in the search log that are returned in |
Listing a Search Log
List the entries in a search log using the following query parameters:
Parameter |
Description |
---|---|
|
Collection Name of a search log. The default value is |
To list search log entries, send an authenticated GET
request to root
entry of the searchLog
feed.
http://Search_Appliance:8000/feeds/searchLog/
A list of search log entries is returned:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchLog</id> <updated>2009-03-26T06:44:31.094Z</updated> <link rel=’http://schemas.google.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/searchLog/aaa@default_collection</id> <updated>2009-03-26T06:44:31.094Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T06:44:31.094Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <gsa:content name=’entryID’>aaa@default_collection</gsa:content> <gsa:content name=’reportState’>2</gsa:content> <gsa:content name=’reportCreationDate’> March 25, 2009 11:20:20 PM PDT </gsa:content> <gsa:content name=’reportDate’>date_3_25_2009</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/searchLog/bbb@default_collection</id> <updated>2009-03-26T06:44:31.094Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T06:44:31.094Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <gsa:content name=’entryID’>bbb@default_collection</gsa:content> <gsa:content name=’reportState’>2</gsa:content> <gsa:content name=’reportCreationDate’> March 25, 2009 11:42:28 PM PDT </gsa:content> <gsa:content name=’reportDate’>date_3_25_2009</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry> </feed>
Creating a Search Log
To create a new search log entry, send an authenticated POST
request to the root
entry of the searchLog
feed:
http://Search_Appliance:8000/feeds/searchLog/
A request with content is as follows:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’reportName’>bbb</gsa:content> <gsa:content name=’collectionName’>default_collection</gsa:content> <gsa:content name=’reportDate’>date_3_25_2009</gsa:content> </entry>
A new search log entry generates and returns:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchLog</id> <updated>2009-03-26T06:42:28.742Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T06:42:28.742Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog’/> <gsa:content name=’entryID’>bbb@default_collection</gsa:content> <gsa:content name=’reportState’>1</gsa:content> <gsa:content name=’reportCreationDate’> March 25, 2009 11:42:28 PM PDT </gsa:content> <gsa:content name=’reportDate’>date_3_25_2009</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry>
Retrieving Search Log Content
To check the search log status and get search log content, send an authenticated GET
request to a search
log entry of the searchLog
feed using the following parameters.
Parameter |
Description |
---|---|
|
Query string for the |
|
The maximum |
|
The first |
Example:
http://Search_Appliance:8000/feeds/searchLog/ aaa@default_collection?query=document
A search log entry with logContent
(if content is ready) returns:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchLog/aaa%40default_collection</id> <updated>2009-03-26T06:22:41.416Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T06:22:41.416Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog/aaa%40default_collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog/aaa%40default_collection’/> <gsa:content name=’entryID’>aaa@default_collection</gsa:content> <gsa:content name=’toLine’>2</gsa:content> <gsa:content name=’logContent’> 127.0.0.2!127.0.0.1 - - [25/Mar/2009:23:18:43 -0800] "GET /search?q=document&btnG=Google+Search&access=p& client=default_frontend&output=xml_no_dtd& proxystylesheet=default_frontend&sort=date%3AD%3AL%3Ad1& entqr=0&oe=UTF-8&ie=UTF-8&ud=1&site=default_collection& ip=172.30.120.197 HTTP/1.1" 200 2432 3 0.02 127.0.0.2!127.0.0.1 - - [25/Mar/2009:23:18:14 -0800] "GET /search?q=document&btnG=Google+Search&access=p& client=default_frontend&output=xml_no_dtd& proxystylesheet=default_frontend&sort=date%3AD%3AL%3Ad1& entqr=0&oe=UTF-8&ie=UTF-8&ud=1&site=default_collection& ip=172.30.120.197 HTTP/1.1" 200 2432 3 0.02 </gsa:content> <gsa:content name=’reportState’>2</gsa:content> <gsa:content name=’fromLine’>1</gsa:content> <gsa:content name=’totalLines’>2</gsa:content> <gsa:content name=’reportCreationDate’> March 25, 2009 11:20:20 PM PDT </gsa:content> <gsa:content name=’reportDate’>date_3_25_2009</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry>
Updating a Search Log
To update the search log status and get search log content, send an authenticated PUT
request to a search log entry of the searchLog
feed. There are no properties for this use of the searchLog
feed:
http://Search_Appliance:8000/feeds/searchLog/bbb@default_collection
Specify a request with content:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> </entry>
A search log entry returns:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/searchLog/bbb%40default_collection</id> <updated>2009-03-26T06:50:05.928Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T06:50:05.928Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog/bbb%40default_collection’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/searchLog/bbb%40default_collection’/> <gsa:content name=’entryID’>bbb@default_collection</gsa:content> <gsa:content name=’reportState’>3</gsa:content> <gsa:content name=’reportCreationDate’> March 25, 2009 11:42:28 PM PDT </gsa:content> <gsa:content name=’reportDate’>date_3_25_2009</gsa:content> <gsa:content name=’isFinal’>false</gsa:content> </entry>
Deleting a Search Log
To update the search log status and get search log content, send an authenticated DELETE
request to a search log entry of the searchLog
feed.
http://Search_Appliance:8000/feeds/searchLog/bbb@default_collection
A search log entry will be deleted.
GSA Unification
The sections that follow describe how to configure the GSA Unification features of the Admin Console:
- Configuring a GSA Unification Network
- Adding a GSA Unification Node
- Retrieving a Node Configuration
- Retrieving All Node Configurations
- Updating a Node Configuration
- Deleting a Node
GSA Unification is also known as dynamic scalability. GSA Unification features are provided by the federation
feed.
Configuring a GSA Unification Network
Retrieve, update, create, or delete the GSA Unification node configuration and retrieve the node configuration of all nodes in the network on the Google Search Appliance.
Property |
Description |
---|---|
|
The ID of the search appliance, required to identify the node in node operations. |
|
The private tunnel IP address (virtual address) for the node. This address must be an RFC 1918 address. A GSA Unification works best when the IP addresses of the nodes are numerically near, such as 10.1.1.1, 10.1.1.2, 10.1.1.3, and so on. The search appliance disallows a GSA Unification for nodes that are not in the same /16 subnet. This is a problem only if there are more than 65534 nodes in a GSA Unification network. GSA Unification nodes communicate on TCP port 10999.
|
|
The host name of the search appliance. |
|
The type of search appliance. Possible values:
|
|
The scoring bias value for this node. Valid values are integers between -99 and 99. The scoring bias value reflects the weighting to be given to results from this node. A higher value means a higher weighting. The values and their equivalent in the Admin Console are: ![]() |
|
The secret token that you use to establish a connection to this node. This token can be any non-empty string. The remote search appliance needs this token for the connection handshake. |
Adding a GSA Unification Node
To add a GSA Unification node, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/federation
The following is an example of a request body:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>S4-JAX9N2PQ4GNAB</gsa:content> <gsa:content name=’nodeType’>SECONDARY</gsa:content> <gsa:content name=’federationNetworkIP’>10.0.0.2</gsa:content> <gsa:content name=’secretToken’>token</gsa:content> <gsa:content name=’hostname’>host1.domain.com</gsa:content> <gsa:content name=’scoringBias’>20</gsa:content> </entry>
Retrieving a Node Configuration
To retrieve the configuration information about a GSA Unification node, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/federation/Appliance_Id
The following example shows a sample result for a secondary node:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/federation/S4-JAX9N2PQ4GNAB</id> <updated>2008-12-11T08:18:04.372Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation/S4-JAX9N2PQ4GNAB’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation/S4-JAX9N2PQ4GNAB’/> <gsa:content name=’entryID’>S4-JAX9N2PQ4GNAB</gsa:content> <gsa:content name=’nodeType’>SECONDARY</gsa:content> <gsa:content name=’federationNetworkIP’>10.0.0.2</gsa:content> <gsa:content name=’secretToken’>token</gsa:content> <gsa:content name=’hostname’>host1.domain.com</gsa:content> <gsa:content name=’scoringBias’>20</gsa:content> <gsa:content name=’remoteFrontend’>remoteFrontend</gsa:content> <gsa:content name=’slaveTimeout’>100</gsa:content> </entry>
The following example shows a sample result for a primary node:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/federation/S4-JAX9N2PQ4GNAB</id> <updated>2008-12-11T08:18:04.372Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation/S4-JAX9N2PQ4GNAB’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation/S4-JAX9N2PQ4GNAB’/> <gsa:content name=’entryID’>S4-JAX9N2PQ4GNAB</gsa:content> <gsa:content name=’nodeType’>PRIMARY</gsa:content> <gsa:content name=’federationNetworkIP’>10.0.0.2</gsa:content> <gsa:content name=’secretToken’>token</gsa:content> <gsa:content name=’hostname’>host1.domain.com</gsa:content> <gsa:content name=’secondaryNodes’>Appliance_ID1, Appliance_ID2</gsa:content> </entry>
Retrieving All Node Configurations
To retrieve information on all GSA Unification nodes, send an authenticated GET
request to the following URL:
http://Search_Appliance:8000/feeds/federation
The following example shows a sample result for a secondary node:
<?xml version=’1.0’ encoding=’UTF-8’?> <feed xmlns=’http://www.w3.org/2005/Atom’ xmlns:openSearch=’http://a9.com/-/spec/opensearchrss/1.0/’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/federation</id> <updated>2008-12-11T08:01:21.253Z</updated> <link rel=’http://schemas.example.com/g/2005#feed’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation’/> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation’/> <generator version=’0.5’ uri=’http://gsa:8000/gsa’> Google Search Appliance </generator> <openSearch:startIndex>1</openSearch:startIndex> <entry> <id>http://gsa:8000/feeds/federation/ApplianceId1</id> <updated>2008-12-11T08:01:21.253Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation’/> <gsa:content name=’entryID’>Appliance_Id1</gsa:content> <gsa:content name=’nodeType’>SECONDARY</gsa:content> <gsa:content name=’federationNetworkIP’>10.0.0.2</gsa:content> <gsa:content name=’secretToken’>token</gsa:content> <gsa:content name=’hostname’>host1.domain.com</gsa:content> <gsa:content name=’scoringBias’>20</gsa:content> <gsa:content name=’remoteFrontend’>remoteFrontend</gsa:content> <gsa:content name=’slaveTimeout’>100</gsa:content> </entry> <entry> <id>http://gsa:8000/feeds/collection/new2_collection</id> <updated>2008-12-11T08:01:21.253Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/federation’/> <gsa:content name=’entryID’>Appliance_Id</gsa:content> <gsa:content name=’nodeType’>PRIMARY</gsa:content> <gsa:content name=’federationNetworkIP’>10.0.0.3</gsa:content> <gsa:content name=’secretToken’>token1</gsa:content> <gsa:content name=’hostname’>host2.domain.com</gsa:content> <gsa:content name=’scoringBias’>40</gsa:content> <gsa:content name=’secondaryNodes’></gsa:content> </entry> </feed>
Updating a Node Configuration
To update the configuration of a node in the GSA Unification network, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/collection/Appliance_Id
The following example request body shows the result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’entryID’>Appliance_Id</gsa:content> <gsa:content name=’nodeType’>SECONDARY</gsa:content> <gsa:content name=’federationNetworkIP’>10.0.0.5</gsa:content> <gsa:content name=’secretToken’>token2</gsa:content> <gsa:content name=’hostname’>host5.domain.com</gsa:content> <gsa:content name=’scoringBias’>40</gsa:content> </entry>
Deleting a Node
To delete a node from the GSA Unification network, send an authenticated DELETE
request to the following URL:
http://Search_Appliance:8000/feeds/federation/Appliance_Id
Administration
The sections that follow describe how to configure the Administration features of the Admin Console:
License Information
Retrieve license Information for a search appliance using the licenseInfo
entry of the info
feed.
Property |
Description |
---|---|
|
Provides the identification value for the Google Search Appliance software. This value is also known as the serial number for the software. |
|
Provides the unique license identification value. |
|
Identifies when the search appliance software license will expire. |
|
Indicates the maximum number of collections. Configure collections at the Crawl and Index > Collections page. |
|
Indicates the maximum number of front ends. Configure front ends at the Serving > Front Ends page. |
|
Maximum number of content items that you can index with this product. Content items include documents, images, and content from the feeds interface. |
Retrieving License Information
To get the license information for a search appliance, send an authenticated GET
request to the info
feed URL:
http://Search_Appliance:8000/feeds/info/licenseInfo
The following example result is an entry that includes current license Information values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/info/licenseInfo</id> <updated>2008-12-12T09:11:42.455Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/info/licenseInfo’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/info/licenseInfo’/> <gsa:content name=’entryID’>licenseInfo</gsa:content> <gsa:content name=’maxFrontends’>unlimited</gsa:content> <gsa:content name=’licenseID’> license_S5-QJBPL6N3H8JJA_20081211_220512 </gsa:content> <gsa:content name=’maxPages’>unlimited</gsa:content> <gsa:content name=’maxCollections’>unlimited</gsa:content> <gsa:content name=’licenseValidUntil’>March 7, 9009</gsa:content> <gsa:content name=’applianceID’>S5-QJBPL6N3H8JJA</gsa:content> </entry>
Import and Export
Import or export a search appliance configuration using the importExport
entry of the config
feed.
Common query parameters for all requests:
Parameter |
Description |
---|---|
|
The password of the exported configuration |
The importExport
entry properties:
Property |
Description |
---|---|
|
The content of exported configuration |
|
The password for generating configuration file |
Exporting a Configuration
To export a search appliance configuration, send an authenticated GET
request to the importExport
entry of the config
feed:
http://Search_Appliance:8000/feeds/config/importExport?password=12345678
An importExport
entry returns:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/config/importExport</id> <updated>2009-03-26T05:56:23.092Z</updated> <app:edited xmlns:app=’http://purl.org/atom/app#’> 2009-03-26T05:56:23.092Z </app:edited> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/importExport’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/config/importExport’/> <gsa:content name=’entryID’>importExport</gsa:content> <gsa:content name=’xmlData’> **********configuration content*********** <
Import a Configuration
To import a search appliance configuration, send an authenticated PUT
request to the importExport
entry of the config
feed:
http://Search_Appliance:8000/feeds/config/importExport
The following example shows an importExport
entry with content:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’password’>12345678</gsa:content> <gsa:content name=’xmlData’> **********configuration content*********** </gsa:content> </entry>
Event Log
Retrieve the event log for a search appliance using the eventLog
entry of the logs
feed.
Parameter |
Description |
---|---|
|
Query string for the |
|
The first |
|
The maximum |
The following properties enable access to log content.
Property |
Description |
---|---|
|
The starting line of the |
|
The log content. |
|
The ending line of the |
|
Total lines of the |
Retrieving the Event Log
Retrieve the event log information for a search appliance by sending an authenticated GET
request to the eventLog
feed URL (wrapped for readability):
http://Search_Appliance:8000/feeds/logs/eventLog? query=User&startLine=Starting_Line&maxLines=Max_Lines
The result is an entry that includes the current event log values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/logs/eventLog</id> <updated>2008-12-12T09:03:37.294Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/logs/eventLog’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/logs/eventLog’/> <gsa:content name=’entryID’>eventLog</gsa:content> <gsa:content name=’toLine’>11</gsa:content> <gsa:content name=’logContent’> @ 2008/12/11 23:39:40: User logged in: [admin logged in from 172.30.123.69 at 2008_12_11_23_39_40_PST] @ 2008/12/11 23:39:38: User logged in: [admin logged in from 172.30.123.69 at 2008_12_11_23_39_38_PST] </gsa:content> <gsa:content name=’fromLine’>10</gsa:content> <gsa:content name=’totalLines’>67</gsa:content> </entry>
System Status
Retrieve the system status for a search appliance using the systemStatus
entry of the status
feed.
Property |
Description |
---|---|
|
Temperature of the CPU: 0 if okay, 1 if caution, 2 if critical. |
|
Remaining disk capacity of the search appliance: 0 if okay, 1 if caution, 2 if critical. |
|
Health of the internal system components: |
|
Overall health of the entire search appliance: 0 if okay, 1 if caution, 2 if critical. |
|
Health of the raid array: 0 if okay, 1 if caution, 2 if critical. |
Retrieving a System Status Entry
To get the current search appliance system status, send an authenticated GET
request to the status
feed URL:
http://Search_Appliance:8000/feeds/status/systemStatus
The following result is an entry that includes current system status values for the search appliance:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <id>http://gsa:8000/feeds/status/systemStatus</id> <updated>2008-12-09T23:53:14.288Z</updated> <link rel=’self’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/status/systemStatus’/> <link rel=’edit’ type=’application/atom+xml’ href=’http://gsa:8000/feeds/status/systemStatus’/> <gsa:content name=’entryID’>systemStatus</gsa:content> <gsa:content name=’overallHealth’>0</gsa:content> <gsa:content name=’diskCapacity’>0</gsa:content> <gsa:content name=’raidHealth’>0</gsa:content> <gsa:content name=’cpuTemperature’>0</gsa:content> <gsa:content name=’machineHealth’>0</gsa:content> </entry>
Shut Down and Reboot
Shut down or reboot the search appliance.
Property |
Description |
---|---|
|
Command sent to the search appliance. The command can be |
|
Indicates the search appliance status:
|
Shutting Down or Rebooting a Search Appliance
To shut down or reboot a search appliance, send an authenticated PUT
request to the following URL:
http://Search_Appliance:8000/feeds/command/shutdown
The following example request body shows the result:
<?xml version=’1.0’ encoding=’UTF-8’?> <entry xmlns=’http://www.w3.org/2005/Atom’ xmlns:gsa=’http://schemas.google.com/gsa/2007’> <gsa:content name=’command’>reboot</gsa:content> </entry>