Search
Clear search
Close search
Google apps
Main menu
true

Specifications and Usage Limits

November 2017

This document provides information about system limits in GSA 7.6. Many of these limits are configurable and have default values that you can change from the Admin Console. These limits might have recommended values for acceptable performance or a maximum value that is enforced by the search appliance and cannot be exceeded. Other limits are built into the system and cannot be changed. You can use this information to help you plan your deployment and to configure your system for optimal performance. The information is organized into sections based on feature.

Google recommends accepting default values or customizing settings before crawling starts. To apply changed settings to documents that have already been crawled and indexed, the search appliance must recrawl and reindex those documents.

Content Sources

Web Crawl Default Limit Enforced Notes
Maximum file size to download
  • Text/HTML files
20MB 2048MB Yes The search appliance downloads text and HTML documents up to the maximum file size. If a document is larger than the set value, the search appliance truncates the file and discards the remainder.

The amount of each document that is indexed is determined by the value of Amount to index per document in the Admin Console. See Amount of each document to index.

Maximum file size to download
  • Other file types (not HTML or text)
100MB 2048MB Yes Other document types that are within the set value are converted to HTML. If a document is larger than the set value, the search appliance discards the file completely.

Compressed document types, such as Microsoft Office 2007, might not convert properly if the uncompressed file size is larger than the set limit.

The amount of each document that is indexed is determined by the value of Amount to Index per Document in the Admin Console. See Amount of each document to index.

Maximum number of URLs to crawl Number of URLs allowed by your license See table below Yes Decreasing this value could result in removing some documents from the index or preventing new content from being indexed.

If your index already contains the maximum number of URLs, or your license limit has been exceeded, the search appliance reduces the number of indexed documents as follows:

  • Documents are removed to bring the total number of documents to the license limit.
  • Documents with the lock attribute set to true are deleted last.
Maximum number of start URLs, follow patterns, or do not follow patterns No upper limit No There is no limit on the number of URLs you can add in the Admin Console, however the number of URLs that the search appliance can crawl is limited by your license (see table below).

Also note that adding a large number of unique hostnames to your start URLs can impact performance and cause a proxy error.

Maximum number of concurrent web server connections (host load) 4 Depends on the Web server's load capacity Yes Increasing the host load can speed up the crawl rate but it puts more load on your Web servers. Google recommends that you experiment with the host load settings at off-peak times or in controlled environments so that you can monitor the effect it has on your Web servers.

The number of connections might drop below the set value if the search appliance must reduce the crawl rate to achieve an acceptable response time, depending on system activity.

You can specify the host load as a decimal value, such as .5 or 2.0, with up to two decimal places. The search appliance behaves differently depending on the value you set:

  • A value of 1 or more sets the average number of connections within any one minute window. For example, a value of 2.0 indicates that, on average, the search appliance opens two concurrent connections to each host.
  • A decimal value of 1 or more also sets an average number of connections. For example, a value of 3.5 indicates that, on average, the search appliance opens 3.5 concurrent connections to each host. Further, a value of 3.5 indicates that 50% of the time, three connections are open and 50% of the time, four connections are open.
  • A decimal value under 1 sets the percentage of time during which the search appliance opens connections. For example, a value of .25 indicates that, on average, connections to the Web or file server are opened 25% of the time, or for approximately 15 seconds per minute. The search appliances bases the number of connections on the responsiveness of the server.
  • A value of zero (0) stops content from being crawled.
Exceptions to Web server host load Depends on the Web server's load capacity No You can set different host load values for individual URL patterns. Note that the search appliance implements host load settings based on IP address. This means that if you add multiple URL patterns for the same IP address (host), the search appliance applies the same default host load value to all of the patterns, even if you set different host load values for the different URL patterns.

When crawling URLs with the same IP address, the search appliance crawls the URL pattern at the higher directory level first (the less restrictive pattern). For example, given the following URL patterns, the second URL is crawled first, regardless of the individual host load settings:

http://www.google.com/enterprise/search/
http://www.google.com/enterprise/

Also, having multiple host load exceptions for the same IP address can negatively affect crawling for that host.

Maximum transmission unit (MTU) --- 1500 bytes Yes Your Web server's MTU must match its requirements and be compatible with the MTU of the search appliance. Consult your Web server's documentation to configure its MTU.

The number of URLs that your search appliance can crawl depends on the model and license limit. The following table lists the maximum number of URLs that the search appliance can crawl for the crawl patterns that you define.

Search Appliance Model Maximum License Limit Maximum Number of URLs that Match Crawl Patterns
G100 20 million ~ 26 million
G500 100 million ~ 133 million

For more information, refer to Administering Crawl and the Admin Console Help.

Feeds Default Limit Enforced Notes
Maximum size of XML feeds 1GB Yes If your feed is larger than 1GB, consider breaking the feed into smaller feeds that can be pushed more efficiently.
Maximum size of files in a feed See Maximum file size to download. Files greater than the maximum size are truncated at that limit and the remainder of the file is discarded before being converted. This limit applies to both content feeds and Web feeds (metadata-and-url).

Binary files that are truncated might not be properly indexed due to HTML conversion problems. Ensure that binary feed documents are within this limit.

Compressing feed content does not change the feed file size limitations.

Amount of each feed record to index See Amount of each document to index. This limit applies to both content feeds and Web feeds (metadata-and-url).

For more information, refer to Feeds Protocol Developer's Guide

OneBox Modules Default Limit Enforced Notes
Maximum number of OneBox modules 0 No upper limit No A OneBox module can be used with any of the front ends on the search appliance.

There is no limit on the number of OneBox modules you can add to a front end, although the number of OneBox modules used is limited by the value of Maximum number of OneBox results per search on the Content Sources > OneBox Modules page in the Admin Console.

Maximum number of OneBox modules used per search 2 20 Yes Google recommends that you use no more than three OneBox modules for searches to prevent usability issues when displaying results.
Maximum number of search results per OneBox module

external provider: 8

internal provider: 3

external provider: 8

internal provider: 10

Yes

If you use the default OneBox stylesheet template, the search appliance returns a maximum of three OneBox results.

The maximum search results for an internal provider is determined by the value of Search results on the Content Sources > OneBox Modules page in the Admin Console.

Maximum number of additional fields returned in a search result from an external OneBox provider 8 8 Yes  
OneBox response timeout 1000 milliseconds No upper limit Yes Specifies how long the search appliance waits for a response from the OneBox provider. Must be at least one millisecond.

The search appliance adds two seconds (2000ms) to the value you set in the Admin Console.

Although this value has no upper limit, the search appliance will not wait longer than the configured query processing time.

OneBox name 32 characters Yes Names must begin with an alphabetic character, and can contain only ASCII alphanumeric characters, underscores (_), hyphens (-), and dots (.). After you create the module definition, you cannot change its name if you later edit the OneBox module configuration.
In a unified environment, OneBox module configuration is available only on the primary search appliance. This means the primary search appliance only serves results from OneBox modules configured on the primary search appliance, not OneBox modules configured on the secondary nodes. Because spelling checkers are enabled as OneBox modules, spelling check is available only for documents indexed on the primary search appliance.

For more information, refer to the OneBox for Enterprise Developer’s Guideand the Admin Console Help.

Diagnostics Default Limit Enforced Notes
Crawl queue name 20 characters Yes Queue names can contain ASCII or non-ASCII characters, hyphens, and underscores. The name cannot start with a hyphen.
Number of URLs included in a crawl queue snapshot 0 100,000 Yes A crawl queue snapshot shows the set of URLs that are overdue to be crawled and the URLs that the search appliance is waiting to crawl.
Maximum number of URLs exported in diagnostics reports in .xml format 0 10,000 URLs Yes The index diagnostics report shows the crawl status of all URLs configured on the search appliance. This limit applies to reports in list format that are exported as .xml files using standard Google Sitemaps Protocol format.
Number of links to crawled pages, as shown in the index diagnostics report 0 2000 links Yes If the search appliance crawls and indexes more than 2000 links in a document, for example 2500, the index diagnostics report shows a maximum value of 2000.

Index

Index Settings Default Limit Enforced Notes
Amount of each document to index 2.5MB 10MB Yes Determines how much of each text or HTML document the search appliance indexes, including documents that have been truncated or converted to HTML. After indexing, the search appliance caches the indexed portion of the document and discards the remainder. URLs and metadata in the discarded portion are not indexed.

How much of a document is downloaded before indexing is determined by Maximum size to download in the Admin Console. See Maximum file size to download.

Maximum instances of a word indexed per document 2000 Yes The search appliance indexes up to the first 2000 instances of any word in a document.
Document dates used for sorting last-modified-date returned in the HTTP headers Dates found in the URL, title, body, or meta tags Yes Results without a date are displayed after results with dates, sorted by relevance.
Indexable file formats For a complete list of supported file formats that the search appliance can crawl and index, refer to Indexable File Formats.

 

Entity Recognition Default Limit Enforced Notes
Maximum number of entities per document 50 100 Yes Entity recognition works for languages that are read left to right. It does not work for languages that are read right to left.

For metadata limits, see Maximum number of characters returned for each meta tag in the Search table below.

Maximum number of words in an entity term 20 100 Yes An entity term can be formed by one or more words separated by spaces.

 

Collections Default Limit Enforced Notes
Maximum number of collections One collection named default_collection No upper limit No Creating more than 200 collections can impact performance when importing or exporting a configuration, and when listing the collections.
Collection name 200 characters Yes Names can contain only alphanumeric ASCII characters, underscores, and hyphens. Names cannot begin with a hyphen.
Composite collection name 200 characters Yes Names can contain only alphanumeric ASCII characters, underscores, and hyphens. Names cannot begin with a hyphen.

For more information, refer to Administering Crawl and the Admin Console Help.

Search

Search Results Default Limit Enforced Notes
Maximum number of search results for a single query 100 1000 Yes You can increase the default number of search results by using the num query parameter.
Maximum number of results when using the link: query term in search requests 25 1000 Yes The link: query term lists Web pages that have links to the specified Web page.
Result biasing policy name 200 characters Yes Names can contain only alphanumeric ASCII characters, underscores, and hyphens. Names cannot begin with a hyphen.
Number of user results 0 ~ 100,000 No
  • More than 100,000 user results could cause performance issues.
  • Works for Web URLs only, not SMB:// paths.
User result name 23 characters Yes Result names must begin with an alphabetic character and can contain only alphanumeric ASCII characters, underscores, and hyphens.
Snippet length in search results 160 characters 1024 characters Yes
  • Snippets beyond the first 300KB of the document, including its HTTP headers, are not displayed or returned.
  • Snippets for CJK languages are by default 240 characters.
Maximum number of characters returned for each meta tag For Latin characters: 1500 characters, including meta tag name and its contents

For characters in multibyte languages (Chinese, Japanese, and Korean): 500 characters

Yes The search for meta tags is case-insensitive. Use only whole words in the getfields parameter, not partial words or word “stems.”
Number of meta tag matches displayed in search results No upper limit No Meta tags beyond the first 300KB of the document are not displayed or returned.
Number of characters allowed when using the inmeta: query term to filter results 128 characters Yes The 128 character limit includes the inmeta term and the meta tag name and value. This limit is calculated based on the escaped form of the term and meta tag name/value.

For more information, refer to the Search Protocol Reference.

Search Requests Default Limit Enforced Notes
Search request length using the GET command. 2KB 2KB Yes Query strings that exceed this limit are truncated. To submit longer query strings use the POST command.
Search request length using the POST command. 10KB 10KB Yes Note that the search appliance rewrites the query that was input by the user, which means that the final query string is longer than the input string. The POST body can be as large as 10,240 characters.
Maximum length of query terms in search requests 128 characters Yes Limit does not include punctuation or spaces.
Number of query terms allowed in search requests 50 150 Yes
  • Query terms beyond the first 150 are ignored. The search results do not indicate if excess query terms were ignored. This limit includes query terms in parameter q and in any parameters starting with as_.
  • Phrase search using double quotation marks (") does not reduce the number of query terms.
  • Phrase search will not find a document if the phrase includes a word that has more than 2000 instances earlier in the document. The search appliance indexes only the first 2000 instances of any word.
  • Query expansion is disabled when a query contains a special query term, such as inurl: or allintitle:
  • Search queries with special query terms, such as inmeta: or info:, are excluded from the database of query suggestions.
  • Search queries against multiple collections (using AND or OR boolean operators) are excluded as query suggestions.
Number of site or as_sitesearch parameters allowed in search requests 1 Yes  

 

Query Expansion Files Default Limit Enforced Notes
Query expansion file name 20 characters Yes The name that you assign to a query expansion file can contain only ASCII characters, underscores, or hyphens. This name does not have to match the actual file name, which can contain any UTF-8 characters.
Maximum size of a query expansion file (synonyms, blacklist, or stopwords) 3MB Yes  
Total number of query expansion files (synonyms + blacklist + stopwords) that can be uploaded 300 Yes Query expansion files that are enabled on a search appliance apply to all front ends whose query expansion policy is set to Local or Full. To add terms for a specific front end, consider adding related queries instead.
Maximum number of lines in a query expansion file 75,000 No Adding more lines can cause the search appliance to return a server error.
Maximum number of terms per line in a query expansion file 100 Yes
  • Adding more than 32 terms to an entry in a local query expansion file can reduce search performance and impact the quality of search results. Each entry must include at least two terms.
  • A file can contain any number of synonyms for a particular search term, for example:

    {joe, joey, joseph}
    {joe, josephine, jo}

Maximum number of words allowed per query term 4 Yes Each term can include multiple words separated by spaces, for example:

{GSA, Google Search Appliance v7}

Only alphanumeric characters and spaces are allowed. You can use spaces instead of hyphens.

Additional requirements for local query expansion files Yes
  • File must use UTF-8 encoding if entries have accented characters. Latin1 is not allowed.
  • The following characters are not allowed in the file: !"#$%()*,-/.:;<?@[\]^'{|}~
  • Entries are case-sensitive.
  • Comment lines must start with the pound sign (#).

 

Front Ends Default Limit Enforced Notes
Maximum number of front ends One front end named default_frontend No upper limit No Creating more than 200 front ends can slow performance when importing or exporting a configuration.
Search box length 32 characters Limited only by display space Yes This option determines the size of the search box that displays. Users can type search queries beyond the size of the box.
Title length in search results page 70 bytes Yes Specifies the length of the title that displays in search results.
Maximum number of KeyMatch results 3 50 Yes You can increase the default number of KeyMatch results by using the numgm query parameter.
Maximum length of KeyMatch title 1000 Yes Title that displays for KeyMatch results.
Dynamic result clusters Display at the side of search results Side or Top Yes
  • Dynamic result clustering requires that approximately 1000 relevant results be returned before it can display result categories.
  • Dynamic result clusters are not supported when dynamic navigation is enabled.
  • Results must display at top to work with sidebar elements through the Page Layout Helper.
  • Using dynamic result clusters with secure search can significantly increase serving latency. Google recommends using dynamic result clusters for secure search only if the expected number of concurrent users is no more than two or three and only if ACLs are used for authorization (early binding).
Maximum expansions per wildcard term 200 1000 Yes
  • Google recommends setting this value to 1000 unless your search appliance experiences unacceptable latency at serve time. Setting the value to zero (0) disables wildcard search.
  • A wildcard query term must satisfy at least one of the following conditions:
    • A sequence of at least 2 characters at the start of a word, for example: go*
    • A sequence of at least 2 characters at the end of a word, for example: *le
    • A sequence of at least 3 characters anywhere in the word, for example: *ear*
  • Enabling wildcard search can impact crawling performance, particularly for feeds with binary content.
  • Wildcard search does not support:
    • Common queries such as filetype, inurl, and intext.
    • Chinese, Japanese, Korean, or Thai languages.

 

Dynamic Navigation Notes
  • The names of attributes that you add must exactly match the meta tag NAME. Ensure that you have the correct name for the metadata attribute before adding a dynamic navigation attribute.
  • You can add an unlimited number of metadata attributes. Adding more than five or six attributes, however, can impact a users ability to view them all in the left navigation column.
  • Maximum number of results returned per metadata attribute is 5000.
  • The combined length of the URL-encoded meta tag name and value has a 118 character limit. If the combined length exceeds 118 characters, the meta tag will be ignored for the purpose of dynamic navigation.
  • Using dynamic navigation with secure search can significantly increase serving latency. Google recommends using dynamic navigation for secure search only if the expected number of concurrent users is no more than two or three and only if ACLs are used for authorization.
  • The search appliance verifies the relevance of up to 30,000 documents for public searches and verifies the relevance and authorization of up to 10,000 documents for secure searches (for the purpose of creating the facets). You can change the default values to any number larger than 0.
    Note: Increasing the number of documents also increases search response time and may impact performance.
  • Dynamic navigation does not aggregate counts for metadata attributes with similar names. For example, "pub" and "publication" might represent the same metadata attribute but their counts are not aggregated. You can use query expansion to add a synonym file with the entry {pub, publication} to include both metadata attributes in one facet. The counts for the two attributes are then aggregated.
  • Currency attributes can support ranges only if the metadata value starts with the $ symbol, for example $23 or $23.00. Values such as 23, 23$ or $ 23 are not supported for range queries. The $ symbol is not required for currency attributes that do not use ranges.
  • Dynamic result clusters are not supported when dynamic navigation is enabled.
  • Sidebar elements are not supported when dynamic navigation is enabled. However, the Expert Search sidebar element is supported with dynamic navigation when users click the Detailed View.
  • To use dynamic navigation in a unified environment, you must have the same front end and the same dynamic navigation attributes configured on all nodes. You must also add a dynamic navigation enabled front end as a remote front end in the master node configuration.

 

Document Preview Default Limit Enforced Notes
Maximum pages per document preview 5 pages 100 pages Yes The search appliance caches document preview images, which can be large and use significant disk space. Setting this value to 0 (which generates previews for all pages in each document), might cause serious performance issues.
Maximum number of document previews available

The number of document previews that can be generated depends on many factors including your available disk space, the number of supported documents in your index, and the values you configure for the following limits in the Admin Console:

  • Amount to Index per Document (Index > Index Settings)
  • Image resolution (Search > Search Features > Document Preview Module)
  • Maximum Pages per Document (Search > Search Features > Document Preview Module)

The larger the size of each generated preview, the fewer preview images that might be available. Once the cache is consumed, no more documents are converted for preview. Generating a large number of document previews can seriously impact performance and may prevent previews from being generated for all of your supported documents.

Document previews are not supported in custom front ends.

Supported file types
  • The search appliance can show preview images in search results for documents in the following formats:
    • Microsoft Word (doc, docx)
    • Microsoft PowerPoint (ppt, pptx)
    • Adobe Portable Document Format (pdf)
  • Document previews are not supported for .doc, .pdf, and .ppt files in zip files.
  • The real time translation feature does not support document previews.
  • Not all fonts are supported.

 

Secure Search Default Limit Enforced Notes
Maximum size of a per-URL ACL 10,000 entries 100,000 entries No Specifies the number of entries (users and groups) that can be added to a per-URL ACL. Adding more than 10,000 entries can reduce serving performance.

There are no limits on ACL inheritance chains but long chains can impact performance.

Maximum number of policy ACLs No upper limit No Adding more than 300,000 policy ACLs can reduce serving performance.
Credential group name 200 characters Yes Names can contain only alphanumeric ASCII characters, underscores, and hyphens. Names cannot begin with a hyphen.
Session Idle Time 1800 seconds (30 minutes) 60 minutes Yes Specifies how long a user's search session can be inactive before timing out. Range is 5 to 60 minutes.
Timeout 3 seconds No upper limit Yes
  • If the search appliance does not make the network connection in the specified time, it abandons the attempt.
  • Limit applies to cookie-based, HTTP Basic, SAML, Connectors, and LDAP authentication.
Trust duration 1200 seconds for HTTP Basic, Connectors and LDAP; 300 seconds for cookie-based No upper limit Yes Specifies how long the authentication mechanism's verification of user credentials will be trusted.

The search appliance prompts a user to provide credentials whenever the session idle timer or trust duration times out. For this reason, Google recommends coordinating the two settings.

Query processing time 20 seconds No upper limit Yes Specifies the maximum number of seconds that the search appliance waits for multiple batches of authorization requests to complete. This value should be larger than the timeout value for a batch of authorization requests.

Increasing the default value enables the search appliance to process more batches of authorization requests but if a content server is unresponsive, performance can be negatively impacted.

Timeout for a batch of authorization requests 5 seconds No upper limit Yes Specifies the maximum number of seconds that the search appliance waits to fully process a single batch of authorization requests. This value should be larger than the timeout value for individual requests.
Timeout for individual authorization requests 2.5 seconds No upper limit Yes Specifies the maximum number of seconds that the search appliance waits for the response to a single authorization request to a web server. This value should be smaller than the timeout value for batch requests.

If you decrease this value, slow servers might be unable to respond to authorization requests in time. User results could be incomplete and skewed toward content on the fast servers. In contrast, if you increase this value, slow Web servers can provide additional results but users will experience longer response times.

Timeouts permitted before host is considered unreachable 100 No upper limit Yes Specifies the maximum number of times the search appliance attempts to contact an unresponsive server before adding it to the cache of unreachable hosts. The value should allow for multiple failed attempts due to fluctuations in server traffic that might cause a normal number of timeouts.
Timeout measurement period 300 seconds No upper limit Yes Specifies the time period during which the timeouts permitted parameter is applied. The value should be large enough to accommodate short-lived server unavailability.
Duration of unreachable host cache entry 600 seconds No upper limit Yes Specifies the maximum number of seconds each item is maintained in the cache.
Supported serve-time authentication methods
  • Cookie-based authentication
  • HTTP Basic or NTLM HTTP
  • Kerberos authentication against a domain controller
  • The SAML Authentication Service Provider Interface (SPI)
  • LDAP
  • Digital certificates and certification authorities
Supported Single Sign-On systems
  • Computer Associates SiteMinder 6.0, Policy Server and Web Agent
  • Oracle Access Manager 7.0.4 (formerly Oblix)
  • Cams by Cafesoft, version 3.0

For more information, refer to Creating the Search Experience, Managing Search for Controlled-Access Content, and the Admin Console Help.

Reports

Reports Default Limit Enforced Notes
Maximum number of search reports 500 Yes Search reports remain available for one year from the creation date.
Search report name 20 characters Yes Report names can contain ASCII or non-ASCII characters, hyphens, and underscores. The report name cannot start with a hyphen.
Number of top queries and keywords to include in report 100 No upper limit No  
Maximum number of search log reports to generate and retain 100 Yes The raw search log data is maintained for 90 days and then automatically deleted.
Search log name 20 Yes Report names can contain ASCII or non-ASCII characters, hyphens, and underscores. The report name cannot start with a hyphen.

Administration

Administration Default Limit Enforced Notes
Account username 200 characters Yes Usernames can contain only alphanumeric ASCII characters, underscores, and hyphens. Names cannot begin with a hyphen.
SNMP username No limit on length No SNMP usernames can contain only alphanumeric ASCII characters.
Authentication/Authorization passphrase 30 characters Yes  

Multibox Configurations

GSA Unification

In GSA release 7.6, GSA unification is deprecated. It will be removed in a future release.

  • You cannot configure both a unified environment and distributed crawling and serving on the same set of search appliances. Configure either a unified environment or the distributed crawling and serving feature.
  • You cannot combine a unified environment with GSA mirroring in a GSAn configuration. You can, however, configure mirroring for a node participating in a unification configuration.
  • The primary search appliance only serves results from OneBox modules configured on the primary search appliance, not OneBox modules configured on the secondary nodes. Because spelling checkers are enabled as OneBox modules, spelling check is available only for documents indexed on the primary search appliance.

For more information, refer to Configuring GSA Unification and the Admin Console Help.

GSA Mirroring

GSA mirroring requires a sustainable 1MB per second file transfer rate between the master Google Search Appliance and each replica search appliance. To determine whether a network can provide the required file transfer rate, we recommend that you measure the rate by transferring files on your network between the subnets where the search appliances are located. If the file transfer requirement is not met by the network, the mirroring feature might not work as expected.

Your installation must meet the following requirements to participate in a GSA mirroring configuration:

  • All search appliances must be on the same software version. For example, you cannot have one search appliance in the configuration on version 7.4 and another on version 7.6, or one search appliance on 7.4.0.G114 and another on 7.4.0.G120. When you update from one software version to the next, ensure that you update all search appliances in the configuration.
  • All search appliances in the configuration must be licensed for the same document count.
  • The search appliances must be able to contact each other on TCP ports 8000 and 8443 while mirroring is being set up, and any time a change to the mirroring configuration is made.

The search appliance models you have determine which machine is the master and which can be replicas. The primary consideration for choosing the master is that its license count cannot exceed the maximum license limit of the replica search appliance.

Master Search Appliance Replica Search Appliances
G100 G100
G500
G500 G500

For more information, refer to Configuring GSA Mirroring and the Admin Console Help.

Distributed Crawling and Serving

The following limitations apply to distributed crawling and serving:

  • All search appliances must be in the same data center. If you need to serve results from geographically dispersed locations and have other scalability and serving needs, consider GSA Unification (see Configuring GSA Unification).
  • Each document is stored on a single node in a distributed crawling and serving configuration. If a node fails, all documents stored on that node will be missing from the search results. To ensure that all documents are included in search results after a node failure, configure GSA Mirroring (see Configuring GSA Mirroring) for the nodes in the distributing crawling and serving configuration.
  • All feeds must be processed by the master node. This can create a performance bottleneck if you have a large number of content feeds.
  • You can use Google Search Appliance connectors with distributed crawling and serving. However, Google recommends that you use an external connector manager and connector, installed on a separate host computer, instead of the internal connectors because 7.6 onboard connectors are removed from the GSA. For information about using external connectors, see the appropriate connector documentation.
  • You cannot create both a unified environment and distributed crawling and serving on the same set of search appliances. Configure a unified environment or distributed crawling and serving.
  • Composite collections work only with unified environments. Do not create composite collections when you have distributed crawling and serving enabled.
  • Search appliances with different license limits can participate in a single distributed crawling and serving setup. The license limit is enforced on each search appliance independently, while URLs are distributed evenly to each search appliance regardless of its license limit. If a search appliance reaches its license limit, additional URLs that are allocated to it will not be indexed or served.

For more information, refer to Configuring Distributed Crawling and Serving and the Admin Console Help.

Was this article helpful?
How can we improve it?