Managing Search for Controlled-Access Content

Overview

The Google Search Appliance makes documents in your domain discoverable through search. In addition to public content that is available to everyone, the search appliance can crawl and index documents that require a login and password or another form of authentication. To protect confidentiality at serving time, the search appliance determines whether the user performing the search is authorized to view each document before it displays results.

For instance:

  • You have several sign-on domains, and want to enable employees to search across enterprise content without logging in to many sites.
  • You want to make an article searchable by everyone, but to require that a user supplies a login before they can view the full text.
  • You want to allow the finance team to search through confidential reports on an accounting site. Members of the finance group can search for reports and read them. Employees in other divisions cannot view accounting reports and should never see these documents in their search results.

As the search appliance administrator, you must configure the Google Search Appliance to support these kinds of situations.

Back to top

About this Guide


This guide is intended for the search appliance administrator and developers who need to understand authentication and authorization for the Google Search Appliance. It explains how the Google Search Appliance makes controlled-access content available through search, describes how to configure authentication and authorization, and demonstrates how to make controlled-access content available to authorized users in your organization.

Back to top

Which Sections of this Guide Should I Read?


This guide helps you to answer the following questions:

  • How do I set up my search appliance to crawl and index controlled-access content?
  • Once I have indexed controlled-access content, how do I specify the content that is visible to a user during serve? Public content (access=p ) is available in all search results, while secure content (access=s ) is only visible to authorized users.

Because some methods of accessing controlled-access content do not support secure serve, the answers to these questions depend on your existing access control infrastructure, and whether your content sources require secure serve.

The following table explains which sections in this guide are most relevant for each access method, and provides links to those sections.

Access Method

Access Type

Suggested Crawl Method

Suggested Serve Method

HTTP Basic or NTLM HTTP

Public or secure

Crawler Access (see Configuring Crawl for HTTP Basic or NTLM HTTP)

HTTP Basic or NTLM authentication (see HTTP-Based Authentication)

Access content on a Windows or SMB/CIFS file share

Public or secure

Crawler Access (see Configuring Crawl for HTTP Basic or NTLM HTTP)

Pass user credentials and optionally authenticate with LDAP (see Integrating the Search Appliance with an LDAP Server)

Single login domain: Windows (Kerberos) Authentication for Windows Server or Sharepoint Server

Public or secure

Crawler Access (see Configuring Crawl for HTTP Basic or NTLM HTTP)

IWA (Integrated Windows Authentication) / Kerberos authentication (see Kerberos-Based Authentication)

Single login domain: one set of domain credentials provides access to all content, and the login form does not use frames or JavaScript.

Public or secure

Forms Authentication (see Configuring Crawl for Cookie-Based Access)

Cookie-based authentication (see Cookie-Based Authentication)

Single login domain: one set of domain credentials provides access to all content. The login form is plain HTML. Single or multiple cookie domains.

Public or secure

Forms Authentication (see Configuring Crawl for Cookie-Based Access)

Cookie-based authentication (see Cookie-Based Authentication)

Multiple login domains: more than one set of credentials are required to provide access to all content.

Public or secure

Forms Authentication (see Configuring Crawl for Cookie-Based Access)

Cookie-based authentication (see Cookie-Based Authentication)

Multiple login domains: more than one set of credentials are required to provide access to all content.

Public or secure

Crawler Access (see Configuring Crawl for HTTP Basic or NTLM HTTP) or Forms Authentication (see Configuring Crawl for Cookie-Based Access)

Mixed authentication mechanisms (see Configuring Credential Groups)

For information about specific secure search limitations, see Specifications and Usage Limits.

Back to top

Was this helpful?
How can we improve it?