Managing Search for Controlled-Access Content

Use Cases for Public and Secure Serve

This section provides more detailed explanation of how to set up crawl for controlled-access content and how to set up the Google Search Appliance to centralize serve-time authentication.

Back to top

Use Case 1: HTTP Basic or NTLM HTTP Controlled-Access Content with Public Serve


The ABC Company wants to make its controlled-access content discoverable using intranet search. The content is stored on these internal servers:

  • events.abc.int is a simple web server that uses HTTP Basic authentication. This server contains information about internal company events.
  • announce.abc.int is a Microsoft IIS web server that uses Integrated Windows Authentication over NTLM HTTP. This server contains announcements for employees.
  • directory.abc.int is another Microsoft IIS server. This server provides phone and office location information about employees. For the purpose of this example, let’s suppose that content from this server is best provided by a web feed.

All these servers are located on the same domain, abc_corp. Although authentication is required by each of these servers, this information isn’t sensitive. ABC Company wants to serve the snippet results as public content, viewable by any employee. There is no reason to require the search appliance to perform document-level authentication when serving results.

ABC Company has these people who interact with this content:

  • Adam, the system administrator
  • Sandra, the search appliance administrator
  • Eric, an employee who needs to find content

Setting up Crawl and Index

First, the system administrator creates a user account for the search appliance, called ABCsearch, and sets up access policies that ensure that the ABCsearch user account is authorized to view all files on events.abc.int, and announce.abc.int. The feed process on directory.abc.int has its own account with similar permissions, called ABCfeeder.

Next, the search appliance administrator logs into the Admin Console and performs these actions:

  1. To provide the search appliance with credentials for crawl and index, Sandra opens Content Sources > Web Crawl > Secure Crawl > Crawler Access, and adds rows using the account names and passwords given to her by the system administrator:

    For URLs Matching Pattern, Use:

    Username:

    In Domain:

    Password:

    Confirm Password:

    Make Public:

    https://events.abc.int/

    ABCsearch

     

    ******

    ******

    X

    https://announce.abc.int/

    ABCsearch

    abc_corp

    ******

    ******

    X

    https://directory.abc.int/

    ABCfeeder

    abc_corp

    ******

    ******

    X

    Here, omitting the domain for events.abc.int instructs the search appliance to authenticate using HTTP Basic. For all other servers in this example, the domain entry tells the search appliance to authenticate against a Microsoft IIS Server using NTLM HTTP.

    Because Basic Authentication sends credentials as base-64 encoded clear text, the patterns for events.abc.int all use HTTPS, which protects user names and passwords. Although the use of HTTPS is recommended for Basic Authentication, the search appliance can also authenticate over HTTP. Make Public is selected for all URL patterns.

  2. Under Content Sources > Web Crawl > Start and Block URLs, Sandra clicks Add under Start URLs and adds the URL patterns "https://events.abc.int/" and "https://announce.abc.int/".
  3. Sandra also adds the URL patterns "https://events.abc.int/", "https://announce.abc.int/", and "https://directory.abc.int/" under Follow Patterns.
  4. Finally, she clicks Save to save the changes.
  5. She pushes a web feed to the appliance that includes the URLs from directory.abc.int, using the following syntax:
    <record url="http://directory.abc.int/" authmethod="ntlm">

    Because the record has authmethod=ntlm, the search appliance attempts to authenticate using NTLM HTTP when crawling this content.

Now that the search appliance has access to all of ABC Company’s press releases, the search appliance administrator starts the crawl and waits for the controlled-access content to appear in the index.

Populating the Index for Controlled-Access Content

During crawl, the search appliance goes through each of the content sources that have been configured, and uses the credentials under Crawler Access to obtain the controlled-access content.

The search appliance can use multiple protocols to crawl and index controlled-access content.

  • The search appliance connects to events.abc.int over HTTPS. The web server asks for credentials using HTTP Basic Authentication: the search appliance provides the username “ABCsearch ” and the password entered in the Admin Console. The web server verifies that ABCsearch has access to view documents on events.abc.int. The search appliance crawls through all documents on events.abc.int and adds them to the index.
  • The search appliance connects to announce.abc.int over HTTPS. The Microsoft IIS server asks for credentials using Windows Authentication: the search appliance provides an NTLM HTTP message that contains the username “ABCsearch ” and a response based on the password entered in the Admin Console. The IIS server verifies that ABCsearch has access to view documents on announce.abc.int. The search appliance crawls through all documents on announce.abc.int and adds them to the index.
  • The search appliance receives a web feed that directs it to directory.abc.int with authmethod=ntlm. It connects to directory.abc.int over HTTPS. The Microsoft IIS server asks for credentials using Windows Authentication: the search appliance provides an NTLM HTTP message that contains the username “ABCfeeder ” and a response based on the password entered in the Admin Console. The IIS server verifies that ABCfeeder has access to view documents on directory.abc.int. The search appliance crawls through all documents on directory.abc.int and adds them to the index.

Serving Controlled-Access Content to the User as Public Content

ABC Company has decided to make the search results public: the events, announce, and directory servers control access to their content, but employees can discover the information they need by performing a search query.

Eric is an employee of ABC Company. He wants to find an announcement about a colleague’s recent promotion to Director. Eric opens the search page in a web browser and enters a query about “Maria Jones director”. The search appliance performs the following steps before sending Eric to the search results page:

  1. The search appliance checks to see whether any of the content sources require authorization. Although the search appliance had to provide credentials to index the content, the Make Public? checkbox is selected for all of ABC Company’s content sources. All content in the index is labeled as public: no authorization check is required.
  2. The search appliance queries the index and obtains a list of relevant results for Eric’s query.
  3. Eric sees search results from events.abc.int, announce.abc.int , and directory.abc.int that match the query “Maria Jones director”. For instance, Eric finds an all-hands meeting that Maria scheduled from events, a notice about her promotion from announce, and her office phone number and location from directory.

When Eric clicks on one of the links in the search results page, the server that hosts the page requests a response that includes an authentication header. If Eric hasn’t logged in elsewhere, he’ll have to enter a username and password on a login form. Although the search appliance indexed the content as “public,” the server still requires credentials before it displays the full document.

The next time that Eric clicks a link on his search results page, however, his browser forwards an authentication header based on his user name and password to the server. If all the servers in this example are on the same domain and accept the same credentials, Eric shouldn’t have to log in again for as long as he keeps the browser open and the session time hasn’t expired.

Back to top

Use Case 2: One Set of Credentials for Multiple Authentication Mechanisms


AlphaLyon is a multi-national corporation that has various different content servers that use different authentication mechanisms.

  • http://insidealpha.com is the URL for content protected by a single sign-on (SSO) server.
  • apacheserver.alphainside.com is a server for content protected by a custom apache script that uses cookies from the SSO system.
  • comp.alpha.int is a simple web server that uses HTTP Basic authentication. This server hosts some personnel information from North America.
  • pers.def.int is a Microsoft IIS web server that uses NTLM v2 HTTP. This server hosts global personnel information, excluding North America.
  • AlphaLCM is a connector manager with one connector instance that is used to traverse and index information (including some global personnel information) from AlphaLyon’s Documentum content management system.

There is a single corporate-wide set of credentials for each employee.

Currently, when employees search for protected personnel information, they are prompted for their credentials by each authentication mechanism separately. AlphaLyon’s Information Technology department has set an objective to centralize serve-time authentication for the various servers hosting personnel information. This way, users need to provide their credentials only once for content protected by several authentication mechanisms.

AlphaLyon has these people who interact with this content:

  • Ashish, the search appliance administrator
  • Tanya, the search appliance administrator
  • Joseph, a manager who wants to view personnel information about people in his organization

This use case is based on the assumption that Tanya has added a connector for Documentum and the content from the CMS has been traversed and fed into the search appliance. For information about adding connectors, see Introducing Connectors.

Setting Up Crawl and Index

Ashish, the system administrator creates a user account for the search appliance, called ALSearch, and sets up access policies that ensure that the ALSearch user account is authorized to view all files on comp.alpha.int, and pers.def.int.

Next, Tanya sets up crawl and index of the controlled-access content by performing the following steps:

  1. To provide the search appliance with credentials for crawling and indexing comp.alpha.int, which is protected by HTTP Basic Authentication, and pers.def.int, which uses NTLM HTTP, Tanya opens Content Sources > Web Crawl > Secure Crawl > Crawler Access.
  2. Tanya adds the following rows:

    For URLs Matching Pattern, Use:

    Username:

    In Domain:

    Password:

    Confirm Password:

    Make Public:

    http://comp.alpha.int/

    ALSearch

     

    *******

    *******

     

    https://pers.def.int/

    ALSearch

    aphalyon_corp

    *******

    *******

     

    Tanya uses the account name and password for ALSearch that was provided by Ashish, the system administrator. Note that, for http://comp.alpha.int/, the In Domain text box is cleared. This cleared checkbox instructs the search appliance to authenticate using HTTP Basic. For http://pers.def.int/, Tanya supplies the domain, which tells the search appliance to authenticate against the server using NTLM HTTP.

    The Make Public checkbox is also cleared. The search appliance has full access to the server, but labels any results from them as “secure” and requires authentication and authorization checks before displaying secure content in the search results.

  3. Tanya clicks Save.
  4. Next, Tanya needs to provide the search appliance with credentials for crawling and indexing content protected by single sign-on systems (http://insidealpha.com and apacheserver.alphainside.com ), so she opens Content Sources > Web Crawl > Secure Crawl > Forms Authentication.
  5. In the Sample Forms Authentication protected URL box, Tanya enters http://insidealpha.com/inside.html.
  6. In the URL Pattern for this rule box, Tanya enters http://insidealpha.com/ and clicks Create a New Forms Authentication Rule.

    The search appliance proxies the login form.

  7. Tanya enters the credentials for the crawler user account and saves the forms authentication rule.

    The search appliance stores the rule for use in crawl for all content under http://insidealpha.com/. When a cookie expires, the search appliance uses the stored crawler account to request a new session cookie.

  8. Next, Tanya uses the Content Sources > Web Crawl > Secure Crawl > Forms Authentication page to add credentials for crawling and indexing apacheserver.alphainside.com. In the Sample Forms Authentication protected URL box, Tanya enters apacheserver.alphainside.com/alphainsider.html.
  9. In the URL Pattern for this rule box, Tanya enters apacheserver.alphainside.com/ and clicks Create.
  10. The search appliance proxies the login form.
  11. Tanya enters the credentials for the crawler user account and saves the forms authentication rule.

    The search appliance stores the rule for use in crawl for all content under apacheserver.alphainside.com/. When a cookie expires, the search appliance uses the stored crawler account to request a new session cookie.

  12. Next, to get the controlled-access content crawled and indexed, Tanya opens Content Sources > Web Crawl > Start and Block URLs.
  13. Tanya clicks Add under Start URLs and adds the following URL patterns:
    • http://comp.alpha.int/
    • https://pers.def.int/
    • http://insidealpha.com/
    • https://apacheserver.alphainside.com/
  14. Tanya also adds these URL patterns in the Follow Patterns box and clicks Save.
  15. To check that the crawling system is currently running, Tanya opens Content Sources > Diagnostics > Crawl Status. The crawl status indicates that the crawl system is running.

Now that the search appliance has access to all this protected content, it can populate the index, as described in the following section.

Populating the Index with Controlled-Access Content

During crawl, the search appliance goes through each of the content sources that have been configured, and obtains the controlled-access content by using the HTTP Basic Authentication credentials configured on Content Sources > Web Crawl > Secure Crawl > Crawler Access and the forms authentication credentials configured Content Sources > Web Crawl > Secure Crawl > Forms Authentication.

For content on comp.alpha.int, which is protected by HTTP Basic Authentication:

  1. The search appliance connects to http://comp.alpha.int/.
  2. The web server asks for credentials using HTTP Basic Authentication.
  3. The search appliance provides the username “ALSearch ” and the password entered in the Admin Console.
  4. The web server verifies that ALSearch has access to view documents on comp.alpha.int.
  5. The search appliance crawls through all documents on comp.alpha.int and adds them to the index.

For content on pers.def.int, which is protected by NTLM HTTP:

  1. The search appliance connects to pers.def.int over HTTPS.
  2. The Microsoft IIS server asks for credentials using Windows Authentication.
  3. The search appliance provides an NTLM HTTP message that contains the username “ALSearch ” and a response based on the password entered in the Admin Console.
  4. The IIS server verifies that ALSearch has access to view documents on pers.def.int. The search appliance crawls through all documents on pers.def.int and adds them to the index.

For content on http://insidealpha.com and apacheserver.alphainside.com, which are protected by forms authentication:

  1. First, the search appliance connects to http://insidealpha.com/.
  2. The web server asks for a session cookie.
  3. the search appliance recognizes the URL pattern and provides the cookie that was set in the Admin Console under Content Sources > Web Crawl > Secure Crawl > Forms Authentication.
  4. The web server verifies that crawler has access to view documents in the controlled access directory.
  5. The search appliance crawls through all documents on http://insidealpha.com/ and adds them to the index. Because these documents were accessed through a forms authentication rule with Make Public cleared, they are labeled as secure in the index.
  6. Next, the search appliance connects to apacheserver.alphainside.com/ and repeats steps 2 through 5 by interacting with the apache server.

When the crawl completes, the index contains content from the sources.

Setting Up Serve

To centralize serve-time authentication for the protected content, Tanya, the system administrator, configures the Default credential group:

  1. First, to add the single sign-on server http://insidealpha.com to the credential group, Tanya opens Search > Secure Search > Universal Login Auth Mechanisms > Cookie.

Because the Default credential group is already selected, Tanya does not need to select a credential group from the pull-down menu.

  1. Tanya types http://insidealpha.com/inside.html, a sample URL for the site, in the Sample URL box. Options for adding another cookie-based domain appear on the page. The Default credential group is already selected.
  2. Tanya clicks Save.
  3. Next, to add apacheserver.alphainside.com , Tanya types apacheserver.alphainside.com/alphainsider.html , a sample URL for the content protected by a custom apache script, in the Sample URL box and clicks Save.
  4. Next, to add the comp.alpha.int web server, which uses HTTP Basic authentication, to the credential group, Tanya clicks the HTTP tab.

The Default credential group is already selected.

  1. Tanya types http://comp.alpha.int/na.html , a sample URL for the site in the Sample URL box, and clicks Save.

Options for adding another HTTP-based domain appear on the page. The Default credential group is already selected.

  1. To add pers.def.int, which uses NTLM HTTP authentication, Tanya clicks the NTLM checkbox, types pers.def.int/emea.html in the Sample URL box and clicks Save.
  2. Finally, to add the connector manager to the credential group, Tanya clicks the Connectors tab.

The Default credential group is already selected.

  1. Tanya types AlphaLCM, the name of the registered connector manager, in the Connector Manager Name box and clicks Save.

Serving Controlled-Access Content to a User with One Set of Credentials

Joseph is a manager who wants to gather all the personnel records for Pat Smith, an employee who recently joined Joseph’s group from another department. Several systems in the Default credential group contain information about Pat Smith.

The following steps give an overview of the process of serving controlled-access content with Default credential group configured.

  1. Joseph opens the search page in a web browser, enters a query for “Pat Smith,” clicks the public and secure content radio button, and clicks Search.
  2. The Universal Login Form checks the existing cookies that Joseph already has to see whether the credential group is already satisfied.

The authentication mechanisms return a “rejected” response, meaning that the credential group is not satisfied.

  1. The search appliance prompts Joseph by presenting the Universal Login Form.
  2. Joseph enters his username and password on the Universal Login Form and clicks Login.
  3. The search appliance applies Joseph’s credentials to the systems in the Default credential group and checks each sample URL for access.

None are rejected and the Default credential group is satisfied.

  1. The search appliance queries the index and obtains a list of relevant results for Joseph’s query.
  2. The search appliance checks the list to see whether any of the results require authorization and filters the results based on which results Joseph is authorized to view.
  3. The search appliance directs Joseph’s browser to a search results page that contains all results that match the query “Pat Smith” that Joseph is authorized to view.

Use Case 3: Two Sets of Credentials for Two Connectors

AlphaLyon, from use case 2 (see Use Case 2: One Set of Credentials for Multiple Authentication Mechanisms), has acquired ABC company, from use case 1 (see Use Case 1: HTTP Basic or NTLM HTTP Controlled-Access Content with Public Serve). Content for the merged companies is managed by two different content management systems (CMSs).

  • AlphaLyon’s content is managed by the ECM Documentum Content Management System
  • ABC company’s legacy content is managed by Open Text Livelink ECM

Employees of the merged companies have two corporate-wide sets of credentials:

  • 80% of the merged company’s employees have credentials in AlphaLyon’s system.
  • 25% of the employees have credentials in ABC company’s system.
  • 5% of the employees have credentials in both systems.

AlphaLyon’s IT department wants to centralize serve-time authentication for both systems, using both sets of credentials.

AlphaLyon has these people who interact with this content:

  • Tanya, the search appliance administrator
  • Leslie, a employee who joined AlphaLyon as a result of the merger who has credentials in both systems and who wants to view information from both systems

This use case assumes that Tanya has added connectors for Documentum and Livelink and the content from the CMS’s has been traversed and fed into the search appliance. For information about adding connectors, see Introducing Connectors.

Back to top

Creating a Credential Group

Tanya needs to configure two credential groups, one credential group for each of the connectors. However, because she is going to configure the Default credential group for Documentum, she only needs to create one additional credential group, for Livelink.

  1. Tanya opens Search > Secure Search > Universal Login.
  2. Tanya creates a credential group for Livelink by typing the name for the new credential group, ABCLivelink, in the Credential Group Name box.
  3. Tanya types a display name for the new credential group in Credential Group Display Name.
  4. Tanya does not click Require a user-name for this credential group? because no ACLs need it.
  5. Tanya checks Group is optional? because not everyone has a login to this credential group.
  6. Tanya clicks Save.

Adding Connectors to the Credential Groups

Next, Tanya configures the Default credential group and the ABCLivelink credential group by adding the connectors to each group:

  1. First, to add the Documentum connector to the Default credential group, Tanya clicks Search > Secure Search > Universal Login Auth Mechanisms > Connectors.

The Default credential group is already selected.

  1. Tanya types AlphaCM, a mechanism name for this entry in the Mechanism Name box.
  2. Tanya selects the connector instance to be used in the Connector Name box and clicks Save.
  3. Next, to add the Livelink connector to the ABCLivelink credential group, Tanya creates a new entry by selecting the ABCLivelink credential group from the pull-down menu, typing a Mechanism Name, and clicking Save.

Serving Controlled-Access Content to a User with Two Sets of Credentials

Leslie is an employee who works on the “Island” project. She began working on this project in ABC company and continues to work on it after the merger. Both the Documentum and Livelink CMS have information about this project. Leslie wants to view information about project Island from both systems.

The following steps give an overview of the process of serving controlled-access content with two credential groups (Default and ABCLivelink) configured.

  1. Leslie opens the search page in a web browser and enters a query for “Island,” clicks the public and secure content radio button, and clicks Search.
  2. The Universal Login Form checks to see whether the two credential groups are already satisfied.

The authentication mechanisms return “rejected” responses, meaning that neither of the credential groups are satisfied.

  1. The search appliance prompts Leslie for her user credentials (user name and password) for both systems by presenting the Universal Login Form with two logins--one for the system in the Default credential group and one for the system in the ABCLivelink credential group.
  2. Leslie enters her two usernames and passwords on the Universal Login Form and clicks Login.
  3. The search appliance checks her passwords with the connector managers.

Leslie correctly entered her credentials for the system for the Default credential group but mistyped her password for the system in the ABCLivelink credential group. The Default credential group is satisfied, but the ABCLivelink credential group is not satisfied.

  1. The search appliance again prompts Leslie for her credentials for the system in the ABCLivelink credential group by presenting the Universal Login Form.

Because the Default credential group is already satisfied, its login is disabled (grayed-out)

  1. Leslie re-enters her username and password for the system in the ABCLivelink credential group, this time correctly.
  2. The search appliance checks her password with the connector manager. The credential group is satisfied.
  3. The search appliance queries the index and obtains a list of relevant results for Leslie’s query.
  4. The search appliance checks the list to see whether any of the results require authorization and filters the results based on which results Leslie is authorized to view.
  5. The search appliance directs Leslie’s browser to a search results page that contains all results that match the query “Island” that Leslie is authorized to view.

Use Case 4: Windows Authentication with Kerberos Tickets for Secure Serve

AlphaLyon has decided to upgrade older servers and implement a new security policy that uses Integrated Windows Authentication (IWA) on all machines throughout their internal domain. The domain controller is a Windows server named hal.alphalyon.com.

AlphaLyon is going to upgrade the following servers:

  • products.alphalyon.int is a simple web server that uses HTTP Basic authentication. This server contains information about the company’s products.
  • news.alphalyon.int is a Microsoft IIS web server that uses NTLM HTTP. This server contains news announcements.
  • emp.alphalyon.int is another Microsoft IIS server that uses NTLM HTTP. It provides internal information about employees, such as email addresses and phone numbers.
  • sales.alphalyon.int is a web server that uses HTTP Basic authentication. This server stores general information used by everyone on the sales team.
  • customers.alphalyon.int is a Microsoft IIS server that uses NTLM HTTP. It stores customer directory information, such as phone numbers and addresses.

Our search appliance administrator, Tanya, wants to use Kerberos authentication to enable the search appliance to silently authenticate the user without requiring an HTTP Basic login box.

This use case is based on the following assumptions:

  • Tanya has already set up crawl and index for the protected content by providing the search appliance with credentials on Content Sources > Web Crawl > Secure Crawl > Crawler Access.
  • The following two servers have been crawled with Make Public selected: products.alphalyon.int and news.alphalyon.int and their content is public. Content on the other servers is secure.
  • The search appliance has already indexed the protected content.

Once again, AlphaLyon has these people who interact with this content:

  • Ashish, the system administrator
  • Tanya, the search appliance administrator
  • Eric, an employee who needs to find content
  • Salim, a sales manager who needs to find information on pricing for the upcoming “AlphaLyon Product” release.

Back to top

Obtaining a keytab File

Before configuring and activating Kerberos support, Tanya must obtain a Kerberos Service Key Table (keytab) file from the domain controller.

Tanya performs the following actions:

  1. Tanya requests a keytab file for the search appliance from Ashish, the Windows system administrator.
  2. Ashish sends Tanya a keytab file named searchappliance.keytab.
  3. Tanya saves the keytab file on her Desktop.

Configuring and Activating Kerberos Support

Now, Tanya needs to configure the search appliance to check for a user’s session ticket during serve. She also needs to activate Kerberos support:

  1. Tanya opens Search > Secure Search > Universal Login Auth Mechanisms > Kerberos.
  2. Under Specify a Kerberos Key Distribution Center (KDC) / Windows Domain Controller (DC) , Tanya enters hal.alphalyon.com in the Kerberos KDC Hostname box, and clicks Save to save the change.
  3. Under Import a Kerberos Service Key Table (keytab) File, Tanya clicks Choose File and navigates to her Desktop folder.
  4. She selects the keytab file, searchappliance.keytab, and clicks OK to upload the Kerberos key table file to the search appliance.
  5. She clicks Import Kerberos Keytab File to save the change.
  6. In the section labeled Activate IWA (Integrated Windows Authentication) / Kerberos Authentication , she clicks Enable Kerberos support, and clicks Save. Because she is configuring Kerberos support for the Default credential group, she does not need to select a credential group from the pull-down menu.

Now that the search appliance is configured to use Kerberos authentication, any time a user requests secure content, the search appliance attempts to authenticate with the user’s Kerberos session key. No additional setup is needed for secure serve.

Serving Controlled-Access Content to the User as Secure Content with Kerberos Authentication

AlphaLyon now has public and secure search results available on the search appliance, and the search appliance is able to authenticate users against a Windows Domain Controller.

Search by an Authorized User

Salim is looking for a detailed report that discusses sales figures for the new “AlphaLyon Product” release. Salim opens the search page in a web browser and enters a query for “AlphaLyon Product fall sales report”.

The search appliance performs the following steps before sending Salim’s browser to the search results page:

  1. The search appliance queries the index and obtains a list of the most relevant results for Salim’s query. The list of potential results includes announcements about the new AlphaLyon Product release (public content), as well as sales presentations and other sales collateral materials about AlphaLyon Product (secure content).
  2. The search appliance filters the list of results as specified by the front end that applies to Salim’s search. It applies Filters defined in Search > Search Features > Front Ends > Filters and excludes all URLs listed in Search > Search Features > Front Ends > Remove URLs.
  3. The sales collateral materials come from content sources that are labeled “secure”. Before it can serve results for Salim’s query, the search appliance needs more information.
  4. The search appliance checks to see whether Salim has provided credentials that it can use. Salim’s web browser obtains or validates his Kerberos ticket from the network domain controller, which is acting as a Kerberos Key Distribution Center (KDC).
  5. The search appliance sends an authorization request to Salim’s web browser. Because the search appliance is configured to force the use of SSL for secure search, the request is sent over HTTPS. (This configuration is recommended, but optional.)
  6. Because Salim’s Kerberos ticket is valid for use by the search appliance, Salim’s web browser does not display the Universal Login form. His query is silently authenticated through Kerberos.
  7. Salim’s Kerberos ticket is used to generate a session cookie on his computer. The browser sends Salim’s cookie back to the search appliance as an authentication header sent over HTTPS.
  8. Using Salim’s cookie, the search appliance performs an HTTP HEAD request for each of the secure documents in the list of results. If the server returns “HTTP status 401” (not authorized) for a document, or the authorization attempt is inconclusive, the document is removed from the list of potential results. Because Salim is a member of the policy group sales, the search appliance should be authorized to request all of the secure sales collateral materials when passing his credentials.
  9. The search appliance creates a list of search result snippets and URLs that meet all of the following criteria:
  10. URLs match Salim’s search query.
  11. URLs are not excluded by a filter in Salim’s front end.
  12. URLs are not excluded by a Remove URL in Salim’s front end.
  13. The URL is public or Salim has authorization to view the URL.
  14. The search appliance directs Salim’s browser to the search results page that contains all public and secure documents that match the query “AlphaLyon product fall sales report”. Salim should see results from products.alphalyon.int, news.alphalyon.int, emp.alphalyon.int, sales.alphalyon.int, and customers.alphalyon.int.

When Salim clicks on one of the links in his search results page, the browser provides his Kerberos ticket in the authentication header. The next time that Salim performs a search, the search appliance recognizes his session cookie and skips directly to the HTTP HEAD request in step 8. The session cookie set by the search appliance remains valid as long as he keeps the browser open.

The search results page doesn’t tell Salim how many search results match his query or display “Goooooogle” links, since that reveals how many secure documents exist in the index.

Search by an Unauthorized User

Eric isn’t a member of the sales team, but he’s also interested in the new AlphaLyon Product release and wants to know when the sales figures will be posted. Eric opens the search page in a web browser and enters the same query for AlphaLyon Product fall sales report. The search appliance performs the following steps before sending Eric’s browser to the search results page:

  1. The search appliance queries the index and obtains a list of the most relevant results for Eric’s query. The list of potential results includes press releases announcing the new AlphaLyon Product release, as well as sales presentations and other sales collateral materials about AlphaLyon Product.
  2. The search appliance filters the list of results as specified by the front end that applies to Eric’s search. It applies Filters defined in Search > Search Features > Front Ends > Filters and excludes all URLs listed in Search > Search Features > Front Ends > Remove URLs.
  3. The sales collateral materials come from content sources that are labeled “secure”. Before it can serve results for Eric’s query, the search appliance needs more information.
  4. The search appliance checks to see whether Eric has provided credentials that it can use. Eric’s web browser obtains or validates his Kerberos ticket from the network domain controller, which is acting as a Kerberos Key Distribution Center (KDC).
  5. The search appliance sends an authorization request to Eric’s web browser. Because the search appliance is configured to force the use of SSL for secure search, the request is sent over HTTPS.
  6. Because Eric’s Kerberos ticket is valid for use by the search appliance, Eric’s web browser does not display the Universal Login Form. His query is silently authenticated through Kerberos.
  7. Eric’s Kerberos ticket is used to generate an encrypted session cookie on his computer. The browser sends Eric’s credentials back to the search appliance as an authentication header sent over HTTPS.
  8. Using Eric’s cookie, the search appliance performs an HTTP HEAD request for each of the secure documents in the list of results. If the server returns “HTTP status 401” (not authorized) for a document, or the authorization attempt is inconclusive, the document is removed from the list of potential results. Because Eric isn’t a member of the policy group sales, the search appliance fails its authorization check using Eric’s credentials. It removes all of the secure sales collateral materials from the list of potential results.
  9. The search appliance creates a list of search result snippets and URLs that meet all of the following criteria:
  10. URLs match Eric’s search query.
  11. URLs are not excluded by a filter in Eric’s front end.
  12. URLs are not excluded by a Remove URL in Eric’s front end.
  13. The URL is public or Eric has authorization to view the URL.
  14. The search appliance directs Eric’s browser to the search results page that contains all public documents that match the query “AlphaLyon product”. Eric should see results from products.alphalyon.int and news.alphalyon.int, but unlike Salim, he doesn’t see any results from emp.alphalyon.int, sales.alphalyon.int or customers.alphalyon.int.

The search results page doesn’t tell Eric how many search results match his query or display “Goooooogle” links, since that reveals how many secure documents exist in the index.

Was this helpful?
How can we improve it?