Can a single Google Mini crawl both Internet and intranet content?
Alternatively, the Google Mini can crawl both internet and intranet content and administrators can use the Google Mini's collection feature to provide separate search engines for publicly available content and proprietary intranet content.
With the Google Mini collections feature, you can divide up your Google Mini to provide great search on multiple web sites in your organization. A collection is a filtered subset of your entire document collection. Once you set up a collection, you can tell the Google Mini which collections you want it to search on each search query just by passing the collection names in as a parameter with the search keywords.
You can use collections to provide rich functionality on multiple parts of your web infrastructure. For example:
1) Provide search for your external web site and your intranet using just one Google Mini
2) Have a general search box on your web site for all documents, and then have another search box on your support site which only searches over the support pages.
3) Create different collections for different categories of pages (for example, product pages, support pages, articles, etc.) and have the user choose which categories they want to search through.
There are a few things to note when using collections:
1) If different collections are to be used by different sets of users (for example, your external web site is open to everyone, but your intranet is only open to employees), you should place a proxy web server in front of your Mini. Users will send search requests to the proxy server, and this could then choose the appropriate collection(s) to serve results from based on the user's permissions.
2) In the case of running your external web site and intranet on one Mini, the proxy server would need to be placed outside your firewall to serve your external web site search. Your intranet users can either go directly to the Mini or to another proxy server inside your firewall.
For a more robust appliance-based solution (no external proxy required), we recommend Google's Search Appliance with its security enhancement package. This provides document level security, including authentication of users (using your existing authentication scheme) prior to showing results which include secure documents. http://www.google.com/enterprise/gsa/