URL patterns
URL patterns are used to specify what pages you want included in your custom search engine. When you use the control panel or the Google Marker to add sites, you're generating URL patterns. Most URL patterns are very simple and simply specify a whole site. However, by using more advanced patterns, you can more precisely pick out portions of sites.
For example, the pattern 'www.foo.com/bar' will only match the single page 'www.foo.com/bar'. To cover all the pages where the URL starts with ' www.foo.com/bar', you must explicitly add a '*' at the end. In the form-based interfaces for adding sites, 'foo.com' defaults to '*.foo.com/*'. If this is not what you want, you can change it back in the control panel. No such defaulting occurs for patterns that you upload. Also note that URLs are case sensitive - if your site URLs include capital letters, you'll need to make sure your patterns do as well.
In addition, the use of wildcards in URL patterns allows you to include or exclude multiple pages or portions of a site all at once. The following patterns illustrate how you can use wildcards:
- The wildcard pattern 'www.webmd.com/hw/cancer/*bar' specifies all the URLs that begin with ' www.webmd.com/hw/cancer/' and contain 'bar'.
- The prefix pattern 'www.webmd.com/*' specifies all the URLs that begin with ' www.webmd.com', i.e. all the URLs on the www.webmd.com site.
- The exact-match pattern ' www.webmd.com/' specifies only the URLs 'http://www.webmd.com/' and 'https://www.webmd.com/'.
More detailed examples are included in this table:
| Pattern | Description | Matches | Does not match |
| www.example.com/ | Matches a single page |
www.example.com/ example.com/ |
host.example.com www.example.com/stamps |
| www.example.com/* | Matches all URLs beginning with www.example.com or example.com |
www.example.com www.example.com/stamps example.com/stamps |
host.example.com/ host.example.com/stamps |
| www.example.com/*kites | Matches all URLs that begin with www.example.com/ or example.com/ and contain the word "kites" |
www.example.com/kites.html www.example.com/kites/page2.html www.example.com/funwithkites.html |
www.example.com www.example.com/stamps |
| www.example.com/product.asp*cat=Elec | Matches all URLs that begin with www.example.com/product.asp and contain the term 'cat=Elec' | www.example.com/product.asp?sku=20283&cat=Elec |
www.example.com www.example.com/stamps |
| www.example.com/*kites*fly | Matches all URLs that begin with www.example.com/and contain the words "kites" and "fly" |
ww.example.com/kites/howto/fly.html www.example.com/fly/howto/kites.html |
www.example.com/kites/help.html www.example.com/help/fly.html |
| *.example.com/* | Matches all sub-domains under example.com |
www.example.com/stamps host.parent.example.com/kites example.com/kites/fly.html |
example.host.com |
Adding "top-level domains" such as '*.com', '*.travel/*' is not permitted in Google site search. Adding "top-level domain" returns a "forbidden" error in Google site search - search results page.