What's with all the cache/nocache stuff and weird filenames?

During its bootstrap process, a Google Web Toolkit application goes through a series of sometimes oddly-named files.  These files, generated by the GWT Compiler, usually seem strange to new users.  To effectively deploy a GWT application, however, it is necessary to understand these files so that they can be placed appropriately on the web server.

These are the important files produced by the GWT Compiler:
  • gwt.js
  • <Module Name>.nocache.html
  • <Alphanumeric>.cache.html
Each of the items above is described below.  However, first it's important to understand Deferred Binding since that notion is at the heart of the bootstrap process, so you might want to read this link before continuing.

Before explaining what each file does, it's useful to summarize the overall bootstrap procedure for a GWT application:
  1. The browser loads and processes the host HTML page.
  2. When the browser encounters the page's <script src="gwt.js"> tag, it immediately downloads and executes the JavaScript code in the file.
  3. gwt.js scans the host page's DOM, looking for a <meta> tag with "name" attribute set to "gwt:module".  gwt.js fetches the name of the GWT module to load from the "content" attribute.
  4. gwt.js constructs a URL to a new filename, using the pattern:  <Module Name from Meta Tag>.nocache.html.
    (For example, if your module is com.company.app.MyApp, then gwt.js will look for com.company.app.MyApp.nocache.html.)
  5. gwt.js then creates a hidden <iframe>, inserts it to the host page's DOM, and loads the .nocache.html file into that iframe.
  6. The .nocache.html file contains JavaScript code that resolves the Deferred Binding configurations (such as browser detection, for instance) and then uses a lookup table generated by the GWT Compiler to locate one of the .cache.html files to use.
  7. The .nocache.html file then does a location.replace() call, replacing itself with the chosen .cache.html file.
  8. The .cache.html file contains the actual program logic of the GWT application.

That's the process in a nutshell.  The sections below describe each file in detail.

The gwt.js File
The gwt.js file is comparatively simple.  It is a small amount of meticulously cross-browser JavaScript code that kicks off the GWT startup procedure.  If you are familiar with operating systems, you can think of gwt.js as a sort of bootloader.  Its responsibility is to scan the host HTML page and gather the information required to locate the next phase in the bootstrap process.  Generally this boils down to seeking the <meta> tag that identifies the GWT Module containing your application.

It is possible to specify multiple GWT Modules in a single HTML host page.  The gwt.js code is intended to handle this case, but it is a somewhat rare use case.

The .nocache.html File
The "nocache" file is where Deferred Binding occurs.  Before the application can run, any dynamically-bound code must be resolved.  This might include browser-specific versions of classes, the specific set of string constants appropriate to the user's selected language, and so on.  In Java, this would be handled by simply loading an appropriate service-provider class that implements a particular interface. To maximize performance and minimize download size, however, GWT does this selection up-front in the "nocache" file.

If you were to look inside a .nocache.html file, you would see that it is JavaScript code wrapped in a thin HTML wrapper.  You might wonder why the GWT Compiler doesn't simply emit it as a JavaScript .js file.  The reason for this is that certain browsers do not correctly handle compression of pure-JavaScript files in some circumstances.  This would effectively mean that users unfortunate enough to be using such a browser would download the .js file uncompressed.  Since the GWT mantra is no-compromise, high-performance AJAX code, the GWT Compiler wraps the JavaScript in an HTML file to wiggle around this browser quirk.

The reason the file is named ".nocache.html" is to indicate that the file should never be cached.  That is, it must be downloaded and executed again each time the browser starts the GWT application.  The reason it must be re-downloaded each time is that the GWT Compiler regenerates it each time, but under the same file name.  If the browsers were allowed to cache the file, they might not download the new version of the file, when the GWT application was recompiled and redeployed on the server.  To help prevent caching, the code in gwt.js actually appends an HTTP GET parameter on the end of file name containing a unique timestamp.  The browser interprets this as a dynamic HTTP request, and thus should not load the file from cache.

One of the key features of the "nocache" file is a lookup table that maps Deferred Binding permutations to .cache.html filenames.  For example, "Firefox in English" and "Opera in French" would both be entries in the lookup table, pointing to different .cache.html files.

The .cache.html Files
The "cache" files contain your application's logic.  Like the "nocache" file -- and for the same reason -- the "cache" files are HTML rather than pure JavaScript.

They are named according to the MD5 sum of their contents.  This guarantees deterministic behavior by the GWT Compiler:  if you recompile your application without changing code, the contents of the output will not change, and so the MD5 sums will remain the same.  Conversely, if you do change your source code, the output JavaScript code will likewise change, and so the MD5 sums and thus the filenames will change.

Because of this uniqueness guarantee, it is safe (and indeed preferable) for browsers to cache these files, which is reflected in their .cache.html file extension.

Summary
That is the story behind the somewhat strange GWT file names.  There is indeed a method to the madness:  gwt.js loads the nocache file for Deferred Binding resolution, and the nocache file selects a cache file based on the execution context.
Google apps
Main menu
Search Help Center
true
true
false
false