What's with all the cache/nocache stuff and weird filenames?

During its bootstrap process, a Google Web Toolkit application goes through a series of sometimes oddly-named files. These files, generated by the GWT Compiler, usually seem strange to new users. To effectively deploy a GWT application, however, it is necessary to understand these files so that they can be placed appropriately on the web server.

These are the important files produced by the GWT Compiler:
  • <Module Name>.nocache.js (or <Module Name>-xs.nocache.js for cross-site script inclusion)
  • <Alphanumeric>.cache.html
  • <Alphanumeric>.gwt.rpc
Each of the items above is described below. However, first it's important to understand Deferred Binding since that notion is at the heart of the bootstrap process, so you might want to read this link before continuing.

Before explaining what each file does, it's useful to summarize the overall bootstrap procedure for a GWT application:
  1. The browser loads and processes the host HTML page.
  2. When the browser encounters the page's <script src="<Module Name>.nocache.js"> tag, it immediately downloads and executes the JavaScript code in the file.
  3. The .nocache.js file contains JavaScript code that resolves the Deferred Binding configurations (such as browser detection, for instance) and then uses a lookup table generated by the GWT Compiler to locate one of the .cache.html files to use.
  4. The JavaScript code in .nocache.js then creates a hidden <iframe>, inserts it to the host page's DOM, and loads the .cache.html file into that iframe.
  5. The .cache.html file contains the actual program logic of the GWT application.

That's the process in a nutshell. The sections below describe each file in detail.

The .nocache.js File
In previous versions of GWT, the bootstrap process included the gwt.js file which kicked off the startup procedure. Its responsibility was to scan the host HTML page and gather the information required to locate the next phase in the bootstrap process. Although the formulation using gwt.js still works (read more on this bootstrap process here), it is no longer necessary as much of what used to be done in gwt.js has now been factored to the .nocache.js file.

The "nocache" file is where Deferred Binding occurs. Before the application can run, any dynamically-bound code must be resolved. This might include browser-specific versions of classes, the specific set of string constants appropriate to the user's selected language, and so on. In Java, this would be handled by simply loading an appropriate service-provider class that implements a particular interface. To maximize performance and minimize download size, however, GWT does this selection up-front in the "nocache" file.

The reason the file is named ".nocache.js" is to indicate that the file should never be cached. That is, it must be downloaded and executed again each time the browser starts the GWT application. The reason it must be re-downloaded each time is that the GWT Compiler regenerates it each time, but under the same file name. If the browsers were allowed to cache the file, they might not download the new version of the file, when the GWT application was recompiled and redeployed on the server.

One of the key features of the "nocache.js" file is a lookup table that maps Deferred Binding permutations to .cache.html filenames. For example, "Firefox in English" and "Opera in French" would both be entries in the lookup table, pointing to different .cache.html files.

The .cache.html Files
The "cache" files contain your application's logic. If you were to look inside a .cache.html file, you would see that it is JavaScript code wrapped in a thin HTML wrapper. You might wonder why the GWT Compiler doesn't simply emit it as a JavaScript .js file. The reason for this is that certain browsers do not correctly handle compression of pure-JavaScript files in some circumstances. This would effectively mean that users unfortunate enough to be using such a browser would download the .js file uncompressed. Since the GWT mantra is no-compromise, high-performance AJAX code, the GWT Compiler wraps the JavaScript in an HTML file to wiggle around this browser quirk.

The .cache.html files are named according to the MD5 sum of their contents. This guarantees deterministic behavior by the GWT Compiler: if you recompile your application without changing code, the contents of the output will not change, and so the MD5 sums will remain the same. Conversely, if you do change your source code, the output JavaScript code will likewise change, and so the MD5 sums and thus the filenames will change.

Because of this uniqueness guarantee, it is safe (and indeed preferable) for browsers to cache these files, which is reflected in their .cache.html file extension.

The .gwt.rpc File
In previous versions of GWT, if your application used GWT RPC, the types that you wanted to serialize across the wire had to implement the IsSerializable interface. In GWT 1.4, types that implement the java.io.Serializable interface now also qualify for serialization over RPC, with some conditions.

One of these conditions is that the types that you would like to serialize over the wire must be included in the .gwt.rpc file generated by the GWT compiler. The .gwt.rpc file serves as a serialization policy to indicate which types implementing java.io.Serializable are allowed to be serialized over the wire. For more details on this and other conditions to use Serializable types in GWT RPC, check out this FAQ.

Summary
That is the story behind the somewhat strange GWT file names. There is indeed a method to the madness: nocache.js performs the Deferred Binding resolution and selects a cache file based on the execution context.