(See also common preferences panel help.)
Using the Grail Cache
About the cache
Grail uses a persistent disk cache to reduce the cost of loading
resources across the network. In many cases, a copy of the resource is
kept on your local disk so that future access can use the local copy
instead of making a network connection. Resources like HTML pages and
in-line images are good candidates for caching.
Grail is conservative about making caching decisions: It does not,
for example, cache resources that are created dynamically and it
allows you to specify how long a resource should stay in the cache
before a new copy is retrieved. Under the factory default settings,
the cache stores 1 megabyte of recently used data in your .grail
directory.
There is a detailed discussion of how Grail caches items below. If
you suspect that your browser is displaying a cached resource that is
out of date, you can always use the Reload command to force Grail to
retrieve a new copy.
Preferences for controlling the cache
- Size
- The maximum amount of data that can be stored in the cache. If a
new resource is loaded and the cache is already full, an older
resource is deleted from the cache. (Also see the note below about the size
of the cache's log.)
- Directory
- The directory where cached resources are stored. If the
directory's path is not absolute, it is interpreted as being relative
to your .grail directory.
- When you change cache directories, the old directory and its cached
resources are preserved and the new directory is checked for an
existing cache. If no cache is found in the new directory, a new one
is created. (Thus, you could have two different caches and switch
between them.)
- Erase cache now
- Clicking this button will erase the entire contents of the current
cache directory and start a new cache.
- Repair cache
- Grail keeps a log of all the resources in the cache (in a text file
called 'LOG'). Under unusual circumstances, it is possible that a
resource will be stored in the cache directory but will not be listed
in the log; this wastes disk space, but shouldn't have any other ill
effects. Clicking the repair button will delete any file in the cache
directory that isn't listed in the log.
- This "lost file" problem is only known to occur when you run more
the own Grail process simultaneously. (But you probably don't want to
do that; see the note below.)
- Verify document
- The verify document preference controls what Grail does to ensure
that the cached resources are kept up-to-date. When Grail verifies a
document, it sends a standard HTTP request that uses the
'If-Modified-Since' header to that asks the server to return a copy of
the resource only if it has changed since the cached copy was made.
- Verify: Always
- Every time you look at a page, Grail checks with the server.
- Verify: Once per session
- A resource is checked the first time it is used during any
session. (A session starts when you start a new Grail process or when
you changed cache directories.)
- Verify: Never
- Always assume that the cached copy is up-to-date. To get a new
copy of the resource from the server, you must use the Reload command.
- Verify: Ever Q hours
- Check with the server if more than Q hours have elapsed since the
last time the server was asked about a particular resource. (Q can be a
fractional part of an hour.)
Detailed explanation of caching decisions
Caching World Wide Web resources introduces a number of
complexities that can make it difficult to determine if the page you
see in your browser is the same as the page that is currently on the
original server. (Proxy servers that implement caching further
complicates the issue.) The following lists provides a detailed
discussion of how Grail decides when to cache objects and what steps
it takes to verify that cached copies of resources are not
out-of-date.
When to cache an object?
- Only cache objects referenced by http, ftp, or hdl URLs.
- Do no cache a resource if it the URL includes a query or if the
resource was retrieved with an HTTP post.
- Do not cache an item if the HTTP server includes a "Pragma:
no-cache" header in its response.
- An HTTP server can also attach a specific expiration date to a
resource, using the "Expires:" header. Expires headers are honored.
- Do not cache a resource that is larger than one quarter of the
total space allotted to the cache.
Warnings and bugs
- Running more than one copy of Grail. If you run
more than one Grail process simultaneously (but not if you have more
than one browser window open using a single process), the
cache will get very confused. The two processed will fight over the
cache, and the last one to write to disk will win.
- The size of the cache log. The cache size is
measured without considering the size of the cache's log. If you set
the cache size to 100KB, for example, it is possible for the cache
directory to contain 100KB of cached resources plus an
arbitrarily large log file. The log can be come rather large during a
long session, but it is compacted at the end of each session.
- Does not verify in-line images. When a page
contains in-line images, they are not verified even when the
containing page is verified. However, using the Reload command will
load new copies -- and if a server returns a new copy of a resource
being verified, any in-line images it contains will be reloaded.
- Image cache leaks memory. Images are handled
differently than other Web resources, and this release of Grail still
has some problems with them. Grail keeps a special cache of Tk image
objects, but cached images are never deleted; so as you load more and
more images into Grail, the application uses more and more memory. We
intend to fix this problem.
- HTTP servers and bad dates. Grail tries to steer
clear of your personal life, so if you want to date an HTTP
server... Seriously, some HTTP servers produce poorly formatted date
strings in HTTP responses. Some can be parsed, but are ambiguous
(e.g. a two-digit year field), while others cannot even be parsed. If
a date string can't be parsed, it is interpreted as 0 seconds (after
the epoch), which will either prevent the resource from being cached
(if it is an Expires: date) or cause it to always be marked as
out-of-date.