Writing File Type Plug-Ins for Grail


Introduction

Web servers can return files of many different types. The HTTP protocol uses the naming mechanism for file types that was pioneered by the MIME standard (RFC 1521): file types are of the form major/minor where major is a general "class" of file types, e.g. text, image or audio; and minor is a specific subtype of that class. Here are some common file types:

text/plain
a regular text file without mark-up
text/html
a text file with HTML mark-up
image/gif
an image file encoded using the GIF standard for image files
image/jpeg
an image file encoded using the JPEG standard
audio/basic
a sound file encoded using the digital telephony standard mu-LAW
application/octet-stream
a stream of bytes ("octets" in politically correct ISO-speak) with no further (known) interpretation implied

Grail has support for a small number of file types built into in: text/html, text/plain, and (as a special case) text/*, meaning any subtype of the text class, but there may be other types for which you would like to see support. If Grail doesn't have support for a file, it will offer to save it for you.

If a file type is best handled by passing it to an external application (e.g. an image viewer), you should use the MAILCAP mechanism (RFC 1524).

If a file type could be displayed in a multi-font text window using a suitable parser (like HTML), you can provide a parser yourself by writing a File Type Plug-In module.

Location of File Type Plug-In Modules

A File Type Plug-In module for a file type major/minor is imported the first time a file of that type is encountered. The module file, residing in the Grail file type plug-ins directory or in the Grail source directory, should be named major_minor.py (with any non-alphanumeric characters in the file type replaced by underscores). E.g. to provide a parser for type text/sgml you'd create a module text_sgml.py.

It is also possible to create a module that provides a parser for all file types of a particular class (except for those subtypes for which an explicit parser has been provided). In this case the module should be called major.py. E.g. to provide a parser for all subtypes of class text, you'd create a module text.py.

The Grail file type plug-ins directory is a subdirectory named filetypes of the Grail user directory. On Unix, the Grail user directory is given by the environment variable $GRAILDIR, or, if that variable is undefined or empty, the subdirectory named .grail of the user's home directory.

Contents of File Type Plug-In Modules

The module File Type Plug-In module should define a parser class named parse_major_minor, or parse_major if the module is called major.py. The parser class is instantiated as follows:
    parse_major_minor(viewer, reload)

where viewer is the Viewer object that will display the actual text, and reload is a flag that is true if the file is being parsed in response to a "Reload" command by the user -- in this case, composite file types that contain references to other objects (like embedded images or applets in HTML) may want to pass this flag on to the mechanism for loading those embedded objects.

Parser Interface

The parser class should define the following methods:
feed(data)
This method is called any number of times to "feed" the parser with string data. The length of the data string passed in is arbitrary -- it is determined by buffering algorithms in the network and cannot be predicted. The parser should therefore cope with the case that the data string ends with an incomplete token, and buffer any unconsumed characters internally until the next call to feed() or close().
close()
This method is called after all data has been fed to the parser, indicating the end of the data. No more interaction with the parser will take place after this call has been made.