Triggered cache manager for HTTP Server

This topic provides information about the triggered cache manager, page assembly, and wrappers.

Important: Information for this topic supports the latest PTF levels for HTTP Server for i5/OS . It is recommended that you install the latest PTFs to upgrade to the latest level of the HTTP Server for i5/OS. Some of the topics documented here are not available prior to this update. See http://www.ibm.com/servers/eserver/iseries/software/http/services/service.htm Link outside Information Center

for more information.

The triggered cache manager (TCM) function provides a means to manage cached copies of static and dynamically produced Web pages including those generated by CGI programs, Net.Data^®, or Java™ servlets. The triggered cache manager function does not provide a cache, but rather a cache content manager. Web pages are actually cached in such places as local file systems, proxy servers, or caching network routers. When used in conjunction with an application trigger, the triggered cache manager function keeps caches synchronized with the most current data while reducing the frequency of unnecessary dynamic page generation and cache synchronization. See Trigger messages for triggered cache manager on HTTP Server (powered by Apache) for more information.

These Web pages can consist of a single object or they can consist of several objects (or page fragments). Dynamically produced Web pages that embed page fragments are updated anytime an embedded page fragment changes.

For example, several Web pages are constructed by a Java servlet using data extracted from a DB2^® database. These Web pages are in turn cached in several caching network routers. Web pages are served from the network router caches rather than from the HTTP server running the Java servlet. When the database data is changed, an application sends a trigger message to the triggered cache manager function. Through the use of object dependency graphs, the triggered cache manager function determines which Web pages need to be updated. The triggered cache manager function then requests new versions of each affected Web page from the Java servlet and updates the cached copies in the caching network routers.

In this example, the Web site performance is greatly increased by serving cached copies of dynamically produced Web pages rather than serving newly generated Web pages for each request. The Web site consistency and performance is again increased by using an application trigger, with the triggered cache manager function, to update cached Web pages only when underlying data changes. This avoids unnecessary and costly cache updates and synchronization.

The triggered cache manager function is most effective for a Web site that has a large number of requests for content that is somewhat constant, but with variables that change frequently. An example of this might be a Web site that serves an on-line catalog that contains price and inventory information (the product information is static, but the price and inventory information changes frequently). One of IBM's first uses of the triggered cache manager function was to drive the 1996 Winter Olympic Games Web site.

Page assembly

Often, advanced Web sites contain information that appears on more than one page. If the information changes, all of the pages with that information need to be updated. There are several potentially difficult problems associated with this:

It can be extremely difficult to find and update all affected pages, especially as a Web site grows in complexity.
Information with a tendency to change also has a tendency to be expensive to maintain; for example, database activity might be required to effect an update.

If a Web site's pages can be composed from partial HTML fragments. Each fragment is unique and any page that contains its information acquires it by embedding the fragment. This can lead to more flexible, diverse, and complex Web sites.

The trigger cache manager function provides a way to assemble Web pages from a set of fragments. If a fragment changes, the author of the fragment needs only to publish that fragment; the triggered cache manager finds all the affected pages (data sources), rebuilds them, and copies the updated pages to the configured delivery locations (cache targets).

To take advantage of this facility, HTML segments must indicate which other fragments are to be embedded. This is done with a simple tag very similar to Server Side Includes (SSI). These tags are used for two purposes:

To determine the dependency relationships among HTML fragments (dependency parsing).
To physically construct pages from the fragments (page assembly).

A publish trigger handler is used to accomplish this task.

The tag used to specify that a fragment is to be included is specified in HTML as follows. The keyword %fragment is chosen to avoid conflicts with SSI.

<!-- %fragment (/source-name, /default-name) -->

Notes:

The source-name is the name of the embedded fragment, relative to the data source specified in the server configuration. The data source is searched in this order:
1. Among the objects triggered.
2. Among objects in the assembled directory within the repository of published objects maintained by the triggered cache manager function. These are objects that might have been fetched as a result of previous triggers and correspond to the assembled versions of fragments, intermediate results, and final publishable pages.
  If the data source is a file system, the source-name is a file name. If the data source is HTTP, the source-name is the file name portion of a URL.
The default-name is the name of another fragment that might be used as a default when a specified fragment cannot be found.
Nesting is supported. The triggered cache manager function uses the object dependency graph (ODG) specified in the server configuration to start the page assembler in the correct order to build pages.

For example, when one fragment, fragment A.html, embeds another fragment, fragment B.html, it is said that fragment A.html is dependent on fragment B.html. This is denoted as either B --> A or as the ordered pair (B, A). It is said that there is a dependency relationship between B and A. Suppose that A also embeds another fragment, fragment C.html, and that fragment C.html in turn embeds fragment D.html.

Suppose further that some other fragment, fragment E.html, also embeds fragment C.html. This relationship can be represented as a directed graph called the object dependency graph (ODG). The object dependency graph is for fragment A.html and fragment C.html. The HTML fragments that describe this look something like the following:

Fragment A.html	Fragment C.html	Fragment E.html
<html> ... <!-- %fragment(B.html) --> <!-- %fragment(C.html) --> ... </html>	<html> ... <!-- %fragment(D.html) --> ... </html>	<html> ... <!-- %fragment(C.html) --> ... </html>

When the triggered cache manager function is instructed to publish fragment C.html, it determines that C.html embeds D.html. Similarly, when A.html and E.html are published, it determines B --> A, C --> A, and C --> E. These relationships are automatically entered into the object dependency graph when the fragments are processed by the publish trigger handler. If D.html is republished at a later time, the publish application can determine, by examining the ODG, that all of C.html, A.html, and E.html must be rebuilt but that the old copy of B.html can be used. Similarly, if only B.html changes, none of C.html, D.html, or E.html are affected. The entire process of discovering dependencies, updating the object dependency graph, and composing the correct and only the correct pages is all automatically performed by the publish trigger handler.

Graphical representation of fragment A.html	Graphical representation of fragment E.html	Graphical representation of ODG for fragments A.html and E.html.

Wrappers

In some cases, most notably when HTML is generated automatically from HTML editors, it might be undesirable to use the full HTML content in a fragment. Such applications might insist on producing <html> and <body> tags, for example, which make the fragment nearly unusable. To manage this, fragments are also parsed for wrappers.

A wrapper directive begins with the following tag:

<!-- %begin-fragment( name ) -->

A wrapper directive ends with the following tag:

<!-- %end-fragment -->

When these are encountered within a document, all text before the initial tag and following the final tag is discarded. Only the text between these tags is actually used during page composition. The selected text can be given a name other than the name of the file in which it occurs. This permits multiple fragments to exist in the same file. That is, if multiple %begin-fragment and %end-fragment tags are found, the file is treated as multiple files for the purpose of composing the page and managing the object dependency graph.