This topic describes the module mod_disk_cache for HTTP Server (powered by Apache).
The server may take each iteration of the disk cache maintenance process through one or two phases, depending on how much maintenance is needed. In the first phase, the server will examine the file system directories for the disk cache function and discard data that no longer complies with the current server configuration settings. It will also discard unused or unmodified data according to the criteria set by CacheGcClean or CacheGcUnused directives. File names and expiration times for the remaining data will be collected and the total amount of space allocated for them will be tallied. If the tally is above the maximum disk storage limit (set by CacheSize), the server will go into phase two. If the tally is at or below the maximum disk storage limit, the server will stop the current iteration of the maintenance process. If the server takes the current iteration into the second phase, information collected in the first phase for the remaining data is sorted according to cache expiry time. The server will then discard remaining data, by order of expiration (soonest to latest), until the amount of allocated space is at or below the maximum disk storage limit.
The following steps summarize the disk cache maintenance process:
Directives
Module: mod_disk_cache | |
Syntax: CacheDirLength length | |
Default: CacheDirLength 2 | |
Context: server config, virtual host | |
Override: none | |
Origin: Apache | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheDirLength 4 |
The CacheDirLength directive specifies the number of characters in subdirectory names used by the disk cache function to store data.
- Parameter: length
- The length parameter specifies the number of characters in subdirectory names used by the disk cache function. The specified value multiplied by the value specified for the CacheDirLevels directive must be less than or equal to 20.
If the values specified for CacheDirLevels and CacheDirLength are changed once they have been used to cache data, the server will discard all existing cache data when it runs disk cache maintenance since the file paths used to store data no longer adhere to the new values. See the CacheGcDaily or CacheGcInterval directives for more details on disk cache maintenance.
Module: mod_disk_cache | |
Syntax: CacheDirLevels levels | |
Default: CacheDirLevels 3 | |
Context: server config, virtual host | |
Override: none | |
Origin: Apache | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheDirLevels 3 |
The CacheDirLevels directive specifies the number of directory levels used by the disk cache function to store data.
- Parameter: levels
- The length parameter specifies the number of directory levels used by the disk cache function. The specified value multiplied by the value specified for the CacheDirLength directive must be less than or equal to 20.
A hash algorithm is used to generate unique and seemingly random character strings from hash keys (or URLs) provided for data stored in cache. These character strings are used to build unique file system path names. Data is stored in the file system using these path names, relative to the directory root specified by the CacheRoot directive. This setting specifies how many directory levels are used, while the CacheDirLength directives specifies the length of each subdirectory name, with remaining characters simply used for file names. The server uses the hash algorithm and directory levels to improve the performance of the server when working with a potentially large number of data files.
- Example 1
CacheRoot /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/MyCache CacheDirLevels 3 CacheDirLength 1The above example indicates that a hash key such as ftp://ibm.com/document.html may be used to build a directory path such as /x/3/_/9sj4t2svBA where x, 3, and _ are three subdirectory names (CacheDirLevels 3) each having a length of one character (CacheDirLength 1). The remaining characters, 9sj4t2svBA, are used for file names.
- Example 2
CacheRoot /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/MyCache CacheDirLevels 5 CacheDirLength 2The above example indicates that the same hash key described for example one (ftp://ibm.com/document.html) may be used to build a directory path such as /x3/_9/sj/4t/2s/vBA where x3, _9, sj, 4t, and 2s are five subdirectory names (CacheDirLevels 5) each having a length of two characters (CacheDirLength 2). The remaining characters, vBA, are used for file names.
Directory paths generated in this process are relative to the directory root defined by the CacheRoot directive. Therefore, for example one (above), two files, one named 9sj4t2svBA.data and the other named 9sj4t2svBA.header will be created to store data using the hash key ftp://ibm.com/document.html. Both files will reside within the /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/MyCache/x/3/_ directory. For example two (above), the two files will be named vBA.data and vBA.header and will reside within the /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/MyCache/x3/_9/sj/4t/2s directory using the same hash key.
Directory length and level limits:
Since the hash algorithm generates an exponential number of directories using this schema, a limit must be set upon the values that CacheDirLevels and CacheDirLength may have. The limits described as such:
CacheDirLevels * CacheDirLength <= 20
The maximum number of directory levels multiplied by the maximum length of each subdirectory must be less than or equal to 20. If not, the server will fail to activate at startup.
If the values specified for CacheDirLevels and CacheDirLength are changed once they have been used to cache data, the server will discard all existing cache data when it runs disk cache maintenance since the file paths used to store data no longer adhere to the new values. See the CacheGcDaily or CacheGcInterval directives for more details on disk cache maintenance.
Module: mod_disk_cache | |
Syntax: CacheGcClean hash-key-criteria period | |
Default: CacheGcClean *2592000 (seconds, or 30 days) | |
Context: server config, virtual host | |
Override: none | |
Origin: iSeries™ | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheGcClean http://www.ibm.com /* 1296000 |
The CacheGcClean directive specifies a complete URL or URL match expression and a maximum period value used to identify and remove data from cache that has not been updated (or written to cache) within the number of specified seconds. Multiple CacheGcClean directives are allowed. If disk cache maintenance is disabled, this setting has no affect and the cache may grow without bound, unless managed by some application or process other than the server.
This directive is similar to the CacheGcUnused directive, however the former distinguishes when data was last written (or saved) to cache, not when it was last served from cache.
- Parameter One: hash-key-criteria
- The hash-key-criteria parameter accepts a complete URL or URL match expression used to identify cached data by hash key. Complete URLs do not contain asterisks (*) or question marks (?) and must match hash keys URLs completely (see example two). URL match expressions contain one or more asterisks (*) or question marks (?) used as wildcards to match multiple hash keys. For example: http://ibm.com/*, *://ibm.com/*, or ftp://server?.ibm.com/* (see example one).
- Parameter Two: period
- The period parameter specifies the maximum amount of time (in seconds) that matched data may remain cached.
Cached data for the disk caching function is identified by comparing hash keys with the value specified for the hash-key-criteria parameter. Matched data that has not been updated (or written to cache) within the number of seconds specified by the corresponding period parameter is discarded by the server during phase one of the disk cache maintenance process. Matched data that has been updated within the number of specified seconds is not affected. Unmatched data is not affected. See Two Phase Disk Cache Maintenance for details concerning the disk cache maintenance process.
- Example 1: URL match expressions
CacheRoot serverCache CacheGcClean *://ibm.com/* 2592000 CacheGcClean ftp://server?.ibm.com/* 1209600For this example, the first CacheGcClean directive ensures cached data with hash keys (or URLs) that match the expression *://ibm.com/* and has not been updated within the past 2592000 seconds (or 30 days) is discarded during phase one of the cache maintenance process. The second CacheGcClean directive ensures cached data with hash keys (or URLs) that match the expression ftp://server?.ibm.com/* and has not been updated within the past 1209600 seconds (or 2 weeks) is discarded.
Example one uses CacheGcClean directives with URL match expressions to manage data stored in cached using the disk cache function (CacheRoot serverCache). For the expression *://ibm.com/*, the first wildcard (*) is used to match one or more characters in hash keys preceding the characters //ibm.com/. The second wildcard (*) is used to match one or more characters succeeding the characters //ibm.com/. Hash keys that match this expression, for example, include http://ibm.com/public/welcome.html and ftp://ibm.com/patch.zip. For the expression ftp://server?.ibm.com/*, the first wildcard (?) is used to match any single character between ftp://server and .ibm.com/. The second wildcard (*) is used to match one or more characters succeeding the characters .ibm.com/. Hash keys that match this expression, for example, include ftp://server1.ibm.com/whitepaper.pdf and ftp://server5.ibm.com/downloads/driver.exe.
- Example Two: Complete URL
CacheRoot serverCache CacheGcClean ftp://server5.ibm.com/downloads/application.zip 432000For this example, the CacheGcClean directive uses a complete URL to ensure cached data with the hash key ftp://server5.ibm.com/downloads/application.zip is discarded during phase one of the disk cache maintenance process if it has not been updated within the past 432000 seconds (or 5 days). No other data will be matched since complete URLs identify a single hash key.
The server detects updates to cached data for the disk caching function by comparing the "Data change date/time" values of data file attributes. These are commonly referred to as last-modified times. When data is updated within cache, the corresponding last-modified times record the date and time that the last update was made.
Module: mod_disk_cache | |
Syntax: CacheGcDaily time-of-day | off | |
Default: CacheGcDaily 03:00 | |
Context: server config, virtual host | |
Override: none | |
Origin: iSeries | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheGcDaily 23 |
The CacheGcDaily directives specifies whether the server is to perform disk cache maintenance, at a particular time, when the disk cache function is enabled. If the disk cache function is disabled (the default), this setting has no affect and the server does not perform disk cache maintenance. The default value is 3:00 (3:00 am local system time).
- Parameter: time-of-day | off
- The time-of-day parameter accepts a value in the HH:MM:SS format (24 hour clock) where HH is an hour value (0 to 23), MM is a minute value (0 to 59), and SS is a second value (0 to 59). A minute (MM) or second (SS) value is not required. If a minute value is not specified, maintenance will commence at the beginning of the hour specified by the hour value (see example two). Likewise, if a second value is not specified, maintenance will commence at the specified number of minutes past the hour (see example one).
- If off is specified, maintenance will not be performed based on a particular time of day (see example three).
If off is not specified, the server will perform cache maintenance every day, starting at the specified local system time (if disk caching is enabled, see examples one and two). If off is specified, the server will not perform disk cache maintenance at a specific time of day, however it may perform disk cache maintenance at regular time intervals, if a maintenance period is set using the CacheGcInterval directive. If off is specified, and a maintenance period is not specified using CacheGcInterval, the server will never perform disk cache maintenance (see example three).
- Example 1
CacheRoot dataCache CacheGcDaily 15:55
- Example 2
CacheRoot dataCache CacheGcDaily 9
- Example 3
CacheRoot dataCache CacheGcDaily offFor example one, the server will perform cache maintenance every day at 15:55 (or 3:55 pm local system time). For example two, the server will perform cache maintenance every day at 9:00 (or 9:00 am local system time). For example three, the server will not perform disk cache maintenance since CacheGcDaily is set to off, and CacheGcInterval is not specified.
See Two Phase Disk Cache Maintenance for details concerning the disk cache maintenance process.
CacheRoot dataCache CacheGcDaily 01:30:00 <VirtualHost ...> CacheGcDaily 02:30 00 </Virtual Host>
Module: mod_disk_cache | |
Syntax: CacheGcInterval period | |
Default: none | |
Context: server config, virtual host | |
Override: none | |
Origin: Apache | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheGcInterval 8100 |
The CacheGcInterval directive specifies whether the server is to perform disk cache maintenance, at regular time intervals, when the disk cache function is enabled. Maintenance for this setting will commence at the time the server is started, and repeat every number of specified seconds, until the server is ended. If the disk cache function is disabled (the default), this setting has no affect and the server does not perform disk cache maintenance.
- Parameter: period
- The period parameter specifies a period for cache maintenance cycles, in seconds. The value may include a decimal to indicate fractional hours. For example, use CacheGcInterval 5400 to perform cache maintenance every 5400 seconds (every 90 minutes).
If this directive is not used (not specified), the server will not perform disk cache maintenance at regular time intervals, however it may at a particular time of day, if such a time is specified using the CacheGcDaily directive. If this directive is not used (not specified), and CacheGcDaily is set to off, the server will never perform disk cache maintenance (see example two).
- Example 1
CacheRoot dataCache CacheGcInterval 9900
- Example 2
CacheRoot dataCache CacheGcDaily offexampleFor example one, the server will perform disk cache maintenance every 9900 seconds (every 2 hours and 45 minutes), starting from the time the server is started. For example two, the server will not perform disk cache maintenance since CacheGcDaily is set to off, and CacheGcInterval is not specified.
See Two Phase Disk Cache Maintenance for details concerning the disk cache maintenance process.
Module: mod_disk_cache | |
Syntax: CacheGcMemUsage size | |
Default: CacheGcMemUsage 5000000 | |
Context: server config, virtual host | |
Override: none | |
Origin: iSeries | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheGcMemUsage 3000000 |
The CacheGcMemUsage directive specifies the maximum amount of system memory, in bytes, the server is to use to collect information for phase two of the disk cache maintenance process. See Two Phase Disk Cache Maintenance for details concerning the disk cache maintenance process.
- Parameter: size
- The size parameter specifies, in bytes, the amount of main store memory that the server may use for phase two of the disk cache maintenance process.
When the amount of system memory consumed for phase two of the disk cache maintenance process reaches the value specified for the size parameter, the server stops collecting information for remaining data in cache but continues to do the other tasks for phase one until finished. If the server takes disk cache maintenance into phase two, only the information collected in phase one is used. This will not include information for all remaining cached data if the size parameter is not large enough.
- Example
CacheRoot dataCache CacheGcDaily 5:00 CacheGcMemUsage 200000For this example, the server will perform disk cache maintenance every day at 5:00 (CacheGcDaily 5:00). During phase one maintenance, the server records file names and expiration times for data remaining cached, until it consumes 200000 bytes of memory (CacheGcMemUsage 200000). After this limits reached, the server continues to perform the other phase one tasks. After all phase one tasks are complete, the server performs phase two maintenance (if needed) using whatever information it was able to collect in phase one.
Module: mod_disk_cache | |
Syntax: CacheGcUnused hash-key-criteria period | |
Default: CacheGcUnused * 1209600 (seconds, or 2 weeks) | |
Context: server config, virtual host | |
Override: none | |
Origin: iSeries | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheGcUnused http://www.ibm.com/* 432000 |
The CacheGcUnused directive specifies a complete URL or URL match expression and a maximum period value used to identify and remove data from cache that has not been used (or served from cache) within the number of specified seconds. Multiple CacheGcUnused directives are allowed. If disk cache maintenance is disabled (see CacheGcDaily or CacheGcInterval), this setting has no affect and the cache may grow without bound, unless managed by some application or process other than the server itself.
This directive is similar to the CacheGcClean directive, however the latter does not distinguish when data was last served from cache, but rather when it was last written (or saved) to cache.
- Parameter One: hash-key-criteria
- The hash-key-criteria parameter accepts a complete URL or URL match expression used to identify cache data by hash key. Complete URLs do not contain asterisks (*) or question marks (?) and must match hash keys completely (see example two). URL match expressions contain one or more asterisks (*) or question marks (?) as wildcards to match multiple hash keys. For example, http://* or ftp://server?.ibm.com/* (see example one).
- Parameter Two: period
- The period parameter specifies the maximum amount of time (in seconds) that matched data may remain cached.
Cached data for the disk caching function is identified for this setting by comparing hash keys with the value specified for the hash-key-criteria parameter. Matched data that has not been used (or served from cache) within the number of seconds specified by the corresponding period parameter are discarded by the server during phase one of the disk cache maintenance process. Matched data that has been used within the number of specified seconds is not affected. Unmatched documents are not affected. See Two Phase Disk Cache Maintenance for details concerning the disk cache maintenance process.
- Example 1: URL match expressions
CacheRoot serverCache CacheGcUnused http://* 25929000 CacheGcUnused ftp://server?.ibm.com/* 1209600For this example, the first CacheGcUnused directive ensures that cached data with hash keys (or URLs) that match the expression http://* and has not been updated within the past 25929000 seconds (or 30 days) are discarded during phase one of the disk cache maintenance process. The second CacheGcClean directive ensures that cached data with hash keys (or URLs) that match the expression ftp://server?.ibm.com/* and has not been updated within the past 1209600 seconds (or 2 weeks) is discarded.
Example one uses CacheGcUnused directives with URL match expressions to manage data stored in cache using the disk caching function (CacheRoot serverCache). For the expression http://*, the wildcard (*) is used to match one or more characters in hash keys preceding the characters http://. This expression matches all hash keys starting with the characters http://. For the expression ftp://server?.ibm.com/*, the first wildcard (?) is used to match any single character in hash keys between ftp://server and .ibm.com/. The second wildcard (*) is used to match one or more characters in hash keys succeeding the characters .ibm.com/. Hash keys that match this expression, for example, include ftp://server1.ibm.com/whitepaper.pdf and ftp://server5.ibm.com/downloads/driver.exe.
- Example 2: Complete URL
ProxyRequests on CacheRoot serverCache CacheGcUnused ftp://server5.ibm.com/downloads/application.zip 432000For this example, the CacheGcUnused directive uses a complete URL to ensure cached data with the hash key ftp://ftpserver.ibm.com/downloads/application.zip is discarded during phase one of the disk cache maintenance process if it has not been requested within the past 432000 seconds (or 5 days). No other data will be matched since complete URLs identify a single hash key.
The server detects requests for cached data for the disk caching function by comparing the "Last access date/time" values of data file attributes. These are commonly referred to as last-accessed times. When data is served from cache, the corresponding last-accessed times record the date and time that the last request was served.
Module: mod_disk_cache | |
Syntax: CacheRoot directory | |
Default: none | |
Context: server config, virtual host | |
Override: none | |
Origin: Apache | |
Usage Considerations: A LoadModule is required in the configuration file prior to using the directive. The statement should be as follows: LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM | |
Example: CacheRoot webProxyCache |
The CacheRoot directive enables the disk cache function and specifies the name of the file system directory root. Setting this directive also enables disk cache maintenance for the CacheGcDaily directive, by default, and the CacheGcInterval directive. See the CacheGcDaily or CacheGcInterval directives for more details on disk cache maintenance.
- Parameter: directory
- The directory parameter accepts a file system path name to specify the file system directory root for the disk cache function (see directory root limits below).
The disk cache function provides underlying cache support for a local proxy cache and user written modules, using local file system space (disk space). The server must have *RWX data authorities and *ALL object authorities to the specified directory.
A hash algorithm is used to generate unique and seemingly random path names based on hash keys (or URLs) provided for data stored in cache (see CacheDirLength and CacheDirLevels). Data is stored in the local file system using these path names, relative to the specified directory root. Since the algorithm generates case sensitive path names, CacheRoot must specify a directory within the QOpenSys (case sensitive) file system. For this reason the following limits are placed on the directory root:
- Directory root limits:
- If the directory parameter specifies an absolute path it must start with /QOpenSys/QIBM/UserData/HTTPA/CacheRoot, otherwise the proxy will fail to activate at startup.
- If the directory parameter does not specify an absolute path (does not start with a '/'), it will be assumed to be relative to the following: /QOpenSys/QIBM/UserData/HTTPA/CacheRoot
The directory will be created if it does not exist prior to server startup. Only the last directory in the path will be created. All other directories in the path must previously exist. For example, if "CacheRoot abc/def" is configured, the server will create directory "/QOpenSys/QIBM/UserData/HTTPA/CacheRoot/ABC/def".
Example 1: Absolute PathCacheRoot /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/proxyCache ProxyRequests onExample 2: Relative PathCacheRoot proxyCache ProxyRequests onExample 3: Relative Path (with disk cache function unavailable for proxy data)CacheRoot cache ProxyRequests onExample 4: Bad PathCacheRoot /MyServerCacheFor example one, CacheRoot enables the disk cache function (CacheRoot /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/proxyCache) , ProxyRequests specifies that the proxy function is enabled to handle forward proxy requests (ProxyRequests on). With these directive settings, HTTP proxy response data is cached and maintained within the /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/proxyCache directory using disk cache function support. See the ProxyRequests directive for more information on handling proxy requests and caching HTTP proxy response data.
For example two, the disk cache function is enabled (CacheRoot proxyCache), the proxy function is enabled (ProxyRequests on), and the local proxy cache is enabled. With these directive settings, HTTP proxy response data is cached and maintained within the proxyCache directory, relative to the /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/ directory. This directory is the same one described in example one, simply specified as a relative path name rather than an absolute path name. Either specification is acceptable.
For example three, the disk cache function is enabled (CacheRoot cache), and the proxy function is enabled (ProxyRequests on), however the local proxy cache is disabled. With these directive settings, the disk cache function is not used to cache data for the proxy function, but may be used to cache data for user written modules.
For example four, the directory specified for CacheRoot is not valid since an absolute path within /QOpenSys/QIBM/UserData/HTTPA/CacheRoot/ is not specified. With this configuration the server will generate an error message(s) at startup and fail to activate.
Module: mod_disk_cache | |
Syntax: CacheSize size | |
Default: CacheSize 1000000 | |
Context: server config, virtual host | |
Override: none | |
Origin: Apache | |
Usage Considerations: A LoadModule is required in
the configuration file prior to using the directive. The statement should
be as follows:LoadModule cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM LoadModule disk_cache_module /QSYS.LIB/QHTTPSVR.LIB/QZSRCORE.SRVPGM |
|
Example: CacheSize 8550 |
The CacheSize directive specifies the maximum amount of system storage space allocated for the disk cache function (in kilobytes). Although actual usage may exceed this setting, the server will discard data when it runs disk cache maintenance until the total allocated cache space is at or below this setting. If disk cache maintenance is disabled, this setting has no affect and the cache may grow without bound, unless managed by some application or process other than the server itself. See CacheGcDaily or CacheGcInterval for more details on the disk cache maintenance process.
- Parameter: size
- The size parameter specifies the maximum number of kilobytes allocated for the disk cache function. Depending on the expected server traffic volume, and values set for CacheGcInterval or CacheGcDaily, use a size value that is at least twenty to forty percent lower than the available space.
The disk cache function uses the local file system to store data. Therefore, space allocated for this cache is used to maintain directory structures and file attributes as well as to store cache data. It also includes unused space within file system storage blocks allocated to files and directories. Therefore, the total amount of system storage allocated for the cache will always be greater than the total amount of actual cache data. This setting sets a limit for the total amount of allocated space, not a limit for the total amount of actual cache data.