Understanding disk consumption by Collection Services

The amount of disk resource Collection Services consumes varies greatly depending on the settings that you use.

For illustration purposes, assume that Collection Services is used daily and cycles at midnight, causing each *MGTCOL object to contain one day's worth of data collection. Next, establish a base size for one day's worth of data collection by using the default properties for Collection Services. A standard plus protocol profile with an interval value of 15 minutes can collect 500 MB of data in a *MGTCOL object. The size actually collected per day using the default properties can vary greatly depending on system size and usage. The 500 MB example might represent a higher-end system that is heavily used.

Interval rate Intervals per collection Multiplier Size in MB
15 minutes 96 1 500

The size of one day's worth of data is directly proportional to the number of intervals collected per collection period. For example, changing the interval rate from 15 minutes to 5 minutes increases the number of intervals by a factor of 3 and increases the size by the same factor.

Interval rate Intervals per collection Multiplier Size in MB
15 minutes 96 1 500
5 minutes 288 3 1500

To continue this example, the following table shows the size of one *MGTCOL object produced each day by Collection Services at each interval rate, using the default standard plus protocol profile.

Interval rate Intervals per collection Multiplier Size in MB
15 minutes 96 1 500
5 minutes 288 3 1500
1 minutes 1440 15 7500
30 seconds 2880 30 15000
15 Seconds 5760 60 30000

The size of a *MGTCOL object, in this example, can vary from 500 MB to 30 GB depending on the rate of collection. You can predict a specific system's disk consumption for one day's collection interval through actual observation of the size of the *MGTCOL objects created, using the default collection interval of 15 minutes and the standard plus protocol profile as the base and then using the multiplier from the above table to determine the disk consumption at other collection intervals. For example, if observation of the *MGTCOL object size reveals that the size of the object for a day's collection is 50 MB for 15-minute intervals, then you could expect Collection Services to produce *MGTCOL objects with a size of 3 GB when collecting data at 15-second intervals.

Note: Use caution when considering a collection interval as frequent as 15 seconds. Frequent collection intervals can adversely impact system performance.

Retention period

The retention period also plays a significant role in the amount of disk resource that Collection Services consumes. The default retention period is one day. However, practically speaking, given the default values, a *MGTCOL object is deleted on the third day of collection past the day on which it was created. Thus, on the third day of collection there is two days' worth of previously collected data plus the current day's data on the system. Using the table above, this translates into having between 1 GB and 1.5 GB of disk consumption at 15-minute intervals, and 60 to 90 GB of disk consumption at 15-second intervals on the system during the third day and beyond.

The formula to calculate disk consumption based on the retention period value is:

(Retention period in days + 2.5) * Size of one day's collection =
Total Disk Consumption
Note: 2.5 corresponds to two days of previous collection data, and an average of the current day (2 days + 1/2 day).

Using the above tables and formula, a retention period of 2 weeks gives you a disk consumption of 8.25 GB at 15-minute intervals and 495 GB at 15-second intervals for the example system.

It is important to understand the disk consumption by Collection Services to know the acceptable collection interval and retention period for a given system. Knowing this can ensure that disk consumption will not cause system problems. Remember to consider that a system monitor or a job monitor can override a category's collection interval to graph data for a monitor. A system administrator must ensure that monitors do not inadvertently collect data at intervals that will cause excess data consumption.