Introduction to server performance

The performance characteristics of any computing environment may be described in the following terms.

Response time: The amount of time that is required for a request to be processed
Utilization: The percentage of resources that are used when processing requests
Throughput: The volume of requests (per unit of time) that are being processed
Capacity: The maximum amount of throughput that is possible

Typically, response time is the critical performance issue for users of a server. Utilization frequently is important to the administrators of a server. Maximum throughput is indicative of the performance bottleneck, and may not be a concern. While all of these characteristics are interrelated, the following summarizes server performance:

Every computing server has a bottleneck that governs performance: throughput.
When server utilization increases, response time degrades.

In many servers, capacity is considerable, and is not an issue with users. In others, it is the primary performance concern. Response time is critical. One of the most important questions for administrators is: How much can the server be degraded (by adding users, increasing utilization) before users begin objecting?