Normalization

Several available design methods allow you to design technically correct databases, and effective relational database structure. Some of these methods are based on a design approach called normalization. Normalization refers to the reduction or elimination of storing redundant data.

The primary objective of normalization is to avoid problems that are associated with updating redundant data.

However, this design approach of normalization (for example, 3NF–3rd Normal Form), may result in large numbers of tables. If there are numerous table join operations, SQL performance may be reduced. Consider overall SQL performance when you design databases. Balance® the amount of redundant data with the number of tables that are not fully normalized.

The following graphic illustrates that the proportion of redundant data to the number of tables affects performance:


Balancing redundant data and number of tables

Minimize the use of code tables when little is gained from their use. For example, an employee table contains a JOBCODE column, with data values 054, 057, and so forth. This table must be joined with another table to translate the codes to Programmer, Engineer, and so on. The cost of this join could be quite high compared to the savings in storage and potential update errors resulting from redundant data.

For example:

Figure 1. Normalized data form:
Normalized data form
Figure 2. Redundant data form:
Redundant data form

The set level (or mass operation) nature of SQL significantly lessens the danger of a certain redundant data form. For example, the ability to update a set of rows with a single SQL statement greatly reduces this risk. In the following example, the job title Engineer must be changed to Technician for all rows that match this condition.

Use SQL to update JOBTITLE:

         UPDATE EMPLOYEE
           SET JOBTITLE = "Technician"
           WHERE JOBTITLE = "Engineer"