Tips for improving the performance of join queries

If you are looking at a join query which is performing poorly or you are about to create a new application which uses join queries, these tips may be useful.

Start of change
Table 1. Checklist for Creating an Application that Uses Join Queries
What to Do How It Helps
Check the database design. Make sure that there are indexes available over all of the join columns and row selection columns or both. The optimizer provides index advise in several places to aid in this process. Use either the index advisor under iSeries™ navigator - Database, the advised information under visual explain, or the advised information in the 3020 record in the database monitor. This gives the query optimizer a better opportunity to select an efficient access method because it can determine the average number of duplicate values. Many queries may be able to use the existing index to implement the query and avoid the cost of creating a temporary index or hash table.
Check the query to see whether some complex predicates should be added to other dials to allow the optimizer to get a better idea of the selectivity of each dial. Since the query optimizer does not add predicates for predicates connected by OR or non-isolatable predicates, or predicate operators of LIKE, modifying the query by adding these predicates may help.
Specify ALWCPYDTA(*OPTIMIZE) or ALWCPYDTA(*YES) If the query is creating a temporary index or hash table, and you feel that the processing time may be better if the optimizer only used the existing index or hash table, specify ALWCPYDTA(*YES).

If the query is not creating a temporary index or hash table, and you feel that the processing time may be better if a temporary index was created, specify ALWCPYDTA(*OPTIMIZE).

Alternatively, specify the OPTIMIZE FOR n ROWS to inform the optimizer of the application has intention to read every resulting row. To do this set n to a large number. You can also set n to a small number before ending the query.

For OPNQRYF, specify OPTIMIZE(*FIRSTIO) or OPTIMIZE(*ALLIO) Specify the OPTIMIZE(*FIRSTIO) or OPTIMIZE(*ALLIO) option to accurately reflect your application. Use *FIRSTIO, if you want the optimizer to optimize the query to retrieve the first block of rows most efficiently. This biases the optimizer toward using existing objects. If you want to optimize the retrieval time for the entire answer set, use *ALLIO. This may cause the optimizer to create temporary objects such as temporary indexes or hash tables in order to minimize I/O.
Star join queries A join in which one table is joined with all secondary tables consecutively is sometimes called a star join. In the case of a star join where all secondary join predicates contain a column reference to a particular table, there may be performance advantages if that table is placed in join position one. In Example A, all tables are joined to table EMPLOYEE. The query optimizer can freely determine the join order. For SQE, the optimizer uses Look Ahead Predicate generation to determine the optimal join order. For CQE, the query should be changed to force EMPLOYEE into join position one by using the query options file (QAQQINI) FORCE_JOIN_ORDER parameter of *YES. Note that in these examples the join type is a join with no default values returned (this is an inner join.). The reason for forcing the table into the first position is to avoid random I/O processing. If EMPLOYEE is not in join position one, every row in EMPLOYEE can be examined repeatedly during the join process. If EMPLOYEE is fairly large, considerable random I/O processing occurs resulting in poor performance. By forcing EMPLOYEE to the first position, random I/O processing is minimized.

Example A: Star join query

DECLARE C1 CURSOR FOR 
  SELECT * FROM DEPARTMENT, EMP_ACT, EMPLOYEE,
  PROJECT
  WHERE DEPARTMENT.DEPTNO=EMPLOYEE.WORKDEPT 
  AND EMP_ACT.EMPNO=EMPLOYEE.EMPNO
  AND EMPLOYEE.WORKDEPT=PROJECT.DEPTNO

Example B: Star join query with order forced via FORCE_JOIN_ORDER

  DECLARE C1 CURSOR FOR 
  SELECT * FROM EMPLOYEE, DEPARTMENT, EMP_ACT,
  PROJECT
  WHERE DEPARTMENT.DEPTNO=EMPLOYEE.WORKDEPT 
  AND EMP_ACT.EMPNO=EMPLOYEE.EMPNO
  AND EMPLOYEE.WORKDEPT=PROJECT.DEPTNO
  
Specify ALWCPYDTA(*OPTIMIZE) to allow the query optimizer to use a sort routine. In the cases where ordering is specified and all key columns are from a single dial, this allows the query optimizer to consider all possible join orders.
Specify join predicates to prevent all of the rows from one table from being joined to every row in the other table. This improves performance by reducing the join fan-out. Every secondary table should have at least one join predicate that references on of its columns as a 'join-to' column.
End of change