Hash table scan

During a Hash Table Scan operation, the entire temporary hash table is scanned and all of the entries contained within the hash table will be processed.

The optimizer considers a hash table scan when the data values need to be collated together, but the sequence of the data is not required. The use of a hash table scan will allow the optimizer to generate a plan that can take advantage of any non-join selection while creating the temporary hash table. An additional benefit of using a hash table scan is that the data structure of the temporary hash table will typically cause the table data within the hash table to remain resident within main memory after creation, thus reducing paging on the subsequent hash table scan operation.

Table 1. Hash table scan attributes
Data access method Hash table scan
Description Read all of the entries in a temporary hash table. The hash table may perform distinct processing to eliminate duplicates or takes advantage of the temporary hash table to collate all of the rows with the same value together.
Advantages
  • Reduces the random I/O to the table generally associated with longer running queries that may otherwise use an index to collate the data
  • Selection can be performed before generating the hash table to subset the number of rows in the temporary object
Considerations Generally used for distinct or group by processing. Can perform poorly when the entire hash table does not stay resident in memory as it is being processed.
Likely to be used
  • When the use of temporary results is allowed by the query environmental parameter (ALWCPYDTA)
  • When the data is required to be collated based upon a column or columns for distinct or grouping
Example SQL statement
SELECT COUNT(*), FirstNme FROM Employee
WHERE WorkDept BETWEEN 'A01' AND 'E01'
GROUP BY FirstNme
Messages indicating use There are multiple ways in which a hash scan can be indicated through the messages. The messages in this example illustrate how the SQL Query Engine will indicate a hash scan was used.
  • Optimizer Debug:
    CPI4329 -- Arrival sequence was used for file
               EMPLOYEE.
  • PRTSQLINF:
    SQL4010 -- Table scan access for table 1.
    SQL4029 -- Hashing algorithm used to process
               the grouping.
SMP parallel enabled Yes
Also referred to as Hash Scan, Preload

Hash Table Scan Distinct

Hash Table Scan Distinct, Preload

Visual Explain icon
Hash table scan icon