Functions

A function is an operation denoted by a function name followed by one or more operands that are enclosed in parentheses. It represents a relationship between a set of input values and a set of result values. The input values to a function are called arguments. For example, a function can be passed two input arguments that have date and time data types and return a value with a timestamp data type as the result.

Types of functions

There are several ways to classify functions. One way to classify functions is as built-in, user-defined, or generated user-defined functions for distinct types.

Built-in functions are functions that come with the database manager. These functions provide a single-value result. Built-in functions include operator functions such as "+", aggregate functions such as AVG, and scalar functions such as SUBSTR. For a list of the built-in aggregate and scalar functions and information on these functions, see Built-in functions.
The built-in functions are part of schema QSYS2. ³³
User-defined functions are functions that are created using the CREATE FUNCTION statement and registered to the database manager in catalog table QSYS2.SYSROUTINES and catalog view QSYS2.SYSFUNCS. For more information, see CREATE FUNCTION. These functions allow users to extend the function of the database manager by adding their own or third party vendor function definitions.
A user-defined function is either an SQL, external, or sourced function. An SQL function is defined to the database using only SQL statements. An external function is defined to the database with a reference to an external program or service program that is executed when the function is invoked. A sourced function is defined to the database with a reference to a built-in function or another user-defined function. Sourced functions can be used to extend built-in aggregate and scalar functions for use on distinct types.

A user-defined function resides in the schema in which it was created. The schema cannot be QSYS, QSYS2, or QTEMP.
Generated user-defined functions for distinct types are functions that the database manager automatically generates when a distinct type is created using the CREATE DISTINCT TYPE statement. These functions support casting from the distinct type to the source type and from the source type to the distinct type. The ability to cast between the data types is important because a distinct type is compatible only with itself.
The generated cast functions reside in the same schema as the distinct type for which they were created. The schema cannot be QSYS, QSYS2, or QTEMP. For more information about the functions that are generated for a distinct type, see CREATE DISTINCT TYPE.

Another way to classify functions is as aggregate, scalar, or table functions, depending on the input data values and result values.

An aggregate function receives a set of values for each argument (such as the values of a column) and returns a single-value result for the set of input values. Aggregate functions are sometimes called column functions. Built-in functions and user-defined sourced functions can be aggregate functions.

A scalar function receives a single value for each argument and returns a single-value result. Built-in functions and user-defined functions can be scalar functions. Generated user-defined functions for distinct types are also scalar functions.

A table function returns a table for the set of arguments it receives. Each argument is a single value. A table function can only be referenced in the FROM clause of a subselect. A table function can be defined as an external function or as an SQL function, but a table function cannot be a sourced function.
Table functions can be used to apply SQL language processing power to data that is not DB2® data or to convert such data into a DB2 table. For example, a table function can take a file and convert it to a table, get data from the Web and tabularize it, or access a Lotus® Notes® database and return information about email messages.

Function invocation

Each reference to a scalar or aggregate function (either built-in or user-defined) conforms to the following syntax: ³⁴

function-invocation:
 
                                  (1)
|--function-name--(--+----------+-------+----------------+--)---|
                     +-ALL------+       | .-,----------. |
                     '-DISTINCT-'       | V            | |
                                        '---expression-+-'

Notes:

The ALL or DISTINCT keyword can be specified only for an aggregate function or a user-defined function that is sourced on an aggregate function.

Each reference to a table function conforms to the following syntax:

>>-TABLE--(--function-name--(--+----------------+--)--)--correlation-clause-><
                               | .-,----------. |
                               | V            | |
                               '---expression-+-'

In the above syntax, expression is the same as it is for a scalar or aggregate function. See Expressions for other rules for expression.

When the function is invoked, the value of each of its parameters is assigned, using storage assignment, to the corresponding parameter of the function. Control is passed to external functions according to the calling conventions of the host language. When execution of a user-defined aggregate or scalar function is complete, the result of the function is assigned, using storage assignment, to the result data type. For details on the assignment rules, see Assignments and comparisons.

Table functions can be referenced only in the FROM clause of a subselect. For more details on referencing a table function, see the description of the FROM clause in from-clause.

Function resolution

A function is invoked by its function name, which is implicitly or explicitly qualified with a schema name, followed by parentheses that enclose the arguments to the function. Within the database, each function is uniquely identified by its function signature, which is its schema name, function name, the number of parameters, and the data types of the parameters. Thus, a schema can contain several functions that have the same name but each of which have a different number of parameters, or parameters with different data types. Or, a function with the same name, number of parameters, and types of parameters can exist in multiple schemas. When any function is invoked, the database manager must determine which function to execute. This process is called function resolution.

Function resolution is similar for functions that are invoked with a qualified or unqualified function name with the exception that for an unqualified name, the database manager needs to search more than one schema.

Qualified function resolution: When a function is invoked with a function name and a schema name, the database manager only searches the specified schema to resolve which function to execute. The database manager selects candidate functions based on the following criteria:
- The name of the function instance matches the name in the function invocation.
- The number of input parameters in the function instance matches the number of arguments in the function invocation.
- The authorization ID of the statement must have the EXECUTE privilege to the function instance.
- The data type of each input argument of the function invocation matches or is promotable to the data type of the corresponding parameter of the function instance.
If no function in the schema meets these criteria, an error is returned. If a function is selected, its successful use depends on it being invoked in a context in which the returned result is allowed. For example, if the function returns an integer data type where a character data type is required, or returns a table where a table is not allowed, an error is returned.
Unqualified function resolution: When a function is invoked with only a function name, the database manager needs to search more than one schema to resolve the function instance to execute. The SQL path contains the list of schemas to search. For each schema in the SQL path (see SQL path), the database manager selects candidate functions based on the following criteria:
- The name of the function instance matches the name in the function invocation.
- The number of input parameters in the function instance matches the number of function arguments in the function invocation.
- The authorization ID of the statement must have the EXECUTE privilege to the function instance.
- The data type of each input argument of the function invocation matches or is promotable to the data type of the corresponding parameter of the function instance.
If no function in the schema meets these criteria, an error is returned. If a function is selected, its successful use depends on it being invoked in a context in which the returned result is allowed. For example, if the function returns an integer data type where a character data type is required, or returns a table where a table is not allowed, an error is returned.

After the database manager identifies the candidate functions, it selects the candidate with the best fit as the function instance to execute (see Determining the best fit). If more than one schema contains the function instance with the best fit (the function signatures are identical except for the schema name), the database manager selects the function whose schema is earliest in the SQL path.

Function resolution applies to all functions, including built-in functions. Built-in functions logically exist in schema QSYS2. If schema QSYS2 is not explicitly specified in the SQL path, the schema is implicitly assumed at the front of the path. Therefore, when an unqualified function name is specified, ensure that the path is specified so that the intended function is selected.

In a CREATE VIEW statement, function resolution occurs at the time the view is created. If another function with the same name is subsequently created, the view is not affected, even if the new function is a better fit than the one chosen at the time the view was created.

Determining the best fit

There might be more than one function with the same name that is a candidate for execution. In that case, the database manager determines which function is the best fit for the invocation by comparing the argument and parameter data types. Note that the data type of the result of the function or the type of function (aggregate, scalar, or table) under consideration does not enter into this determination.

If the data types of all the parameters for a given function are the same as those of the arguments in the function invocation, that function is the best fit. If there is no exact match, the database manager compares the data types in the parameter lists from left to right, using the following method:

Compare the data type of the first argument in the function invocation to the data type of the first parameter in each function. (Any length, precision, scale, and CCSID attributes of the data types are not considered in the comparison.)
For this argument, if one function has a data type that fits the function invocation better than the data types in the other functions, that function is the best fit. The precedence list for the promotion of data types in Promotion of data types shows the data types that fit each data type in best-to-worst order.
If the data type of the first parameter for more than one candidate function fits the function invocation equally well, repeat this process for the next argument of the function invocation. Continue for each argument until a best fit is found.

The following examples illustrate function resolution.

Example 1: Assume that MYSCHEMA contains two functions, both named FUNA, that were created with these partial CREATE FUNCTION statements.

  CREATE FUNCTION MYSCHEMA.FUNA (VARCHAR(10), INT, DOUBLE) ...
  CREATE FUNCTION MYSCHEMA.FUNA (VARCHAR(10), REAL, DOUBLE) ...

Also assume that a function with three arguments of data types VARCHAR(10), SMALLINT, and DECIMAL is invoked with a qualified name:

   MYSCHEMA.FUNA( VARCHARCOL, SMALLINTCOL, DECIMALCOL ) ...

Both MYSCHEMA.FUNA functions are candidates for this function invocation because they meet the criteria specified in Function resolution. The data types of the first parameter for the two function instances in the schema, which are both VARCHAR, fit the data type of the first argument of the function invocation, which is VARCHAR, equally well. However, for the second parameter, the data type of the first function (INT) fits the data type of the second argument (SMALLINT) better than the data type of second function (REAL). Therefore, the database manager selects the first MYSCHEMA.FUNA function as the function instance to execute.

Example 2: Assume that functions were created with these partial CREATE FUNCTION statements:

   1. CREATE FUNCTION SMITH.ADDIT (CHAR(5), INT, DOUBLE) ...
   2. CREATE FUNCTION SMITH.ADDIT (INT, INT, DOUBLE) ...
   3. CREATE FUNCTION SMITH.ADDIT (INT, INT, DOUBLE, INT) ...
   4. CREATE FUNCTION JOHNSON.ADDIT (INT, DOUBLE, DOUBLE) ...
   5. CREATE FUNCTION JOHNSON.ADDIT (INT, INT, DOUBLE) ...
   6. CREATE FUNCTION TODD.ADDIT (REAL) ...
   7. CREATE FUNCTION TAYLOR.SUBIT (INT, INT, DECIMAL) ...

Also assume that the SQL path at the time an application invokes a function is "TAYLOR", "JOHNSON", "SMITH". The function is invoked with three data types (INT, INT, DECIMAL) as follows:

   SELECT  ... ADDIT(INTCOL1, INTCOL2, DECIMALCOL) ...

Function 5 is chosen as the function instance to execute based on the following evaluation:

Function 6 is eliminated as a candidate because schema TODD is not in the SQL path.
Function 7 in schema TAYLOR is eliminated as a candidate because it does not have the correct function name.
Function 1 in schema SMITH is eliminated as a candidate because the INT data type is not promotable to the CHAR data type of the first parameter of Function 1.
Function 3 in schema SMITH is eliminated as a candidate because it has the wrong number of parameters.
Function 2 is a candidate because the data types of its parameters match or are promotable to the data types of the arguments.
Both Function 4 and 5 in schema JOHNSON are candidates because the data types of their parameters match or are promotable to the data types of the arguments. However, Function 5 is chosen as the better candidate because although the data types of the first parameter of both functions (INT) match the first argument (INT), the data type of the second parameter of Function 5 (INT) is a better match of the second argument (INT) than the data type of Function 4 (DOUBLE).
Of the remaining candidates, Function 2 and 5, the database manager selects Function 5 because schema JOHNSON comes before schema SMITH in the SQL path.

Example 3: Assume that functions were created with these partial CREATE FUNCTION statements:

   1. CREATE FUNCTION BESTGEN.MYFUNC (INT, DECIMAL(9,0)) ...
   2. CREATE FUNCTION KNAPP.MYFUNC (INT, NUMERIC(8,0))...
   3. CREATE FUNCTION ROMANO.MYFUNC (INT, FLOAT) ...

Also assume that the SQL path at the time an application invokes a function is "ROMANO", "KNAPP", "BESTGEN". The function is invoked with two data types (SMALLINT, DECIMAL) as follows:

   SELECT  ... MYFUNC(SINTCOL1, DECIMALCOL) ...

Function 2 is chosen as the function instance to execute based on the following evaluation:

All three functions are candidates for this function invocation because they meet the criteria specified in Function resolution.
Function 3 in schema ROMANO is eliminated because the second parameter (FLOAT) is not as good a fit for the second argument (DECIMAL) as the second parameter of either Function 1 (DECIMAL) or Function 2 (NUMERIC).
The second parameters of Function 1 (DECIMAL) and Function 2 (NUMERIC) are equally good fits for the second argument (DECIMAL).
Function 2 is finally chosen because "KNAPP" precedes "BESTGEN" in the SQL path.

Best fit considerations

Once the function is selected, there are still possible reasons why the use of the function may not be permitted. Each function is defined to return a result with a specific data type. If this result data type is not compatible within the context in which the function is invoked, an error will occur. For example, given functions named STEP defined with different data types as the result:

   STEP(SMALLINT) RETURNS CHAR(5)
   STEP(DOUBLE) RETURNS INTEGER

and the following function reference (where S is a SMALLINT column):

   SELECT ... 3 +STEP(S)

then, because there is an exact match on argument type, the first STEP is chosen. An error occurs on the statement because the result type is CHAR(5) instead of a numeric type as required for an argument of the addition operator.

In cases where the arguments of the function invocation were not an exact match to the data types of the parameters of the selected function, the arguments are converted to the data type of the parameter at execution using the same rules as assignment to columns (see Assignments and comparisons). This includes the case where precision, scale, length, or CCSID differs between the argument and the parameter.

An error also occurs in the following examples:

The function is referenced in the TABLE clause of a FROM clause, but the function selected by the function resolution step is a scalar or aggregate function.
The function referenced in an SQL statement requires a scalar or aggregate function, but the function selected by the function resolution step is a table function.

33.

Built-in functions are implemented internally by the database manager, so an associated program or service program object does not exist for a built-in function. Furthermore, the catalog does not contain information about built-in functions. However, built-in functions can be treated as if they exist in QSYS2 and a built-in function name can be qualified with QSYS2.

34.

A few functions allow keywords instead of expressions. For example, the CHAR function allows a list of keywords to indicate the desired date format. A few functions use keywords instead of commas in a comma separated list of expressions. For example, the EXTRACT, TRIM, and POSITION functions use keywords.

[ Top of Page | Previous Page | Next Page | Contents | Index ]