This tutorial explains how to use the Able Agent Editor, how to open an example provided with ABLE, and how to create agents to approve mortgages using either neural network or rule-based application techniques.
First, go to the /able/bin directory and start the Able Agent Editor with the appropriate method for your environment.
Windows or OS/2 | UNIX |
runnit | runnit.sh |
After the Able Agent Editor main window appears, you may choose to read through the entire tutorial or select just one lesson at a time from the contents frame.
In this lesson, you are shown how to create, change, connect, inspect, and move the beans that make up an agent. You may wish to review the Able Agent Editor component descriptions or open it for reference before proceeding further.
To construct a new agent, go to the File pulldown menu and select New... A cascade menu will appear with a list of available AbleAgents. Select the type of agent you want to instantiate and one will be created and set as the base container agent for further editing and customization.
To add an Able bean component to an agent, turn to the desired page on the toolbar's bean palette and then click on the desired Able bean icon. To identify each component, just place and pause the cursor over the icon. Popup help text will show the underlying name of the Able bean. The Able beans are divided into several categories including Data, Learning, Rules, Agents, and Samples.
When you click on the icon, an instance of the Able bean will be created and added to the Able Agent. The icon will appear in the Agent canvas, and a new row will appear in the bean tree. At this point you can configure the Able bean, or create buffer, property, or event connections to the other Able beans contained by the agent.
Agents can contain both beans and other agents. When you select an agent in the tree, its content members are displayed in the canvas to the right. The name of the agent whose contents are displayed should appear above the canvas.
To change the state or configuration of a container agent you can select an action from the context or popup menu by right clicking on the agent name in the tree root or in the white space on the canvas. Open the customizer by choosing the Properties... item from the context menu or by double-clicking on the tree root or canvas white space.
After an AbleBean is created, there may be properties needed to complete its definition. The attributes themselves depend on the type of bean. For example, an Import bean needs an attribute specified to locate and define the format of the data to be imported. Move the mouse to the bean's entry in the tree or canvas areas. Double click on the bean or click the right mouse button to show the pop-up or context menu. Select Properties... to display an appropriate bean-specific customizer. Change properties as desired. Click on the OK push button to return to the workplace.
You can change the state or configuration of any AbleBean that is contained in an AbleAgent by clicking the right mouse button over the AbleBean icon in the canvas pane or on its name in the tree. This will bring up the context or popup menu. To open the customizer, choose the Properties... item from the context menu. While in the Properties customizer, some beans provide buttons to initialize, generate, or otherwise configure themselves which can change the bean's state. A bean that is not initialized will appear to be greyed-out or disabled on the editor canvas.
AbleAgents manage a list of User-defined Functions that can serve as sensors and effectors. They provide an easy way for AbleBeans to invoke methods on adaptor beans, access external application data, and invoke actions. To define a user function, open the context menu on the Agent (right click on white space in the Canvas pane or right click on the agent itself in the tree view) and select User functions...
A dialog box will appear with a list of the currently defined user functions in a list box. You can edit existing definitions, or add and remove them from the list.
Change an agent name by opening its properties. On the General tab, enter the name and press OK. You may wish to set your object names before saving. Once a bean or agent has been initialized, you cannot change its name.
Save an agent by selecting Save or Save as from the File pull-down menu. All beans contained in the agent as well as any open inspectors will be written to the serialized file with the provided name property.
There are several methods available for creating AbleBeans. To
create a single bean such as an Import bean place the pointer on
the icon on the data palette and click the
left mouse button. When you release the mouse button, the bean is
created and placed on the canvas.
Another method used to create beans is to copy an existing bean. First, select a bean already on the canvas by placing the pointer over it and clicking the left mouse button. Next, select the Copy choice from the Edit pull-down menu, and then select Paste from the Edit pull-down menu. Copy and Paste are also available from the context menu.
Each bean has an associated pop-up or context menu that is available by clicking the right mouse button on any of the object icons or in the bean tree. The container agent's context menu, in addition to its availability from the bean tree, is also available by right clicking on the white space in the canvas pane.
To move a bean, position the pointer on the desired object. Press the left mouse button, and hold it down while you move the mouse to drag the object to its new location in the Able Editor canvas. Release the button when the object is in its desired new location.
There are three types of connections supported in Able: buffer or data connections, property connections, and event connections. The connection for each connect is made in the same manner. The View menu determines which connection is visually displayed.
To connect two AbleBeans, right click on the source bean to bring up the context menu, move the cursor over the Connect... item (a cascade menu will appear) and select the desired type of connection. A rubberband line will appear from the source AbleBean. Move the mouse over the desired target or destination AbleBean and click the left mouse button again. A connection will be drawn between the two beans. If it is not visible, change the View selection to the type of connection created.
To cancel a connection, press the Escape key or click on the canvas instead of a bean.
To remove connections from a bean, open its customizer by selecting Properties... from the context menu. Select the Connections tab which displays a list of each of the connection types. Select the connections to be removed and press the Remove button.
The Able Editor allows you to customize settings so that you can change defaults to suit yourself. You do this with the Preferences dialog. To change or view your current choices, select File from the menu bar, and then Preferences... from the pull-down menu. This is where you set logging and tracing levels.
To see an example of a simple neural network application, load the animal.ser recognition demonstration from the examples/neural directory.
After the agent has loaded, notice that the inspector windows are prearranged and presized on your screen. This is a feature of the Able Editor that allows you to end and save a session exactly the way you left it. When you start the Able Editor again, you can pick up where you left off.
Press the step button to read a record from the active data source and
pass it through the filter to the network. Examine the InFilter
inspector window to see the field values for the record obtained,
and the result of the network's classification followed by the
target classification in the OutFilter inspector.
To interactively provide your own data,
To retrain the agent, go to the Neural Classifier page in the property editor. Select the Reset Beans button, and then press Start. The agent will train its neural backpropagation bean, and during training the BackPropagation Inspector will show the accuracy it achieves as it trains. It should eventually reach 100% accuracy and stop of its own accord.
To see an example of a simple ruleset application, load the animalRules.ser recognition demonstration from the examples/rules directory.
After the agent has loaded, notice that the inspector window shows the inputs to and outputs from the ruleset bean.
Press the step button to read a record from the active data source and
pass it through the rulest
. Examine the inspector window to see the inputs to
the ruleset. The ruleset outputBuffer's first element is the
value determined from the ruleset. The second value is the
animal's actual type which also happens to be present in this
ruleset. This value is not required for the ruleset but rather is
used by machine learning beans. But, as you step through the data
you should notice that the ruleset does not always calculate the
same value. If you look at the ruleset by double-clicking it, you
will see that this ruleset is incomplete in that it does not
include all the animals in the data source - it knows neither
ostriches or deer.
Add the rule:
: if ((swims == "no") && (color == "black/white") && legs == 2) then result = "ostrich";
Add the clause && legs == 4 to the rule resulting in zebra. The ruleset should now call an ostrich an ostrich rather than a zebra. What's needed to differentiate between a deer and a lion? Hint: eats_grass.
The animal recognition demonstration is an example of a saved agent. This section of the tutorial describes how the agent saved in the animal.ser file was created. To create an agent for neural classification, follow these steps:
Agents can also be created by selecting beans from the toolbar palette using the Agent Editor or within a program. Some samples are provided on the Samples toolbar palette.
As you have seen in the animal recognition demonstration, inspectors are a powerful part of the Agent Editor. After the beans have been created and configured, you may want to view what each object does during training or processing. To do this, open an Inspector window on the bean of interest.
Click the right mouse button on the Network object pop-up menu and select the Inspect item. A Text inspector is created, and it displays the Input Buffer and Output Buffer by default. Click on the Data item on the menu bar, and select Parameters... from the pull-down menu. You may want to remove the default values. Page down until you see the parameter named Max RMS Error. Click on that parameter, and choose a few of your own. When you exit the pull-down menu, the value for each of the parameters you selected will display.
Both text and graphic inspectors are provided. These include variations of bar, line, and scatter plots. Display options are provided for setting titles, legends, axis options, and other graphic options.
The inspector refresh rate can be changed as well. This controls the number of events that the inspector must count before refreshing the screen. Increasing the time between refreshes can improve training performance for neural network beans.
If you select View from the pull-down menu, you will see a list of other kinds of inspectors. You can view other types of objects with Inspector windows also. Several Inspector windows can be open at the same time, each on different objects, or on the same object displaying different parameter combinations or views. Some beans have custom inspectors - the Neural Network beans have a Network Graphic inspector, for example.
In this lesson, you are shown how to create, change, parse, and debug rulesets. You may wish to review the Able Ruleset Editor component descriptions before proceeding.
When the Ruleset Editor opens, a default ruleset is loaded. You can use this as a skeleton and type in entries, use the insert menu to stub out new blocks with additional instructions in comments, or load a previously existing ruleset and make changes. Enter text in the space provided as you would any text editor. You can also view the Able Rule Language description from the help pulldown.
Before you can run a ruleset that has changed, it must be
verified or parsed - press the toolbar's verify button . Parsing errors will
be displayed in a dialog box. After the rules are verified,
process the rules by pressing the toolbar run
button, and any outputs
or errors will appear in a dialog box.
All elements of the Able Rule Language are not necessarily
supported by all inferencing methods. For instance, inserting a
fuzzy set or fuzzy variable is only useful when the fuzzy
inferencing method is specified. Menu and toolbar tasks are: new,
open, save, properties, debug, cut/copy/paste, undo/redo, insert
a variable or ruleset, parse, view fuzzy sets , run, and help. To
see an interesting fuzzy set demonstration, open the ruleset
HedgeDemo.rs and view its fuzzy sets to observe how linguistic hedges
alter fuzzy set definitions.
The Insert... pulldown menu in the Ruleset Editor is used to insert a template entry in the ruleset. Templates are available for each kind of variable, rule, and fuzzy set as well as for a rule block. Some entries may only be useful with specific inference engines. Since you can switch between inference engines, all templates are available.
The Properties... item on the Agent Editor's context menu opens the Ruleset Editor. In order to change processing options or remove connections, use the File pulldown menu in the Ruleset Editor and select the Properties... option there.
You initialize a ruleset when you parse or verify its contents.
The ruleset name is the first word that appears after the word "RuleSet" in the rule text. To save a ruleset, go to the file menu and select the Save or SaveAs option as desired.
After a ruleset has been verified, it can be converted from Able Rule Language to an XML representation. Rulesets written in XML to the Able DTD can be read too.
Creating a ruleset within the ARL Editor is intended to be similar to developing a Java class. You create a file with the extension .arl, enter ARL constructs, and save the ruleset to parse or compile it. Syntax errors will be displayed in the task list.
Run the ruleset using either the run or debug buttons on the frame of the Rule Console window. Variables listed in the outputs section of the ruleset will appear in the Rule Console. Runtime errors will appear in the Rule Engine view. Turning on trace the Rule Engine view shows the rule firing sequence.
You can define inputs within the ruleset, and process them entirely within the Ruleset Editor. For example, open the ruleset examples\rules\MortgageB.rs. Press the verify button to parse the ruleset. Press the run button. A dialog box labelled Output should appear that includes the values of each output variable defined in the ruleset. In this case, the result shows whether a mortgage request was approved or denied, and the results of each check performed on the application.
Change one of the values such as YearsEmployed. Set it to a value of 1. Now press verify and run the ruleset. The first time you ran the ruleset, the mortgage request was granted. Now, it is rejected because this change to the data caused the EmploymentCheck to fail. The rules used to provide these values are called unconditional rules or assertions.
To run a ruleset within the editor, make sure the list of InputVariables is empty, and that you assert each variable that would have been provided in the inputBuffer to a value somewhere within the ruleset.
To provide data from an Able data source such as a text or database import as input to the ruleset, provide the list of variables with the InputVariables keyword. The order in which they are listed is the order in which they are assigned from the input buffer. The values in the input buffer will override any values provided in assertions.
Once you generally have a ruleset performing as desiring, connecting it to a data source is a fast way of using it to calculate results for batch processing or deployment as an agent. You can interactively provide data to the ruleset again through the properties Data tab for the import object, or by temporarily commenting out the input variable list and providing the values you wish as unconditional rules or assertions.
To save you time we have provided an InputVariables list with the field names but they are commented out. Uncomment the InputVariable lines that define the income and other variables, and comment out the empty InputVariables statement. Press Parse.
Now go back to the Agent Editor, select the Data tab on the
toolbar palette, and click on the Import bean icon . Doubleclick on the icon in the canvas,
and go to the Import properties tab. Set the definition file to
be examples\datafiles\Mortgage.dfn, and press the Open File
button. Connect the databuffer of the Import bean to the Rules
bean. The connection will be green if dataflow is enabled on each
end. Check the General tab to insure the Import bean has dataflow
enabled. Use the rule editor's File->Properties... option to
set DataFlow enabled for the ruleset bean. Open an inspector on
the BooleanUnderwriter ruleset to see the data flow. Press the
step button
and the
inspector should display the record input from the mortgage data
file in the inputBuffer and the ruleset conclusions in the
outputBuffer.
Able provides several aids in developing and testing rulesets. From the Ruleset Editor's Ruleset pulldown menu, you can set trace levels for parsing, inferencing, and the ruleset itself. You can inspect rulesets. Finally, a debug console may be enabled within your ruleset to allow you to step through rules and clauses within rules. For more information on debugging rulesets than is provided in the tutorial, see the ruleset debug directions.
There are three debug options available from the Classic Ruleset Editor's Ruleset pulldown menu: Parser, Ruleset, and Inferencing. To debug in the ARL Editor plugin, simply use the Rule Console view and the buttons to debug similar to debugging a Java class.
As you are writing rules and defining variables in the classic editor, occasionally press the verify button to provide syntax checking. Should a parsing error occur, a dialog will appear showing the line number responsible for the error. When you dismiss the error dialog, the cursor will be positioned on the line responsible. More information on a parsing error will appear in a dialog box if you select Ruleset->Debug->Parser and the level of detail to be displayed.
If you have embedded your ruleset in an agent and connected it to import and export beans for data buffer processing, select the Ruleset->Debug->Ruleset and the level of detail to be displayed to see what values are read from, and written to, the external buffers.
Finally, if you'd like to see which rules fire in what sequence, you may enable inferencing trace with the menu sequence Ruleset->Debug->Inferencing and the level of detail to be displayed.
You can use Inspectors to display some valuable ruleset information. Two parameters in particular of interest are variables, which lists all variables named in the ruleset and their current values, and inferenceContext, which shows the last RuleBlock, Rule, and Clause evaluated.
Ruleset and Inferencing debug output can be viewed in the Agent Editor by opening an inspector on the rules object. Select the data parameters baseTraceObject and inferTraceObject, respectively.
Able also provides a debugConsole in the classic editor which allows you to step through rules and clauses within rules as the RuleSet is evaluated. You turn on result debugging with entries in your RuleSet. First, import the the debug library:
import com.ibm.able.rules.AbleDebugLib;
Define an arbitrary List variable to assign as the result in your block of Variables:
List() rc = new List();
At the spot within the ruleset where you want debugging to start, start debugging code something like the following:
DebugStart: rc = startDebugConsole(this);
Use the panel buttons Clause and and Rule to step through each item in the ruleset and display intermediate results in the right-hand pane.
Debug rulesets in the ARL Editor plugin from the Rule Console just as you would Java classes.
Suppose you work in the information processing department of a large bank. One line of business in the bank is to issue mortgages to home buyers. Currently, the bank handles about 1000 mortgage applications every month. A mortgage underwriter examines each application and either approves it, rejects it, or requests more information. The bank would like to handle more applications each month but would rather not hire more mortgage underwriters. Your manager has asked you to develop a computer program to pre-evaluate the mortgage applications so the underwriters can make decisions more quickly. Your manager also hints at improving the quality of the underwriting process by decreasing the number of bad debts. This is clearly a classification problem. The underwriter examines the application data and classifies it as either approved or rejected.
This section presents three approaches to solving the mortgage underwriting problem. First, you will use a neural network backward propagation model which is typically used to solve classification problems. Next, you will contrast that approach to one using fuzzy rules. Finally, we also provide a ruleset example using forward chaining.
This section describes the neural network approach to developing a mortgage underwriting application. These steps are to be discussed:
For auditing purposes, the bank has archived all mortgage applications handled in the past seven years. You decide to use this information to train a neural network to mimic a mortgage underwriter. The database contains the following information about the applicant, the property, and the loan:
Data | Descriptions |
Employment data | Monthly income, years employed |
Finances | Total savings |
Credit rating | Number of good and poor ratings |
Property | Appraised value, location, structure value |
Loan | Amount, monthly payment |
Underwriter response | Approval or rejection reason codes |
You could try to use this data as it is to train the neural network. The desired output of the neural network is the underwriter's response and the input data is the rest of the application information. However, it is useful to understand how the data is used to make decisions so you can explicitly present the most important information to the neural network. Otherwise the neural network takes a long time to train, and it may make decisions based on irrelevant features in the data.
According to a mortgage underwriter, every application is different and many factors influence the decision. As "rules of thumb", though, an application is usually rejected in the following cases:
In these rules, the underwriter is making use of some ratios that are calculated from the raw data. To present this information explicitly to the neural network, create a logical view of the database to add these ratios:
Since the relevant information about credit rating is coded in the CreditScore field, you decide to omit the actual number of good and poor ratings from the database view. You have now defined the source data.
The next step is to subset the data into two or more files that can be used for training and testing the neural network. Too many records in the training file can make training cumbersome, but if too few are used, they may not be a representative sample.
The records in the testing file are used as a "control group". You use the test records to determine how well the neural network performs on data it has not seen before. This is an indication of how well the neural network has generalized from the training records.
You decide to take a random sample from the bank's database to create a training file and testing file with 20 records each. In a real application, of course, it would be better to have larger files than this. The files could be database or spreadsheet, but this tutorial assumes the format is space-delimited text.
Each of these files has the following fields:
Database Field | Description |
Income | Whole number up to $10 000 |
YearsEmployed | Whole number up to 30 |
Savings | Whole number up to $10 000 |
AppraisedValue | Whole number up to $1 000 000 |
Location | Character, one of RURAL, SUBURB, URBAN |
StructureValue | Whole number up to $1 000 000 |
LoanAmount | Whole number up to $1 000 000 |
MonthlyPayment | Whole number up to $10 000 |
LoanToValueRatio | Real number 0.00 to 1.00 |
PaymentToIncomeRatio | Real number 0.00 to 1.00 |
StructureToValueRatio | Real number 0.00 to 1.00 |
CreditScore | Real number 0.00 to 1.00 |
Approved | Character, either YES or NO |
InsufficientIncome | Character, either YES or NO |
UnstableEmployment | Character, either YES or NO |
PoorCredit | Character, either YES or NO |
LowPropertyValue | Character, either YES or NO |
LowStructureValue | Character, either YES or NO |
The last six fields represent the underwriter's responses. If the loan is approved, all the reason fields have the value NO. If the loan is not approved, at least one reason field will have the value YES.
The record format for the data file is provided in a Definition file in the examples/datafiles/Mortgage.dfn file. You could use your favorite editor to create a definition file or use the one we provide. If you create your own, the fields labelled Approved and continuing through LowStructureValue are the output fields. These fields represent the mortgage underwriter's decision and are the fields that the network predicts based on the other data values.
If you have not already done so, start the Able Editor interactive development environment. The following sections show how to create an AbleNeuralClassifierAgent consisting of five objects: an Import bean for training data, an Import bean for testing data, a Filter bean for input, a neural network bean, and a Filter bean for output.
Note: In this tutorial, fields that are not discussed should not be changed from their initial values.
In this case may be legitimate to reduce the accuracy threshold. Other options would be to customize the filter values, re-examine the data for inaccuracies and inconsistencies, or fine tune the network. You could change the network architecture to include 10 units in the first hidden layer for instance.
Change the number of units in the first hidden layer from 0 to 10. Press Generate Beans to update the network bean's architecture. Warning: with the current implementation, Generating Beans will remove the Inspectors. Press the Start button. Does it average RMS error drop faster this time? With this data, the additional weights in the hidden layer are needed to exactly match over 60% of the records in the test data set. If you compare only the final result you will see that only the record that was incorrect was index 4 which was approved by the Mortgage Underwriter agent but rejected according to the data. Analyzing that record may cast some doubt on the ability of the human mortgage underwriter!
You've used Able's autotrain feature to train the network - now we'll explain how it works in more detail.
From the property page labelled Neural Classifier, you used the Start button to both train and test the network. While in TRAIN mode, the network adjusts its weights until the desired accuracy specified by Minimum Percent Correct is attained. It then internally switches to TEST mode: the test data source becomes the active data source, and each record in it is classified and the number of correct calculations recorded. If the test data source then meets the specified Minimum Percent Correct value, training stops. If not, the network mode is changed back to TRAIN, and the train dataset becomes active once again. The Train/Test ratio value specifies the number of passes or epochs which are used to continue to train the network. When that value is reached, the network again switches to TEST mode and calculates the percentage of records correctly classified in the test data source. The cycle continues until the network meets the Minimum Percent Correct for the test data, or until the Maximum Passes value is reached.
To validate the network, you establish a criteria for the acceptable classifcation accuracy of the neural network on the test data set file.These are records the neural network has not seen during training. In this example we decided the network was trained sufficiently if it could produce the exact desired response for at least 60% of the test records. Then we experiemented with another network architecture by adding a layer of hidden units. With this data, using this architecture produced a higher classification accuracy and the network learned faster.
You can validate or test the network manually. Perhaps you would like to use another parameter such as average RMS error or PercentIncorrect as the primary validation criteria instead of PercentCorrect. Make the test data set the active data set by turning its data flow on. Switch the mode of the back propagation bean to TEST so the network's weights are not adjusted, ie, so it doesn't train from the test data. Press the Cycle button on the Agent Editor toolbar to process all the records in the test data set. Use inspectors to view the results. Create text inspectors to monitor the average RMS error and PercentIncorrect parameters. Check to see that the values are within the limits you set. If not, turn on data flow for the train data set, set the network mode to TRAIN, and press the run button until you decide to repeat the manual test steps listed.
Whether you validate your agent with Autotrain or manually, how well the neural network performs on the test data should indicate how it will perform when it is deployed as an application.
In this section the mortgage underwriting problem described previously is solved by using only fuzzy rules. The problem definition and the available data and information are the same. However, the underwriting decision is to be made by a set of fuzzy rules so no training is involved, and the data identified as 'output' for the neural network is not used. As you can see, solving problems with rules is a very different process than with neural networks. Rather than using the "rules of thumb" provided by the mortgage underwriter purely as guidance in analyzing the data, these rules of thumb are used directly in the form of fuzzy rules.
To fully illustrate the differences in application development using fuzzy rules, a summary of the problem analysis steps is repeated. The first step is to identify the fuzzy variables needed.
Referring to the available source data in Source Data Definition, and the related discussion in Defining the Source Data, the Income field values range from 0 to $10 000 per month, and will be defined as a fuzzy variable to avoid arbitrary distinct thresholds. Next, define three fuzzy sets or membership functions to reason about the Income variable. Low, Medium, and High are the fuzzy sets for Income.
The variable YearsEmployed ranges from 0 to 30. However, in the neural network problem analysis, it was important only to differentiate between five states: less than 1 year, 1 year, 2 years, 3 years, and 4 or more. This can be represented as a fuzzy variable with five crisp fuzzy sets, or, the data can be pre-processed and represented by a discrete variable with five states.
Savings ranges from 0 to $10 000, which is a fuzzy variable with values of Low, Medium, and High as fuzzy sets over that range.
AppraisedValue and LoanAmount both cover a large range from 0 to a million dollars. Represent these as fuzzy variables with three fuzzy sets, and manage the large range by defining our fuzzy set membership functions to map to an intuitive notion of High, Medium, and Low. There is no need for scaling as there is with neural networks.
The Location is a categorical variable with 3 possible values, RURAL, SUBURB, or URBAN.
The StructureValue is ignored since it is part of the StructureToValueRatio ratio.
The LoanAmount is a fuzzy variable with fuzzy sets of LOW, MEDIUM, and HIGH and ranges from 0 to $1 000 000.
The MonthlyPayment can be ignored since it is included in the PaymentToIncomeRatio.
The five computed Ratios can be represented as fuzzy variables with a single fuzzy set over them. The fuzzy set can be segmented with a 0 value at 0, and a 1.0 membership when the ratio is equal to 1.0. Later we will show how some of these ratios can be computed inside the ruleset rather than requiring them to be included in the data.
Finally, the Response variable has six discrete output values: Approved, Insufficient income, Unstable employment, Poor credit, Low property value, and Low structure value.
Variable definitions are as follows:
Fuzzy Variable Definition Summary
Database Field | Representation |
Income | Fuzzy $0 to $10 000 (Low, Medium, High) |
YearsEmployed | Fuzzy 0 to 30 years (<1, <2, <3, <4, >=4) |
Savings | Fuzzy $0 to $10 000 (Low, Medium, High) |
AppraisedValue | Fuzzy $0 to $1 000 000 (Low, Medium, High) |
Location | Categorical ("RURAL", "SUBURB", "URBAN") |
StructureValue | Ignored |
LoanAmount | Fuzzy $0 to $1 000 000 (Low, Medium, High) |
MonthlyPayment | Ignored |
LoanToValueRatio | Continuous 0.0 to 1.0 |
PaymentToIncomeRatio | Continuous 0.0 to 1.0 |
StructureToValueRatio | Continuous 0.0 to 1.0 |
CreditScore | Continuous 0.0 to 1.0 |
Response | Categorical ("Approved", "Rejected") |
IncomeCheck | Categorical ("OK", "InsufficientIncome") |
EmploymentCheck | Categorical ("OK", "UnstableEmployment") |
CreditCheck | Categorical ("OK", "PoorCredit") |
PropertyCheck | Categorical ("OK", "LowPropertyValue") |
StructureCheck | Categorical ("OK", "LowStructureValue") |
Continue to work with the module developed in the first part of the tutorial by adding a Fuzzy Ruleset as an alternate processing method to the back propagation neural network approach.
Select the Rules tab and click on the RuleSet icon to add a new object
to the canvas. Edit the ruleset by doubleclicking or selecting Properties...
from the context menu. The rule editor is opened with a prototype
ruleset. At any time you may press the parse
button to verify the
rule syntax.
Here is the fuzzy rule-base to be defined; subsequent tutorial sections will discuss its main constituents. You could elect to load the ruleset from the examples\rules\MortgageF.arl, copy/paste the rules from below, or type them in from the keyboard. The ruleset file will be most up to date.
ruleset MortgageUnderwriter { variables { Fuzzy Income = new Fuzzy(0, 10000) { Linear HIGH = new Linear (4000, 10000, ARL.Up ); Trapezoid MEDIUM = new Trapezoid(1000, 3000, 7000, 9000); Linear LOW = new Linear ( 0, 4000, ARL.Down ); }; Fuzzy YearsEmployed = new Fuzzy(0, 30) { Segments MORETHAN2 = new Segments(2, 0, 2.1, 1, 30, 1); Segments LESSTHAN2 = new ~Segments(MORETHAN2); }; Fuzzy Savings = new Fuzzy(0, 10000) { Linear HIGH = new Linear (4000, 10000, ARL.Up ); Pi MEDIUM = new Pi (5000, 4000 ); Linear LOW = new Linear ( 0, 4000, ARL.Down); }; Fuzzy AppraisedValue = new Fuzzy(0, 1000000) { Sigmoid HIGH = new Sigmoid (120000, 200000, 300000, ARL.Up); Trapezoid MEDIUM = new Trapezoid( 40000, 85000, 120000, 300000); Linear LOW = new Linear ( 0, 85000, ARL.Down ); }; Categorical Location = {"RURAL", "SUBURB", "URBAN"}; Fuzzy StructureValue = new Fuzzy(0, 1000000) { Linear Unused = new Linear (0.0, 1.0, ARL.Up); }; Fuzzy LoanAmount = new Fuzzy(0, 1000000) { Linear HIGH = new Linear (120000, 200000, ARL.Up ); Trapezoid MEDIUM = new Trapezoid( 40000, 85000, 120000, 200000); Linear LOW = new Linear ( 0, 85000, ARL.Down ); }; Fuzzy MonthlyPayment = new Fuzzy(0, 3000) { Linear Unused = new Linear (0.0, 1.0, ARL.Up); }; Fuzzy LoanToValueRatio = new Fuzzy(0.0, 1.0) { Linear HIGH = new Linear (0.85, 0.851, ARL.Up); }; Fuzzy PaymentToIncomeRatio = new Fuzzy(0.0, 1.0) { Linear LOW = new Linear(0.25, 0.35, ARL.Down); Linear HIGH = new ~Linear(LOW); Linear EXCESSIVE = new Linear(0.35, 0.45, ARL.Up); }; Fuzzy StructureToValueRatio = new Fuzzy(0.0, 1.0) { Linear LOWRURAL = new Linear (0.4 , 0.6 , ARL.Down); Sigmoid LOW = new Sigmoid(0.65, 0.75, 0.85, ARL.Down); }; Fuzzy CreditScore = new Fuzzy(0.0, 1.0) { Linear POOR = new Linear (0.8, 0.801, ARL.Down); }; Categorical Response = {"Approved", "Rejected"}; Categorical IncomeCheck = {"OK", "InsufficientIncome"}; Categorical EmploymentCheck = {"OK", "UnstableEmployment"}; Categorical CreditCheck = {"OK", "PoorCredit"}; Categorical PropertyCheck = {"OK", "LowPropertyValue"}; Categorical StructureCheck = {"OK", "LowStructureValue"}; //---------------------------------------------------------- // Fields in the input buffer for supervised learning by // neural networks; not needed for rule processing but can // be used to confirm rules arrive at expected conclusion //---------------------------------------------------------- Categorical HumanApproved = {"YES", "NO"}; Categorical HumanInsufficientIncome = {"YES", "NO"}; Categorical HumanUnstableEmployment = {"YES", "NO"}; Categorical HumanPoorCredit = {"YES", "NO"}; Categorical HumanLowPropertyValue = {"YES", "NO"}; Categorical HumanLowStructureValue = {"YES", "NO"}; } // Use the following inputs{} statement if variable values are // provided by assertions in the "preProcess" ruleblock below. inputs {}; // Use the following inputs{} statement if variable values are // to come from the ruleset's inputBuffer (which must be // primed by some external process before the ruleset is // invoked). /* inputs {Income, YearsEmployed, Savings, AppraisedValue, Location, StructureValue, LoanAmount, MonthlyPayment, LoanToValueRatio, PaymentToIncomeRatio, StructureToValueRatio, CreditScore, HumanApproved, HumanInsufficientIncome, HumanUnstableEmployment, HumanPoorCredit, HumanLowPropertyValue, HumanLowStructureValue}; */ // Values to place into the output buffer when inferencing is // complete. outputs {Response, IncomeCheck, EmploymentCheck, CreditCheck, PropertyCheck, StructureCheck}; void preProcess() using Fuzzy { //----------------------------------------------------------------------- // Optionally initialize the InputVariables here for // running the ruleset within the ruleset editor. // Delete these values if the InputBuffer is to provide // values! // // For each pair of statements below, the first one shown // should be OK; the commented second statement should cause // the check to fail. //----------------------------------------------------------------------- : Income = 945; // : Income = 315; // causes InsufficientIncome rejection : YearsEmployed = 5; // : YearsEmployed = 1; // causes UnstableEmployment rejection : Savings = 10000; // : Savings = 2; // causes InsufficientIncome rejection : AppraisedValue = 30000; : Location = "URBAN"; : StructureValue = 27000; // : StructureValue = 10000; // causes LowStructureValue rejection : LoanAmount = 20000; // : LoanAmount = 30000; // causes LowPropertyValue rejection : MonthlyPayment = 315; : CreditScore = 0.9; // : CreditScore = 0.5; // causes PoorCredit rejection } void process() using Fuzzy { : setControlParameter(ARL.InferenceMethod, ARL.FuzzyAdd); // { MinMax | FuzzyAdd | ProductOr } : setControlParameter(ARL.CorrelationMethod, ARL.Product); // { Product | Minimum } : setControlParameter(ARL.DefuzzifyMethod, ARL.Centroid); // { Centroid | MaxHeight } : setControlParameter(ARL.AlphaCut, 0.10); // 0.0 < n < 1.0 : Response = "Rejected"; : IncomeCheck = "OK"; : EmploymentCheck = "OK"; : CreditCheck = "OK"; : PropertyCheck = "OK"; : StructureCheck = "OK"; R1: LoanToValueRatio = LoanAmount / AppraisedValue; R2: PaymentToIncomeRatio = (MonthlyPayment / Income); R3: StructureToValueRatio = (StructureValue / AppraisedValue); I1: if (PaymentToIncomeRatio is EXCESSIVE) then IncomeCheck = "InsufficientIncome"; I2: if (PaymentToIncomeRatio is HIGH and Savings is not HIGH) then IncomeCheck = "InsufficientIncome"; E1: if (YearsEmployed is LESSTHAN2) then EmploymentCheck = "UnstableEmployment"; C1: if (CreditScore is POOR) then CreditCheck = "PoorCredit"; P1: if (LoanToValueRatio is HIGH) then PropertyCheck = "LowPropertyValue"; S1: if (Location == "RURAL" and StructureToValueRatio is LOWRURAL) then StructureCheck = "LowStructureValue"; S2: if (Location == "URBAN" and StructureToValueRatio is LOW) then StructureCheck = "LowStructureValue"; S3: if (Location == "SUBURB" and StructureToValueRatio is LOW) then StructureCheck = "LowStructureValue"; Result: if (IncomeCheck == "OK" and EmploymentCheck == "OK" and CreditCheck == "OK" and PropertyCheck == "OK" and StructureCheck == "OK") then Response = "Approved"; } } |
The ruleset for this tutorial has these major sections:
Each of the variables shown in Fuzzy Variable Definition needs to be defined, and its fuzzy set membership functions specified. For example, Income is to be a fuzzy variable with a minimum of 0 and a maximum of 10000, with fuzzy sets of low, medium, and high income. This is defined like this in our example:
Fuzzy Income = new Fuzzy(0 , 10000) { Linear HIGH = new Linear (4000, 10000, Up); Trapezoid MEDIUM = new Trapezoid(1000, 3000, 7000, 9000); Linear LOW = new Linear (0, 4000, ARL.Down); }; |
Each fuzzy set has a set definition which defines a shape. The predefined shape Linear takes 3 parameters: two points within the specified variable range, and a direction. Multiple combinations of linguistic hedges can be used to modify the sets; these modifiers include words such as above, below, very, not, and about.
We define LOW as income values between 0 (definitely LOW in any currency system!) and 4000. A value is a member of the LOW set if it is below the line drawn from (0,1) to (4000, 0). The trapezoid for the MEDIUM set requires 4 inputs, and results in a space outlined by the points (1000,0) to (3000,1) to (7000,1) to (90000,0).
You can graphically view your fuzzy set definitions. Verify
your ruleset by pressing . View your fuzzy set definitions by selecting the
toolbar button
,
or select the RuleSet menu, and select View fuzzy sets. Each
variable defined will be listed in the left column. Each fuzzy
set will be displayed in the right column, with a composite of
all fuzzy sets for a given variable adjacent to the variable
chart. The composite for the Income variable should look
like this:
Income
is plotted on the X axis, and set membership on the Y axis. The
LOW set is under the green line, MEDIUM under the blue, and HIGH
under the red. Note the overlap between the sets which is
characteristic of fuzzy logic.
To create a categorical variable such as the Location, enter a list of values like this:
Categorical Location = new Categorical("RURAL", "SUBURB", "URBAN"); |
Define the remaining variables as shown in Fuzzy Variable Definition by looking at the fuzzy ruleset, or copy/paste them from another source. Note that we have tried to use many of the sets supported by Able, and included the use of the Complement hedge.
Fuzzy rules are in the following RuleLabel: If ... Then ... form:
void main() { I1: if (PaymentToIncomeRatio is EXCESSIVE) then IncomeCheck = "InsufficientIncome"; } |
Each of the underwriters' "rules of thumb" can readily be expressed in this format. Hedges such as SLIGHTLY or VERY may be used to alter a fuzzy set boundary. When hedges are used, their impact can be seen in the Fuzzy Set definition; each hedge will be graphed to the right of the column containing the User-defined Set.
Save the ruleset from the File pulldown menu.
Create a data buffer connection between the Import and the MortgageUnderwriter beans. Open an inspector on the MortgageUnderwriter bean. Press the Step button on the Agent Editor panel. Compare the output from the ruleset to the approval values from a real mortgage underwriter to see if they make the same decisions.
You may notice that frequently the human mortgage underwriter approved some mortgages even though the rule of thumb rejected the application. Often with this data, it is the StructureToValueRatio which is apparently given less credence.
You can verify which rule is causing the LowStructureValue response by turning debug on. Go to the ruleset editor panel and from the Ruleset pulldown, select the Debug-> Inferencing->High option. Now open an inspector on the MortgageUnderwriter ruleset object. Select Data from the pulldown menu, and then Parameters. Remove any existing parameters, and select the inferTraceObject. You should be able to see the rule that fired which rejected the application because of LowStructureValue.
Experiment with ways to obtain a better match to the decisions made by the human underwriter:
If your bank wishes to relax its rules in order to issue more mortgages, you could use these same techniques to modify the rules. The rules can be tightened too of course in order to reject borderline applications and issue lower risk mortgages.
The ruleset MortgageB.arl is more simplistic, and more rigidly enforces the underwriters' rules of thumb than the fuzzy solution just provided. The rules themselves are straightforward and are presented here and contrasted with the fuzzy implementation.
Recall the rule that the PaymentToIncomeRatio must be less than 0.28, or less than 0.35 unless the applicant had "high" savings. We used a fuzzy set to define savings level earlier. In this case, use the 0.35 limit if the applicant has twelve months of mortgage payments saved.
This example uses mathematical operations to calculate rather than require the data to be provided for the LoanToValueRatio, PaymentToIncomeRatio, and StructureToValueRatio. Functions can be provided by static libraries such as java.lang.Math, java classes such as java.lang.util.Vector, or custom java classes.
ruleset BooleanUnderwriter { // if you modify the inference method to Backward, import this library // library com.ibm.able.rules.AbleGUILib; variables { Continuous Income = new Continuous(0.0, 10000.0); Continuous YearsEmployed = new Continuous(0.0, 30.0); Continuous Savings = new Continuous(0.0, 10000.0); Continuous AppraisedValue = new Continuous(0.0, 1000000.0); Categorical Location = {"RURAL", "SUBURB", "URBAN"}; Continuous StructureValue = new Continuous(0.0, 1000000.0); Continuous LoanAmount = new Continuous(0.0, 1000000.0); Continuous MonthlyPayment = new Continuous(0.0, 3000.0); Continuous LoanToValueRatio = new Continuous(0.0, 1.0); Continuous PaymentToIncomeRatio = new Continuous(0.0, 1.0); Continuous StructureToValueRatio = new Continuous(0.0, 1.0); Continuous CreditScore = new Continuous(0.0, 1.0); Categorical Response = {"Approved", "Rejected"}; Categorical IncomeCheck = {"OK", "InsufficientIncome"}; Categorical EmploymentCheck = {"OK", "UnstableEmployment"}; Categorical CreditCheck = {"OK", "PoorCredit"}; Categorical PropertyCheck = {"OK", "LowPropertyValue"}; Categorical StructureCheck = {"OK", "LowStructureValue"}; //---------------------------------------------------------- // Fields in the input buffer for supervised learning by // neural networks; not needed for rule processing but can // be used to confirm rules arrive at expected conclusion //---------------------------------------------------------- Categorical HumanApproved = {"YES", "NO"}; Categorical HumanInsufficientIncome = {"YES", "NO"}; Categorical HumanUnstableEmployment = {"YES", "NO"}; Categorical HumanPoorCredit = {"YES", "NO"}; Categorical HumanLowPropertyValue = {"YES", "NO"}; Categorical HumanLowStructureValue = {"YES", "NO"}; } // end of Variables block inputs {}; // use if inputs provided by assertions in init ruleblock // use next block if inputs provided from inputBuffer /* inputs{Income, YearsEmployed, Savings, AppraisedValue, Location, StructureValue, LoanAmount, MonthlyPayment, LoanToValueRatio, PaymentToIncomeRatio, StructureToValueRatio, CreditScore, HumanApproved, HumanInsufficientIncome, HumanUnstableEmployment, HumanPoorCredit, HumanLowPropertyValue, HumanLowStructureValue}; */ outputs {Response, IncomeCheck, EmploymentCheck, CreditCheck, PropertyCheck, StructureCheck}; void process() using Forward { // using Backward { // if you modify the inference method to Backward, set a goal // setControlParameter(ARL.Goal, "Response") ; //----------------------------------------------------------------------- // Optionally initialize the InputVariables here for // running the ruleset within the editor. // Delete these values if InputBuffer is to provide values! // First value given should be OK, commented second should fail check. //----------------------------------------------------------------------- : Income = 945.0; // : Income = 315; // causes InsufficientIncome rejection : YearsEmployed = 5; // : YearsEmployed = 1; // causes UnstableEmployment rejection : Savings = 10000; // : Savings = 2; // causes InsufficientIncome rejection : AppraisedValue = 30000; : Location = "URBAN"; : StructureValue = 27000; // : StructureValue = 10000; // causes LowStructureValue rejection : LoanAmount = 20000; // : LoanAmount = 30000; // causes LowPropertyValue rejection : MonthlyPayment = 315; : CreditScore = 0.9; // : CreditScore = 0.5; // causes PoorCredit rejection : Response = "Rejected"; : IncomeCheck = "InsufficientIncome"; : EmploymentCheck = "UnstableEmployment"; : CreditCheck = "PoorCredit"; : PropertyCheck = "LowPropertyValue"; : StructureCheck = "LowStructureValue"; R1: LoanToValueRatio = LoanAmount / AppraisedValue; R2: PaymentToIncomeRatio = MonthlyPayment / Income; R3: StructureToValueRatio = StructureValue / AppraisedValue; I1: if (PaymentToIncomeRatio <= 0.28) then IncomeCheck = "OK"; I2: if (PaymentToIncomeRatio > 0.28 and PaymentToIncomeRatio <= 0.35 and Savings >= 12 * MonthlyPayment) then IncomeCheck = "OK"; E1: if (YearsEmployed >= 2) then EmploymentCheck = "OK"; C1: if (CreditScore >= 0.8) then CreditCheck = "OK"; P1: if (LoanToValueRatio <= 0.85) then PropertyCheck = "OK"; S1: if (Location == "RURAL" and StructureToValueRatio >= 0.5) then StructureCheck = "OK"; S2: if (Location == "URBAN" and StructureToValueRatio >= 0.85) then StructureCheck = "OK"; S3: if (Location == "SUBURB" and StructureToValueRatio >= 0.85) then StructureCheck = "OK"; Result: if (IncomeCheck == "OK" and EmploymentCheck == "OK" and CreditCheck == "OK" and PropertyCheck == "OK" and StructureCheck == "OK") then Response = "Approved"; } } |
Now that you've created an agent that has three different ways to calculate whether a mortgage request should be granted, you'd like to embed this capability in an application. Without actually writing an application, you can simulate this function as follows:
Typically you would deploy an agent in an application program. The import bean acts like an event generator, and the underwriter agent listens for and handles mortgage request events. The agent calculates a response, puts that response in its output buffer, and sends a dataChanged event which is received by the inspector. The inspector shows the result of the mortgage request as well as the input values in its outputBuffer and inputBuffer, respectively.