|
ABLE 2.0.0 07/02/2003 10:25:01 | ||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--com.ibm.able.AbleObject | +--com.ibm.able.beans.AbleAbstractImport
This abstract class provides common interfaces to import data sources for Able Beans.
An Import bean's primary function is to read data from a data source, and parse each record into the outputBuffer array when processed. AbleAbstractImport can load all the records into memory or optionally cache a quantity of records. The number of records cached is specified by the bufferSize. This object handles the cacheing. Each time all records in the data source have been processed, it sends an end-of-file event and increments the numEpochs value.
An Import uses an AbleImportData object to handle the I/O. Meta-data must be provided in order for an AbleImportData to create field variables. When the data source is first opened, it is scanned to determine the number of records. On this first pass it also computes min/mean/max values for continuous fields, creates symbol to index mappings for categorical fields, and creates number to index mappings for discrete fields. To force additional datasources within the same agent to use the same definition, set computeStatistics to false.
An AbleAbstractImport can be used to generate an AbleFilter bean which will translate the data in the manner specified in the meta-data definition file. Field usage can be initialized from a data definition file (a *.dfn file) for text import beans. It can also be specified interactively on the customizer's data panel for import objects such as database imports whose metadata does not include field usage.
When an Import is processed, it populates the outputBuffer array with elements from the data source. If the data consists solely of continuous fields, a double array is used; otherwise, a String array is populated. Records may be processed sequentially from the data source, or in random sequence. When buffering is used, the records are randomly retrieved from within each buffer. After all records in the buffer have been processed, the next buffer of records is retrieved.
Field Summary | |
---|---|
protected int |
bufferRecordIndex
Current record in the buffer file being processed. |
protected int |
bufferSize
The maximum number of records to read in a block from this data source. |
protected boolean |
computeStatistics
A boolean indicating that metadata is to be opened and field statistics are to be computed when the data source is opened. |
protected boolean |
cycleRelative
A flag indicating the cycleSize is relative to the file size, ie, a multiplier. |
protected double |
cycleSize
When cycleRelative is false, cycleSize is the raw number of records to process in a cycle. |
static java.lang.String |
defaultName
Value assigned to name by default. |
protected boolean |
eof
When the last record in the file has been processed, eof is true. |
protected java.util.Vector |
fieldList
A Vector of AbleField objects describing the data source. |
protected AbleImportData |
importData
The AbleImportData object referenced by this import. |
protected long |
numEpochs
The number of times this data source has processed all records it contains. |
protected java.util.Vector |
numericData
A Vector of double arrays containing records from the database table. |
protected int |
numFieldsPerRecord
The number of fields in a record from a data source. |
protected long |
numRecords
The total number of records in this data source. |
double[] |
outNum
A double array used in calculating the output buffer. |
java.lang.Object[] |
outSym
A String array used in calculating the output buffer. |
protected int[] |
randomIndices
An array of indices used when records are randomly accessed. |
protected boolean |
randomizeData
Determines whether to output records from the data source in random or sequential order. |
protected long |
recordIndex
Current record in the entire data file being processed. |
protected long |
recordsRead
The number of the records read from the start of the data source. |
protected java.util.Vector |
textData
A Vector of String arrays containing records from the database table. |
Fields inherited from class com.ibm.able.AbleObject |
---|
changed, chgSupport, comment, dataFlowEnabled, destBufferConnections, eventQueue, fileName, inputBuffer, listeners, logger, name, outputBuffer, parent, propertyConnectionMgr, sourceBufferConnections, state, stateChgSupport, trace |
Constructor Summary | |
---|---|
AbleAbstractImport()
Construct a default AbleAbstractImport object. |
|
AbleAbstractImport(java.lang.String name)
Construct an AbleAbstractImport object with specified name. |
Method Summary | |
---|---|
void |
close()
Close the data source, disable data flow, and set its state to Unititiated. |
static java.lang.String |
Copyright()
Determine the copyright of this class. |
void |
endOfFile()
Notify any listeners that we are at the end of the file, and increment the epoch count numEpochs. |
boolean |
eof()
Return whether the data source is at end of file. |
protected java.util.Vector |
getAgentFieldList()
Get the default fieldList for this object's container agent. |
int |
getBufferSize()
Return the buffer size. |
boolean |
getComputeStatistics()
Return the value of the computeStatistics setting. |
long |
getCurrentRecordIndex()
Get the index of the last record in the entire data file processed. |
double |
getCycleSize()
Return the raw cycle size setting. |
java.lang.String |
getCycleSizeAsString()
Return the raw cycle size formatted appropriately for the cycleRelative flag. |
java.util.Vector |
getFieldList()
Return a Vector of AbleField objects defining each field in the data source. |
java.util.Vector |
getFieldList(java.lang.String usageType)
Return a Vector of AbleField objects with the specified usage. |
void |
getNextRecordBlock()
Read the next bufferSize records from the data source. |
int |
getNormalizedRecordSize()
Return the size of the record after categorical and discrete fields are expanded. |
int |
getNumberOfOutputFields()
Return the number of fields per record in the data source. |
long |
getNumEpochs()
Retrieve the number of passes over the data, or epochs. |
long |
getNumRecords()
Return the number of records in the data source. |
long |
getRecordsRead()
Return the current count of records read from the beginning of the data source. |
long |
getStepsPerCycle()
Calculate and return the number of steps in a cycle from the raw cycle size, using the cycleRelative flag. |
void |
init()
Open the data source. |
boolean |
isAllNumericData()
Return true if all fields are "continuous", and false if any are "discrete" or "categorical" (i.e. |
boolean |
isCycleRelative()
Return whether the raw cycle size is to be interpreted as a factor of the number of records in the data source, or as an absolute number of records. |
boolean |
isRandomizeData()
Return whether records are processed in random sequence or not. |
boolean |
isReady()
Indicate whether the importData is ready to provide data. |
void |
open()
Open the data source if it is ready. |
void |
process()
Process gets the next record from the data source, and places its contents in the outputBuffer. |
void |
processAbleEvent(AbleEvent e)
Process an AbleEvent sent by another Able bean. |
void |
processTimerEvent()
Process a timer expiration event synchronously; that is, on the same thread as the caller. |
void |
quitAll()
Close an open data source. |
void |
reset()
Set processing options to default values, and re-initialize (reopen) the the data source. |
void |
setBufferSize(int size)
Set the buffer size, which determines whether to load the entire data source (=0) or just pieces of it (>0) into memory. |
void |
setComputeStatistics(boolean computeStatistics)
Set the value of the computeStatistics flag. |
void |
setCycleSize(double cycleSize,
boolean relative)
Set the cycle size and definition for its use. |
protected void |
setDefaults()
Set processing options to default values. |
void |
setFieldList(java.util.Vector fieldList)
Set a Vector of AbleField objects defining each field in the data source. |
void |
setRandomizeData(boolean state)
Set the randomize flag so records are processed in random sequence. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface com.ibm.able.AbleBean |
---|
getComment, getLogger, getName, getParent, getState, getTraceLogger, init, isChanged, process, removeAllConnections, resumeAll, setChanged, setComment, setLogger, setName, setParent, setState, setTraceLogger, suspendAll |
Methods inherited from interface com.ibm.able.AbleEventListener |
---|
handleAbleEvent |
Methods inherited from interface com.ibm.able.AbleEventListenerManager |
---|
addAbleEventListener, dataChanged, getAbleEventListeners, notifyAbleEventListeners, removeAbleEventListener |
Methods inherited from interface com.ibm.able.AbleEventQueueProcessor |
---|
processNoEventProcessingEnabledSituation |
Methods inherited from interface com.ibm.able.AblePropertyChangeManager |
---|
addPropertyChangeListener, addPropertyConnection, getPropertyConnectionManager, removeAllPropertyConnections, removePropertyChangeListener, removePropertyConnection |
Methods inherited from interface com.ibm.able.AbleSerializable |
---|
getFileName, restoreFromFile, restoreFromFile, saveToFile, saveToFile, setFileName |
Field Detail |
public static final java.lang.String defaultName
protected int numFieldsPerRecord
protected int bufferSize
protected long numRecords
protected long recordsRead
protected long numEpochs
protected AbleImportData importData
protected boolean randomizeData
protected int[] randomIndices
protected boolean computeStatistics
protected java.util.Vector fieldList
protected transient java.util.Vector textData
protected transient java.util.Vector numericData
protected long recordIndex
protected int bufferRecordIndex
protected boolean eof
protected double cycleSize
protected boolean cycleRelative
public transient double[] outNum
public transient java.lang.Object[] outSym
Constructor Detail |
public AbleAbstractImport() throws AbleException
public AbleAbstractImport(java.lang.String name) throws AbleException
name
- A String containing the name used to identify this bean.Method Detail |
public void init() throws AbleException
init
in interface AbleBean
init
in class AbleObject
open()
public void open() throws AbleException
init()
protected java.util.Vector getAgentFieldList() throws AbleException
The result is the fieldList from the active data source if there is one, or the first opened data source otherwise. It will be an empty Vector if the container has no open data sources. If this object is not in a container, return the object's current fieldList.
public void process() throws AbleException
process
in interface AbleBean
process
in class AbleObject
com.ibm.able.AbleBean
AbleException
- If an error occurs.public void processTimerEvent() throws AbleException
This method is called by our AbleEventQueue whenever the following conditions are all true:
This method calls process populate the output buffer with the next data record.
processTimerEvent
in interface AbleEventQueueProcessor
processTimerEvent
in class AbleObject
AbleObject.setSleepTime(long)
,
AbleObject.setTimerEventProcessingEnabled(boolean)
,
AbleObject.startEnabledEventProcessing()
public void reset() throws AbleException
reset
in interface AbleBean
reset
in class AbleObject
com.ibm.able.AbleBean
AbleException
- If an error occurs.protected void setDefaults() throws AbleException
public void quitAll() throws AbleException
quitAll
in interface AbleBean
quitAll
in class AbleObject
close()
public void close() throws AbleException
public int getNumberOfOutputFields()
getNumberOfOutputFields
in interface AbleDataSource
public long getNumRecords()
getNumRecords
in interface AbleDataSource
public void getNextRecordBlock() throws AbleException
public void setBufferSize(int size) throws AbleException
public int getBufferSize()
public long getRecordsRead()
public long getCurrentRecordIndex()
public void processAbleEvent(AbleEvent e) throws AbleException
processAbleEvent
in interface AbleEventQueueProcessor
processAbleEvent
in class AbleObject
com.ibm.able.AbleEventQueueProcessor
theAbleEvent
- The event to process.AbleException
- If an error occurs.public long getNumEpochs()
getNumEpochs
in interface AbleDataSource
public java.util.Vector getFieldList()
getFieldList
in interface AbleDataSource
public void setFieldList(java.util.Vector fieldList)
setFieldList
in interface AbleDataSource
fieldList
- A Vector of fields. The Vector may be empty, but not null.public java.util.Vector getFieldList(java.lang.String usageType) throws AbleException
AbleData.UsageType(String)
.usageType
- A String denoting usage type.public int getNormalizedRecordSize() throws AbleException
A categorical field, for example, is encoded in 1-of-N format in which one field in a boolean vector is used to indicate the value present. The expanded field thus is the same length as the number of unique categorical values.
public boolean isAllNumericData()
isAllNumericData
in interface AbleDataSource
public boolean eof()
public boolean isReady() throws AbleException
isReady
in interface AbleDataSource
public void endOfFile()
public void setRandomizeData(boolean state)
public boolean isRandomizeData()
public double getCycleSize()
public java.lang.String getCycleSizeAsString()
public boolean isCycleRelative()
public long getStepsPerCycle()
getStepsPerCycle
in interface AbleDataSource
public void setCycleSize(double cycleSize, boolean relative)
cycleSize
- A double value.relative
- A boolean indicating how to interpret the cycleSize.
If true, the number of steps in a cycle is the cycleSize
multiplied by the number of records in the data source.
If false, the number is the absolute number of records
to process in a cycle.public boolean getComputeStatistics()
public void setComputeStatistics(boolean computeStatistics)
public static java.lang.String Copyright()
|
ABLE 2.0.0 07/02/2003 10:25:01 | ||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |