 |
C H A I M S: Compiling High-level Access Interfaces for Multi-site Software
Towards the Science of Component Engineering |
 |
Incremental result extraction and progress monitoring in other protocols:
written by Dorothea Beringer, March 1999
Incremental result extraction and progress monitoring
in JointFlow
JointFlow is the Joint Workflow Management Facility of CORBA [JointFlow98].
It is an implementation of the I4 protocol of the workflow reference model
of WfMC [WfMC94] on top of CORBA. JointFlow adopts an object oriented view
of workflow management: processes, activities, requesters, resources, process
managers, event audits etc. are distributed objects, collaborating to get
the overall job done. Each of these objects can be accessed over an ORB,
the JointFlow specification defines their interfaces in IDL.
Starting execution of work
Simplified, work is started in the following way:
-
A requester (instance of WfRequester) gets the reference of a process
manager (instance of WfProcessMgr) from somewhere, e.g. over a naming
service. The requester then invokes the create_process operation of the
process manager thus prompting it to create a new process (instance
of WfProcess). Both, requester and process get the references of each other,
allowing future communication. Finally, the requester sets the context
attributes in the process and invokes the start operation of the process.
Context attributes not only contain the data or pointers to data on which
work should be performed. They can also determine which results are desired
and what kind of work should be done (can replace the notion of having
several methods to choose from in a megamodule).
-
The process may be a wrapper of legacy code or of a physical device,
or it may contain several execution steps encapsulated in activities
(instances of WfActivity) that are instanciated during the execution of
the workflow represented by this process instance. An activity might need
external (human) help. This is achieved by assigning a resource
(instance of WfResource) to the activity, either by the activity itself
or in conjunction with some resource manager that may or may not be implemented
as an instance of a WfProcess. An activity may also act itself as
a requester, and start some other process via a process manager to do the
work for it.
The process contains references to its activities, the activity contains
references to its assignments of resources.
Prior to requesting the creation of a new process, a requester can get
the signatures of context as well as result attributes (ProcessDataInfo)
from the process manager. Yet this information only contains a list of
pairs of attribute name and type name (strings denoting IDL types). If
the type name is a complex type, no further information about its structure
is available. Also no constraint information is available, e.g. for
determining which context attributes have to be set for specific results
(e.g. for a travel reservation service, name information is necessary always
whereas other information is different for reserving a hotel room, renting
a car, or booking a flight).
Monitoring the progress of work
Both, processes and activities are in one of the following
states:
running, not_running.not_started, not_running.suspended, completed (successfully),
terminated (unsuccessfully), aborted (unsuccessfully).
Assignments
are either in the state potential (assignment not yet accepted by
resource) or accepted. A requester can query the state of a process, the
states of the activities of the process (by querying and navigating the
links from processes to activities), and the states of assignments (by
querying and navigating the links from activities to assignments).
If the requester knows the workflow model with all its different steps
implemented by the process, the requester even might be able to interpret
the state information and figure out what the progress of the process is.
Yet for a requester not familiar with the internal workflow logic of the
process this status information is not of great help, the requester just
can determine if the process is complete or not (yet if the process is
complete the requester would be notified by its receive_event operation
anyway), and if the process is running or suspended. Without intimate knowledge
of the workflow logic and model the requester has no way of determining
how far a process has advanced or how much more time it might take. The
same is true whenever a process either does not have sub activities, or
these sub activities are hidden as would be the case for autonomous processes
located in other organizations that care about privacy.
Comparison to CPAM:
-
CPAM supports the notion that certain services may support progress information
(e.g. 40% done) that can be monitored. This information is more detailed
than just running or complete, and more aggregated and better suited for
autonomous services than detailed information about component activities.
-
JointFlow signals completion of work to the requester and process, whereas
in CPAM this information has to be polled for by repeated progress monitoring
There is a possible workaround in JointFlow for getting progress information:
a process can have a special result attribute for progress information
and the process is free to update that attribute regularly. It then can
send a WfAuditEvent with the old and new value of the progress indicator
result to its requester after each update. Yet this result attribute cannot
be polled by a requester (in contrast to CPAM and SWAP), because get_result
only returns results if all are available at least as intermediate results.
Extracting results incrementally
Both, processes and activities have an operation get_result():ProcessData
(returning a list of name value pairs). Get_result does not take any input
parameter and thus returns all the results. The get_result operation may
be used to request intermediate result data, which may or may not
be provided depending upon the details of the work being performed. If
the results cannot yet be obtained, the operation get_result raises an
exception and returns garbage. The results are not final until the unit
of work is completed, resulting in a state change to the state complete
and a notification of the container process via the operation complete()
or of the requester via the operation receive_event(WfEventAudit). This
kind of extracting intermediate results corresponds to the progressive
extraction of all result attributes in CPAM.
The following features are not available in JointFlow:
-
Partial extraction with get_result: only all or none of the result
values can be extracted by get_result, there is no mechanism to apply an
exception only for some of the values.
-
Progressive extraction with get_result of just one result attribute
when not yet all other results are ready for intermediate or final extraction
-
There is no accuracy information for intermediate results, unless
it is in a separate result attribute. There is no possibility to find out
the accuracy or if intermediate results are ready unless requesting these
results. The same is true for getting result updates over an WfAuditEvent.
Especially for large data amounts this might be nasty.
Whenever a process or an activity undergoes a state change, a change in
the context data, or a change in the result data, the process or activity
creates a WfAuditEvent containing the old and the new information (for
context and result data only those attributes that have changed are listed).
A process can send this event to its requester via the receive_event operation.
Though no such operation is mentioned for sending the events from an activity
to its container process, there must also exist a possibility for it, because
a process is required to store all the events from itself and its activities
in its history log. Drawback: In case of large data, this messaging mechanism
could result in huge amounts of traffic, especially if many increments
of intermediary results are made available.
General question: how, when and by whom are process and activity objects
deleted?
Using WfAuditEvent for partial and progressive result extraction:
-
A process can notify its requester of all changes in result and context
data, in addition to state changes.
-
Such events only contain the data (name value pairs of old and new data)
that has changed, thus having the same effect as partial progressive extraction.
-
Drawback: it is the process that determines which events are to be sent,
not the requester. Unless the process has a special context attribute that
tells it if and which data change events to send, there is no way for the
requester to inform a process about its preferences.
-
It is not clear which operations would be used to notify a process about
changed result data in activities.
Given the fact that there exists no descriptions of scenarios for partial
and progressive extraction and that partial extraction is mentioned nowhere
and progressive extraction is only mentioned in the context of defining
the effect of get_result when a process or activity is not yet completed,
the use of events for partial and progressive extractions seems to be incidental.
It becomes possible because a process is requested to log events, therefore
events for changes in data exist, and the receive_event operation of requester
can receive all the different kind of events, not only state change events.
Therefore, though partial as well as progressive extraction is possible
in JointFlow, JointFlow has not been designed to do it and the way of doing
it is rather inconsistent and incidental. Furthermore, a specific process
must provide additional context and result attributes to fine tune partial
and progressive extraction (determining notification by events, progress
indicator, accuracy indicator). Without special protocol support and being
an integral part in the syntax and semantic of the JointFlow specification
and the system designs based on it, it is doubtful that any services would
provide partial and progressive result extraction or high level progress
information.
CORBA notification service
All objects in JointFlow can use the CORBA notification service for WfEventAudits
as well as additional events. Thus it is possible to implement any notification
between any objects in a particular implementation. Drawback: it is not
specified who should receive which events via the notification service,
so implementations that use the notification service for communication
between
the objects of JointFlow are no more compatible. Notification via the CORBA
notification service is mainly designed for integration of other outside
systems.
Incremental result extraction and progress monitoring
in SWAP
SWAP (Simple Workflow Access Protocol) is a proposal for a workflow protocol
based on extending http. It mainly implements I4 (to some extend also I2
and I3) of the WfMC reference model.
The different components of a workflow system are internet resources
that implement one or several of the interfaces defined in SWAP. The three
main interfaces are ProcessInstance, ProcessDefinition and Observer.
The messages exchanged between these resources are extended http-messages
with headers like PROPFIND, CREATEPROCESSINSTANCE etc. The data is encoded
as text/xml in the body of the message.
Starting work
-
Somebody, e.g. a resource implementing the interface Observer, knows the
URI of the ProcessDefinition it is interested in. With PROPFIND it can
ask for information about the resource, this includes information about
names and types of context and result data. As the response is in XML,
the protocol itself does not limit the amount and depth of type information
given. The SWAP specification does not specify the syntax and semantic
of type information.
-
A process instance is created and started by sending a CREATEPROCESSINSTANCE
message to the appropriate ProcessDefinition resource. This message also
contains the context data to be set and the URI of an observer resource
that should be notified about completion and other events. The response
returns the URI of the created process instance resource which implements
the interface ProcessInstance. Context data can also be set by sending
PROPPATCH messages to the process instance. The process is started either
automatically by the ProcessDefinition resource if the CREATEINSTANCEMESSAGE
contains the startImmediately flag, or by sending a PROPPATCH message to
the process instance with the new state running.
Additional Observer resources can subscribe to a process instance any
time (the SWAP specification does not specify if they will receive only
state change events or also data change and role change events).
-
A process instance resource can delegate work to other resources
by creating ActivityObserver resources (a specialization of
the Observer interface) which create new process instances via process
definition resources or give work to some human being or legacy system.
Result extraction and result monitoring
Results are extracted from a process instance by sending it the message
PROPFIND.
This message either returns all available results, or if it has a list
of result attributes to be returned, it only returns the selected ones.
Only result attributes are returned that are available. If requested attributes
are not yet available, presumably an exception should be returned. SWAP
does not specify if the results returned by PROPFIND have to be final or
not, though I rather assume they have to be final.
Completing of work of a process instance or another resource is signaled
to an observer with the COMPLETE message. This message also contains
the result data: all the name value pairs that represent the final set
of data as of the time of completion. After sending the COMPLETE message
the resource does not have to exist any longer.
A process instance can also send NOTIFY messages to an observer
resource. These messages transmit state change events, data change events,
and role change events. Data change events contain the names and values
of data items that have changed. Who determines if an observer is notified
about all or only some of the possible state, data (result and context)
and role changes? Requiring notification of data changes as default or
mandatory seems to be quite an overkill as all result attributes would
be sent to the observer at least twice, once by NOTIFY and once by COMPLETE,
or even more often if PROPFIND messages are used before a COMPLETE is received.
Process instances receive result data from other processes or
legacy systems over the ActivityObserver interface. As activity observers
they can receive COMPLETE messages and PROPPATCH messages. Both contain
a list of result attributes as name value pairs, though in the case of
PROPPATCH this can be a partial list. SWAP does not specify if the results
may also be intermediary or not.
Process progress monitoring
PROPFIND not only returns all result values available, it also returns
the state of the process instance and additional descriptive information
about the process. Possible states can be specified by the process
itself, PROPFIND also returns the list of all possible state values, yet
in most cases it would probably just be not_yet_running, running, suspended,
completed, terminated. A process instance can be asked for all the activities
it contains (the URIs of the activity observers it contains), and these
activity observers can then be asked for their state information which
mirrors the state of the process instance or legacy system they are observing.
Drawbacks: see section on JointFlow.
Overall progress information is not specified by SWAP, but it could
be implemented by a special result attribute assuming that result attributes
can be changed over time. Such result attributes could be extracted any
time by PROPFIND, independent of the availability of other result attributes.
Drawback: PROPFIND always returns all possible information about a process
instance, returning of result attribute values can be selected but not
turned off.
Summary
These mechanisms allow the following kind of result extraction and progress
monitoring:
-
Partial result extraction: Either pulling results via NOTIFY
messages or pushing results via PROPFIND messages is possible. NOTIFY sends
all new result data, PROPFIND returns all available result data whether
or not they have already been returned by a previous PROPFIND. Notification
of result changes without sending also the changes, or asking for the status
of results without getting also the results, is not possible.
-
Progressive result extraction: After reading the SWAP specification
it is not entirely clear if progressive result updates in a process instance
are allowed or not. If not, the result attributes would not be available
until their values are final. If yes, then progressive results can be extracted
either by pulling results via NOTIFY messages or by pushing results
via PROPFIND messages. NOTIFY sends all changed/updated result data, PROPFIND
returns all available result data whether or not they have changed since
the last PROPFIND. Accuracy indication is not provided, it would have to
be implemented via additional result attributes. This is also true for
simple complete/not_yet_final status of individual result attributes.
-
Monitoring state of processes: Using PROPFIND to query status of
the process instance and state of sub activities and process instances.
-
Monitoring overall progress: Introducing additional result attribute.
SWAP presumably does not inhibit incremental result extraction and progress
monitoring. Partial result extraction is even very straightforward and
supported quite well by PROPFIND as well as NOTIFY. Some issues around
progressive result extraction are not clear. Also the monitoring
part for incremental result extraction and overall progress information
is quite weak. This is clearly due to the fact that incremental result
extraction is not a main objective of SWAP, if it has been an objective
at all.
Incremental result extraction and progress monitoring
in CORBA-DII
CORBA offers two modes for interaction between a client and remote servers:
the static and the dynamic interface to an ORB. For the static interface
an IDL must exist that is compiled into stub code that can be linked with
the client. The client then executes remote procedure calls as if the remote
methods were local.
The dynamic invocation interface (DII) offers dynamic access where no
stub code is necessary. The client gets somehow the reference to a remote
object (e.g. from another method call), and the client somehow has to know
the IDL from the remote object, i.e., the names of the methods and the
parameters they take. The client then creates a request for a method
of that object. Creating a request for a specific object instance takes
the following parameters: the method name as a string, a pointer to a list
of named values for all the IN, INOUT and OUT parameters of the method,
a named value for the return value of the method, and some flags. A named
value is a structure containing the name of the parameter, the value as
type any (or a pointer to the value and a CORBA type code), the length
of the parameter, and some flags. The ORB needs all the information in
the named value to make sure the parameters are the once the server expects.
As this is not checked at compile time, it will be checked at run-time
using the information like type codes. Creating a request has many
similarities to how in Java JNI method handles are created for calling
Java methods out of C code.
Once the request is created, the method can be invoked. This is either
done synchronously with invoke or asynchronously with send
(in fact, some flags allow more elaborate settings). Invoke returns after
the invocation has finished, and the client can read all OUT parameters
in the named value list. In case of a send, the client is not blocked.
In order to figure out when the invocation has finished, the client can
use get_response, either in a blocking (it waits until invocation
is done) or a non-blocking mode. When / if the return status of get_response
indicates that the invocation is done, the client can read OUT parameters
from the named value list.
In case of the DII, asynchronous invocation of methods is supported
in CORBA. The progress of an invocation can be monitored as far as DONE,
NOT_DONE is concerned, but no further progress information is returned
(e.g. how much is done). Incremental extraction of results (i.e. OUT parameters
as well as the return value of a method) is not supported by DII. When
creating a request the parameters for the method can be inserted into the
request step by step using add_arg on the request object, yet this
just concerns the creation of the request on the client side and cannot
be compared to SETPARAM in CHAIMS.
In order ot mimic the incremental result extraction of CHAIMS, one
could use asynchronous method invocation with DII coupled with the event
service of CORBA. The client could be implemented as a PullConsumer for
a special event channel CHAIMSresults, the servers could push results into
that channel as soon as they are available, together with accuracy information.
Though event channels could be used for that purpose (we could require
that every megamodule uses event channels for this), an integration
of incremental result extraction and invocation progress monitoring into
the access protocol itself is definitely more adequate when we consider
this to be an integral part of the protocol.
References
[JointFlow98] Workflow Management Facility, Revised Submission, OMG Document
Number: bom/98-06-07, July 1998
[WfMC94] Workflow Management Coalition: The Workflow Reference Model,
Document Number TC00-1003, Nov 1994
[SWAP98] Simple Workflow Access Protocol (SWAP), Keith Swenson, IETF
internet draft, August 1998