The Collaboratory Interoperability Framework Common Application Programming Interface

Deb Agarwal, Peter Schabert, Nitya Narasimhan, Karlo Berket, Ian Foster, Steve Tuecke

Table of Contents

The Collaboratory Interoperability Framework Common Application Programming Interface


1.0 Introduction

Scientific collaboratories are software systems designed to support scientific collaboration. Such systems couple diverse components, including videoconferencing, experiment monitoring/control, file sharing, and electronic notebooks. Traditionally these components have been developed in isolation without regard for interoperability. This use of distinct technical solutions might seem reasonable since each component has different communication requirements regarding reliability of delivery, number of recipients, and ordering of messages. However, these components need to be plug-and-play and to interoperate in the collaboratory environment. For example, an experiment control system may be required to communicate with an electronic notebook for automatic storage of collected data. We believe that an important first step towards such interoperability is to define a common communication application programming interface (API).

The common communication API defined in this document is intended to provide a simple uniform interface to different low-level protocols, including reliable and unreliable unicast, different forms of multicast, and vendor-specific protocols. We focus on simplicity because we believe that the task of programming collaboratory components will often fall to people inexperienced in communication protocols. The currently available APIs for communication programming (e.g., TCP/IP sockets) are difficult to comprehend and use; our proposal is considerably simpler, and will we hope facilitate development of collaboratory components by non computer scientists. We emphasize uniformity in the interface because this allows low-level implementations to evolve without requiring reprogramming of applications.

Simplicity rather than performance is our primary design goal. The API is unlikely to provide the same level of performance as a professional programmer could get by using the various protocols natively. However, it is our intention to provide reasonable performance.

2.0 Background

It is anticipated that collaboratory components require access to a suite of communication protocols that include initially at a minimum: unreliable unordered unicast (UDP), reliable source-ordered unicast (TCP), unreliable unordered multicast (IP Multicast), unreliable ordered multicast, source ordered reliable multicast (XTP), and reliable ordered multicast (totem). Although these protocols all exist, each protocol has its own unique and complicated interface. Most of the protocols support the sockets interface but this interface is relatively low level and requires a great deal of configuration and error handling on the part of the user.

The current sockets interface is used to support both stream and datagram based connections and provides a variety of message send and receive functions each with disjoint sets of arguments. The setsockopt function is the user's only means of customizing a socket connection and as new protocols have been built they have made extensive use of the setsockopt function to set the protocol parameters. In order to create an IP Multicast socket, it now takes a significant quantity of code to set all the necessary options. Emerging reliable multicast protocols support a richer set of functionality then can be represented with the sockets interface and many have abandoned any pretense of supporting the sockets interface.

So, although the sockets interface appears to give a single general unifying API, in practice it encourages wide divergence in its use. Once a program has been built with a particular communication protocol as its underlying means of passing messages, it is a time-consuming task to convert the program to use an alternate communication protocol. This difficulty in conversion is generally not because the application is exploiting special features of the protocol but is instead caused by the differences in the functions utilized for communication over the protocol.

At first glance differences between communication protocol APIs appear to be necessary. Multicast communication needs representation mechanisms for membership and to join process groups, and TCP connections are stream rather than datagram oriented. It is our contention that the apparent differences in the API needs of the protocols are largely the product of attempts to conform to the sockets interface. The unicast connection is in fact a simplified multicast connection that has at most two members. Membership messages at the API of a TCP connection indicate when the connection is completed.

Of primary importance in designing the generic communication API is ease of use to allow relatively inexperienced computer programmers to build distributed programs. It is not, however, our intention to sacrifice functionality that is needed by the experienced computer programmer. In our experience most programs have relatively simple needs from the communication protocol and it is our expectation that a reasonable set of defaults for each of the communication protocols will in fact suffice for most programmers. An additional priority in design of the API is ease of conversion of a program to use a different protocol. This will allow a program to be developed and debugged using a unicast communication protocol and then easily expanded to allow multicast at a later date.

3.0 The Common API

This document describes the generic communication API proposed for the DOE2000 Collaboratory Interoperability Framework (CIF) project. The API is structured to meet the requirements of a message-passing, event-driven communication paradigm. The actual design will be developed in several phases, the first version will provide basic communication functionality without event handling. The second phase will provide the ability for the application to register event handlers to be called to handle incoming messages and events on connections. During the third phase we will consider expanding the send and receive functions to provide data type specific functions. We will also be including the DOE2000 security project work so that a connection initiator can be authenticated and the proper authorizations checked prior to granting access. In addition encryption capabilities will be provided.

The objective of this API is to obtain a uniform object-oriented interface that implements the common functionality required to interface to unicast and multicast communication. Four basic functions will be provided:

A connection is a communication session that is identified by a remote machine end-point or a multicast group. Once a connection has been opened the application can use it to send and receive messages within the communication session. We also provide the ability to listen for connection requests. When a connection request comes in connection is created. Messages are assumed to be message-based rather than stream-based. The application (user) can set a small number of protocol-specific parameters before opening a connection, but the details of the underlying protocol are transparent to the user when operating on the connection. The application can thus switch effortlessly between different underlying protocols, varying in particular

according to what is desired for the current communication session.

Protocols supported by the Common API:
Implementations of the following protocols are planned in order to achieve the levels of ordering and reliability guarantees supported by the API.

UUM : Unreliable Unordered Multicast (e.g IP Multicast)
UOM : Unreliable Ordered Multicast (IP Multicast with out of order packets filtered)
UDP : Unreliable Datagram Protocol (UDP)
TCP : Transmission Control Protocol (TCP)
ROM : Reliable Ordered Multicast (e.g the Totem protocol)
RSM : Reliable Source Ordered Multicast (e.g the XTP protocol )


A pictorial representation of the CIF API is shown in the Figure above. This rest of this section gives the object-oriented description of the API design. We will give an overview of each of the classes and the methods (functions) in the class and then detailed descriptions. The classes do not define constructors because the CIFFactory class is used to construct these objects. The declarations in this document will use Java syntax.

Connection Class -

The Connection class describes a communication connection and provides the methods for using and administering the connection. The Connection methods pass as arguments only parameters that are common to all the underlying protocols. Additional protocol-specific parameters have therefore to be obtained from the user by some other means.

The default parameters used for a connection are those appropriate to the particular connection type. The user can set or get the parameters by accessing the Parameters object. Users who choose not to set any parameters can proceed using the default parameters specified.

send() : to transmit a message (blocking)
receive() : to receive a message (this is a blocking call with a timeout)
close() : to terminate the opened connection
poll() : to check for the receipt of messages at the local connection endpoint
membership() : to get the membership of the connection

Additional Methods for querying a connection object about its state

isMember() : ask whether a process is a member of the connection
getLossStatistics() : get the current loss rate for the connection
getRTT() : get the current estimate of the round trip time
getThroughput() : get the current real throughput for the connection

Message Class -

Each message is received by the application as an object. The data is the original message as sent by the sending application process. The info block contains information about the message that has been added by the communication protocol including: sender identifier, timestamp, and message type.

data() : return the data portion of the message
info() : return the user info block for the message

Listener Class -

In the case where the application wants to run a server that listens for multiple remote sites to connect in to a particular port using unicast, there is a listener class. The listener class will return a connection class object when a remote site connects to this port. The remote sites are not limited to connecting in one at a time.

getURL() : retrieve information for use in connecting to the listener
accept() : listen for new connection requests and create a connection object for the new connection when it is requested.

Parameter Class -

The default parameter settings for connections are expected to be correct for most applications. We realize, however, that there will inevitably be applications that require particular settings for the connection parameters. All settable parameters are accessible through objects of the parameter class.

setFlowControl() : set the parameters related to flow control
setSecurity() : set the parameters related to security
setGeneral() : set the general parameters
setPriority() : to set priority request parameters

CIFFactory Class -

The CIFFactory provides a means of getting objects of the Connection, Listener, and Parameters classes without the application having to know or care about the specific protocol implementation. The user instead simply thinks of the returned object as being one of the generic class.

getConnection() : to construct a connection object and initiate a connection
getListener() : listener object constructor
getParameters() : get a new Parameters class object

The classes and their methods are described in greater detail in the following sections.


The Connection class


Sending a message

void send(msgtype, byte[], len) throws IOException { .. }

void send(byte[], len) throws IOException { .. }

Description : This is the method used to transmit messages. The send accepts messages of variable length. However, message size limitations may be imposed by either the definition of the underlying protocol or by the network routers. The user can optionally define msgtypes and specify a message type on the send. The msgtype will be indicated in the message header when the message is received.
Arguments : The user can optionally specify an integer indicated a user defined message type. Currently the message must be a byte array. The last argument defines the length of the byte array (message) being passed in.

Returns : Void
Exceptions : Throws all exceptions up (to be handled at a higher level) including standard IO Exceptions and any method-specific exceptions defined.


Receiving a message

Message receive( ) throws IOException { .. }

Description : This is the method used to receive messages, it normally operates in blocking mode with a time-out. The buffer is dynamically allocated when required by the receive method. Information regarding length of the message, sender of the message, time of transmission, and order of the message are contained in the message header (see Message class for header description).

Arguments : none.
Returns : the message received (header included) as an object (or structure in C).
Exceptions : Throws all exceptions up (to be handled at a higher level) including standard IO Exceptions and any method-specific exceptions (such as message-truncated or buffer too small)


Closing a connection

void close() throws IOException { .. }

Description : This is the method used to request the closing of a connection. The close method results in the application leaving any group of which it is a member before terminating the connection. After the close has been called there may be additional messages received on the connection to allow clean exits in the case of reliable multicast or unicast connections.
Arguments : None required. The connection closed is the same one previously initiated by the constructor of this object.
Returns : Void
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


Check if there are any messages on a Connection

Boolean poll() throws IOException { .. }

Description : Checks whether there are any messages sitting on the incoming queue that are ready to be received for this connection.
Arguments : None required.
Returns : A 1 indicating there is a message ready for receipt or 0 if there is nothing ready to read.
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


Check the current membership of a connection

Membership membership() throws IOException { .. }

Description : Find out the current membership of this connection. In the case of a unicast connection there will be be at most 2 members. With a multicast connection it will be the membership as determined by the underlying communication protocol.
Arguments : None required.
Returns : A membership object containing the membership and number of members.
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


The Message class


Get the Message Data

Byte[] data() throws IOException { .. }

Description : This method returns the original byte array sent by the application at the other end.
Arguments : None required. The data returned is the data in this particular message.
Returns : The byte array
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


Get the Message Length

long len() throws IOException { .. }

Description : This method returns the length of the original byte array sent by the application at the other end.
Arguments : None required. The length returned is the length of the data in this particular message.
Returns : The length of the data
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


Get the Message Info Block

UserInfo info() throws IOException { .. }

Description : This method returns all the meta-data (not including the data) contained in the message. This information includes message type, timestamp, sender identifier, and data length. There will also be methods provided in the implementation for retrieving specific header meta-data items individually.
Arguments : None required. The user info returned is the user info block in this particular message.
Returns : The user info block associated with the message
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


The Listener class


Get the Listener's URL

String getURL() throws IOException { .. }

Description : This method returns the URL of the Listener. If the URL was not fully specified when the Listener was constructed this method can be used to obtain the fully specified URL in use by the listener. This allows the application to determine the port in use if the port was left unspecified to the constructor on a unicast listener. Other sites that want to be able to connect to the server will need this information.
Arguments : None required.
Returns : The URL associated with the Listener
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


Accept a client connection request

Connection accept( timeout ) throws IOException { .. }

Description : Listen for the next client request for a connection to the server, accept the connection and spawn a new Connection object associated with the request.
Arguments : None required.
Returns : The Connection object that was constructed in accepting the connection request.
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


The Parameters class


setFlwCntrl : set the parameters related to flow control
setSecurity : set the parameters related to security
setGen : set the general parameters
setQoS : to set priority request parameters

Getting and setting parameters
getFlowControl( ... ) throws IOException { .. }
getSecurity( ... ) throws IOException { .. }
getQoS( ... ) throws IOException { .. }
getGeneral( ... ) throws IOException { .. }
setFlowControl( ... ) throws IOException { .. }
setSecurity( ... ) throws IOException { .. }
setQoS( ... ) throws IOException { .. }
setGeneral( ... ) throws IOException { .. }
Description : The parameter manipulation methods are still under construction. The above list of methods is a guess at what these will look like. A decision regarding whether to break the parameters into groups that are set as a unit or into individual parameters has not been made.
Arguments : A list of parameters and requested settings.
Returns : still to be determined
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).

Some likely specific parameters include:
notifymembership : a flag set if the user wants to be notified of membership changes
packetsize : a numeric value for default size of transmitted packets
ttl : an integer value that may be set by the user to limit the scope of multicasts
timeout : default timeout to use for getConnection and receive method calls
encryption key :the key if set will be used to encrypt all messages sent on the connection
directions :whether this connection is send only, receive only or full-duplex
setPacketMax :set the maximum size the packetizer should use for packets. The function of the packetizer is to fragment large messages into smaller packets at the sender to accommodate communication protocol and network limits on packet size.
The above list is expected to grow primarily when the user community requests access to additional functionality within the protocols.


The CIFFactory class


Open a Connection construct a Connection object instance

getConnection( Parameters, URL, timeout) throws IOException { .. }

Description : This is the constructor for the connection class and is used to initiate a connection. The Parameters is an object of the class Parameters and contains any special connection characteristics the user would like set. The URL is a string specifying the protocol and connection destination; in the case of unicast or multicast, this URL will contain a protocol designation (e.g. TCP) and an address and port. This constructor performs checking to see if the specified URL is a valid destination. If the current protocol supports the concept of groups (whereby a processor needs to join a group in order to communicate), the join method is invoked as part of the constructor, making it transparent to the user. In TCP the connect and bind procedures are invoked when opening a connection making them transparent to the user. An example of a URL for a TCP connection to george.lbl.gov on port 2345 would be TCP://george.lbl.gov:2345.
Arguments : The Parameter object specifying any special connection attributes required. If the parameter object is NULL then the default values will be used. The URL is a text string indicating the connection type and what to connect to. The timeout indicates the time-out set for establishing the connection
Returns: A connection object that represents the opened communication connection
Exceptions: Throws all exceptions up (to be handled at a higher level). Exceptions include the standard IOExceptions (such as SocketException, BindException etc.) as well as defined method-specific exceptions (e.g incorrect port or address specified etc)


Construct a Listener object

getListener( Parameters, URL ) throws IOException { .. }

Description : The listener object is provided to allow the application to create a server that listens for connection requests. Objects of this class are expected to only be needed for unicast servers. The Listener class will accept connection requests and create a connection object to correspond to the new connection. The listener class is only required in cases where the remote side of the connection is unknown or multiple simultaneous connection requests need to be serviced. All connections spawned from this listener will use the parameters provided in the constructor. The URL contains only a port in the unicast (UDP/TCP) server case and would look like (::1234).
Arguments : The Parameter object if the application would like all connections created off of this Listener to have special settings. If this argument is NULL then the default parameters are used. The URL is a string specifying the address the server should listen on (normally a port). If the URL is NULL then a free port is chosen.
Returns : A new Listener object
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


Parameters constructor

getParameters( ) throws IOException {..}

getParameters( Parameters ) throws IOException {..}

Description : The Parameters object allows the application to set protocol specific parameters and optional parameters for the connection. If getParameters is called with no arguments then a new Parameters object is created with the default settings. If getParameters is called with an existing Parameters object as its argument then it copies the parameter settings in the given Parameters object into the new Parameters object before returning.
Arguments : none or a parameter object.
Returns : void
Exceptions : Throws up the standard IOExceptions ( to be handled at a higher level).


4.0 Implementation

At present, we have a rudimentary implementation of the API developed in Java, it implements the Connection class basic methods (send(), receive() and close() ) with additional functionality soon to be implemented.

The objectives set out for this generic API have a lot in common with the goals set out for the Nexus runtime environment developed at Argonne National Laboratory. We are currently developing a preliminary implementation of the API that will use services provided by the Nexus communication library, extended with reliable multicast functionality provided by XTP and Totem. This implementation will be in C++ and will achieve portability across a wide range of Unix systems. Ports to Windows systems are planned in the near future. Development of the API in C and in Java over the Nexus Java version is slated for future work.

5.0 Discussion and Future Directions

The idea of creating a generic communication interface has been evolving within our group over the last 2 years and simple implementations have already proven exceedingly beneficial. Our time to develop the communication in a distributed program is dramatically reduced when using a generic API. Our decision to provide all communication as datagram services is based on our own experience in building applications which indicated that streams based communication was generally used as if it were a datagram service. Further discussions with the various applications programmers is required to determine whether this decision meets the needs of the DOE 2000 applications.

One issue that still needs to be resolved is what the set of standard exceptions will be. The exceptions need to provide comparable information for each of the protocols so that they can be handled generically in the application. We are currently considering defining two classes of exceptions the first class will indicate errors encountered during execution of the protocol. The second class will indicate warnings generated during execution. The error exceptions will be required to be handled by the application. Handling of flow control is not yet covered in the document. We anticipate using the RTP protocol for the IP Multicast traffic for packet loss rate information and many of the other protocols have built-in flow control.

A significant portion of message-based communication is asynchronous in nature. Through the use of an event handler the application will be able to register handlers for each type of message expected on a connection and then do other processing or go to sleep waiting on an event. All events are triggered by the arrival of messages on the connection. The type of an event is determined by the message type field. The event types include system reserved types and user defined types. The system reserved types are used for notifying the application of connection events such as membership changes. The user defined types are used by the application to differentiate between data messages and register handlers for each specific data type.

Many application programs that could benefit from the API described in this document have been built using the Common Object Request Broker Architecture (CORBA). We will define the API as a service available in CORBA. In Addition, the numeric, character, and object datatype formats may be supported in a future send and receive, and allowing the message object to be allocated by the application.



For more details :
CIF Homepage
http://www.mcs.anl.gov/cif/ or
http://www-itg.lbl.gov/CIF/
Contact:
Deb Agarwal (DAAgarwal@lbl.gov)
Peter Schabert (schabert@george.lbl.gov)
Nitya Narasimhan (nitya@george.lbl.gov)
Karlo Berket (kberket@george.lbl.gov)
Ian Foster (itf@mcs.anl.gov)
Steve Tuecke (tuecke@mcs.anl.gov)

This document can be found at http://www-itg.lbl.gov/CIF/Reports/GcommonAPI.html We welcome your feedback regarding the API. Please feel free to contact us at cif@injector.ca.sandia.gov