[SAGA-RG] SAGA Message API Extension

John Shalf JShalf at lbl.gov
Wed Jan 17 12:18:09 CST 2007


On Jan 17, 2007, at 4:23 AM, Andre Merzky wrote:

> Hi John!
>
> Quoting [John Shalf] (Jan 16 2007):
>>
>>
>> Hi Andre,
>> In Chapter 2, the paragraphs explaining the message transfer
>> requirements are a bit confusing.
>> 	1) it says it supports multicast (which is inherently unreliable).
>> I'm sure you mean to say a "message bus" (sort of like the Linux DBus
>> concept), which would not specifically call out a particular network
>> standard (multicast is more specific than a "message bus").
>
> You are right: using 'multicast' here is probably
> misleading.  Thilo made a similar comment, so I'll change it
> to "message bus" or so.
>
>
>> 	2) It says the message must be received completely and correctly, or
>> not at all.  This leaves some aspects of the reliability uncertain.
>> For instance
>> 		a) does this mean it must guarantee delivery of a message?
>> 		Or only  that a delivered message does not contain errors?
>
> It probably needs a more verbose explanation.  Tyhe current
> idea is:
>
>   - the implementation MUST ensure that the messages are
>     complete and error free
>
>   - if message delivery is guaranteed or not is up to the
>     implementation, and up to the used protocol.  The
>     implementation MUST document that aspect, and the URL
>     (scheme...) allows to choose between different
>     reliability modes.

I agree.  I think the reliability semantics should be a property of  
the channel (and hence underlying protocol) and not something to  
switch at runtime.  However, I think the clients that use the  
connection should be able to query what the channel properties are  
(or define the minimum requirements and throw an exception if they  
cannot be met).  This is just a little introspection (not really a  
deep or fancy coding underneath).

> I wanted to avoid a 'reliability' attribute on the endpoint,
> as I don't see a reasonable easy way to switch reliability
> in mid stream.  It is most likely that changes in
> reliability policy would require different protocols (only
> few protocols will be able to serve both ends), and would
> hence need a new connection setup anyway.  Not sure if that
> is reasonable though.
>
> Anyway, that would leave us with specification of
> reliability on connection setup (i.e. endpoint construction)
> - right now that is done via the scheme part of the URL.
> Would a flag be more appropriate/explicit?  Probably...
>
>
>> 		b) what about if the message arrives more than once? (eg.
>> 		redundant  copies of messages can occur in practice for a number of
>> unreliable  or semi-reliable messaging protocols).
>
> Good point.  IMHO we should specify that messages arrive 'AT
> MOST ONCE' (unreliable) or EXACTLY ONCE (reliable).  I don't
> think that we need to support additional modes, nor the
> complete spectrum of (un/)reliable transmission modes.  Or?

I cannot remember the document, but there was a spec for defining  
reliability of another kind of messaging bus.  It had
	unreliable: message may or may not arrive (typical UDP).
	semi-reliable: message must arrive (will be retransmitted if not  
acked).  Source keeps resending until it sees the ack.  It is  
possible that the ack is lost, in which case, the message may arrive  
at the client twice (so this is a Message will arrive at least once,  
and possibly more than once.)  This protocol is useful since the  
message destination does not need to maintain complex message  
state... it just acks messages that arrive (very simple).
	reliable: the message must arrive and it will only arrive once.

>
>> 		c) It says this document will not address things at a
>> 		protocol  level, but I think these issues are semantic and  
>> therefore must
>> be  addressed by the API.
>
> IC, good point.  Well, flags on the endpoint construction
> would solve that.
>
>> 		d) I think there are a number of attributes that users
>> 		should be  able to supply or query when opening a message service
>> connnection.   Like the JMS (Java Message Service), we should be  
>> able to
>> specify  whether this is a point-to-point (message queue) or a  
>> publish-
>> subscribe (message bus) like interface.  The API should not require
>> any work to support both since point-to-point is a sub-category of
>> the message-bus, but it should be an attribute of the message
>> interface that a user can force/specify in the opening of the
>> connection.
>
> Hmm, point to point could be simply established on
> application level, by just setting up a single connection.
>
> Uhm, problem: serve() can only be started, not stopped.  An
> int argument to serve(), giving the number of clients to
> wait for, would solve that.
>
>   // point to point:
>   saga::message  msg;
>   saga::endpoint ep;
>   ep.serve (1);     // allow one client
>   ep.recv  (&msg);  // expect one message
>   exit;
>
>   // publish/subscribe
>   saga::message  msg;
>   saga::endpoint ep;
>   ep.serve ();      // allow n clients
>   while ( 1 ) {
>     ep.recv  (&msg);  // expect m messages
>   }
>
> Does that make sense?

That does address one of the cases (not exactly the one I was  
thinking of).
Also should allow -1 to specify *any-number-of-clients* at the  
underlying protocol's discretion.

But now for a semantic thicket....

Case 1: I connect to a named destination and I am joined together  
with everyone else who has connected to that port.  In this case, any  
message I send will be broadcast to everyone else and vice verse  
(message bus).  This is the equivalent of a publish-subscribe service.
Case 2: I connect to a port and I own the service.  This appears to  
be the case you are setting up above as it only allows one client to  
join the service.
Case 3: I connect to the port and for a sub-process that does not  
share the messages with the other clients that have connected to that  
port (like an HTTP server).  I don't quite see how that kind of point- 
to-point message service is supported.

I think we need an attribute for the message port that says whether  
it is a bus or a point-to-point.  In addition, I like the idea of  
setting the message queue length (as you have above).  I was not  
thinking of that, but I can see the value of creating a first-come- 
first-serve message port as well. (if that is not too complicated).

>> Another thing to support is specification or query of
>> the message service reliability (something that deserves at least a
>> subsection to define.  That is, the document should define classes of
>> reliability (just as done with other XML-based messaging APIs that
>> even define intermediate cases for unreliable messaging that
>> guarantee message arrival, but do not ensure that duplicate messages
>> will not arrive).  The API should allow a user to specify these
>> semantic attributes and will not allow a connection to be built if
>> the underlying protocol for the message connection cannot meet those
>> attributes.
>
> Again, that could be done by using flags on the ep creation.
> I guess that a 'more reliable' connection than requested
> would be possible?  E.g.,
>
>   saga::endpoint ep (saga::message::Unreliable);
>
> could actually set up an unreliable or an reliable
> connection, as both would fulfill the user req.?

Yes, I think it is sufficient to define these things at the setup of  
the endpoint (not something you would change at runtime).

-john





More information about the saga-rg mailing list