[SAGA-RG] SAGA Message API Extension
Pascal Kleijer
k-pasukaru at ap.jp.nec.com
Thu Jan 18 22:48:35 CST 2007
Hello all,
I jump into the conversation a bit later then expected. It seems that
Andre always choose a time in space-time that is inconvenient for me :p
Here is some food:
A. Typos
I cannot access the CVS so I post some of my stuff here.
1. Copyright at 2006? Should it not be 2007? See page 1 and 23
2. Most notes of the detailed spec should be "Notes: - see notes *on*
memory management."
3. Section 2.1, "[ibidem]" ???
4. Section 2.1.3, paragraph 2 "A message sent by *an* endpoint"
5. Page 17 "Format: serve" instead of " Format: connect"
There might be more, but this just catched my eyes.
B. API
1. I would called the method "receive()" and not "recv()", in section
2.1.3 you have it right ;)
2. The "test()" method should be called "available()" because it sounds
more like test code then API code. The method should be non-blocking and
return -1 if no message is available. The timeout should be ignored IMO.
3. Abusive usage of the NotImplemented exception. Most method are
mandatory anyway, clean out the spec would be nice.
4. Metric RemoteDisConnect should be RemoteDisconnect.
5. The "msg" class should be called "message"
Andre, you are to C (hacker) centric :p Make it simple for use folx with
full names.
C. Others
Here several more complex things.
1. State machine
Why do you not include the failed states? Depending on how the message
API is handled the failure might be permanent and the object remains in
the given state. We can imaging that the dropped can be recovered for
some time: You have two attributes attempts and delay, the object is in
dropped state until it can reconnect, if the number of attempts are
consumed the state is permanent, otherwise it will sleep (delay) for
some time and try a new connection.
I have this problem with the NAREGI project, for the moment I only allow
one more attempt when the connection is dropped or fails (time out,
others). After that a fatal exception is thrown.
This might be an optional implementation but will ensure resilient
implementation. If the failure is permanent it must be handled at
application level. Of course the two attribute can be modified anytime
during the life time of the object.
2. Silently Discarded & Reliability
In section 2.1.2 second paragraph! Gasp no, in a (un)reliable system
this cannot be! You should at least return a flag with the "send()" in
synchronous connection telling the message was send, ev. acked. In
asynchronous it will be a monitorable field, like a ticked for the
"send()" call or the task that does the job.
Whatever Reliability, Correctness and Order is applied there should be a
way to tell the application, ev. Optional implementation, that the
message was send. If the underlying API does not support reliability the
application can implement one. In a reliable system anyway this should
not happen.
3. Correctness
I would rather see this as an attribute you can set. If the message is
received I am not always interested in knowing if the content is
correct. Just getting it is enough. The overhead for correctness depends
on the implementation but in most cases this means extra CPU usages that
I don’t want. I can think about MPEG streams, if we loose some part or
some is corrupted, I don’t care; we go to the next frame. It would be
even more powerful if you can set correctness for a given direction
specifically.
4. Order
Same as correctness. In most cases I don’t care order. I sometimes
prefer have control over it on at the application level. This
dramatically simplifies the underlying implementation and to have it at
application level can be rather easy. In the NAREGI project I have such
mechanism. I don’t care about the order for 95% of the messages, just
some must be ordered. It would be even more powerful if you can set
correctness for a given direction specifically.
5. Opaque messages
In the NAREGI project we have already implement a messaging API.
Actually we have several, (Socket, File, SOAP, and HTTP), and all use
the same message container. Now the API for each implementation isn’t
unified yet. We do use opaque messages content because when an
application component wants to send a piece of something it just drops
its content. That can be any object. When the message moves down to the
messaging API it passes through filters that will transform it into byte
arrays. The same is true for incoming messages. Filters will transform
the content when it moves up in the application.
The message class has an extra method called "getBytes()" which ensures
to return a byte array. If the content cannot be serialized to a byte
array, then nothing happen and null is returned. The "getSize()" will
return the size of usable block of the array. The "getData()" might
return an array but is not guarantied. In addition the message container
has an ID flag to hint the receiving application in how the filters must
be applied. It has also other attributes but of no interest so far for SAGA.
6. Memory Management
In general for the sending part I totally agree. The management is up to
the application. Now for the receiving part I am not entirely satisfied.
When the management is done by the implementation I am fine, there is
not much to say. But in the case if the application management I have
issues: manly concerning efficient memory usage of the message buckets.
With message intensive applications the message container should be
handled by the application to avoid memory burns/leaks. This is the case
for the NAREI project.
For example, the application creates a pool of buckets (messages) of
lets say 512 bytes each. +90% of the incoming messages are always
smaller then 512, so we have no trouble with the buckets since the array
capacity is always larger then the size of the message. Now what happen
when we have a message larger and we didn’t do the available test
because we want to block the current thread with the received? In the
current proposal my message gets truncated. That is not good. In the
current implementation the implementation will change the array by
reallocating the array.
The message container has two fields: size of the actual message, and
capacity of the current array buffer. In the next version of the API for
NAREGI I will add offset in case we want to use larger array blocks,
typically from network buffers and allow sub blocks to be used. In C
implementation this is not necessary because you can just shift the
pointer, Java cannot.
7. Serve time out and shutdown
I think that we should allow as optional implementation or usage to have
the server method time out. This will either stop the internal thread or
for example with a TCP/IP messaging solution to set the SO_TIMEOUT flag.
We need this in order to check what is happening with processes. In some
cases the job submission tool reports the process running but no IO
connection is active, a time out might help us release the hook and do
something different before making another attempt.
We should also be able to properly stop the service with a "disconnect"
or "shutdown", in the case the close doesn’t do it. That means that the
server must stop and all the connection handled by the class must be
terminated. That is important for us to free the underlying memory and
objects allocated. Typically we should be able to close the server
sockets in a proper way.
8. Endpoint reuse
The current semantic does not allow reusing the same object after a
close. It might be handy to support a reconnect with an open if the
implementation does not support reconnection. In case the application
must handle stable connection it must create a new object at each time,
which might be a memory burden. If the URL is not modified and the
states are correct I don’t see why we cannot open a connection again
after a close.
OK That is all for now. I might have other things popping up.
--
Best regards,
Pascal Kleijer
----------------------------------------------------------------
HPC Marketing Promotion Division, NEC Corporation
1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan.
Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4385 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.ogf.org/pipermail/saga-rg/attachments/20070119/3947632b/attachment-0003.bin
More information about the saga-rg
mailing list