[saga-rg] SAGA spec proof read

Fri Jun 9 01:32:58 CDT 2006

Hi Pascal, 

comments to the comments... :-)

(removed some topics, agreed with them, and will change
accordingly)

Quoting [Pascal Kleijer] (Jun 09 2006):
> 
> Hi,
> 
> some in-line comments...
> 
> 
> >>Some extensions:
> >>Purpose: ?Reads up to ?size? bytes of data from the stream into an array 
> >>of bytes. An attempt is made to read as many as ?size? bytes, but a 
> >>smaller number but may be read. The number of bytes actually read is 
> >>returned as ?nbytes?. If ?size? is zero, then no bytes are read and 
> >>nbytes is 0; otherwise, there is an attempt to read at least one byte.?
> >
> >That should already be the case - that is why 'nbytes' is
> >returned, indicating the actual number of bytes copied.
> >nbytes can be smaller than size w/o posing an error
> >condition (as in POSIX).  Is that what you propose?
> 
> Well I think the whole description should be redone based on the 
> blocking or not behavior. Also the 'nbytes' returned should add, like in 
> POSIX, the error code (negative). IF a new read is made after an fatal 
> error, then the exception should kick in. See below.

See below.

> >>3. In this implementation we have no choice but to wait for an exception 
> >>to be thrown when the stream has dried up (use the read method). It 
> >>would be nice to have -1 set to nbytes to first time the end of stream 
> >>is reached (if applies), then the next call to read will throw an 
> >>exception.
> >
> >What is the error condition the exception is supposed to
> >catch:
> >
> >  - stream died
> >  - no data available
> >
> >The first one is available via the state of the stream.
> >
> >The second one is what you refer to, right?  But I think
> >that this is not an error condition which should raise an
> >exception: data might be available again a second later.
> >
> >If data availability poses an error condition or not should
> >be decided on application level IMHO.  
> >
> >Am I missing something?
> 
> Basically the exception should be thrown if the last read did return 
> with an error. If the error was rectified in the mean time, the read 
> should go one as a new call. Therefore the state machine is important to 
> have, because a change of state will flush the last error.

I am not sure if I would agree with this one.

Assume the following pseudo code:

 char chache[10] = "";
 int  nbytes;

 while ( nbytes = socket.read (cache, 10))
 {
   if ( 0 > nbytes )
   { 
     print (cache);
   }
   else
   { 
     sleep (1);
   }  
 }

Adding an exception on the second failed read breaks that
schema, and makes code more complex.  Why?  Because the
application has to explicitely keep track of the stream
state.

However, POSIX and most other stream APIs I can think of
would fully support the schema above.

Also, what is the application supposed to do after the first
non-read?  It is bound to eventually try again, and raise an
exception.

However, that is related to the next point you raise...

(PS.: new data would not change the state, really, in the
sense of state diagram...)

> >>4. There is no way to check the size of the data waiting for read. It 
> >>would be nice if we can query the stream on how many bytes in the pipe 
> >>are waiting. This is important if the client must allocate a buffer to 
> >>read this data. So an additional method like ?available? should be 
> >>added. Note that this will slightly overlap with the ?wait? method for 
> >>?read? or ?any? mask, but will return a more accurate figure. Another 
> >>solution is to extend the wait method.
> >
> >I would have no idea how to implement that on a stream:
> >without an additional protocol, you have no chance to know
> >how many bytes will arrive.
> >
> >However, that is one of the reasons why the message API was
> >proposed: that implies an additional protocol on the wire,
> >but will allow the application to query the message size
> >before performing the message read.
> 
> I don't say how many will arrive, but how many 'has' arrived. That mean 
> how many are currently stored somewhere within the stream and that are 
> ensured to be returned when the next call the read is done. In that case 
> we can test for a possible block (if blocking) and size of the buffer to 
> allocate (if necessary) as well as to avoid a catch block.

Even that would be difficult to implement.  Assume you
implement that on a BSD socket.  If you do that strainght
forward, so just map the read to a socket read etc, you
don't have that test call.

If you want to provide the test call, you have to
continously READ the data from the socket, to push it into a
cache, and to return the current size of the cache on the
test call.

  - How much do you cache?  1kB?  1MB? 1GB?
  - How do you know that the app actually wants all the
    data?
  - you imply the existence of a thread in the SAGA
    implementation which continously reads on the stream
  - you loose zero copy, so degrade performance

These are again exactly the problems the message API is
supposed to solve - there the cache size is given by the
message size, and it is assumed that the application is
interested in one complete message, at least.  And zero copy
should be preserved as the message size will be known in
advance.

Cheers, Andre.

-- 
"So much time, so little to do..."  -- Garfield