[saga-rg] Re: proposal for extended file IO - summary

Andre Merzky andre at merzky.net
Fri Jun 17 14:34:51 CDT 2005


Hi List, 

I went through the IO thread again, and also had a chat with
John Shalf, and I'd like to summarize the outcome of the
discussion.  Please consider that as a joint proposal of
John and me for inclusion in the file IO methods.

  Observations:

   - normal read/write has severe drawbacks on remote IO, if
     used extensively, both sync and async

   - external preprocessing of data for read can be accomplisehd 
     by spawning preprocessing jobs

   - async is well covered by the task model

   - there exists various approaches to improve throughput
     for IO intensive apps, amongst them:

     - (A) gather/scatter (see readv (2)
     - (B) FALLS (regular paterns on binary data)
     - (C) eRead (see ERET/ESTO in gridftp)

  Remarks:

   - the options A, B and C show increasing powerfull
     expressions, but also require increasing concertation
     between client and server side.

   - A is, being POSIX, well known

   - B maps to hyperslabs pretty well, a seemingly common
     access pattern

   - C maps GridFTP, a commonly used protocol, very well

  Proposal:

   - There seem advantages to A, B and C.  Also, the need
     for more than simple read seems obvious.  Hence we
     propose to include A, B and C into the SAGA API.

     void readV       (in  array<ivec>     ivec,
                       out array<string>   buffers  );
     void writeV      (in  array<ivec>     ivec,
                       in  array<string>   buffers  );

     void readP       (in  pattern         pattern,
                       out string          buffer, 
                       out long            len_out  );
     void writeP      (in  pattern         pattern,
                       in  string          buffer,
                       out long            len_out  );

     void lsEModes    (out array<string,1> emodes   );
     void readE       (in  string          emode,
                       in  string          spec,
                       out string          buffer, 
                       out long            len_out  );
     void writeE      (in  string          emode,
                       in  string          spec,
                       in  string          buffer,
                       out long            len_out  );

We think that adding the 7 calls does not bloat the API (although increases the
file method number significantly), but will make the API much more usable for
the targeted use cases.

Please comment :-)
     
Cheers, Andre.




Quoting [Andre Merzky] (Jun 12 2005):
> 
> Hi again, 
> 
> consider following use case for remote IO.  Given a large
> binary 2D field on a remote host, the client wans to access
> a 2D sub portion of that field.  Dependend on the remote
> file layout, that requires usually more than one read
> operation, since the standard read (offset, length) is
> agnostic to the 2D layout.
> 
> For more complex operations (subsampling, get a piece of a
> jpg file), the number of remote operations grow very fast.
> Latency then stringly discourages that type of remote IO.
> 
> For that reason, I think that the remote file IO as
> specified by SAGA's Strawman as is will only be usable for a
> limited and trivial set of remote I/O use cases.
> 
> There are three (basic) approaches:
> 
>   A) get the whole thing, and do ops locally
>      Pro: - one remote op, 
>           - simple logic
>           - remote side doesn't need to know about file
>             structure
>           - easily implementable on application level
>      Con: - getting the header info of a 1GB data file comes
>             with, well, some overhead ;-)
> 
>   B) clustering of calls: do many reads, but send them as a
>      single request.
>      Pro: - transparent to application
>           - efficient
>      Con: - need to know about dependencies of reads
>             (a header read needed to determine size of
>             field), or included explicite 'flushes'
>           - need a protocol to support that
>           - the remote side needs to support that
> 
>   C) data specific remote ops: send a high level command,
>      and get exactly what you want.
>      Pro: - most efficient
>      Con: - need a protocol to support that
>           - the remote side needs to support that _specific_
>             command
> 
> The last approach (C) is what I have best experiences with.
> Also, that is what GridFTP as a common file access protocol
> supports via ERET/ESTO operations.
> 
> I want to propose to include a C-like extension to the File
> API of the strawman, which basically maps well to GridFTP,
> but should also map to other implementations of C.
> 
> That extension would look like:
> 
>       void lsEModes   (out array<string,1> emodes   );
>       void eWrite      (in  string          emode,
>                         in  string          spec,
>                         in  string          buffer
>                         out long            len_out  );
>       void eRead       (in  string          emode,
>                         in  string          spec,
>                         out string          buffer, 
>                         out long            len_out  );
> 
>       - hooks for gridftp-like opaque ERET/ESTO features
>       - spec:  string for pattern as in GridFTP's ESTO/ERET
>       - emode: string for ident.  as in GridFTP's ESTO/ERET
> 
> EMode:        a specific remote I/O command supported
> lsEModes:     list the EModes available in this implementation
> eRead/eWrite: read/write data according to the emode spec
> 
> Example (in perl for brevity):
> 
>   my $file   = SAGA::File new ("http://www.google.com/intl/en/images/logo.gif");
>   my @emodes = $file->lsEModes ();
> 
>   if ( grep (/^jpeg_block$/, @emodes) )
>   {
>     my ($buff, $len) = file.eRead ("jpeg_block", "22x4+7+8");
>   }
> 
> I would discourage support for B, since I do not know any
> protocoll supporting that approach efficiently, and also it
> needs approximately the same infrastructure setup as C.
> 
> As A is easily implementable on application level, or within
> any SAGA implementation, there is no need for support on API
> level -- however, A is insufficient for all but some trivial
> cases.
> 
> Comments welcome :-))
> 
> Cheers, Andre.
-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+





More information about the saga-rg mailing list