[saga-rg] proposal for extended file IO

Thorsten Schuett schuett at zib.de
Tue Jun 14 09:21:50 CDT 2005


On Monday 13 June 2005 19:28, John Shalf wrote:
> Hi Andre,
And here are number B.2 and E.

B.2:
You could use pitfalls to describe the clustering of reads (google for  
"Remote Partial File Access Using Compact Pattern Descriptions"). It is a 
compact language for describing regular subsets of files and it should at 
least address one of your cons: need a protocol...

E (I already wrote that on the gat-devel-list):

You could submit a process to your archive which extracts the data for you and 
registers the result as a new logical file. On the client side you could wrap 
it in a nice library hiding the job submission stuff and on the 
server/archive side you would prepare some executables for your tasks:
extract_hyperslab_from_hdf5 <logical_file_name> <hyperslab> 
<new_logical_file_name>
compress_file <logical_file_name> <compression_method> <new_logical_file_name>
...

It shouldn't be that hard to prevent users from executing other executables on 
the server.

This method is async and you can use the job interface to check for the status 
of your conversion job.



> I think there is a 4th possibility.  If each of the I/O operations can
> be requested asynchronously, then you can get the same net effect as
> the ERET/ESTO functionality of the GridFTP.  The advantage of simply
> embedding that functionality into the higher-level concept of
> asynchronous calls is that if the underlying library does *not* support
> the async operations (or some subset of the operations cannot be
> performed asynchronously) , you can always perform the operations
> synchronously and still be able present
>
> I do not like plan A or B for the reasons you state.  I do not like
> Plan C because it is too tightly tied to a specific data transfer
> system implementation. I would propose a Plan D that simply augments
> the Task interface of SAGA.  For example, if you allowed the user to
> fire off a number of async read operations
> 	Task handle1= channel.read();
> 	Task handle2=channel.read();
> 	container.addTask(handle1);
> 	container.addTask(handle2);
> 	container.waitAll();
>
> The read operations in this example can be submitted as an eRead
> operation or they can be in separate threads, or they can simply be
> executed synchronously when you call waitAll()  (this is in fact how
> some of the async MPI I/O was done on the first SGI origin machines...
> it was meant to look asynchronous, but in fact the calls did not
> initiate until you did a "Wait" for them).
>
> Anyways, using the task interface provide more degrees of freedom for
> implementing async I/O than simply supporting the GridFTP way of doing
> things and it meshes gracefully with I/O implementations that do *not*
> offer an underlying async execution model.
>
> The only modification that would be useful to add to the tasking
> interface is a notion of "readFrom()" and "writeTo()" which allows you
> to specify the file offset together with the read.  Otherwise, the
> statefulness of the read() call would make the entire "task" interface
> useless with respect to file I/O.
>
> -john
>
> On Jun 12, 2005, at 11:02 AM, Andre Merzky wrote:
> > Hi again,
> >
> > consider following use case for remote IO.  Given a large
> > binary 2D field on a remote host, the client wans to access
> > a 2D sub portion of that field.  Dependend on the remote
> > file layout, that requires usually more than one read
> > operation, since the standard read (offset, length) is
> > agnostic to the 2D layout.
> >
> > For more complex operations (subsampling, get a piece of a
> > jpg file), the number of remote operations grow very fast.
> > Latency then stringly discourages that type of remote IO.
> >
> > For that reason, I think that the remote file IO as
> > specified by SAGA's Strawman as is will only be usable for a
> > limited and trivial set of remote I/O use cases.
> >
> > There are three (basic) approaches:
> >
> >   A) get the whole thing, and do ops locally
> >      Pro: - one remote op,
> >           - simple logic
> >           - remote side doesn't need to know about file
> >             structure
> >           - easily implementable on application level
> >      Con: - getting the header info of a 1GB data file comes
> >             with, well, some overhead ;-)
> >
> >   B) clustering of calls: do many reads, but send them as a
> >      single request.
> >      Pro: - transparent to application
> >           - efficient
> >      Con: - need to know about dependencies of reads
> >             (a header read needed to determine size of
> >             field), or included explicite 'flushes'
> >           - need a protocol to support that
> >           - the remote side needs to support that
> >
> >   C) data specific remote ops: send a high level command,
> >      and get exactly what you want.
> >      Pro: - most efficient
> >      Con: - need a protocol to support that
> >           - the remote side needs to support that _specific_
> >             command
> >
> > The last approach (C) is what I have best experiences with.
> > Also, that is what GridFTP as a common file access protocol
> > supports via ERET/ESTO operations.
> >
> > I want to propose to include a C-like extension to the File
> > API of the strawman, which basically maps well to GridFTP,
> > but should also map to other implementations of C.
> >
> > That extension would look like:
> >
> >       void lsEModes   (out array<string,1> emodes   );
> >       void eWrite      (in  string          emode,
> >                         in  string          spec,
> >                         in  string          buffer
> >                         out long            len_out  );
> >       void eRead       (in  string          emode,
> >                         in  string          spec,
> >                         out string          buffer,
> >                         out long            len_out  );
> >
> >       - hooks for gridftp-like opaque ERET/ESTO features
> >       - spec:  string for pattern as in GridFTP's ESTO/ERET
> >       - emode: string for ident.  as in GridFTP's ESTO/ERET
> >
> > EMode:        a specific remote I/O command supported
> > lsEModes:     list the EModes available in this implementation
> > eRead/eWrite: read/write data according to the emode spec
> >
> > Example (in perl for brevity):
> >
> >   my $file   = SAGA::File new
> > ("http://www.google.com/intl/en/images/logo.gif");
> >   my @emodes = $file->lsEModes ();
> >
> >   if ( grep (/^jpeg_block$/, @emodes) )
> >   {
> >     my ($buff, $len) = file.eRead ("jpeg_block", "22x4+7+8");
> >   }
> >
> > I would discourage support for B, since I do not know any
> > protocoll supporting that approach efficiently, and also it
> > needs approximately the same infrastructure setup as C.
> >
> > As A is easily implementable on application level, or within
> > any SAGA implementation, there is no need for support on API
> > level -- however, A is insufficient for all but some trivial
> > cases.
> >
> > Comments welcome :-))
> >
> > Cheers, Andre.
> >
> >
> > --
> > +-----------------------------------------------------------------+
> >
> > | Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
> > | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
> > | Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
> > | De Boelelaan 1083a                | www:  http://www.merzky.net |
> > | 1081 HV Amsterdam, Netherlands    |                             |
> >
> > +-----------------------------------------------------------------+





More information about the saga-rg mailing list