[saga-rg] proposal for extended file IO

Andre Merzky andre at merzky.net
Tue Jun 14 12:52:40 CDT 2005


Ah, by the way: (Grid)RPC is close to that model: just
exchange a remote job with a remote procedure :-)  Or remote
service request :-))

In fact, I think the RPC lies just between eread as in (C)
and remote jobs as in (E).

Cheers, Andre.


Quoting [Andre Merzky] (Jun 14 2005):
> 
> I think E is a good model for large operations.  Basically,
> you do data preprocessing on request.  It is good for cases
> where you can accept a round trip time of some seconds to
> minutes.  I think it wont help for interactive visualization
> though...
> 
> Andre.
> 
> Quoting [Andrei Hutanu] (Jun 14 2005):
> > 
> > I support this model in the more generic way I proposed ..
> > It's a more accurate model because you explicitely
> > submit a job when you are doing (potentially) complex operations
> > and it allows for better performance.
> > 
> > I think John is right when saying that eread is tricky and hard to use,
> > I also agree with Andre that a simpler model is even more limiting than 
> > eread.
> > Any other opinions?
> > 
> > Andrei
> > 
> > >E (I already wrote that on the gat-devel-list):
> > >
> > >You could submit a process to your archive which extracts the data for you 
> > >and registers the result as a new logical file. On the client side you 
> > >could wrap it in a nice library hiding the job submission stuff and on the 
> > >server/archive side you would prepare some executables for your tasks:
> > >extract_hyperslab_from_hdf5 <logical_file_name> <hyperslab> 
> > ><new_logical_file_name>
> > >compress_file <logical_file_name> <compression_method> 
> > ><new_logical_file_name>
> > >...
> > >
> > >It shouldn't be that hard to prevent users from executing other 
> > >executables on the server.
> > >
> > >This method is async and you can use the job interface to check for the 
> > >status of your conversion job.
> > >
> > >
> > >
> > > 
> > >
> > >>I think there is a 4th possibility.  If each of the I/O operations can
> > >>be requested asynchronously, then you can get the same net effect as
> > >>the ERET/ESTO functionality of the GridFTP.  The advantage of simply
> > >>embedding that functionality into the higher-level concept of
> > >>asynchronous calls is that if the underlying library does *not* support
> > >>the async operations (or some subset of the operations cannot be
> > >>performed asynchronously) , you can always perform the operations
> > >>synchronously and still be able present
> > >>
> > >>I do not like plan A or B for the reasons you state.  I do not like
> > >>Plan C because it is too tightly tied to a specific data transfer
> > >>system implementation. I would propose a Plan D that simply augments
> > >>the Task interface of SAGA.  For example, if you allowed the user to
> > >>fire off a number of async read operations
> > >>	Task handle1= channel.read();
> > >>	Task handle2=channel.read();
> > >>	container.addTask(handle1);
> > >>	container.addTask(handle2);
> > >>	container.waitAll();
> > >>
> > >>The read operations in this example can be submitted as an eRead
> > >>operation or they can be in separate threads, or they can simply be
> > >>executed synchronously when you call waitAll()  (this is in fact how
> > >>some of the async MPI I/O was done on the first SGI origin machines...
> > >>it was meant to look asynchronous, but in fact the calls did not
> > >>initiate until you did a "Wait" for them).
> > >>
> > >>Anyways, using the task interface provide more degrees of freedom for
> > >>implementing async I/O than simply supporting the GridFTP way of doing
> > >>things and it meshes gracefully with I/O implementations that do *not*
> > >>offer an underlying async execution model.
> > >>
> > >>The only modification that would be useful to add to the tasking
> > >>interface is a notion of "readFrom()" and "writeTo()" which allows you
> > >>to specify the file offset together with the read.  Otherwise, the
> > >>statefulness of the read() call would make the entire "task" interface
> > >>useless with respect to file I/O.
> > >>
> > >>-john
> > >>
> > >>On Jun 12, 2005, at 11:02 AM, Andre Merzky wrote:
> > >>   
> > >>
> > >>>Hi again,
> > >>>
> > >>>consider following use case for remote IO.  Given a large
> > >>>binary 2D field on a remote host, the client wans to access
> > >>>a 2D sub portion of that field.  Dependend on the remote
> > >>>file layout, that requires usually more than one read
> > >>>operation, since the standard read (offset, length) is
> > >>>agnostic to the 2D layout.
> > >>>
> > >>>For more complex operations (subsampling, get a piece of a
> > >>>jpg file), the number of remote operations grow very fast.
> > >>>Latency then stringly discourages that type of remote IO.
> > >>>
> > >>>For that reason, I think that the remote file IO as
> > >>>specified by SAGA's Strawman as is will only be usable for a
> > >>>limited and trivial set of remote I/O use cases.
> > >>>
> > >>>There are three (basic) approaches:
> > >>>
> > >>> A) get the whole thing, and do ops locally
> > >>>    Pro: - one remote op,
> > >>>         - simple logic
> > >>>         - remote side doesn't need to know about file
> > >>>           structure
> > >>>         - easily implementable on application level
> > >>>    Con: - getting the header info of a 1GB data file comes
> > >>>           with, well, some overhead ;-)
> > >>>
> > >>> B) clustering of calls: do many reads, but send them as a
> > >>>    single request.
> > >>>    Pro: - transparent to application
> > >>>         - efficient
> > >>>    Con: - need to know about dependencies of reads
> > >>>           (a header read needed to determine size of
> > >>>           field), or included explicite 'flushes'
> > >>>         - need a protocol to support that
> > >>>         - the remote side needs to support that
> > >>>
> > >>> C) data specific remote ops: send a high level command,
> > >>>    and get exactly what you want.
> > >>>    Pro: - most efficient
> > >>>    Con: - need a protocol to support that
> > >>>         - the remote side needs to support that _specific_
> > >>>           command
> > >>>
> > >>>The last approach (C) is what I have best experiences with.
> > >>>Also, that is what GridFTP as a common file access protocol
> > >>>supports via ERET/ESTO operations.
> > >>>
> > >>>I want to propose to include a C-like extension to the File
> > >>>API of the strawman, which basically maps well to GridFTP,
> > >>>but should also map to other implementations of C.
> > >>>
> > >>>That extension would look like:
> > >>>
> > >>>     void lsEModes   (out array<string,1> emodes   );
> > >>>     void eWrite      (in  string          emode,
> > >>>                       in  string          spec,
> > >>>                       in  string          buffer
> > >>>                       out long            len_out  );
> > >>>     void eRead       (in  string          emode,
> > >>>                       in  string          spec,
> > >>>                       out string          buffer,
> > >>>                       out long            len_out  );
> > >>>
> > >>>     - hooks for gridftp-like opaque ERET/ESTO features
> > >>>     - spec:  string for pattern as in GridFTP's ESTO/ERET
> > >>>     - emode: string for ident.  as in GridFTP's ESTO/ERET
> > >>>
> > >>>EMode:        a specific remote I/O command supported
> > >>>lsEModes:     list the EModes available in this implementation
> > >>>eRead/eWrite: read/write data according to the emode spec
> > >>>
> > >>>Example (in perl for brevity):
> > >>>
> > >>> my $file   = SAGA::File new
> > >>>("http://www.google.com/intl/en/images/logo.gif");
> > >>> my @emodes = $file->lsEModes ();
> > >>>
> > >>> if ( grep (/^jpeg_block$/, @emodes) )
> > >>> {
> > >>>   my ($buff, $len) = file.eRead ("jpeg_block", "22x4+7+8");
> > >>> }
> > >>>
> > >>>I would discourage support for B, since I do not know any
> > >>>protocoll supporting that approach efficiently, and also it
> > >>>needs approximately the same infrastructure setup as C.
> > >>>
> > >>>As A is easily implementable on application level, or within
> > >>>any SAGA implementation, there is no need for support on API
> > >>>level -- however, A is insufficient for all but some trivial
> > >>>cases.
> > >>>
> > >>>Comments welcome :-))
> > >>>
> > >>>Cheers, Andre.
> > >>>
> > >>>
> > >>>--
> > >>>+-----------------------------------------------------------------+
> > >>>
> > >>>| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
> > >>>| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
> > >>>| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
> > >>>| De Boelelaan 1083a                | www:  http://www.merzky.net |
> > >>>| 1081 HV Amsterdam, Netherlands    |                             |
> > >>>
> > >>>+-----------------------------------------------------------------+
-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+





More information about the saga-rg mailing list