[saga-rg] proposal for extended file IO

Tue Jun 14 12:47:07 CDT 2005

I think E is a good model for large operations.  Basically,
you do data preprocessing on request.  It is good for cases
where you can accept a round trip time of some seconds to
minutes.  I think it wont help for interactive visualization
though...

Andre.

Quoting [Andrei Hutanu] (Jun 14 2005):
> 
> I support this model in the more generic way I proposed ..
> It's a more accurate model because you explicitely
> submit a job when you are doing (potentially) complex operations
> and it allows for better performance.
> 
> I think John is right when saying that eread is tricky and hard to use,
> I also agree with Andre that a simpler model is even more limiting than 
> eread.
> Any other opinions?
> 
> Andrei
> 
> >E (I already wrote that on the gat-devel-list):
> >
> >You could submit a process to your archive which extracts the data for you 
> >and registers the result as a new logical file. On the client side you 
> >could wrap it in a nice library hiding the job submission stuff and on the 
> >server/archive side you would prepare some executables for your tasks:
> >extract_hyperslab_from_hdf5 <logical_file_name> <hyperslab> 
> ><new_logical_file_name>
> >compress_file <logical_file_name> <compression_method> 
> ><new_logical_file_name>
> >...
> >
> >It shouldn't be that hard to prevent users from executing other 
> >executables on the server.
> >
> >This method is async and you can use the job interface to check for the 
> >status of your conversion job.
> >
> >
> >
> > 
> >
> >>I think there is a 4th possibility.  If each of the I/O operations can
> >>be requested asynchronously, then you can get the same net effect as
> >>the ERET/ESTO functionality of the GridFTP.  The advantage of simply
> >>embedding that functionality into the higher-level concept of
> >>asynchronous calls is that if the underlying library does *not* support
> >>the async operations (or some subset of the operations cannot be
> >>performed asynchronously) , you can always perform the operations
> >>synchronously and still be able present
> >>
> >>I do not like plan A or B for the reasons you state.  I do not like
> >>Plan C because it is too tightly tied to a specific data transfer
> >>system implementation. I would propose a Plan D that simply augments
> >>the Task interface of SAGA.  For example, if you allowed the user to
> >>fire off a number of async read operations
> >>	Task handle1= channel.read();
> >>	Task handle2=channel.read();
> >>	container.addTask(handle1);
> >>	container.addTask(handle2);
> >>	container.waitAll();
> >>
> >>The read operations in this example can be submitted as an eRead
> >>operation or they can be in separate threads, or they can simply be
> >>executed synchronously when you call waitAll()  (this is in fact how
> >>some of the async MPI I/O was done on the first SGI origin machines...
> >>it was meant to look asynchronous, but in fact the calls did not
> >>initiate until you did a "Wait" for them).
> >>
> >>Anyways, using the task interface provide more degrees of freedom for
> >>implementing async I/O than simply supporting the GridFTP way of doing
> >>things and it meshes gracefully with I/O implementations that do *not*
> >>offer an underlying async execution model.
> >>
> >>The only modification that would be useful to add to the tasking
> >>interface is a notion of "readFrom()" and "writeTo()" which allows you
> >>to specify the file offset together with the read.  Otherwise, the
> >>statefulness of the read() call would make the entire "task" interface
> >>useless with respect to file I/O.
> >>
> >>-john
> >>
> >>On Jun 12, 2005, at 11:02 AM, Andre Merzky wrote:
> >>   
> >>
> >>>Hi again,
> >>>
> >>>consider following use case for remote IO.  Given a large
> >>>binary 2D field on a remote host, the client wans to access
> >>>a 2D sub portion of that field.  Dependend on the remote
> >>>file layout, that requires usually more than one read
> >>>operation, since the standard read (offset, length) is
> >>>agnostic to the 2D layout.
> >>>
> >>>For more complex operations (subsampling, get a piece of a
> >>>jpg file), the number of remote operations grow very fast.
> >>>Latency then stringly discourages that type of remote IO.
> >>>
> >>>For that reason, I think that the remote file IO as
> >>>specified by SAGA's Strawman as is will only be usable for a
> >>>limited and trivial set of remote I/O use cases.
> >>>
> >>>There are three (basic) approaches:
> >>>
> >>> A) get the whole thing, and do ops locally
> >>>    Pro: - one remote op,
> >>>         - simple logic
> >>>         - remote side doesn't need to know about file
> >>>           structure
> >>>         - easily implementable on application level
> >>>    Con: - getting the header info of a 1GB data file comes
> >>>           with, well, some overhead ;-)
> >>>
> >>> B) clustering of calls: do many reads, but send them as a
> >>>    single request.
> >>>    Pro: - transparent to application
> >>>         - efficient
> >>>    Con: - need to know about dependencies of reads
> >>>           (a header read needed to determine size of
> >>>           field), or included explicite 'flushes'
> >>>         - need a protocol to support that
> >>>         - the remote side needs to support that
> >>>
> >>> C) data specific remote ops: send a high level command,
> >>>    and get exactly what you want.
> >>>    Pro: - most efficient
> >>>    Con: - need a protocol to support that
> >>>         - the remote side needs to support that _specific_
> >>>           command
> >>>
> >>>The last approach (C) is what I have best experiences with.
> >>>Also, that is what GridFTP as a common file access protocol
> >>>supports via ERET/ESTO operations.
> >>>
> >>>I want to propose to include a C-like extension to the File
> >>>API of the strawman, which basically maps well to GridFTP,
> >>>but should also map to other implementations of C.
> >>>
> >>>That extension would look like:
> >>>
> >>>     void lsEModes   (out array<string,1> emodes   );
> >>>     void eWrite      (in  string          emode,
> >>>                       in  string          spec,
> >>>                       in  string          buffer
> >>>                       out long            len_out  );
> >>>     void eRead       (in  string          emode,
> >>>                       in  string          spec,
> >>>                       out string          buffer,
> >>>                       out long            len_out  );
> >>>
> >>>     - hooks for gridftp-like opaque ERET/ESTO features
> >>>     - spec:  string for pattern as in GridFTP's ESTO/ERET
> >>>     - emode: string for ident.  as in GridFTP's ESTO/ERET
> >>>
> >>>EMode:        a specific remote I/O command supported
> >>>lsEModes:     list the EModes available in this implementation
> >>>eRead/eWrite: read/write data according to the emode spec
> >>>
> >>>Example (in perl for brevity):
> >>>
> >>> my $file   = SAGA::File new
> >>>("http://www.google.com/intl/en/images/logo.gif");
> >>> my @emodes = $file->lsEModes ();
> >>>
> >>> if ( grep (/^jpeg_block$/, @emodes) )
> >>> {
> >>>   my ($buff, $len) = file.eRead ("jpeg_block", "22x4+7+8");
> >>> }
> >>>
> >>>I would discourage support for B, since I do not know any
> >>>protocoll supporting that approach efficiently, and also it
> >>>needs approximately the same infrastructure setup as C.
> >>>
> >>>As A is easily implementable on application level, or within
> >>>any SAGA implementation, there is no need for support on API
> >>>level -- however, A is insufficient for all but some trivial
> >>>cases.
> >>>
> >>>Comments welcome :-))
> >>>
> >>>Cheers, Andre.
> >>>
> >>>
> >>>--
> >>>+-----------------------------------------------------------------+
> >>>
> >>>| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
> >>>| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
> >>>| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
> >>>| De Boelelaan 1083a                | www:  http://www.merzky.net |
> >>>| 1081 HV Amsterdam, Netherlands    |                             |
> >>>
> >>>+-----------------------------------------------------------------+
> >>>     
> >>>
> >
> > 
> >

-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+