[saga-rg] proposal for extended file IO

Tue Jun 14 12:12:46 CDT 2005

I support this model in the more generic way I proposed ..
It's a more accurate model because you explicitely
submit a job when you are doing (potentially) complex operations
and it allows for better performance.

I think John is right when saying that eread is tricky and hard to use,
I also agree with Andre that a simpler model is even more limiting than 
eread.
Any other opinions?

Andrei

>E (I already wrote that on the gat-devel-list):
>
>You could submit a process to your archive which extracts the data for you and 
>registers the result as a new logical file. On the client side you could wrap 
>it in a nice library hiding the job submission stuff and on the 
>server/archive side you would prepare some executables for your tasks:
>extract_hyperslab_from_hdf5 <logical_file_name> <hyperslab> 
><new_logical_file_name>
>compress_file <logical_file_name> <compression_method> <new_logical_file_name>
>...
>
>It shouldn't be that hard to prevent users from executing other executables on 
>the server.
>
>This method is async and you can use the job interface to check for the status 
>of your conversion job.
>
>
>
>  
>
>>I think there is a 4th possibility.  If each of the I/O operations can
>>be requested asynchronously, then you can get the same net effect as
>>the ERET/ESTO functionality of the GridFTP.  The advantage of simply
>>embedding that functionality into the higher-level concept of
>>asynchronous calls is that if the underlying library does *not* support
>>the async operations (or some subset of the operations cannot be
>>performed asynchronously) , you can always perform the operations
>>synchronously and still be able present
>>
>>I do not like plan A or B for the reasons you state.  I do not like
>>Plan C because it is too tightly tied to a specific data transfer
>>system implementation. I would propose a Plan D that simply augments
>>the Task interface of SAGA.  For example, if you allowed the user to
>>fire off a number of async read operations
>>	Task handle1= channel.read();
>>	Task handle2=channel.read();
>>	container.addTask(handle1);
>>	container.addTask(handle2);
>>	container.waitAll();
>>
>>The read operations in this example can be submitted as an eRead
>>operation or they can be in separate threads, or they can simply be
>>executed synchronously when you call waitAll()  (this is in fact how
>>some of the async MPI I/O was done on the first SGI origin machines...
>>it was meant to look asynchronous, but in fact the calls did not
>>initiate until you did a "Wait" for them).
>>
>>Anyways, using the task interface provide more degrees of freedom for
>>implementing async I/O than simply supporting the GridFTP way of doing
>>things and it meshes gracefully with I/O implementations that do *not*
>>offer an underlying async execution model.
>>
>>The only modification that would be useful to add to the tasking
>>interface is a notion of "readFrom()" and "writeTo()" which allows you
>>to specify the file offset together with the read.  Otherwise, the
>>statefulness of the read() call would make the entire "task" interface
>>useless with respect to file I/O.
>>
>>-john
>>
>>On Jun 12, 2005, at 11:02 AM, Andre Merzky wrote:
>>    
>>
>>>Hi again,
>>>
>>>consider following use case for remote IO.  Given a large
>>>binary 2D field on a remote host, the client wans to access
>>>a 2D sub portion of that field.  Dependend on the remote
>>>file layout, that requires usually more than one read
>>>operation, since the standard read (offset, length) is
>>>agnostic to the 2D layout.
>>>
>>>For more complex operations (subsampling, get a piece of a
>>>jpg file), the number of remote operations grow very fast.
>>>Latency then stringly discourages that type of remote IO.
>>>
>>>For that reason, I think that the remote file IO as
>>>specified by SAGA's Strawman as is will only be usable for a
>>>limited and trivial set of remote I/O use cases.
>>>
>>>There are three (basic) approaches:
>>>
>>>  A) get the whole thing, and do ops locally
>>>     Pro: - one remote op,
>>>          - simple logic
>>>          - remote side doesn't need to know about file
>>>            structure
>>>          - easily implementable on application level
>>>     Con: - getting the header info of a 1GB data file comes
>>>            with, well, some overhead ;-)
>>>
>>>  B) clustering of calls: do many reads, but send them as a
>>>     single request.
>>>     Pro: - transparent to application
>>>          - efficient
>>>     Con: - need to know about dependencies of reads
>>>            (a header read needed to determine size of
>>>            field), or included explicite 'flushes'
>>>          - need a protocol to support that
>>>          - the remote side needs to support that
>>>
>>>  C) data specific remote ops: send a high level command,
>>>     and get exactly what you want.
>>>     Pro: - most efficient
>>>     Con: - need a protocol to support that
>>>          - the remote side needs to support that _specific_
>>>            command
>>>
>>>The last approach (C) is what I have best experiences with.
>>>Also, that is what GridFTP as a common file access protocol
>>>supports via ERET/ESTO operations.
>>>
>>>I want to propose to include a C-like extension to the File
>>>API of the strawman, which basically maps well to GridFTP,
>>>but should also map to other implementations of C.
>>>
>>>That extension would look like:
>>>
>>>      void lsEModes   (out array<string,1> emodes   );
>>>      void eWrite      (in  string          emode,
>>>                        in  string          spec,
>>>                        in  string          buffer
>>>                        out long            len_out  );
>>>      void eRead       (in  string          emode,
>>>                        in  string          spec,
>>>                        out string          buffer,
>>>                        out long            len_out  );
>>>
>>>      - hooks for gridftp-like opaque ERET/ESTO features
>>>      - spec:  string for pattern as in GridFTP's ESTO/ERET
>>>      - emode: string for ident.  as in GridFTP's ESTO/ERET
>>>
>>>EMode:        a specific remote I/O command supported
>>>lsEModes:     list the EModes available in this implementation
>>>eRead/eWrite: read/write data according to the emode spec
>>>
>>>Example (in perl for brevity):
>>>
>>>  my $file   = SAGA::File new
>>>("http://www.google.com/intl/en/images/logo.gif");
>>>  my @emodes = $file->lsEModes ();
>>>
>>>  if ( grep (/^jpeg_block$/, @emodes) )
>>>  {
>>>    my ($buff, $len) = file.eRead ("jpeg_block", "22x4+7+8");
>>>  }
>>>
>>>I would discourage support for B, since I do not know any
>>>protocoll supporting that approach efficiently, and also it
>>>needs approximately the same infrastructure setup as C.
>>>
>>>As A is easily implementable on application level, or within
>>>any SAGA implementation, there is no need for support on API
>>>level -- however, A is insufficient for all but some trivial
>>>cases.
>>>
>>>Comments welcome :-))
>>>
>>>Cheers, Andre.
>>>
>>>
>>>--
>>>+-----------------------------------------------------------------+
>>>
>>>| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
>>>| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
>>>| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
>>>| De Boelelaan 1083a                | www:  http://www.merzky.net |
>>>| 1081 HV Amsterdam, Netherlands    |                             |
>>>
>>>+-----------------------------------------------------------------+
>>>      
>>>
>
>  
>