[ogsa-d-wg] Draft Charter for Data Movement Interface Standardization WG

Ravi Madduri madduri at mcs.anl.gov
Tue Sep 20 17:25:35 CDT 2005


Bill,
it is 2:00 pm on monday right ?

On Sep 20, 2005, at 3:21 PM, William E. Allcock wrote:

> The BOF has been approved:
>
> 2:00 pm - 3:30 pm
>
> Data Movement Interface Standardization
> (Data Movement Interface Standardization)  Charter-Discussion BOF
>
> Many services have the need to move data (as opposed to messages
> invoking the service). The characteristics / semantics required can  
> vary
> greatly. There are several existing interfaces (RFT, FTS, , all file
> based, that are similar, but incompatible. A working group spawned by
> this BOF would work towards a standardized set of WSDL that could  
> invoke
> a service that met the requirements of existing services as well as
> non-File based sources.
>
> Location: Imperial Ballroom
>
> See you there!
>
> Bill
>
> Malcolm Atkinson wrote:
>
>> Dave this is a good way of posing the question.
>> One can also relate it to the proposal a couple of years ago by Peter
>> Kunszt for grid data handles as a generic concept for naming data.
>> I think the problem arises when the selected data is an arbitrary  
>> subset
>> of the stored data or a derivative of the stored data derived via  
>> any of
>> a wide range of languages: Xquery, Xpath, SQL, LDAP, semi-structured
>> QLs, statistical languages, FFT, datacutter, ....
>> These all make sense.  It is just hard to understand how to compose
>> them.  It is hard to make general rules.  The derived data has to be
>> evaluated to some point where it can be moved - a RAM buffer - can  
>> the
>> movement be part of the standard and the derivation processes be
>> strictly corraled in some other specs?
>> Malcolm
>>  >-----Original Message-----
>>  >From: Dave Berry  >Sent: 14 September 2005 21:00
>>  >To: Malcolm Atkinson; William E. Allcock; ogsa-d-wg at ggf.org;   
>> >gsm-wg at ggf.org; byte-io-wg at ggf.org; Peter Kunszt; James  >Casey;  
>> Ravi Madduri
>>  >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement   
>> >Interface Standardization WG
>>  >
>>  >Hi Malcolm,
>>  >
>>  >I'd rather ask, what are the characteristics of a file that   
>> >makes these file transfer mechanisms tractable?  Then we can   
>> >ask, to what extent can we generalise the mechanism?   >
>>  >For example, if the key characteristics are that a file can  >be  
>> named and supports random access, then we might generalise  >the  
>> mechanism to include data in RAM (which would avoid  >unnecessary  
>> copying to disk).  This case would be analogous  >to some  
>> operating systems which allow entities in RAM to be  >addressed as  
>> part of the file system.
>>  >
>>  >Conversely, if the mechanism can handle any named sequence of   
>> >bytes, then it could presumably handle streaming data as  >well.   
>> Or if it requires other operations that are specific  >to the  
>> location of bytes on a disk (or tape), then the WG  >will restrict  
>> its attention to those cases.
>>  >
>>  >I would expect this group to place a strong requirement on  >the  
>> OGSA WG to provide a naming system that can specify  >whatever  
>> data sets this WG wants to move.
>>  >
>>  >Dave.
>>  >
>>  >
>>  >-----Original Message-----
>>  >From: owner-ogsa-d-wg at ggf.org  >[mailto:owner-ogsa-d-wg at ggf.org]  
>> On Behalf Of Malcolm Atkinson
>>  >Sent: 14 September 2005 17:20
>>  >To: William E. Allcock; ogsa-d-wg at ggf.org; gsm-wg at ggf.org;   
>> >byte-io-wg at ggf.org; Peter Kunszt; James Casey; Ravi Madduri
>>  >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement   
>> >Interface Standardization WG
>>  >
>>  >
>>  >Hi Bill
>>  >
>>  >I agree that such a standard interface is needed.
>>  >When you look at files I presume you consider files where  >ever  
>> they are, secondary or tertiary storage at least.
>>  >
>>  >When you say any dataa, then there is the possibility of   
>> >trivial or large amounts of data between RAM, as well as data   
>> >from files and databases with an enormous set of  >possibilities  
>> of the way it may be selected and identified.   >Eventually it  
>> gets close to Byte-IO, Streams
>>  >(BoF) and InfoD etc.
>>  >
>>  >So I'm agreeing with you that if yu go beyond files then  >scope  
>> control is difficult.
>>  >
>>  >Would it be better to do the standardisation of file movement   
>> >first and look at other forms of adta movement later?
>>  >
>>  >Malcolm
>>  >  >
>>  > >-----Original Message-----
>>  > >From: owner-ogsa-d-wg at ggf.org  > >[mailto:owner-ogsa-d- 
>> wg at ggf.org] On Behalf Of William E. Allcock
>>  > >Sent: 14 September 2005 17:04
>>  > >To: ogsa-d-wg at ggf.org; gsm-wg at ggf.org; byte-io-wg at ggf.org;  >  
>> >Peter Kunszt; James Casey; Ravi Madduri
>>  > >Subject: [ogsa-d-wg] Draft Charter for Data Movement  >  
>> >Interface Standardization WG
>>  > >
>>  > >Sorry for the re-send, I typoed the byte-io mail list.
>>  > >
>>  > >All,
>>  > >
>>  > >Sorry for the SPAM, but I sent this to the "likely  >suspects"  
>> who might  >be interested.  I have a proposed BOF  >(waiting for  
>> AD approval) to  >discuss standardizing an  >interface for  
>> invoking data movement.  There  >are several of  >them out there  
>> already.  CERN has the File Transfer  >System  >(FTS), the gsm-wg  
>> has SRM copy, Globus has the Reliable File   >>Transfer (RFT)  
>> service, etc..  I don't think there will be  > >any argument
>>  > >that there is a need for such standardization, the hard  >part  
>> will be  >scoping the extent of what we will work on.   >For  
>> instance, all the  >examples above are file based, but  >ideally,  
>> this interface would work  >for any data that can be  >addressed.   
>> >  >I expect that that the BOF will be centered  >around scoping  
>> the working  >group, but I think we should  >(and approval of the  
>> BOF depends on)  >getting some initial  >discussion around the  
>> scope.  So... here it goes:  >  >I  >think the obvious thing is  
>> that it needs to be able to have  > >the basic
>>  > >functionality presented by FTS, RFT, and SRM-copy, however  >  
>> >the devil is
>>  > >in the details, so I will break this up into "blocks of  >  
>> >functionality":
>>  > >
>>  > >Lets start with naming.  What will this service accept as   
>> >valid names  >for entities that it will move?  URLs? EPRs?  >Will  
>> logical  > >file names be
>>  > >accepted or should they be translated outside this service?
>>  > >
>>  > >Related to the naming is what type of data will this  >service  
>> move?  >Files? video streams?  the output of  >simulations? the  
>> output  > >of database
>>  > >queries?  Can we make this a service that any service that  >  
>> >wants to move
>>  > >data can simply invoke it?  Note that I am differentiating   
>> >data from  >messages.  You would not use this to send the   
>> >result from a  > >service that
>>  > >summed a bunch of numbers, that would simply be a SOAP  >  
>> >response... IMHO :-).
>>  > >
>>  > >Can we make a generic module that would allow this   
>> >functionality to be  >applied to any service that exposes the   
>> >byte-io interface?  Does that  >affect the interface or is it   
>> >just an implementation issue?  >  >Can we make this service   
>> >transport mechanism agnostic?  both  > >application
>>  > >transport (GridFTP vs HTTP vs ...) as well as network  >  
>> >transport (TCP vs
>>  > >UDP vs UDT vs ...).  My concern here is that I am not sure  >  
>> >SOAP has the
>>  > >functionality we need.  To do this, I wonder if we need the   
>> >equivalent  >of a union in C, so that the parameters  >specified  
>> are based on the
>>  > >transport(s) chosen.  For instance, if you use TCP you need  >  
>> >to specify a
>>  > >buffer size, but not for UDP.  GridFTP specifies streams  >and  
>> data  >channel authentication, but HTTP does not.  >   >>What  
>> about security / authorization.  This is a broad  >category and  
>> we  >should push as much as possible outside of  >scope via  
>> callouts  > >and Policy
>>  > >Enforcement Points (PEPs), but what about delivery guarantees   
>> > >such as AT
>>  > >MOST ONCE, AT LEAST ONCE, EXACTLY ONCE, non-repudiation,   
>> >etc.?  I know  >Dieter has a set of use cases that require  >some  
>> of this type delivery  >guarantee functionality.  >  >A   
>> >potentially contentious issue is whether or not these  >services  
>> will  >use WSRF and notifications to expose (push  >from the  
>> service)  > >or methods
>>  > >to query the state (pull from the service).  Hopefully, we   
>> >can find a  >way to make each optional.  >  >If we start  >making  
>> many optional parts to the interface, it will make   >>what is  
>> exposed as service metadata for brokering will  >become more   
>> >important.  I would propose that we should make  >at a minimum a   
>> >recommendation for what facts about the  >service should be  
>> exposed.  >  >All of the existing services  >accept "bulk" inputs,  
>> i.e., move  > >these 100
>>  > >files.  This can be a problem when the requests become very  >  
>> >large due to
>>  > >de-serialization.  Should we provide a "chunking" interface   
>> >so that  >requests can be of unlimited size?  >  >Please feel   
>> >free to make comments on the above and more importantly    
>> >>suggest other important issues we need to address.  >  >btw,   
>> >once we have a mail list of our own we will quit  > >spamming the  
>> other
>>  > >lists :-).
>>  > >
>>  > >Bill
>>  > >--  > >William E. Allcock
>>  > >Argonne National Laboratory
>>  > >Bldg 221, Office C-115A
>>  > >9700 South Cass Ave
>>  > >Argonne, IL 60439-4844
>>  > >Office Phone:  +1-630-252-7573
>>  > >Office Fax:      +1-630-252-1997
>>  > >Cell Phone:      +1-630-854-2842
>>  > >
>>  > >
>>  > >
>>  > >
>>  >
>>  >
>>
>
> -- 
> William E. Allcock
> Argonne National Laboratory
> Bldg 221, Office C-115A
> 9700 South Cass Ave
> Argonne, IL 60439-4844
> Office Phone:  +1-630-252-7573
> Office Fax:      +1-630-252-1997
> Cell Phone:      +1-630-854-2842
>
>

--
Ravi K Madduri
The Globus Alliance | Argonne National Laboratory
http://www-unix.mcs.anl.gov/~madduri






More information about the ogsa-d-wg mailing list