[ogsa-d-wg] Draft Charter for Data Movement Interface Standardization WG

William E. Allcock allcock at mcs.anl.gov
Tue Sep 20 15:21:04 CDT 2005


The BOF has been approved:

2:00 pm - 3:30 pm	
		
Data Movement Interface Standardization
(Data Movement Interface Standardization)  Charter-Discussion BOF

Many services have the need to move data (as opposed to messages
invoking the service). The characteristics / semantics required can vary
greatly. There are several existing interfaces (RFT, FTS, , all file
based, that are similar, but incompatible. A working group spawned by
this BOF would work towards a standardized set of WSDL that could invoke
a service that met the requirements of existing services as well as
non-File based sources.

Location: Imperial Ballroom

See you there!

Bill

Malcolm Atkinson wrote:
> Dave this is a good way of posing the question.
> One can also relate it to the proposal a couple of years ago by Peter
> Kunszt for grid data handles as a generic concept for naming data.
> 
> I think the problem arises when the selected data is an arbitrary subset
> of the stored data or a derivative of the stored data derived via any of
> a wide range of languages: Xquery, Xpath, SQL, LDAP, semi-structured
> QLs, statistical languages, FFT, datacutter, ....
> 
> These all make sense.  It is just hard to understand how to compose
> them.  It is hard to make general rules.  The derived data has to be
> evaluated to some point where it can be moved - a RAM buffer - can the
> movement be part of the standard and the derivation processes be
> strictly corraled in some other specs?
> 
> Malcolm
> 
> 
>  >-----Original Message-----
>  >From: Dave Berry 
>  >Sent: 14 September 2005 21:00
>  >To: Malcolm Atkinson; William E. Allcock; ogsa-d-wg at ggf.org; 
>  >gsm-wg at ggf.org; byte-io-wg at ggf.org; Peter Kunszt; James 
>  >Casey; Ravi Madduri
>  >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement 
>  >Interface Standardization WG
>  >
>  >Hi Malcolm,
>  >
>  >I'd rather ask, what are the characteristics of a file that 
>  >makes these file transfer mechanisms tractable?  Then we can 
>  >ask, to what extent can we generalise the mechanism?  
>  >
>  >For example, if the key characteristics are that a file can 
>  >be named and supports random access, then we might generalise 
>  >the mechanism to include data in RAM (which would avoid 
>  >unnecessary copying to disk).  This case would be analogous 
>  >to some operating systems which allow entities in RAM to be 
>  >addressed as part of the file system.
>  >
>  >Conversely, if the mechanism can handle any named sequence of 
>  >bytes, then it could presumably handle streaming data as 
>  >well.  Or if it requires other operations that are specific 
>  >to the location of bytes on a disk (or tape), then the WG 
>  >will restrict its attention to those cases.
>  >
>  >I would expect this group to place a strong requirement on 
>  >the OGSA WG to provide a naming system that can specify 
>  >whatever data sets this WG wants to move.
>  >
>  >Dave.
>  >
>  >
>  >-----Original Message-----
>  >From: owner-ogsa-d-wg at ggf.org 
>  >[mailto:owner-ogsa-d-wg at ggf.org] On Behalf Of Malcolm Atkinson
>  >Sent: 14 September 2005 17:20
>  >To: William E. Allcock; ogsa-d-wg at ggf.org; gsm-wg at ggf.org; 
>  >byte-io-wg at ggf.org; Peter Kunszt; James Casey; Ravi Madduri
>  >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement 
>  >Interface Standardization WG
>  >
>  >
>  >Hi Bill
>  >
>  >I agree that such a standard interface is needed.
>  >When you look at files I presume you consider files where 
>  >ever they are, secondary or tertiary storage at least.
>  >
>  >When you say any dataa, then there is the possibility of 
>  >trivial or large amounts of data between RAM, as well as data 
>  >from files and databases with an enormous set of 
>  >possibilities of the way it may be selected and identified.  
>  >Eventually it gets close to Byte-IO, Streams
>  >(BoF) and InfoD etc.
>  >
>  >So I'm agreeing with you that if yu go beyond files then 
>  >scope control is difficult.
>  >
>  >Would it be better to do the standardisation of file movement 
>  >first and look at other forms of adta movement later?
>  >
>  >Malcolm
>  > 
>  >
>  > >-----Original Message-----
>  > >From: owner-ogsa-d-wg at ggf.org 
>  > >[mailto:owner-ogsa-d-wg at ggf.org] On Behalf Of William E. Allcock
>  > >Sent: 14 September 2005 17:04
>  > >To: ogsa-d-wg at ggf.org; gsm-wg at ggf.org; byte-io-wg at ggf.org; 
>  > >Peter Kunszt; James Casey; Ravi Madduri
>  > >Subject: [ogsa-d-wg] Draft Charter for Data Movement 
>  > >Interface Standardization WG
>  > >
>  > >Sorry for the re-send, I typoed the byte-io mail list.
>  > >
>  > >All,
>  > >
>  > >Sorry for the SPAM, but I sent this to the "likely 
>  >suspects" who might  >be interested.  I have a proposed BOF 
>  >(waiting for AD approval) to  >discuss standardizing an 
>  >interface for invoking data movement.  There  >are several of 
>  >them out there already.  CERN has the File Transfer  >System 
>  >(FTS), the gsm-wg has SRM copy, Globus has the Reliable File  
>  >>Transfer (RFT) service, etc..  I don't think there will be 
>  > >any argument
>  > >that there is a need for such standardization, the hard 
>  >part will be  >scoping the extent of what we will work on.  
>  >For instance, all the  >examples above are file based, but 
>  >ideally, this interface would work  >for any data that can be 
>  >addressed.  >  >I expect that that the BOF will be centered 
>  >around scoping the working  >group, but I think we should 
>  >(and approval of the BOF depends on)  >getting some initial 
>  >discussion around the scope.  So... here it goes:  >  >I 
>  >think the obvious thing is that it needs to be able to have 
>  > >the basic
>  > >functionality presented by FTS, RFT, and SRM-copy, however 
>  > >the devil is
>  > >in the details, so I will break this up into "blocks of 
>  > >functionality":
>  > >
>  > >Lets start with naming.  What will this service accept as 
>  >valid names  >for entities that it will move?  URLs? EPRs? 
>  >Will logical 
>  > >file names be
>  > >accepted or should they be translated outside this service?
>  > >
>  > >Related to the naming is what type of data will this 
>  >service move?  >Files? video streams?  the output of 
>  >simulations? the output 
>  > >of database
>  > >queries?  Can we make this a service that any service that 
>  > >wants to move
>  > >data can simply invoke it?  Note that I am differentiating 
>  >data from  >messages.  You would not use this to send the 
>  >result from a 
>  > >service that
>  > >summed a bunch of numbers, that would simply be a SOAP 
>  > >response... IMHO :-).
>  > >
>  > >Can we make a generic module that would allow this 
>  >functionality to be  >applied to any service that exposes the 
>  >byte-io interface?  Does that  >affect the interface or is it 
>  >just an implementation issue?  >  >Can we make this service 
>  >transport mechanism agnostic?  both 
>  > >application
>  > >transport (GridFTP vs HTTP vs ...) as well as network 
>  > >transport (TCP vs
>  > >UDP vs UDT vs ...).  My concern here is that I am not sure 
>  > >SOAP has the
>  > >functionality we need.  To do this, I wonder if we need the 
>  >equivalent  >of a union in C, so that the parameters 
>  >specified are based on the
>  > >transport(s) chosen.  For instance, if you use TCP you need 
>  > >to specify a
>  > >buffer size, but not for UDP.  GridFTP specifies streams 
>  >and data  >channel authentication, but HTTP does not.  >  
>  >>What about security / authorization.  This is a broad 
>  >category and we  >should push as much as possible outside of 
>  >scope via callouts 
>  > >and Policy
>  > >Enforcement Points (PEPs), but what about delivery guarantees 
>  > >such as AT
>  > >MOST ONCE, AT LEAST ONCE, EXACTLY ONCE, non-repudiation, 
>  >etc.?  I know  >Dieter has a set of use cases that require 
>  >some of this type delivery  >guarantee functionality.  >  >A 
>  >potentially contentious issue is whether or not these 
>  >services will  >use WSRF and notifications to expose (push 
>  >from the service) 
>  > >or methods
>  > >to query the state (pull from the service).  Hopefully, we 
>  >can find a  >way to make each optional.  >  >If we start 
>  >making many optional parts to the interface, it will make  
>  >>what is exposed as service metadata for brokering will 
>  >become more  >important.  I would propose that we should make 
>  >at a minimum a  >recommendation for what facts about the 
>  >service should be exposed.  >  >All of the existing services 
>  >accept "bulk" inputs, i.e., move 
>  > >these 100
>  > >files.  This can be a problem when the requests become very 
>  > >large due to
>  > >de-serialization.  Should we provide a "chunking" interface 
>  >so that  >requests can be of unlimited size?  >  >Please feel 
>  >free to make comments on the above and more importantly  
>  >>suggest other important issues we need to address.  >  >btw, 
>  >once we have a mail list of our own we will quit 
>  > >spamming the other
>  > >lists :-).
>  > >
>  > >Bill
>  > >-- 
>  > >William E. Allcock
>  > >Argonne National Laboratory
>  > >Bldg 221, Office C-115A
>  > >9700 South Cass Ave
>  > >Argonne, IL 60439-4844
>  > >Office Phone:  +1-630-252-7573
>  > >Office Fax:      +1-630-252-1997
>  > >Cell Phone:      +1-630-854-2842
>  > >
>  > >
>  > >
>  > >
>  >
>  >
> 
> 

-- 
William E. Allcock
Argonne National Laboratory
Bldg 221, Office C-115A
9700 South Cass Ave
Argonne, IL 60439-4844
Office Phone:  +1-630-252-7573
Office Fax:      +1-630-252-1997
Cell Phone:      +1-630-854-2842





More information about the ogsa-d-wg mailing list