[byteio-wg] Re: Draft Charter for Data Movement Interface Standardization WG

neil p chue hong N.ChueHong at epcc.ed.ac.uk
Thu Sep 15 05:30:13 CDT 2005


 

-----Original Message-----
From: owner-ogsa-d-wg at ggf.org [mailto:owner-ogsa-d-wg at ggf.org] On Behalf Of
Peter Kunszt
Sent: 14 September 2005 22:21
To: William E. Allcock
Cc: ogsa-d-wg at ggf.org; gsm-wg at ggf.org; byteio-wg at ggf.org; James Casey; Ravi
Madduri
Subject: [ogsa-d-wg] Re: Draft Charter for Data Movement Interface
Standardization WG

hi bill

i agree that standardization on interfacing to file transfer services is
important. i would like to put in another aspect in addition to the ones you
mention, which is the scheduling aspect: i.e. there will be 'local'
schedulers which move data on a single network link as opposed to
superschedulers which take also the network topology and network
optimizations into account. so this will relate to GRAAP and GHPN and
probably also the job description language groups (forgot the acronym). it
will actually be an extremely interesting showcase of how all these efforts
can be 'harmonized' into an easy and useful spec or not.

peter

On Wed, 2005-09-14 at 11:14 -0500, William E. Allcock wrote:
> Sorry for the re-send, I typoed the byte-io mail list, yet again.
> 
> All,
> 
> Sorry for the SPAM, but I sent this to the "likely suspects" who might 
> be interested.  I have a proposed BOF (waiting for AD approval) to 
> discuss standardizing an interface for invoking data movement.  There 
> are several of them out there already.  CERN has the File Transfer 
> System (FTS), the gsm-wg has SRM copy, Globus has the Reliable File 
> Transfer (RFT) service, etc..  I don't think there will be any 
> argument that there is a need for such standardization, the hard part 
> will be scoping the extent of what we will work on.  For instance, all 
> the examples above are file based, but ideally, this interface would 
> work for any data that can be addressed.
> 
> I expect that that the BOF will be centered around scoping the working 
> group, but I think we should (and approval of the BOF depends on) 
> getting some initial discussion around the scope.  So... here it goes:
> 
> I think the obvious thing is that it needs to be able to have the 
> basic functionality presented by FTS, RFT, and SRM-copy, however the 
> devil is in the details, so I will break this up into "blocks of
functionality":
> 
> Lets start with naming.  What will this service accept as valid names 
> for entities that it will move?  URLs? EPRs? Will logical file names 
> be accepted or should they be translated outside this service?
> 
> Related to the naming is what type of data will this service move?
> Files? video streams?  the output of simulations? the output of 
> database queries?  Can we make this a service that any service that 
> wants to move data can simply invoke it?  Note that I am 
> differentiating data from messages.  You would not use this to send 
> the result from a service that summed a bunch of numbers, that would
simply be a SOAP response... IMHO :-).
> 
> Can we make a generic module that would allow this functionality to be 
> applied to any service that exposes the byte-io interface?  Does that 
> affect the interface or is it just an implementation issue?
> 
> Can we make this service transport mechanism agnostic?  both 
> application transport (GridFTP vs HTTP vs ...) as well as network 
> transport (TCP vs UDP vs UDT vs ...).  My concern here is that I am 
> not sure SOAP has the functionality we need.  To do this, I wonder if 
> we need the equivalent of a union in C, so that the parameters 
> specified are based on the
> transport(s) chosen.  For instance, if you use TCP you need to specify 
> a buffer size, but not for UDP.  GridFTP specifies streams and data 
> channel authentication, but HTTP does not.
> 
> What about security / authorization.  This is a broad category and we 
> should push as much as possible outside of scope via callouts and 
> Policy Enforcement Points (PEPs), but what about delivery guarantees 
> such as AT MOST ONCE, AT LEAST ONCE, EXACTLY ONCE, non-repudiation, 
> etc.?  I know Dieter has a set of use cases that require some of this 
> type delivery guarantee functionality.
> 
> A potentially contentious issue is whether or not these services will 
> use WSRF and notifications to expose (push from the service) or 
> methods to query the state (pull from the service).  Hopefully, we can 
> find a way to make each optional.
> 
> If we start making many optional parts to the interface, it will make 
> what is exposed as service metadata for brokering will become more 
> important.  I would propose that we should make at a minimum a 
> recommendation for what facts about the service should be exposed.
> 
> All of the existing services accept "bulk" inputs, i.e., move these 
> 100 files.  This can be a problem when the requests become very large 
> due to de-serialization.  Should we provide a "chunking" interface so 
> that requests can be of unlimited size?
> 
> Please feel free to make comments on the above and more importantly 
> suggest other important issues we need to address.
> 
> btw, once we have a mail list of our own we will quit spamming the 
> other lists :-).
> 
> Bill
> --
> William E. Allcock
> Argonne National Laboratory
> Bldg 221, Office C-115A
> 9700 South Cass Ave
> Argonne, IL 60439-4844
> Office Phone:  +1-630-252-7573
> Office Fax:      +1-630-252-1997
> Cell Phone:      +1-630-854-2842
> 
> 
> 
--
------
CERN, 1211 Geneva 23, Switzerland






More information about the byteio-wg mailing list