[ogsa-d-wg] Draft Charter for Data Movement Interface Standardization WG
William E. Allcock
allcock at mcs.anl.gov
Tue Sep 20 15:21:04 CDT 2005
The BOF has been approved:
2:00 pm - 3:30 pm
Data Movement Interface Standardization
(Data Movement Interface Standardization) Charter-Discussion BOF
Many services have the need to move data (as opposed to messages
invoking the service). The characteristics / semantics required can vary
greatly. There are several existing interfaces (RFT, FTS, , all file
based, that are similar, but incompatible. A working group spawned by
this BOF would work towards a standardized set of WSDL that could invoke
a service that met the requirements of existing services as well as
non-File based sources.
Location: Imperial Ballroom
See you there!
Bill
Malcolm Atkinson wrote:
> Dave this is a good way of posing the question.
> One can also relate it to the proposal a couple of years ago by Peter
> Kunszt for grid data handles as a generic concept for naming data.
>
> I think the problem arises when the selected data is an arbitrary subset
> of the stored data or a derivative of the stored data derived via any of
> a wide range of languages: Xquery, Xpath, SQL, LDAP, semi-structured
> QLs, statistical languages, FFT, datacutter, ....
>
> These all make sense. It is just hard to understand how to compose
> them. It is hard to make general rules. The derived data has to be
> evaluated to some point where it can be moved - a RAM buffer - can the
> movement be part of the standard and the derivation processes be
> strictly corraled in some other specs?
>
> Malcolm
>
>
> >-----Original Message-----
> >From: Dave Berry
> >Sent: 14 September 2005 21:00
> >To: Malcolm Atkinson; William E. Allcock; ogsa-d-wg at ggf.org;
> >gsm-wg at ggf.org; byte-io-wg at ggf.org; Peter Kunszt; James
> >Casey; Ravi Madduri
> >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement
> >Interface Standardization WG
> >
> >Hi Malcolm,
> >
> >I'd rather ask, what are the characteristics of a file that
> >makes these file transfer mechanisms tractable? Then we can
> >ask, to what extent can we generalise the mechanism?
> >
> >For example, if the key characteristics are that a file can
> >be named and supports random access, then we might generalise
> >the mechanism to include data in RAM (which would avoid
> >unnecessary copying to disk). This case would be analogous
> >to some operating systems which allow entities in RAM to be
> >addressed as part of the file system.
> >
> >Conversely, if the mechanism can handle any named sequence of
> >bytes, then it could presumably handle streaming data as
> >well. Or if it requires other operations that are specific
> >to the location of bytes on a disk (or tape), then the WG
> >will restrict its attention to those cases.
> >
> >I would expect this group to place a strong requirement on
> >the OGSA WG to provide a naming system that can specify
> >whatever data sets this WG wants to move.
> >
> >Dave.
> >
> >
> >-----Original Message-----
> >From: owner-ogsa-d-wg at ggf.org
> >[mailto:owner-ogsa-d-wg at ggf.org] On Behalf Of Malcolm Atkinson
> >Sent: 14 September 2005 17:20
> >To: William E. Allcock; ogsa-d-wg at ggf.org; gsm-wg at ggf.org;
> >byte-io-wg at ggf.org; Peter Kunszt; James Casey; Ravi Madduri
> >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement
> >Interface Standardization WG
> >
> >
> >Hi Bill
> >
> >I agree that such a standard interface is needed.
> >When you look at files I presume you consider files where
> >ever they are, secondary or tertiary storage at least.
> >
> >When you say any dataa, then there is the possibility of
> >trivial or large amounts of data between RAM, as well as data
> >from files and databases with an enormous set of
> >possibilities of the way it may be selected and identified.
> >Eventually it gets close to Byte-IO, Streams
> >(BoF) and InfoD etc.
> >
> >So I'm agreeing with you that if yu go beyond files then
> >scope control is difficult.
> >
> >Would it be better to do the standardisation of file movement
> >first and look at other forms of adta movement later?
> >
> >Malcolm
> >
> >
> > >-----Original Message-----
> > >From: owner-ogsa-d-wg at ggf.org
> > >[mailto:owner-ogsa-d-wg at ggf.org] On Behalf Of William E. Allcock
> > >Sent: 14 September 2005 17:04
> > >To: ogsa-d-wg at ggf.org; gsm-wg at ggf.org; byte-io-wg at ggf.org;
> > >Peter Kunszt; James Casey; Ravi Madduri
> > >Subject: [ogsa-d-wg] Draft Charter for Data Movement
> > >Interface Standardization WG
> > >
> > >Sorry for the re-send, I typoed the byte-io mail list.
> > >
> > >All,
> > >
> > >Sorry for the SPAM, but I sent this to the "likely
> >suspects" who might >be interested. I have a proposed BOF
> >(waiting for AD approval) to >discuss standardizing an
> >interface for invoking data movement. There >are several of
> >them out there already. CERN has the File Transfer >System
> >(FTS), the gsm-wg has SRM copy, Globus has the Reliable File
> >>Transfer (RFT) service, etc.. I don't think there will be
> > >any argument
> > >that there is a need for such standardization, the hard
> >part will be >scoping the extent of what we will work on.
> >For instance, all the >examples above are file based, but
> >ideally, this interface would work >for any data that can be
> >addressed. > >I expect that that the BOF will be centered
> >around scoping the working >group, but I think we should
> >(and approval of the BOF depends on) >getting some initial
> >discussion around the scope. So... here it goes: > >I
> >think the obvious thing is that it needs to be able to have
> > >the basic
> > >functionality presented by FTS, RFT, and SRM-copy, however
> > >the devil is
> > >in the details, so I will break this up into "blocks of
> > >functionality":
> > >
> > >Lets start with naming. What will this service accept as
> >valid names >for entities that it will move? URLs? EPRs?
> >Will logical
> > >file names be
> > >accepted or should they be translated outside this service?
> > >
> > >Related to the naming is what type of data will this
> >service move? >Files? video streams? the output of
> >simulations? the output
> > >of database
> > >queries? Can we make this a service that any service that
> > >wants to move
> > >data can simply invoke it? Note that I am differentiating
> >data from >messages. You would not use this to send the
> >result from a
> > >service that
> > >summed a bunch of numbers, that would simply be a SOAP
> > >response... IMHO :-).
> > >
> > >Can we make a generic module that would allow this
> >functionality to be >applied to any service that exposes the
> >byte-io interface? Does that >affect the interface or is it
> >just an implementation issue? > >Can we make this service
> >transport mechanism agnostic? both
> > >application
> > >transport (GridFTP vs HTTP vs ...) as well as network
> > >transport (TCP vs
> > >UDP vs UDT vs ...). My concern here is that I am not sure
> > >SOAP has the
> > >functionality we need. To do this, I wonder if we need the
> >equivalent >of a union in C, so that the parameters
> >specified are based on the
> > >transport(s) chosen. For instance, if you use TCP you need
> > >to specify a
> > >buffer size, but not for UDP. GridFTP specifies streams
> >and data >channel authentication, but HTTP does not. >
> >>What about security / authorization. This is a broad
> >category and we >should push as much as possible outside of
> >scope via callouts
> > >and Policy
> > >Enforcement Points (PEPs), but what about delivery guarantees
> > >such as AT
> > >MOST ONCE, AT LEAST ONCE, EXACTLY ONCE, non-repudiation,
> >etc.? I know >Dieter has a set of use cases that require
> >some of this type delivery >guarantee functionality. > >A
> >potentially contentious issue is whether or not these
> >services will >use WSRF and notifications to expose (push
> >from the service)
> > >or methods
> > >to query the state (pull from the service). Hopefully, we
> >can find a >way to make each optional. > >If we start
> >making many optional parts to the interface, it will make
> >>what is exposed as service metadata for brokering will
> >become more >important. I would propose that we should make
> >at a minimum a >recommendation for what facts about the
> >service should be exposed. > >All of the existing services
> >accept "bulk" inputs, i.e., move
> > >these 100
> > >files. This can be a problem when the requests become very
> > >large due to
> > >de-serialization. Should we provide a "chunking" interface
> >so that >requests can be of unlimited size? > >Please feel
> >free to make comments on the above and more importantly
> >>suggest other important issues we need to address. > >btw,
> >once we have a mail list of our own we will quit
> > >spamming the other
> > >lists :-).
> > >
> > >Bill
> > >--
> > >William E. Allcock
> > >Argonne National Laboratory
> > >Bldg 221, Office C-115A
> > >9700 South Cass Ave
> > >Argonne, IL 60439-4844
> > >Office Phone: +1-630-252-7573
> > >Office Fax: +1-630-252-1997
> > >Cell Phone: +1-630-854-2842
> > >
> > >
> > >
> > >
> >
> >
>
>
--
William E. Allcock
Argonne National Laboratory
Bldg 221, Office C-115A
9700 South Cass Ave
Argonne, IL 60439-4844
Office Phone: +1-630-252-7573
Office Fax: +1-630-252-1997
Cell Phone: +1-630-854-2842
More information about the ogsa-d-wg
mailing list