[ogsa-d-wg] Draft Charter for Data Movement Interface Standardization WG
Ravi Madduri
madduri at mcs.anl.gov
Tue Sep 20 17:25:35 CDT 2005
Bill,
it is 2:00 pm on monday right ?
On Sep 20, 2005, at 3:21 PM, William E. Allcock wrote:
> The BOF has been approved:
>
> 2:00 pm - 3:30 pm
>
> Data Movement Interface Standardization
> (Data Movement Interface Standardization) Charter-Discussion BOF
>
> Many services have the need to move data (as opposed to messages
> invoking the service). The characteristics / semantics required can
> vary
> greatly. There are several existing interfaces (RFT, FTS, , all file
> based, that are similar, but incompatible. A working group spawned by
> this BOF would work towards a standardized set of WSDL that could
> invoke
> a service that met the requirements of existing services as well as
> non-File based sources.
>
> Location: Imperial Ballroom
>
> See you there!
>
> Bill
>
> Malcolm Atkinson wrote:
>
>> Dave this is a good way of posing the question.
>> One can also relate it to the proposal a couple of years ago by Peter
>> Kunszt for grid data handles as a generic concept for naming data.
>> I think the problem arises when the selected data is an arbitrary
>> subset
>> of the stored data or a derivative of the stored data derived via
>> any of
>> a wide range of languages: Xquery, Xpath, SQL, LDAP, semi-structured
>> QLs, statistical languages, FFT, datacutter, ....
>> These all make sense. It is just hard to understand how to compose
>> them. It is hard to make general rules. The derived data has to be
>> evaluated to some point where it can be moved - a RAM buffer - can
>> the
>> movement be part of the standard and the derivation processes be
>> strictly corraled in some other specs?
>> Malcolm
>> >-----Original Message-----
>> >From: Dave Berry >Sent: 14 September 2005 21:00
>> >To: Malcolm Atkinson; William E. Allcock; ogsa-d-wg at ggf.org;
>> >gsm-wg at ggf.org; byte-io-wg at ggf.org; Peter Kunszt; James >Casey;
>> Ravi Madduri
>> >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement
>> >Interface Standardization WG
>> >
>> >Hi Malcolm,
>> >
>> >I'd rather ask, what are the characteristics of a file that
>> >makes these file transfer mechanisms tractable? Then we can
>> >ask, to what extent can we generalise the mechanism? >
>> >For example, if the key characteristics are that a file can >be
>> named and supports random access, then we might generalise >the
>> mechanism to include data in RAM (which would avoid >unnecessary
>> copying to disk). This case would be analogous >to some
>> operating systems which allow entities in RAM to be >addressed as
>> part of the file system.
>> >
>> >Conversely, if the mechanism can handle any named sequence of
>> >bytes, then it could presumably handle streaming data as >well.
>> Or if it requires other operations that are specific >to the
>> location of bytes on a disk (or tape), then the WG >will restrict
>> its attention to those cases.
>> >
>> >I would expect this group to place a strong requirement on >the
>> OGSA WG to provide a naming system that can specify >whatever
>> data sets this WG wants to move.
>> >
>> >Dave.
>> >
>> >
>> >-----Original Message-----
>> >From: owner-ogsa-d-wg at ggf.org >[mailto:owner-ogsa-d-wg at ggf.org]
>> On Behalf Of Malcolm Atkinson
>> >Sent: 14 September 2005 17:20
>> >To: William E. Allcock; ogsa-d-wg at ggf.org; gsm-wg at ggf.org;
>> >byte-io-wg at ggf.org; Peter Kunszt; James Casey; Ravi Madduri
>> >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement
>> >Interface Standardization WG
>> >
>> >
>> >Hi Bill
>> >
>> >I agree that such a standard interface is needed.
>> >When you look at files I presume you consider files where >ever
>> they are, secondary or tertiary storage at least.
>> >
>> >When you say any dataa, then there is the possibility of
>> >trivial or large amounts of data between RAM, as well as data
>> >from files and databases with an enormous set of >possibilities
>> of the way it may be selected and identified. >Eventually it
>> gets close to Byte-IO, Streams
>> >(BoF) and InfoD etc.
>> >
>> >So I'm agreeing with you that if yu go beyond files then >scope
>> control is difficult.
>> >
>> >Would it be better to do the standardisation of file movement
>> >first and look at other forms of adta movement later?
>> >
>> >Malcolm
>> > >
>> > >-----Original Message-----
>> > >From: owner-ogsa-d-wg at ggf.org > >[mailto:owner-ogsa-d-
>> wg at ggf.org] On Behalf Of William E. Allcock
>> > >Sent: 14 September 2005 17:04
>> > >To: ogsa-d-wg at ggf.org; gsm-wg at ggf.org; byte-io-wg at ggf.org; >
>> >Peter Kunszt; James Casey; Ravi Madduri
>> > >Subject: [ogsa-d-wg] Draft Charter for Data Movement >
>> >Interface Standardization WG
>> > >
>> > >Sorry for the re-send, I typoed the byte-io mail list.
>> > >
>> > >All,
>> > >
>> > >Sorry for the SPAM, but I sent this to the "likely >suspects"
>> who might >be interested. I have a proposed BOF >(waiting for
>> AD approval) to >discuss standardizing an >interface for
>> invoking data movement. There >are several of >them out there
>> already. CERN has the File Transfer >System >(FTS), the gsm-wg
>> has SRM copy, Globus has the Reliable File >>Transfer (RFT)
>> service, etc.. I don't think there will be > >any argument
>> > >that there is a need for such standardization, the hard >part
>> will be >scoping the extent of what we will work on. >For
>> instance, all the >examples above are file based, but >ideally,
>> this interface would work >for any data that can be >addressed.
>> > >I expect that that the BOF will be centered >around scoping
>> the working >group, but I think we should >(and approval of the
>> BOF depends on) >getting some initial >discussion around the
>> scope. So... here it goes: > >I >think the obvious thing is
>> that it needs to be able to have > >the basic
>> > >functionality presented by FTS, RFT, and SRM-copy, however >
>> >the devil is
>> > >in the details, so I will break this up into "blocks of >
>> >functionality":
>> > >
>> > >Lets start with naming. What will this service accept as
>> >valid names >for entities that it will move? URLs? EPRs? >Will
>> logical > >file names be
>> > >accepted or should they be translated outside this service?
>> > >
>> > >Related to the naming is what type of data will this >service
>> move? >Files? video streams? the output of >simulations? the
>> output > >of database
>> > >queries? Can we make this a service that any service that >
>> >wants to move
>> > >data can simply invoke it? Note that I am differentiating
>> >data from >messages. You would not use this to send the
>> >result from a > >service that
>> > >summed a bunch of numbers, that would simply be a SOAP >
>> >response... IMHO :-).
>> > >
>> > >Can we make a generic module that would allow this
>> >functionality to be >applied to any service that exposes the
>> >byte-io interface? Does that >affect the interface or is it
>> >just an implementation issue? > >Can we make this service
>> >transport mechanism agnostic? both > >application
>> > >transport (GridFTP vs HTTP vs ...) as well as network >
>> >transport (TCP vs
>> > >UDP vs UDT vs ...). My concern here is that I am not sure >
>> >SOAP has the
>> > >functionality we need. To do this, I wonder if we need the
>> >equivalent >of a union in C, so that the parameters >specified
>> are based on the
>> > >transport(s) chosen. For instance, if you use TCP you need >
>> >to specify a
>> > >buffer size, but not for UDP. GridFTP specifies streams >and
>> data >channel authentication, but HTTP does not. > >>What
>> about security / authorization. This is a broad >category and
>> we >should push as much as possible outside of >scope via
>> callouts > >and Policy
>> > >Enforcement Points (PEPs), but what about delivery guarantees
>> > >such as AT
>> > >MOST ONCE, AT LEAST ONCE, EXACTLY ONCE, non-repudiation,
>> >etc.? I know >Dieter has a set of use cases that require >some
>> of this type delivery >guarantee functionality. > >A
>> >potentially contentious issue is whether or not these >services
>> will >use WSRF and notifications to expose (push >from the
>> service) > >or methods
>> > >to query the state (pull from the service). Hopefully, we
>> >can find a >way to make each optional. > >If we start >making
>> many optional parts to the interface, it will make >>what is
>> exposed as service metadata for brokering will >become more
>> >important. I would propose that we should make >at a minimum a
>> >recommendation for what facts about the >service should be
>> exposed. > >All of the existing services >accept "bulk" inputs,
>> i.e., move > >these 100
>> > >files. This can be a problem when the requests become very >
>> >large due to
>> > >de-serialization. Should we provide a "chunking" interface
>> >so that >requests can be of unlimited size? > >Please feel
>> >free to make comments on the above and more importantly
>> >>suggest other important issues we need to address. > >btw,
>> >once we have a mail list of our own we will quit > >spamming the
>> other
>> > >lists :-).
>> > >
>> > >Bill
>> > >-- > >William E. Allcock
>> > >Argonne National Laboratory
>> > >Bldg 221, Office C-115A
>> > >9700 South Cass Ave
>> > >Argonne, IL 60439-4844
>> > >Office Phone: +1-630-252-7573
>> > >Office Fax: +1-630-252-1997
>> > >Cell Phone: +1-630-854-2842
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>>
>
> --
> William E. Allcock
> Argonne National Laboratory
> Bldg 221, Office C-115A
> 9700 South Cass Ave
> Argonne, IL 60439-4844
> Office Phone: +1-630-252-7573
> Office Fax: +1-630-252-1997
> Cell Phone: +1-630-854-2842
>
>
--
Ravi K Madduri
The Globus Alliance | Argonne National Laboratory
http://www-unix.mcs.anl.gov/~madduri
More information about the ogsa-d-wg
mailing list