[ogsa-d-wg] Draft Charter for Data Movement Interface Standardization WG
William E. Allcock
allcock at mcs.anl.gov
Tue Sep 20 17:31:47 CDT 2005
Sorry, yes, on Monday :-)
Ravi Madduri wrote:
> Bill,
> it is 2:00 pm on monday right ?
>
> On Sep 20, 2005, at 3:21 PM, William E. Allcock wrote:
>
>> The BOF has been approved:
>>
>> 2:00 pm - 3:30 pm
>>
>> Data Movement Interface Standardization
>> (Data Movement Interface Standardization) Charter-Discussion BOF
>>
>> Many services have the need to move data (as opposed to messages
>> invoking the service). The characteristics / semantics required can vary
>> greatly. There are several existing interfaces (RFT, FTS, , all file
>> based, that are similar, but incompatible. A working group spawned by
>> this BOF would work towards a standardized set of WSDL that could invoke
>> a service that met the requirements of existing services as well as
>> non-File based sources.
>>
>> Location: Imperial Ballroom
>>
>> See you there!
>>
>> Bill
>>
>> Malcolm Atkinson wrote:
>>
>>> Dave this is a good way of posing the question.
>>> One can also relate it to the proposal a couple of years ago by Peter
>>> Kunszt for grid data handles as a generic concept for naming data.
>>> I think the problem arises when the selected data is an arbitrary
>>> subset
>>> of the stored data or a derivative of the stored data derived via
>>> any of
>>> a wide range of languages: Xquery, Xpath, SQL, LDAP, semi-structured
>>> QLs, statistical languages, FFT, datacutter, ....
>>> These all make sense. It is just hard to understand how to compose
>>> them. It is hard to make general rules. The derived data has to be
>>> evaluated to some point where it can be moved - a RAM buffer - can the
>>> movement be part of the standard and the derivation processes be
>>> strictly corraled in some other specs?
>>> Malcolm
>>> >-----Original Message-----
>>> >From: Dave Berry >Sent: 14 September 2005 21:00
>>> >To: Malcolm Atkinson; William E. Allcock; ogsa-d-wg at ggf.org;
>>> >gsm-wg at ggf.org; byte-io-wg at ggf.org; Peter Kunszt; James >Casey;
>>> Ravi Madduri
>>> >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement
>>> >Interface Standardization WG
>>> >
>>> >Hi Malcolm,
>>> >
>>> >I'd rather ask, what are the characteristics of a file that
>>> >makes these file transfer mechanisms tractable? Then we can >ask,
>>> to what extent can we generalise the mechanism? >
>>> >For example, if the key characteristics are that a file can >be
>>> named and supports random access, then we might generalise >the
>>> mechanism to include data in RAM (which would avoid >unnecessary
>>> copying to disk). This case would be analogous >to some operating
>>> systems which allow entities in RAM to be >addressed as part of the
>>> file system.
>>> >
>>> >Conversely, if the mechanism can handle any named sequence of
>>> >bytes, then it could presumably handle streaming data as >well.
>>> Or if it requires other operations that are specific >to the
>>> location of bytes on a disk (or tape), then the WG >will restrict
>>> its attention to those cases.
>>> >
>>> >I would expect this group to place a strong requirement on >the
>>> OGSA WG to provide a naming system that can specify >whatever data
>>> sets this WG wants to move.
>>> >
>>> >Dave.
>>> >
>>> >
>>> >-----Original Message-----
>>> >From: owner-ogsa-d-wg at ggf.org >[mailto:owner-ogsa-d-wg at ggf.org]
>>> On Behalf Of Malcolm Atkinson
>>> >Sent: 14 September 2005 17:20
>>> >To: William E. Allcock; ogsa-d-wg at ggf.org; gsm-wg at ggf.org;
>>> >byte-io-wg at ggf.org; Peter Kunszt; James Casey; Ravi Madduri
>>> >Subject: RE: [ogsa-d-wg] Draft Charter for Data Movement
>>> >Interface Standardization WG
>>> >
>>> >
>>> >Hi Bill
>>> >
>>> >I agree that such a standard interface is needed.
>>> >When you look at files I presume you consider files where >ever
>>> they are, secondary or tertiary storage at least.
>>> >
>>> >When you say any dataa, then there is the possibility of >trivial
>>> or large amounts of data between RAM, as well as data >from files
>>> and databases with an enormous set of >possibilities of the way it
>>> may be selected and identified. >Eventually it gets close to
>>> Byte-IO, Streams
>>> >(BoF) and InfoD etc.
>>> >
>>> >So I'm agreeing with you that if yu go beyond files then >scope
>>> control is difficult.
>>> >
>>> >Would it be better to do the standardisation of file movement
>>> >first and look at other forms of adta movement later?
>>> >
>>> >Malcolm
>>> > >
>>> > >-----Original Message-----
>>> > >From: owner-ogsa-d-wg at ggf.org > >[mailto:owner-ogsa-d-
>>> wg at ggf.org] On Behalf Of William E. Allcock
>>> > >Sent: 14 September 2005 17:04
>>> > >To: ogsa-d-wg at ggf.org; gsm-wg at ggf.org; byte-io-wg at ggf.org; >
>>> >Peter Kunszt; James Casey; Ravi Madduri
>>> > >Subject: [ogsa-d-wg] Draft Charter for Data Movement >
>>> >Interface Standardization WG
>>> > >
>>> > >Sorry for the re-send, I typoed the byte-io mail list.
>>> > >
>>> > >All,
>>> > >
>>> > >Sorry for the SPAM, but I sent this to the "likely >suspects"
>>> who might >be interested. I have a proposed BOF >(waiting for AD
>>> approval) to >discuss standardizing an >interface for invoking
>>> data movement. There >are several of >them out there already.
>>> CERN has the File Transfer >System >(FTS), the gsm-wg has SRM
>>> copy, Globus has the Reliable File >>Transfer (RFT) service,
>>> etc.. I don't think there will be > >any argument
>>> > >that there is a need for such standardization, the hard >part
>>> will be >scoping the extent of what we will work on. >For
>>> instance, all the >examples above are file based, but >ideally,
>>> this interface would work >for any data that can be >addressed.
>>> > >I expect that that the BOF will be centered >around scoping the
>>> working >group, but I think we should >(and approval of the BOF
>>> depends on) >getting some initial >discussion around the scope.
>>> So... here it goes: > >I >think the obvious thing is that it
>>> needs to be able to have > >the basic
>>> > >functionality presented by FTS, RFT, and SRM-copy, however >
>>> >the devil is
>>> > >in the details, so I will break this up into "blocks of >
>>> >functionality":
>>> > >
>>> > >Lets start with naming. What will this service accept as
>>> >valid names >for entities that it will move? URLs? EPRs? >Will
>>> logical > >file names be
>>> > >accepted or should they be translated outside this service?
>>> > >
>>> > >Related to the naming is what type of data will this >service
>>> move? >Files? video streams? the output of >simulations? the
>>> output > >of database
>>> > >queries? Can we make this a service that any service that >
>>> >wants to move
>>> > >data can simply invoke it? Note that I am differentiating
>>> >data from >messages. You would not use this to send the >result
>>> from a > >service that
>>> > >summed a bunch of numbers, that would simply be a SOAP >
>>> >response... IMHO :-).
>>> > >
>>> > >Can we make a generic module that would allow this
>>> >functionality to be >applied to any service that exposes the
>>> >byte-io interface? Does that >affect the interface or is it
>>> >just an implementation issue? > >Can we make this service
>>> >transport mechanism agnostic? both > >application
>>> > >transport (GridFTP vs HTTP vs ...) as well as network >
>>> >transport (TCP vs
>>> > >UDP vs UDT vs ...). My concern here is that I am not sure >
>>> >SOAP has the
>>> > >functionality we need. To do this, I wonder if we need the
>>> >equivalent >of a union in C, so that the parameters >specified
>>> are based on the
>>> > >transport(s) chosen. For instance, if you use TCP you need >
>>> >to specify a
>>> > >buffer size, but not for UDP. GridFTP specifies streams >and
>>> data >channel authentication, but HTTP does not. > >>What about
>>> security / authorization. This is a broad >category and we
>>> >should push as much as possible outside of >scope via callouts >
>>> >and Policy
>>> > >Enforcement Points (PEPs), but what about delivery guarantees >
>>> >such as AT
>>> > >MOST ONCE, AT LEAST ONCE, EXACTLY ONCE, non-repudiation,
>>> >etc.? I know >Dieter has a set of use cases that require >some
>>> of this type delivery >guarantee functionality. > >A
>>> >potentially contentious issue is whether or not these >services
>>> will >use WSRF and notifications to expose (push >from the
>>> service) > >or methods
>>> > >to query the state (pull from the service). Hopefully, we >can
>>> find a >way to make each optional. > >If we start >making many
>>> optional parts to the interface, it will make >>what is exposed as
>>> service metadata for brokering will >become more >important. I
>>> would propose that we should make >at a minimum a >recommendation
>>> for what facts about the >service should be exposed. > >All of
>>> the existing services >accept "bulk" inputs, i.e., move > >these 100
>>> > >files. This can be a problem when the requests become very >
>>> >large due to
>>> > >de-serialization. Should we provide a "chunking" interface >so
>>> that >requests can be of unlimited size? > >Please feel >free to
>>> make comments on the above and more importantly >>suggest other
>>> important issues we need to address. > >btw, >once we have a mail
>>> list of our own we will quit > >spamming the other
>>> > >lists :-).
>>> > >
>>> > >Bill
>>> > >-- > >William E. Allcock
>>> > >Argonne National Laboratory
>>> > >Bldg 221, Office C-115A
>>> > >9700 South Cass Ave
>>> > >Argonne, IL 60439-4844
>>> > >Office Phone: +1-630-252-7573
>>> > >Office Fax: +1-630-252-1997
>>> > >Cell Phone: +1-630-854-2842
>>> > >
>>> > >
>>> > >
>>> > >
>>> >
>>> >
>>>
>>
>> --
>> William E. Allcock
>> Argonne National Laboratory
>> Bldg 221, Office C-115A
>> 9700 South Cass Ave
>> Argonne, IL 60439-4844
>> Office Phone: +1-630-252-7573
>> Office Fax: +1-630-252-1997
>> Cell Phone: +1-630-854-2842
>>
>>
>
> --
> Ravi K Madduri
> The Globus Alliance | Argonne National Laboratory
> http://www-unix.mcs.anl.gov/~madduri
>
>
>
--
William E. Allcock
Argonne National Laboratory
Bldg 221, Office C-115A
9700 South Cass Ave
Argonne, IL 60439-4844
Office Phone: +1-630-252-7573
Office Fax: +1-630-252-1997
Cell Phone: +1-630-854-2842
More information about the ogsa-d-wg
mailing list