[jsdl-wg] Question on file stage-in (fwd)

Andrew Grimshaw grimshaw at cs.virginia.edu
Thu Jun 9 08:27:42 CDT 2005


Michel,

Your questions:
" I am not objecting in general, but I do wonder
- how is "different" defined? By creation date? By modification date? 
Size? MD5 hash?
- Who carries out the assessment and decides if source and target are 
different or the same?

I think a more prominent use case for a performance enhancement are 
partial file transfers (in your example, this would only transfer the 
changed bits of your data set).

But both, partial file transfers and conditional file transfers can be 
already realised today using extensions to JSDL: Note that, referring 
to JSDL 1.0 draft 19 (http://tinyurl.com/cvfmk), both source and target 
elements in a JSDL document instance do not have to have a jsdl:URI 
child element. You can add as many child elements and attributes as you 
want."

Different may depend on semantics. We've used last update time from stat in
the past ... just see if they are different - of course we kept track is a
separate place the mod time of the cached copy.  In general "difference"
depends on the semantics of the "file".

The problem with partial file transfers is that they may require extensive
book-keeping on the server side of writes, or diffs between different
versions. I think that in the long term we'll end up with different "types"
of file services that can, in fact, do differential synchronization. But
that is beyond where we are now.

I'm not sure how I would use the extensibility option you mention.

Thanks for the reply.

Andrew





-----Original Message-----
From: owner-jsdl-wg at ggf.org [mailto:owner-jsdl-wg at ggf.org] On Behalf Of
Michel Drescher
Sent: Wednesday, June 08, 2005 10:55 AM
To: Ali Anjomshoaa
Cc: JSDL WG
Subject: Re: [jsdl-wg] Question on file stage-in (fwd)

Hi Andrew,

first off, welcome to JSDL. :-)

> In the data staging elements there is a creation flag that indicates 
> whether
> to over-write or append.
> Often, one only wants to overwrite if the target and the source are
> different, for example if I am using a large data set that changes
> infrequently or several jobs on the same resource will use the "same" 
> input
> file. Is there a way to do conditional stage in?

Not at the moment. This may be different in the future.

> But we're talking about a major performance optimization.

I am not objecting in general, but I do wonder
- how is "different" defined? By creation date? By modification date? 
Size? MD5 hash?
- Who carries out the assessment and decides if source and target are 
different or the same?

I think a more prominent use case for a performance enhancement are 
partial file transfers (in your example, this would only transfer the 
changed bits of your data set).

But both, partial file transfers and conditional file transfers can be 
already realised today using extensions to JSDL: Note that, referring 
to JSDL 1.0 draft 19 (http://tinyurl.com/cvfmk), both source and target 
elements in a JSDL document instance do not have to have a jsdl:URI 
child element. You can add as many child elements and attributes as you 
want.

Cheers,
Michel





More information about the jsdl-wg mailing list