[SAGA-RG] missing(?) method reporting last modification time

Andre Merzky andre at merzky.net
Sat May 30 11:25:36 CDT 2009


John, Thilo, 

allow me to quote from the Holy Book of SAGA, Scripture 2.8
"Execution Semantics and Consistency Model":

  SAGA API calls on a single service or server can occur
  concurrently with (a) other tasks from the same SAGA
  application, (b) tasks from other SAGA applications, or
  also (c) calls from other, independently developed
  (non-SAGA) applications.  This means that the user of
  the SAGA API should not rely on any specific execution
  order of concurrent API calls.  However,
  implementations MUST guarantee that a synchronous
  method is indeed finished when the method returns, and
  that an asynchronous method is indeed finished when the
  task instance representing this method is in a final
  state.  Further control of execution order, if needed,
  has to be enforced via separate concurrency control
  mechanisms, preferably provided by the services
  themselves, or on application level.

  [... at most once ...]

  Beyond this, the SAGA API specification does \I{not}
  prescribe any consistency model for its operations, as we
  feel that this would be very hard to implement across
  different middleware platforms.  A SAGA implementation MAY
  specify some consistency model, which MUST be documented.
  A SAGA implementation SHOULD always allow for application
  level consistency enforcement, for example by use of of
  application level locks and mutexes.


Related to that is Scripture 2.6.4 "Concurrency Control":

  Although limited, SAGA defines a de-facto concurrent
  programming model, via the task model and the asynchronous
  notification mechanism. Sharing of object state among
  concurrent units (e.g. tasks) is intentional and necessary
  for addressing the needs of various use cases. Concurrent
  use of shared state, however, requires concurrency control
  to avoid unpredictable behavior. 

  (Un)fortunately, a large variety of concurrency control
  mechanisms exist, with different programming languages
  lending themselves to certain flavors, like object locks
  and monitors in Java, or POSIX mutexes in C-like
  languages. For some use cases of SAGA, enforced
  concurrency control mechanisms might be both unnecessary
  and counter productive, leading to increased programming
  complexity and runtime overheads. 

  Because of these constraints, SAGA does not enforce
  concurrency control mechanisms on its implementations.
  Instead, it is the responsibility of the application
  programmer to ensure that her program will execute
  correctly in all possible orderings and interleavings of
  the concurrent units. The application programmer is free
  to use any concurrency control scheme (like locks,
  mutexes, or monitors) in addition to the SAGA API.


Again related, Commandement 2.6.5 calls for thread safety
for all implementations which want to obtain the blessings
of the church of SAGA.

So, that may be enough from the big story book for today I
think.

Best, Andre.


Quoting [Thilo Kielmann] (May 29 2009):
> 
> Well, the intended semantics was reporting the remote state.
> Of course, in the presence of asynchronous operations (or simply operations
> done by other, e.g. non-SAGA,  processes), all kinds of problems and race 
> conditions can occur. But that's just plain normal. 
> 
> I see nothing special with tme stamps, that we would not have with anything
> else we already have.
> 
> Thilo
> 
> On Fri, May 29, 2009 at 06:39:08AM -0700, John Shalf wrote:
> > Cc: saga-rg at ogf.org
> > From: John Shalf <JShalf at lbl.gov>
> > To: Thilo Kielmann <kielmann at cs.vu.nl>
> > Subject: Re: [SAGA-RG] missing(?) method reporting last modification time
> > 
> > 
> > It just makes problems with an undefined consistency model clearer  
> > because ordering of requests may result in incorrect timestamps.  The  
> > problem is more difficult when dealing with timestamps because that  
> > hits the metadata server, which is a different data path than data  
> > storage. Users will assume POSIX-like consistency where
> > 	a=check_timestamp(file foo)
> > 	write(file foo)
> > 	b=check _timestamp(file foo)
> > that b will always be greater than a.  For remote connections, this is  
> > more difficult to guarantee unless you are explicit about the  
> > underlying consistency model.  Are you going to claim POSIX  
> > consistency?  If so, then it doesn't have anything to say about the  
> > async case.
> > 
> > I brought this up years ago, but reading the current spec I don't see  
> > that info. I'm not sure if it has been fully addressed.  It will be  
> > easier to end up with absurd or seemingly incorrect situations with  
> > relying on timestamp data.
> > 
> > -john
> > 
> > On May 29, 2009, at 6:29 AM, Thilo Kielmann wrote:
> > >John,
> > >
> > >what would be the special thing with a timestamp and everything else,
> > >like file size, content, permissions... that we all have already?
> > >
> > >Thilo
> > >
> > >On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:
> > >>Cc: saga-rg at ogf.org
> > >>From: John Shalf <jshalf at lbl.gov>
> > >>To: Thilo Kielmann <kielmann at cs.vu.nl>
> > >>Subject: Re: [SAGA-RG] missing(?) method reporting last  
> > >>modification time
> > >>
> > >>For the async version of the SAGA interface, what consistency model  
> > >>to
> > >>you propose for the modification time information?  POSIX semantics  
> > >>do
> > >>not address this, which is precisely why POSIX is so damned slow on
> > >>distributed/remote filesystems.  It seems we'd need to at least
> > >>propose an unambiguous consistency model for the time-stamps and how
> > >>this would interact with concurrent async read/write calls that might
> > >>be in progress.
> > >>
> > >>(just making the consistency model based on current state at the
> > >>remote side is fine, but from the client side, you might end up with
> > >>absurd situations when you do an async timestamp request concurrent
> > >>with an async file open for example.)
> > >>
> > >>-john
> > >>
> > >>On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:
> > >>>Folks,
> > >>>
> > >>>within our group we are currently delving into issues with accessing
> > >>>remote file systems. What strikes us is that such access is SLOW.
> > >>>As such, it would be very beneficial if one could find out when (and
> > >>>thus whether) a remote file or directory has been modified.
> > >>>
> > >>>While returning this piece of information sounds to be "trivial", it
> > >>>strikes
> > >>>us that the SAGA spec has no such call in the name space package
> > >>>(where files
> > >>>reside).
> > >>>
> > >>>In POSIX terms, this is the info returned by the stat system call
> > >>>(see: man 2 stat), with the st_mtime parameter.
> > >>>
> > >>>In Java, files have a method lastmodified().
> > >>>
> > >>>Both POSIX and Java report the time in milliseconds since 01/01/1970
> > >>>(epoch).
> > >>>
> > >>>
> > >>>Of course, it looks like nobody has ever been thinking about such a
> > >>>use case,
> > >>>but here we are! Our feeling is that the last modification time is
> > >>>very
> > >>>essential meta data about files, so such a call should certainly be
> > >>>there.
> > >>>With our current problem certainly not being the only use case for
> > >>>finding
> > >>>out how old/new a given file or directory is...
> > >>>
> > >>>Our favourite proposal is to add a method that returns the last
> > >>>modification
> > >>>time to both ns_entry and ns_directory as this makes sense with
> > >>>physical as
> > >>>well as with logical (replicated) files.
> > >>>
> > >>>
> > >>>Any reactions/objections ???
> > >>>
> > >>>
> > >>>Thilo Kielmann
> > >>>-- 
> > >>>Thilo Kielmann
> > >>>http://www.cs.vu.nl/~kielmann/
> > >>>--
> > >>>saga-rg mailing list
> > >>>saga-rg at ogf.org
> > >>>http://www.ogf.org/mailman/listinfo/saga-rg
> > >>
> > >
> > >
> > >
> > >-- 
> > >Thilo Kielmann                                 
> > >http://www.cs.vu.nl/~kielmann/
-- 
Nothing is ever easy.


More information about the saga-rg mailing list