[SAGA-RG] Final Call: advert extension

'Andre Merzky' andre at merzky.net
Tue Mar 9 14:24:53 CST 2010


Hi Hartmut, 

great feedback, thanks!

Some comments inlined below.

Quoting [Hartmut Kaiser] (Mar 09 2010):
> 
> > Hmm, not sure what went wrong - opens fine for me with Acrobat and
> > Preview, on MacOS.  I simply post it again - maybe the attachement
> > process mangled the doc, or whatever.  Please let me know if the
> > problem persists, and I look into it again.
> 
> Here are my comments:
> 
> Page 1
> - Copyright (update for 2010)
> 
> Page 4:
> - saga::advert_directory --> saga::advert::directory 
> - saga::logical_directory --> saga::replica::logical_directory
> 
> Page 6:
> - undefined reference [?]
> - the the --> the
> - ...second advert directory /this/? acts as a...
> 
> Page 7:
> - how to ensure atomicity of setting a workitem to be 'accepted'?
> - ...directories adverts... --> directory adverts
> 
> Page 8:
> - _SAGA_LOCK is not sufficient if set to 1, it needs to contain the
> information about what object holds the lock (id of corresponding object?)
> otherwise it's useless for the implementation of atomic access (or, how
> should atomic locking be implemented)

Good point - storing the objects uuid seems sensible and portable.


> - ...define their on... --> define their own
> 
> Page 9:
> - I'm feeling a bit uneasy with the requirement to throw IncorrectState for
> expired entries. If this is generally enforced I see no way to delete those.
> Furthermore, couldn't it be necessary to access those entries even after
> they expired (atleast as long as they still exist)? Debugging, archiving,
> etc. come to mind here.

Good point.   I guess the IncorrectState exception would only be
sensible for backends which guarantee that the adverts get purged
automatically.  In all other cases, I agree a backend should not 
deny access to an advert which is expired, in order to allow manual
garbage collection.

Does that make sense to you?


> Page 10:
> - advert_directory is missing a constructor allowing to reuse default
> session, it is not possible to inherit constructors (well, at least not in
> C++, but I doubt it works in other languages)

The SAGA core spec describes that the session parameter on all
c'tors is optional, even if that is not explicitely rendered in the
IDL.  How that is implemented (overloading, inheritance, ...) is up
to the language bindings.  So, this is no different than for the
packages in the core spec I think.



> Page 11:
> - advert is missing a constructor allowing to reuse default session

same here.


> - ...uses the same /ag/? semantics...

Uhm, not sure what you mean?


> - How can Truncate reset TTL counter if it is supposed to empty the
> attributes? Do you mean _SAGA_MODIFIED should be set to the current time and
> _SAGA_TTL shouldn't be touched?

Very good point!  I would expect that no advert ever looses its
_SAGA_* attributes.  So, the advert c'tor would automatically create
those state attributes, and truncate would either re-use them, or
delete the old and re-create new ones.

Does that make sense?  I agree this needs clarification.


> Page 12:
> - does Truncate on a directory purge everything below in the hierachy?

Uhm, good question.  I would not think though, as truncation is,
IMHO, focusing on the adverts associated with the entity.  Also, we
allow trauncate on the namespace::directory (or at least do not
forbid it), but do not define any action for that, and in particular
not a recursive remove.

> - ...it has two additional method... --> it has two additional methods
> - directorie's --> directory's
> - directories --> directory's
> - PostCond: ...of the context use... --> of the context used
>  
> Page 13:
> - 'if the Tuncate flag is given, the returned object MUST NOT have an
> associated object': what 'associated object' is meant?

The SAGA object which has been added with store_object().  I'll
clarify.


> - The way the spec describes the TTL timer implies that each entry must have
> a 'hidden' entry representing either the remaining TTL or the starting point
> in time when the TTL has been reset. Wouldn't it be benefitial to make that
> explicit, i.e. add another attribute defining the point in time when the
> entry expires?

We have a last_modification attribute (see page 8) - updating that
should allow to leave the TTL timer constant, and still extend the
expiry date I think.  So, expiry_date = last_modified + ttl.

Makes sense?


> Page 14:
> - what's the preferred method of getting a list of all expired items (I'm
> asking as find() shouldn't return those)?

Hmm, that sounds stupid on second though - again, that would only
make sense for backends which guarantee garbage collection.  All
other backends should return the expired entries just as usual I
guess...


> - ...manage manage... --> manage
> - ...set of user define attributes... --> set of user defined attributes
> 
> Page 17:
> - ...semantics as ... --> semantics as
> 
> Page 18:
> - how can I remove an associated object from an entry (de-associate the
> object)?

Duh.  We can't right now - only change it.   So, one would have to
truncate the advert, or to re-create it from scratch.  Both would
destroy the attributes.

How does 'delete_object()' sound to you?  ;-)


> Pages 19-22: 
> - note: I did not compile/run the code, but looks ok

:-)  This reminds me: I'll probably add some simplier examples to
the document.  Thilo pointed out that this one might be a wee bit
long...


> Page 25:
> - I'd suggest to change the way names for those multiple keys are formed:
> _SAGA_CONTEXT_<N>_URL --> _SAGA_CONTEXT_URL_<N>, as this makes it easier to
> generate those, allows simpler search patterns, and even allows to make
> _SAGA_ANY_ATTRIBUTE and _SAGA_ANY_ATTRIBUTE_0 equivalent by definition
> (which unifies the attribute naming scheme)

I agree about the generation, and simpler patterns - but what do you
mean with 'make _SAGA_ANY_ATTRIBUTE and _SAGA_ANY_ATTRIBUTE_0
equivalent'?  I think you mean that _SAGA_ANY_ATTRIBUTE_0 should not
be used, but rather _SAGA_ANY_ATTRIBUTE instead?

In any case: yes, that makes sense - will change.  Attribute names
will not align that nicely anymore though, which will hurt my eyes
for sure :-P


> Page 26: 
> - how is the _SAGA_OBJECT_TYPE attribute supposed to be stored? as string,
> as integer?

As Enum as defined in the core spec:

  "Values of Enum type attributes are expressed as strings, and have
  the literal value of the respective enums as defined in this
  document. For example, the initial task states would have the
  values ’New’, ’Running’, ’Done’, etc."
  

> - 'For all classes inheriting saga::object, the serialization SHOULD contain
> the two attributes, too.' What two attributes?

'type' and 'session' - that is confusing though, I agree, since
session can be comprised of multiple attributes.  Will fix.


> - I believe it might be useful to store th euuid of the original object as
> well, allowing to make persistent object relations independently of a single
> application instance

I had that in there originally, but removed it because, IMHO,
advert::store_object()/retrieve_object() should behave like
object::clone(), which happens to assign a new UUID.  Also, one
could call 'retrieve_object()' multiple times on the same advert,
and would end up with the same uuid for multiple objects -- this is
exactly why we required to change the uuid on object.clone() I
think, as that would put quite some burden on the implementation to
track those cases, with no appearent benefit to the end user.

I'd be happy to be convinced otherwise though - let me know what you
think, please.


> - How can a saga::object have associated context information stored? It
> should be sufficient to refer to the session, which in turn refers to the
> contexts.

Right, but that would require that the session is actually stored in
the advert service.  That could be either done by the backend
automatically, which creates a session entry along with the original
entry, or is done by the SAGA application explicitely.  

However, in both cases the advert service will get swamped with
session and context adverts.  Alowing to fold the session incl.
contexts into the advert itself avoids that - for the price of
bloated adverts.  Bulk ops or caching are a must in either case I
guess...


> - ...If no session information are serialized... --> If no session
> information is serialized
> 
> Page 27:
> - strings should be always uuencode'd, otherwise we loose interoperability

We don't require that for replica attribute strings either, nor in
fact for any attributes in the SAGA specs anywhere.  IMHO, we should
leave this as simple as it is, and fix it once people start
complaining.  uuencode in user space is also always available for
the fringe use cases.  Makes sense?


> - How vector attributes of a job description have to be serialized

Ouch, good one - no idea.  Possibly similar to the context list, so
via

  _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_NUM = 2
  _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_0   = "localhost"
  _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_1   = "remotehost"

?  That would be at least somewhat consistent, but again bloats the
attribute table...


> - the job service id should not be serialized along with the jobid to avoid
> ambiuities/incompatibilities between the two, parsing the jobid is not a big
> deal in the end...

What ambiguities would you expect?

Yes, parsing is not the big deal, but the job-id format is not a
hard requirement in the Core spec (SHOULD instead of MUST).  Also,
our own adaptors don't always follow that format specification IIRC
;-)


> Page 28:
> - are permissions serialized too?

I don't think so, as those are backend properties.  Like, file
permissions are stored on the backend - as soon as one restores the
file object, the adaptor can query the backend for actual state
information, including permissions.  Only client side state should
get serialized: the file pointer being a prominent example.


> Page 29:
> - logical\_file --> logical_file
> - logical\_directory --> logical_directory
> 
> Page 30:
> - what attributes have advert and advert_directory when serialized?

Doh.  That is not even specified :-)  

I would expect:
  session
  url
  openmode

The rest is again backend state, IMHO.

Thanks again, thats really great feedback!

Best, Andre.


-- 
Nothing is ever easy.


More information about the saga-rg mailing list