[SAGA-RG] Final Call: advert extension

'Andre Merzky' andre at merzky.net
Sun Apr 4 12:16:54 CDT 2010


Hi Hartmut, all, 

I went through your list, and the comments from OGF (see below).
Attached is the next draft iteration.  The major changes are:

  - time to live must be enforced by the backend (expired entries
    are treated as non-existing)
  - serialization scheme is not part of the document anymore, but is
    moved out into a separate document
  - a 'delete_object()| method has been added
  
A number of clarifications and minor changes have been made, too.  
Although there are some minor things still missing (more examples,
some missing bib entries), I would like to put the document in final
call on the list now: the remaining work should not change the API
in any form.  So, please all provide feedback within the next 2
weeks.


Best, Andre.


addressed comments from OGF28
------------------------------

  - ttl mandatory, need to be respected
    - done
  - backend should complain on invalid ttl settings
    - done, value is now also unsigned int
  - add the clone() reference/statement as discussed on list
    - done
  - rediscuss exclusion/limitation of session/context serialization
    - done
  - make refs to GFD.90 clearer
    - done
  - add interop section which explains what level of interop we expect
    (cross application/adaptor/backend/language/...)
    - done



addressed comments from Hartmut:
--------------------------------


Quoting [Hartmut Kaiser] (Mar 09 2010):
> 
> > Hmm, not sure what went wrong - opens fine for me with Acrobat and
> > Preview, on MacOS.  I simply post it again - maybe the attachement
> > process mangled the doc, or whatever.  Please let me know if the
> > problem persists, and I look into it again.
> 
> Here are my comments:
> 
> Page 1
> - Copyright (update for 2010)

done


> Page 4:
> - saga::advert_directory --> saga::advert::directory 
> - saga::logical_directory --> saga::replica::logical_directory

done


> Page 6:
> - undefined reference [?]
> - the the --> the
> - ...second advert directory /this/? acts as a...

done


> Page 7:
> - how to ensure atomicity of setting a workitem to be 'accepted'?

only one worker is operating on a item - they are assigned by the
server.

> - ...directories adverts... --> directory adverts

done


> Page 8:
> - _SAGA_LOCK is not sufficient if set to 1, it needs to contain the
> information about what object holds the lock (id of corresponding object?)
> otherwise it's useless for the implementation of atomic access (or, how
> should atomic locking be implemented)

| Good point - storing the objects uuid seems sensible and portable.

now left out of the spec though.


> - ...define their on... --> define their own
> 
> Page 9:
> - I'm feeling a bit uneasy with the requirement to throw IncorrectState for
> expired entries. If this is generally enforced I see no way to delete those.

No way to delete on user level - has to be done by backend.


> Furthermore, couldn't it be necessary to access those entries even after
> they expired (atleast as long as they still exist)? Debugging, archiving,
> etc. come to mind here.

| Good point.   I guess the IncorrectState exception would only be
| sensible for backends which guarantee that the adverts get purged
| automatically.  In all other cases, I agree a backend should not 
| deny access to an advert which is expired, in order to allow manual
| garbage collection.
| 
| Does that make sense to you?

That should not show up on API level.



> Page 10:
> - advert_directory is missing a constructor allowing to reuse default
> session, it is not possible to inherit constructors (well, at least not in
> C++, but I doubt it works in other languages)

| The SAGA core spec describes that the session parameter on all
| c'tors is optional, even if that is not explicitely rendered in the
| IDL.  How that is implemented (overloading, inheritance, ...) is up
| to the language bindings.  So, this is no different than for the
| packages in the core spec I think.



> Page 11:
> - advert is missing a constructor allowing to reuse default session

| same here.


> - ...uses the same /ag/? semantics...

| Uhm, not sure what you mean?

done


> - How can Truncate reset TTL counter if it is supposed to empty the
> attributes? Do you mean _SAGA_MODIFIED should be set to the current time and
> _SAGA_TTL shouldn't be touched?

| Very good point!  I would expect that no advert ever looses its
| _SAGA_* attributes.  So, the advert c'tor would automatically create
| those state attributes, and truncate would either re-use them, or
| delete the old and re-create new ones.
| 
| Does that make sense?  I agree this needs clarification.

_SAGA_* attribs are out of scope now...


> Page 12:
> - does Truncate on a directory purge everything below in the hierachy?

| Uhm, good question.  I would not think though, as truncation is,
| IMHO, focusing on the adverts associated with the entity.  Also, we
| allow trauncate on the namespace::directory (or at least do not
| forbid it), but do not define any action for that, and in particular
| not a recursive remove.

clarified


> - ...it has two additional method... --> it has two additional methods
> - directorie's --> directory's
> - directories --> directory's
> - PostCond: ...of the context use... --> of the context used

done


> Page 13:
> - 'if the Tuncate flag is given, the returned object MUST NOT have an
> associated object': what 'associated object' is meant?

| The SAGA object which has been added with store_object().  I'll
| clarify.


> - The way the spec describes the TTL timer implies that each entry must have
> a 'hidden' entry representing either the remaining TTL or the starting point
> in time when the TTL has been reset. Wouldn't it be benefitial to make that
> explicit, i.e. add another attribute defining the point in time when the
> entry expires?

| We have a last_modification attribute (see page 8) - updating that
| should allow to leave the TTL timer constant, and still extend the
| expiry date I think.  So, expiry_date = last_modified + ttl.
| 
| Makes sense?

clarified.


> Page 14:
> - what's the preferred method of getting a list of all expired items (I'm
> asking as find() shouldn't return those)?

| Hmm, that sounds stupid on second though - again, that would only
| make sense for backends which guarantee garbage collection.  All
| other backends should return the expired entries just as usual I
| guess...

backend problem now.


> - ...manage manage... --> manage
> - ...set of user define attributes... --> set of user defined attributes

done


> Page 17:
> - ...semantics as ... --> semantics as

?


> Page 18:
> - how can I remove an associated object from an entry (de-associate the
> object)?

| Duh.  We can't right now - only change it.   So, one would have to
| truncate the advert, or to re-create it from scratch.  Both would
| destroy the attributes.
| 
| How does 'delete_object()' sound to you?  ;-)

added


> Pages 19-22: 
> - note: I did not compile/run the code, but looks ok

| :-)  This reminds me: I'll probably add some simplier examples to
| the document.  Thilo pointed out that this one might be a wee bit
| long...

to be done...


> Page 25:
> - I'd suggest to change the way names for those multiple keys are formed:
> _SAGA_CONTEXT_<N>_URL --> _SAGA_CONTEXT_URL_<N>, as this makes it easier to
> generate those, allows simpler search patterns, and even allows to make
> _SAGA_ANY_ATTRIBUTE and _SAGA_ANY_ATTRIBUTE_0 equivalent by definition
> (which unifies the attribute naming scheme)

| I agree about the generation, and simpler patterns - but what do you
| mean with 'make _SAGA_ANY_ATTRIBUTE and _SAGA_ANY_ATTRIBUTE_0
| equivalent'?  I think you mean that _SAGA_ANY_ATTRIBUTE_0 should not
| be used, but rather _SAGA_ANY_ATTRIBUTE instead?
| 
| In any case: yes, that makes sense - will change.  Attribute names
| will not align that nicely anymore though, which will hurt my eyes
| for sure :-P

out of scope now.


> Page 26: 
> - how is the _SAGA_OBJECT_TYPE attribute supposed to be stored? as string,
> as integer?

| As Enum as defined in the core spec:
| 
|   "Values of Enum type attributes are expressed as strings, and have
|   the literal value of the respective enums as defined in this
|   document. For example, the initial task states would have the
|   values ’New’, ’Running’, ’Done’, etc."
  
out of scope now.


> - 'For all classes inheriting saga::object, the serialization SHOULD contain
> the two attributes, too.' What two attributes?

| 'type' and 'session' - that is confusing though, I agree, since
| session can be comprised of multiple attributes.  Will fix.

out of scope now.


> - I believe it might be useful to store th euuid of the original object as
> well, allowing to make persistent object relations independently of a single
> application instance

| I had that in there originally, but removed it because, IMHO,
| advert::store_object()/retrieve_object() should behave like
| object::clone(), which happens to assign a new UUID.  Also, one
| could call 'retrieve_object()' multiple times on the same advert,
| and would end up with the same uuid for multiple objects -- this is
| exactly why we required to change the uuid on object.clone() I
| think, as that would put quite some burden on the implementation to
| track those cases, with no appearent benefit to the end user.
| 
| I'd be happy to be convinced otherwise though - let me know what you
| think, please.

clone() reference added. 
otherwise out of scope now.


> - How can a saga::object have associated context information stored? It
> should be sufficient to refer to the session, which in turn refers to the
> contexts.

| Right, but that would require that the session is actually stored in
| the advert service.  That could be either done by the backend
| automatically, which creates a session entry along with the original
| entry, or is done by the SAGA application explicitely.  
| 
| However, in both cases the advert service will get swamped with
| session and context adverts.  Alowing to fold the session incl.
| contexts into the advert itself avoids that - for the price of
| bloated adverts.  Bulk ops or caching are a must in either case I
| guess...

out of scope now.


> - ...If no session information are serialized... --> If no session
> information is serialized
> 
> Page 27:
> - strings should be always uuencode'd, otherwise we loose interoperability

| We don't require that for replica attribute strings either, nor in
| fact for any attributes in the SAGA specs anywhere.  IMHO, we should
| leave this as simple as it is, and fix it once people start
| complaining.  uuencode in user space is also always available for
| the fringe use cases.  Makes sense?

out of scope now.


> - How vector attributes of a job description have to be serialized

| Ouch, good one - no idea.  Possibly similar to the context list, so
| via
| 
|   _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_NUM = 2
|   _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_0   = "localhost"
|   _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_1   = "remotehost"
| 
| ?  That would be at least somewhat consistent, but again bloats the
| attribute table...

out of scope now.


> - the job service id should not be serialized along with the jobid to avoid
> ambiuities/incompatibilities between the two, parsing the jobid is not a big
> deal in the end...

| What ambiguities would you expect?
| 
| Yes, parsing is not the big deal, but the job-id format is not a
| hard requirement in the Core spec (SHOULD instead of MUST).  Also,
| our own adaptors don't always follow that format specification IIRC
| ;-)

out of scope now.


> Page 28:
> - are permissions serialized too?

| I don't think so, as those are backend properties.  Like, file
| permissions are stored on the backend - as soon as one restores the
| file object, the adaptor can query the backend for actual state
| information, including permissions.  Only client side state should
| get serialized: the file pointer being a prominent example.

out of scope now.


> Page 29:
> - logical\_file --> logical_file
> - logical\_directory --> logical_directory

gone


> Page 30:
> - what attributes have advert and advert_directory when serialized?

| Doh.  That is not even specified :-)  
| 
| I would expect:
|   session
|   url
|   openmode
| 
| The rest is again backend state, IMHO.

out of scope now.


Thanks again, thats really great feedback!

Best, Andre.


-- 
Nothing is ever easy.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: saga_adverts.pdf
Type: application/pdf
Size: 153942 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/saga-rg/attachments/20100404/46b8948e/attachment-0001.pdf 


More information about the saga-rg mailing list