[gfs-wg] Proposed Final Draft for RNS v1.0

Fri Jun 10 19:11:54 CDT 2005

Hi Ted,

Thank you for the detailed review!  Please reference the latest revision 
when reviewing my response.

Latest revision: 
https://forge.gridforum.org/projects/gfs-wg/document/RNS-Proposed_Final_Draft-v1.2/en/3

> Sec 1.1.2.2: Since virtualized reference junctions are a key component
> in the three-level namespace architecture, it is rather dissatisfying
> that the description here is so abstract.  At the very least, could we
> make it clear that the target contains two parts: the logical name
> itself and a reference to a resolver for that name?  Maybe an example
> drawn from the earlier figure would be useful.  Ideally, it would be
> nice to factor out the resolver reference, since typically it would be
> shared by most virtualized reference junctions in a repository.

This section has been revised.

> Sec 1.1.2.4: I'm afraid I still have problems with the alias
> description.  It is misleading to say that an entry MUST NOT be deleted,
> but then in the following section explain how they can be deleted after
> all.  Better to explain that there are two alternatives up front and
> give the two permissible alternatives.

Good suggestion.  I did revise this in response to Osamu?s note, however I 
revised it again in response to your suggestion.

> The confusion related to identity of entries and aliases remains.  As
> far as I can tell this is the only explanation of what an alias does or
> how it works.  Later in the document there are only details of
> properties and types.  How does lookup of an alias work, specifically,
> how does it differ from looking up the target itself?  In other words,
> is an alias visible to the casual user (i.e. lookup as opposed to create
> or delete) and in what ways?

The doc states: ?an alias junction is a junction that references another 
entry within the same service instance.?  An alias is an entry, more 
specifically a junction entry, so lookup() behaves the same as it would 
with any other junction.  I have revised the text to better explain the 
behavior of the lookup operation for aliases.  Basically, you can request 
to have the target properties embedded in the response message of the 
lookup request of an alias.  I have revised the description of this in 
section 1.2.2.2.1 and provided an example message in section 1.3.1.3. 

> Sec 1.2.1: It may be burdensome for the server to implement a
> point-in-time result-set over multiple messages to deliver a large
> directory to a requester.  How can the server bound the resource demands
> this imposes?  What if the requester makes a request for a large
> directory and never responds with another request, but does use the
> lifetime management to keep the IteratorContext state active?  What if
> it very slowly reads the directory in tiny segments?  Can the directory
> destroy the iterator context if resource demands get too much and force
> the requester to start over?  Maybe this shouldn't be a "MUST"
> requirement and instead include a modification timestamp or version
> number with each returned listing segment?

The key here is that this is an implementation specific issue.  The 
specification does facilitate the necessary messaging to report a fault 
and offers resource lifetime management of the IteratorContext via WSRF. 
The reasoning for ?MUST? is that without mandating how this is implemented 
it is the responsibility of this specification to describe the required 
semantic.  My understanding is that the majority opinion is to ensure that 
the result set of a list query is immutable for the lifetime of the 
IteratorContext to ensure that updates to the namespace between iterations 
does not affect the segmented list being returned.

> Sec 1.2.2.1: Maybe the ChildCount should be optional
> (i.e. minOccurs="0") and the server SHOULD provide one.  This would
> facilitate situations where the server produces the directory listing as
> requested and because it didn't have resources to create and store the
> whole listing on the first list request.

This section is only describing the IteratorContext message, which can be 
considered analogous to the description of a class; hence the requirement 
for ?childCount? as a member that must have a value.  The values of these 
properties are always populated by the service and made available to the 
client.  This message is not used as input to any operation.

> What is the path separator character?  What about character restrictions
> for entry names?

This is briefly addressed in the new section 1.5.1 and subsections.

> Sec 1.3.3.2: In rename, what is the rationale for providing the Path &
> Name approach since the Path only mechanism seems perfectly sufficient.
> In fact rename would seem to be a trivial subcase of move, described in
> the previous section.  Maybe the Path & Name naming mechanism should be
> eliminated all together.  The "Name" attribute is clearly necessary for
> the "list" operation, but for others it would seem redundant.  Its use
> is inconsistent anyway: why create but not delete?

This is a good argument.  The only rationale I can see is that in all 
cases where Name is not optional the absolute path is attainable.  However 
in both the create and rename case, the are describing something new. Java 
offers something similar in its File constructor and it makes creating new 
files/directories easy since the client does not have to parse paths.

> Sec 1.3.5.4: Perhaps when inserting a new property, the DataType should
> be mandatory.  Otherwise how can the property be given a value?  Is
> there any way defining a new property could be useful without the
> capability to giving it a value?  [Osamu mentioned this also.]

Done.

> Sec 1.4: Given that it sounds like you are requiring the very demanding
> perfect consistency for updates to a distributed namespace repository, I
> think you need to be very clear.  This quote "This means that operations
> like create, delete, and update MUST guarantee synchronized processing
> that prevents update contingencies based on concurrent execution."  is
> not at all clear.  What are "update contingencies"?  Are you really
> demanding (MUST) strong consistency?  What is the escape clause for RNS
> implementations that cannot provide this level of consistency?  Only
> non-conformance?

I agree MUST is quite demanding, and originally did not include such 
language; however, after the issue was raised in one of our meetings this 
was suggested.  I changed the demand to SHOULD in both sentences.  If 
anyone has any objection please make it known.

The sentence seems perfectly understandable to me, what isn?t clear?

> Sec 2.3.2.5: How does updateEndpointReference work?  You provide an EPR
> to update and the only new properties to specify is EPR.  Does an EPR
> have a name in addition to its value or is this like a replace
> operation?  Can one EPR be replaced with a list of EPRs?  Does an EPR
> have any other properties?  Perhaps a description?  There doesn't seem
> to be an insertEndpointReference operation, so I guess they are inserted
> as a side-effect of other operations.

It is like a replace function.  One EPR cannot be replaced by a list.  The 
only time an EPR makes sense is when it is referenced, and this is 
accomplished via other operations.

> You say "Replication of RNS ... is indispensable. The consistency model
> required by RNS needs to be investigated."  But in section 1.4 the
> specification appears to call for strong, single system image
> consistency.

Byproduct of progressive revisions; has since been removed.

> Appendix GFS Profile:
> 
> Checksum is a required attribute to support, but does that mean it is
> required of all entries?  I would imagine that some data sources would
> not have checksums.  Perhaps in this case the ChecksumType could be
> "none"?  Maybe it should say "if available" as you do for the Version
> property.

This table strictly describes all of the properties that must be 
?supported?, meaning what properties must be defined and made available. 
The profile needs to be expanded to describe further the conditions of 
when each property is required and forth.

> What does ReplicaCopy mean to the client?  Does it imply ReadOnly?  Does
> it control the validity of the Timestamp and Version properties
> (i.e. are these properties invalid if ReplicaCopy is false)?  Is there a
> way to find out the source from which the replica was obtained?

Again, property relationships need to be expanded in the GFS Namespace 
Profile.  Excellent point regarding the source of a replica copy, that 
needs to be added.

Best regards,
Manuel Pereira
IBM Almaden Research Center
1-408-927-1935  [T/L 457]
mpereira at us.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/gfs-wg/attachments/20050610/e633b6a8/attachment.htm