[orep-wg] WS-ReplicaCatalog specification

Sat Aug 6 13:43:53 CDT 2005

Hi Peter,

Excellent points!

| very good! this was not what i understood from ann.
| and if you do speak of a WS-ReplicaCatalog interface,
| this implies standardization at least to me. otherwise
| it is simply a WSRF interface on top of RLS, so excuse
| my confusion.

Point taken. The title is confusing. It probably would not be titled
"WS-"ReplicaCatalog, but rather something more indicative of its early
outgrowth from RLS.

In the Globus context, we term things "ws" and "prews" to distinguish
the nature of our components. So that's where it came from. It made
things unclear though.

Still, we would like, through discussions in this forum to explore the
possibility of generalizing an interface to catalogs.

| 
| i actually did read the document carefully ;-)

*heh* You know I have to give you a hard time. :)

| 
| to be constructive, what someone (probably this group)
| needs to do - upon which the rest of OGSA data is relying -
| is a WS-ReplicaCatalog interface as you correctly mention
| in the document. it needs to be done such that
| it fits seamlessly into the puzzle of the OGSA design,
| and in this doc i can see no mention or sign of that.
| 
| i actually don't agree that it is a spec that anyone could
| implement - it does not fit well on top of our existing catalogs
| for instance. let me delve a bit deeper technically:

I too was thinking about this a bit more. It seems like there are some
conceptual differences between various replica catalog instances. I
objected to your earlier comment about it, but then again the
'conceptual' differences may prevent others from implementing the spec.
Resolving these conceptual differences may be much more challenging than
resolving normal 'implementation' differences.

| 
| there is a schema defined in the document for how the objects
| defined are tied together. this also specifies the implementation
| to a large extent, which i believe is the wrong approach.
| i am referring to the diagram on page 8.
| 
| i believe that
|  - there should be no such schema given at all,
|    just an interface specification

OK. The idea for the schema is to give some sense for how to query the
catalog without defining a rigid set of 'lookup' or 'find' operations.
Perhaps, it is too ambitious to generalize on this.

| 
|  - we need to address the specific issue
|    of data replication, not just file replication
|    and this has to be reflected well in the design
| 
|  - it is not a well-designed schema for the
|    purpose at hand since it is too generic
|    for a replica catalog functionality. it
|    is basically a mapping catalog - there are
|    'strings' which may have attributes and
|    which may relate one to another. i believe
|    we must be more specific for it to be useful.

This is where the conceptual differences lie. Our approach has been to
stick to abstract concepts: names, mappings, attributes. We hardly imply
any more semantics than that. Personally, I'd agree; like you say, it is
basically a mapping catalog.

This is tough. I'm not entirely sure how to merge these two concepts or
make them more compatible.

| 
|  - the interface needs methods that address
|    clearly the creation of data entries, of
|    replicas thereof, and also a unique ID
|    as it is present in the RNS and GFS specs.
|    this notion is completely missing. stating that
|    a GUID may be just one of the strings in
|    the mapping catalog is missing the point
|    of why a unique ID was defined in the first place.

I'm not 100% clear why these can't be names and required attributes. Is
it so that the semantics can be understood by other automated systems?
It seems to me that if a user were to define a required attribute called
'creationDateTime' associated with a name, that would effectively be the
same thing.

I'm not sure I understand your comment about GUIDs. Maybe we're just
using different terms to refer to the same thing...? I'm thinking of a
logical 'name' as being unique. A 'GUID' is a specific type of unique ID
(or name). So the user and the system would respect that the 'name' is
to be unique. We're just not mandating what type of unique name/id it
must be. Maybe I'm confused.

| 
|  - some of the attributes need to be predefined
|    for a replica catalog to be semantically useful.
|    like creation date, lifetime, etc. there are
|    system attributes and user-defined attributes,
|    and i'm not sure that user-defined ones really
|    belong into a replica catalog interface.

On 'predefined' attributes: I'd like to better understand why they need
to be predefined. I'm not saying they shouldn't be, but I don't
understand why.

On 'user-defined' attributes: We don't like them either, but it might be
difficult to get our users to live without them.

| 
| 
| so my question to you: do you intend to do a
| WS-ReplicaCatalog spec or do you just intend to put
| a WSRF-compatible interface on top of the RLS?

We would like to move from our starting position, 'a WSRF-compatible
interface on top of the RLS', to a 'WS-ReplicaCatalog spec'.

| 
| if it is the latter, i suggest you drop the mention
| of WS-ReplicaCatalog in the document.

Sure.

Excellent. Thanks for the thoughtful and thorough comments.

Cheers,

rob