[occi-wg] OCCI Model UML, XSD, XML, Code...

Tue May 26 04:55:58 CDT 2009

Inline as usual ☺

From: Sam Johnston [mailto:samj at samj.net]
Sent: 26 May 2009 09:40
To: Edmonds, AndrewX
Cc: occi-wg at ogf.org
Subject: Re: OCCI Model UML, XSD, XML, Code...

On Mon, May 25, 2009 at 7:53 PM, Edmonds, AndrewX <andrewx.edmonds at intel.com<mailto:andrewx.edmonds at intel.com>> wrote:
Thanks for your efforts - I find it easiest to look at this rendering<http://forge.ogf.org/sf/wiki/do/viewAttachment/projects.occi-wg/wiki/NounsVerbsAndAttributes/occi-instance.xml> to work out what it boils down to and I can't help but to notice there are already some strong similarities with Atom (as well as some serious limitations courtesy the "simplification"):

[AE: ] Compare atom to many other models and I’m sure there’s a reasonably easy mapping. The rendering may not be perfect, I’ll be the first to admit it ☺ If there are other limitations do please list them. I’d be keen to hear.
Creating our own protocol, from scratch, with our own ideas as to how things should be represented, categoried, associated etc. significantly raises the risk of failure as well as the barrier to entry - that's reason enough in itself not to do it, particularly when perfectly good alternatives already exist. My point is that your current proposal is very nearly Atom anyway so why not just use it? Your "simplification" complicates.

 [AE: ] the Atom issue: that’s why I asked you to take the most basic model and use it in an Atom context (inheritance or composition). Update: the example on the wiki is interesting!

 *   How do I tell the difference between resources without a unique identifier (Atom "id")

[AE: ] The current OCCI nouns/verbs/attributes do not specify this, hence a requirement. I’ve only taken what’s on the wiki and worked with that along
Surely the requirement for a unique identifier for each resource is so bleeding obvious that it's implicit... you can already add an "id" field.

[AE: ] stating that something is implicitly obvious is not enough for a specification

 *   How do I tell when the resource was updated (Atom "updated") and/or what version it is (ETag)

[AE: ] Same as above
This is an implementation issue - one of many that implementors will trip up on as they try to turn our spec into reality. Actually it's worse than that in that it may not appear until users try to deploy the resulting systems and find that they do not perform adequately. Caching information also falls squarely into the "obvious" category for me and this stuff is very hard to get right.

Using HTTP as the meta-model for individual resources and Atom for collections gives us immediate access to an entire Internet worth of implementations and optimisations - am I the only one here who sees significant benefit in being able to put reverse proxies in front of your API servers for example? I doubt I'm the only person here implementing systems at scale... this approach of moving away from a hard-wired RPC style API to a loosely coupled document based system paves the way for extremely scalable architectures - for example where nodes periodically PUT their state for clients to GET resulting in the command & control being little more than a repository.

[AE: ] Agreed and nor am I arguing against HTTP and do see its value. My response was in the context of the model in question.

 *   How do I find one or more resources by search/category/tag/etc.?

[AE: ] See occi:Tag
One global namespace for rudimentary tagging is not at all what I had in mind - again this is a problem that is long since solved and one that gets exponentially complicated with the number of items.

 *   How do I represent a single object?

[AE: ] See occi:Compute or indeed occi:Resource
So these things can exist independently of groups? I'd have known that already if it were a common format but here I have to resort to reading the XSD.

[AE: ] You have to read something to understand a specification – some will read things raw, others higher level formalisms. Personally I got more out of reading OVF’s XSD than I did reading a rendering of it or the IIOP spec doc than decoding CDR PDUs.

 *   How do I differentiate between tag vocabularies - "small" and "virtualmachine" are talking about two completely different concepts (Atom "category")

[AE: ] ☺ tag == folksonomy. I guess you are referring to ontology?
Taxonomy - a subtle but important difference.

[AE: ] Ok so sounds like another requirement for the OCCI registry, if you deem necessary a managed set of hierarchical terms. Do we want to manage something like this or should we let each provider manage their own set of terms?

 *   How do I create generic links to supporting resources e.g. screenshots, consoles, etc. (Atom "link")

[AE: ] See my first answer. Also occi:Link – oops my bad this is there but should be xs:complexType not xs:attributeGroup. I’ll update this.
This is yet another reimplementation of an existing, well established concept that developers will have to learn.
[AE: ] Again, it would be great if you could show us how to reuse atom:linkType and extended with the requirements listed at the nouns/verbs/attributes wiki page.

 *   How do I create associations with arbitrary resources e.g. users (Atom "link")

[AE: ] See first answer. Extend atom:linkType or simply import it. Arbitrary? I thought OCCI was focused on the specific domain of Infrastructure and we spent time on extracting core nouns.
Just because you don't care who owns or has access to a resource doesn't mean that I don't - rich ACLs are something that will need to be supportable (if not directly supported) and I'm sure there will be many other examples. Generic links are going to work better here, and we already have them for free.

 *   How do I carry critical information that we nonetheless consider out of scope (e.g. OVF)

[AE: ] can introduce xsd:anyType into the model easily
Ok so I've got an anyType and I want to know what it's for (relation) and how to read it (content type) - how will that be expressed? Yet again, all roads lead to Atom.
[AE: ] How does Atom do it? From what I can see atom:entry allows for atom:contentType, which is just a sequence of xs:any – problem you describe still remains with Atom.

 *   How do we extend the spec? (e.g. rinse & repeat)

[AE: ] Extend the XSD.
As I said. Anyway I've thus far managed to avoid having to touch XSD so I guess we'll have to schedule an XSD vs RELAX NG schema war at some point too... in any case better schema than no schema.
[AE: ] Careful not to feed the trolls :p This is a potential flame indeed but if someone wants to propose something contra to XSD then code it up and offer it for consideration.

 *   How do vendors/users extend the spec? (e.g. namespaces?)

[AE: ] As above plus some notions of governance from OCCI, yet to be discussed.
The point is the problem is already solved [by Atom] and we're going to have to cover the same ground.
[AE: ] Problem of governance? If it’s that have you more details? Issue of XSD import and entity extension are dealt with.

 *   How do I import/export resource(s) (e.g. OVA)

[AE: ] Now we’re into application specific functionality/features… something of an orthogonal for modelling activities. Perhaps closest here is model transforms.
I'd say that this requirement is extremely pertinent for model discussions - it obviously needs to support this functionality somehow or it will be impossible to get anything in/out of an implementation (and therefore to achieve portability between them). Transforms are something else entirely.
[AE: ] Just to be clear, the requirement in the general is the need to support multiple existing resource representations OVA being one along with others such as OVF?

 *   etc.

[AE: ] The more requirements the better. In fact OCCI does not define the base units of attribute values that are part of the model. I don’t think these can be found within Atom. Perhaps there’s an XSD out there that we could import and reuse those based units (e.g. MB, MHz etc). Also it would be nice to be able to add arbitrary infrastructure related parameters to any entity in the model.
Your "nice" is my "necessary". Units are optional - at the very least we need to specify the smallest units (e.g. Mb for memory) but after that making the protocol pretty [to humans] makes it harder to interpret [for computers] - and therefore less interoperable. Jury's still out on this one for me.

[AE: ] Lets mark the need for arbitrary infrastructure parameters in the wiki. Tell any scientist that units are optional! ☺ Unfortunate things (NASAs Mars Climate Orbiter) happen when there’s confusion on units.

Following a (long) conversation over (numerous) beers with Alexis on Friday I have come to the conclusion that the meta-model is quite separate from the representation, which is to say that collections, organisation, association, etc. can be done generically while providing a simple "OCCI" representation of the various resources is a domain-specific task.

[AE: ] Meta-model constrains the model. The model defines how a model instance can be created. Representation is the model instance and can be rendered. So in my view they are related, albeit through 2 levels of indirection.
How does the proposed meta-model constrain the model?
[AE: ] It does so by defining the common relationships between entities. Granted however it is not as rigorous as what you would get from a model driven architecture approach but I’m not advocating that here.

For the former (meta-model) HTTP and/or Atom clearly make the most sense but for the latter we need to be careful not to end up reinventing CIM. Gary's list of network attributes gave me pause over the weekend and I think the best way to filter this is to consider what functionality is supplied rather than what functionality is demanded. That is to say, today public cloud providers do IPv4 so we provide attributes for IPv4. If multiple vendors do IPv6 we add IPv6. If nobody does IPX/SPX then too bad for those who need it. This stuff still needs to go somewhere though, which is why multiple representations are essential.

[AE: ] I agree on this (being careful not extending CIM). In fact this is one of OVF’s strengths. It’s quite easy to include elements and attributes defined in the RASD schema in OVF, allowing people extending OVF reach into a huge library of entity definitions. Regarding IP issues and hence specific technology dependencies, I remember you describing a solution where we not concentrate on such details but rather assign compute resources (VM) to networks by tag. Whatever came of that?
Something better appeared, along with the realisation that if we don't give folks like SNIA a way to play then they'll blaze their own trail. In reality it's not much more complicated - networks just became first class citizens which allows for extensibility. Ditto for storage.

[AE: ] Fair point

If it wasn't already clear, I'm very anti the approach of modeling the problem and blatting out one or more representations, irrespective of the delimiters used. A lot of good work has already been done on HTTP and Atom meta-models and giving everyone else the forks by creating yet another domain-specific language is not something I plan to be in any way associated with... the result will almost certainly be an interop train wreck not unlike WS-*

[AE: ] The modelling approach through to XSD is just one that was followed by the Atom folks in one way or another to arrive at a domain-specific model for the publishing and manipulation of content. If you don’t model the problem – even on paper, napkin or beer mat how do you expect to fully understand it? /joke If you wish not to be domain-specific then might I introduce you to my little friend void* :-D
Atom is quite clearly not domain-specific (as evidenced by its rapid, widespread adoption in the cloud space for a myriad applications)... the concepts apply to any information organisation activity.

[AE: ] My point here was that we need to specify what goes inside the atom:entry element should Atom be used.

With atom imho if you want to use it other than publish blog articles then you will have to create something more domain specific – I wouldn’t be attempting to cout my void* to atom:entry and hope someone understand the content within atom:entry.
We do need to have a simple "OCCI" format to represent functionality that has multiple implementations, agreed. That said, if we palm the metadata (id, updated, etc.) and metamodel (links) off to HTTP/Atom as suggested by Alexis last week then the resulting format ends up being a lot more like what Chris and Richard from ElasticHosts were pushing for. In fact it gets so simple that key-value pairs in $PREFERRED_FORMAT makes the most sense, and having multiple formats available for interop becomes feasible. This fits with the HTML forms model too, which is also highly advantageous - creating and updating resources is as simple as a HTML form submission with no other formatting required.
[AE: ] Sounds interesting – even more interesting is this very approach without the dependency on Atom should someone choose not to use it.

In summary I'd like to see us use Atom where absolutely necessary (e.g. collections) and raw HTTP with multiple representations everywhere else. One of those representations can be a restricted syntax based on features with multiple deployments (but users would also be free to send/receive OVF or even their own home grown guff if they want).

[AE: ] That’s perfectly fine but what will those representations look like? Who will define them? How will you verify them? If we’re not using even Atom, what will it be?
The only representation we care about is the simplest representation necessary to create resources - cores, speed, memory etc. for compute, size for storage and ip, netmask and gateway for network, etc. The others will use existing standards:
[AE: ] If wanting to reuse existing standards, why not or should we then just reuse CIM entity types for the above attributes (core, speed, memory etc)?

 *   OVF for portable workloads
 *   VMX et al for proprietary workloads
 *   PNG/JPG/etc. for screenshots
 *   VNC/RDP/SSH/etc. for consoles
 *   PDF/ODF for documentation
 *   Etc
[AE: ] Embedding OVF/VMX will then contain duplicate and redundant data. If that data differs which is authorative?
Given this representation will then end up being key-value pairs we can easily specify (and implement) text, json and xml renderings for interop, while others who care about performance can use things like protocol buffers for talking to the nodes (XML parsing overhead can be non-negligible for automated workloads at scale even if not a problem for most tasks).

Sam

On Mon, May 25, 2009 at 5:57 PM, Edmonds, AndrewX <andrewx.edmonds at intel.com<mailto:andrewx.edmonds at intel.com>> wrote:

I've been spent some time on the proposed meta-model [1] and model [1] UML that is up on the wiki along with the inputs of others. Following some discussions with Roger and Andre, I decided to take the model in its UML form and create the corresponding XSD. In fact Roger might have the a RDF version of the XSD sometime. This XSD can be found here [2]. Now I don't class myself as a XSD expert so any comments, updates etc are appreciated. From this XSD I was able to generate an API to create and manipulate an in-memory representation of a model instance. This example code can be found here [3]. It's Java and uses XMLBeans[4], but there are many other libraries out there that will allow the generation of an API in languages such as Python[5], Ruby[6], CSharp[7] (typically any language that has SOAP libraries will have an XSD class generator). After rendering the in-memory model the XML rendering is specific but adheres to the OCCI model schema and can be viewed here[8]. Again how things appear in the XML can be tweaked via iteration of the schema. For those JSON peeps out there there's an already XSLT transform sheets out there that will convert this XML to JSON using different conventions [9][10].

This model is abstract in a sense it does not import any dependencies, such as Atom. @Sam perhaps you could take what I've done and either by way of model inheritance or composition show how Atom would integrate with this model?

My view on Atom is that it is useful as a wrapper to renderings of OCCI model instances. If I wanted to publish an OCCI model via Atom then I'd insert a model instance rendering into an atom:entry element (note we could include JSON via CDATA but this is where JSON perhaps gets a little ugly). This would be OCCI using Atom by way of composition and would allow both Atom and OCCI live harmoniously together or separately and not require Atom dependencies should that be the choice of a provider. What's important is that the OCCI model be it via Atom or something else could be understood by a receiving system. Using Atom with OCCI by way of inheritance makes the previous "with or without" approach a little more difficult as there's a hard-dependency on Atom.

Atom reminds me somewhat of WS-Agreement, SOAP etc. It doesn't solve your domain-specific model problems (i.e. what you are trying to represent - OCCI==Infrastructure), only provide a means to send this content about and mechanisms supporting this but perhaps not directly to the model contained as content. We still have to understand and represent our noun/verb/attribute triplets. Atom doesn't really help out bar the features it offers by way of sitting on top of HTTP. Now of course you can extend Atom base types (in fact AFAIK they're just xs:complexTypes W3C XSD types) but the point is you have to extend them to _something_. That something is your domain specific model. There is value however in Atom, not from the model point of view - it's a very simple model. The value in Atom is in the directions the specifications gives pertaining to Atom service behaviour. Another way to view Atom is the interoperability glue in between the now almost commonly accepted cloud stack of Infrastructure -> Platform -> Service. This is where OCCI could learn quite an amount but when you consider that the basis of these behaviours are steeply entrenched in REST design principles it's easy to see 1) why Atom is attractive 2) that you don't necessarily need Atom to have correct and/or standard service behaviour - just adopt good, sound REST design principles!

On more specifics and related to the attributes of each noun, we are still rather under-specified especially in the Storage and Network nouns. @Richard and @Gary (in separate threads) started a useful discussion on attributes for these nouns. Perhaps you two guys could add your findings and suggestions to the wiki?

Andy

PS: for those of you wanting to run the code I've exported the eclipse project and can be found here [11].

[1] http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/NounsVerbsAndAttributes
[2] http://forge.ogf.org/sf/wiki/do/viewAttachment/projects.occi-wg/wiki/NounsVerbsAndAttributes/occi.xsd
[3] http://forge.ogf.org/sf/wiki/do/viewAttachment/projects.occi-wg/wiki/NounsVerbsAndAttributes/OCCITest.java
[4] http://xmlbeans.apache.org
[5] http://www.rexx.com/~dkuhlman/generateDS.html<http://www.rexx.com/%7Edkuhlman/generateDS.html>
[6] http://dev.ctor.org/soap4r/browser/trunk/bin/xsd2ruby.rb
[7] http://xsd2code.codeplex.com/Wiki/View.aspx
[8] http://forge.ogf.org/sf/wiki/do/viewAttachment/projects.occi-wg/wiki/NounsVerbsAndAttributes/occi-instance.xml
[9] http://code.google.com/p/xml2json-xslt/
[10] http://www.bramstein.com/projects/xsltjson/
[11] http://forge.ogf.org/sf/wiki/do/viewAttachment/projects.occi-wg/wiki/NounsVerbsAndAttributes/OCCIModel-eclipse-project.zip

Andy Edmonds
skype: andy.edmonds
tweets: @dizz
tel: +353 (0)1 6069232

IT Research - IT Innovation Centre - Intel Ireland Ltd.
Intel Ireland Limited (Branch)
Collinstown Industrial Park, Leixlip, County Kildare, Ireland
Registered Number: E902934

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

-------------------------------------------------------------
Intel Ireland Limited (Branch)
Collinstown Industrial Park, Leixlip, County Kildare, Ireland
Registered Number: E902934

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

-------------------------------------------------------------

Intel Ireland Limited (Branch)

Collinstown Industrial Park, Leixlip, County Kildare, Ireland

Registered Number: E902934

This e-mail and any attachments may contain confidential material for

the sole use of the intended recipient(s). Any review or distribution

by others is strictly prohibited. If you are not the intended

recipient, please contact the sender and delete all copies.

-------------------------------------------------------------
Intel Ireland Limited (Branch)
Collinstown Industrial Park, Leixlip, County Kildare, Ireland
Registered Number: E902934

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/occi-wg/attachments/20090526/ffc82392/attachment.html