[occi-wg] Is OCCI the HTTP of Cloud Computing?

Sam Johnston samj at samj.net
Wed May 6 08:25:15 CDT 2009


Richard,

Apologies for the tardy response - I have been busy organising our presence
at the Cloud Computing Expos in Prague and London on 18-19 and 20-21 May
respectively (hope to see some of you there).

On Tue, May 5, 2009 at 3:44 PM, Richard Davies <
richard.davies at elastichosts.com> wrote:

> > Amazon use a simple but sprawling XML based API
> ...
> > there are a raft of intellectual property issues as well:
>
> We definitely agree with you that OCCI can produce a better API than
> Amazon,
> both in terms of IP issues and also a cleaner design. If I were Amazon and
> wanted to play hardball, I would (a) allow Eucalyptus, etc to copy my API
> for now (since it only helps me gain traction as a defacto standard), (b)
> remain vague about IP for now, (c) not support any other API standards
> while
> I remember the defacto standard and (d) later halt or charge fees to any
> competition which becomes serious.
>
> (a), (b) and (c) seem well executed to date ;-)
>

Indeed, and until such time as Amazon work out what they want to do with
their APIs they're essentially a no-go-zone (for us at least).

> I am now 100% convinced that the best results are to be had with a variant
> > of XML over HTTP
> ...
> > support alternative formats including HTML, JSON and TXT via XML
> > Stylesheets
>
> As you're well aware, we're less in favour of XML. However support for
> alternative formats with automatic cross-conversion makes all the versions
> equivalently first-class citizens, which is good enough for us.
>
> The XSLT convertors start with XML and convert to the other formats.
> In practice, ElasticHosts will likely start with TXT, and convert from
> there
> to the JSON, XML, etc - it would be great to see automatic convertors in
> this opposite direction too, to validate that it can be done. Writing these
> will also impose discipline and prevent creation of unnecessarily complex
> datastructures, which is always a risk in XML.
>

Agreed, but one of the primary drivers for XML is the ease at which one can
transform from it into $PREFERRED_FORMAT - the same cannot necessarily be
said in reverse. That's not to say I won't have a crack at it, nor that it
would necessarily be all that difficult, but there's no generic
xsltjson<http://www.bramstein.com/projects/xsltjson/>(and things like
BadgerFish <http://ajaxian.com/archives/badgerfish-translating-xml-to-json>)
to do that mechanically for example.

> You can see the basics in action thanks to my Google App Engine reference
> > implementation at http://occitest.appspot.com/ (as well as HTML, JSON
> and
> > TXT versions of same)
> >
> > http://occitest.appspot.com/
> >
> http://www.w3.org/2005/08/online_xslt/xslt?xslfile=http%3A%2F%2Focci.googlecode.com%2Fsvn%2Ftrunk%2Fxml%2Focci-to-html.xsl&xmlfile=http%3A%2F%2Foccitest.appspot.com
> >
> http://www.w3.org/2005/08/online_xslt/xslt?xslfile=http%3A%2F%2Focci.googlecode.com%2Fsvn%2Ftrunk%2Fxml%2Focci-to-json.xsl&xmlfile=http%3A%2F%2Foccitest.appspot.com
> >
> http://www.w3.org/2005/08/online_xslt/xslt?xslfile=http%3A%2F%2Focci.googlecode.com%2Fsvn%2Ftrunk%2Fxml%2Focci-to-text.xsl&xmlfile=http%3A%2F%2Foccitest.appspot.com
>
> I'm not going to comment on the XML, since Enterprise XML design is not our
> forte. I will go through the TEXT and JSON versions in some detail.
>

No problem - it's not that difficult to understand but I'm guessing a fairly
introductory explanation would be a useful addition to the spec.


> Here's a sample excerpt in XML:
>
> > <entry>
> >   <id>urn:uuid:47bb7df8-587e-47fa-bd89-6f2f81c14b19</id>
> >   <title>Virtual Machine #1</title>
> >   <summary>Sample Compute Resource</summary>
> >   <updated>2009-05-04T09:52:37Z</updated>
> >   <link href="/47bb7df8-587e-47fa-bd89-6f2f81c14b19" />
> >   <link rel="http://purl.org/occi/storage#device"
> href="urn:uuid:4cc8cf62-69a4-4650-9e8c-7d4c516884df" title="Hard Drive"/>
> >   <link rel="http://purl.org/occi/network#interface"
> href="urn:uuid:dc88b244-145f-49e4-be7c-0880dcad42e9" title="Internet
> Connection"/>
> >   <link rel="http://purl.org/occi/network#interface"
> href="urn:uuid:253d83dd-e417-4e1f-9958-8c0a63120475" title="Private
> Network"/>
> > </entry>
>
> in JSON:
> [I assume the '\n' and '\/' in the strings are mistakes. It'd also be good
> if the XSLT-converted JSON had nice indentation, like I've produced below]
>

Yes, I was assuming that would go unnoticed but apparently people are paying
attention... I was using TextMate to clean it up but the same could easy
enough be done in the XSLT.


> > {
> >   "id":"urn:uuid:47bb7df8-587e-47fa-bd89-6f2f81c14b19",
> >   "title":"Virtual\nMachine\n#1",
> >   "summary":"Sample\nCompute\nResource",
> >   "updated":"2009-05-04T09:52:37Z",
> >   "link":[
> >     {
> >       "href":"\/47bb7df8-587e-47fa-bd89-6f2f81c14b19"
> >     },
> >     {
> >       "rel":"http:\/\/purl.org\/occi\/storage#device",
> >       "href":"urn:uuid:4cc8cf62-69a4-4650-9e8c-7d4c516884df",
> >       "title":"Hard\nDrive"
> >     },
> >     {
> >       "rel":"http:\/\/purl.org\/occi\/network#interface",
> >       "href":"urn:uuid:dc88b244-145f-49e4-be7c-0880dcad42e9",
> >       "title":"Internet\nConnection"
> >     },
> >     {
> >       "rel":"http:\/\/purl.org\/occi\/network#interface",
> >       "href":"urn:uuid:253d83dd-e417-4e1f-9958-8c0a63120475",
> >       "title":"Private\nNetwork"
> >     }
> >   ]
> > }
>
> and in TXT:
>
> > [47bb7df8-587e-47fa-bd89-6f2f81c14b19]
> > title|Virtual Machine #1
> > summary|Sample Compute Resource
> > updated|2009-05-04T09:52:37Z
> > link|||/47bb7df8-587e-47fa-bd89-6f2f81c14b19
> > link|http://purl.org/occi/storage#device|Hard<http://purl.org/occi/storage#device%7CHard>Drive|urn:uuid:4cc8cf62-69a4-4650-9e8c-7d4c516884df
> > link|http://purl.org/occi/network#interface|Internet<http://purl.org/occi/network#interface%7CInternet>Connection|urn:uuid:dc88b244-145f-49e4-be7c-0880dcad42e9
> > link|http://purl.org/occi/network#interface|Private<http://purl.org/occi/network#interface%7CPrivate>Network|urn:uuid:253d83dd-e417-4e1f-9958-8c0a63120475
> > etag|
>
> Going through the text version...
>
> > [47bb7df8-587e-47fa-bd89-6f2f81c14b19]
>
> To simplify parsing even further, I'd write this as:
>
> id|47bb7df8-587e-47fa-bd89-6f2f81c14b19
>

Right, Chris and I discussed this before and agreed that an INI format
wasn't ideal - changing it is easy enough to do (even for an XSLT newbie
like myself).

We also spoke about putting the ID on every line ala tinydns-data format
(which is nice for multithreaded/asynchronous implementations which may need
to pull data from billing systems, performance monitor counters, etc)...
this is trivial to parse into a hash [of hashes] and a pleasure to work
with.

Another thing to note is the use of the pipe (|) character as a separator
since colons (:) are present in URLs... we may also be able to move to CSV
which is something I believe GoGrid are already supporting. This is a detail
that makes little difference though, and probably something we can agree on
later.


> and have the blank line also as the separator between objects
>
> > title|Virtual Machine #1
> > summary|Sample Compute Resource
> > updated|2009-05-04T09:52:37Z
>
> These look good - simple key-value pairs.
>
> Presumably we'll also add some actual parameters of the virtual machine,
> e.g.
>
> smp|2
> cpu|2000
> mem|1024
>

Right, and then there's just the question of whether these go into the top
level namespace or e.g. "compute:cores", "compute:arch", "compute:speed",
"memory:size", etc. (I'm thinking burning a few more characters is worth it,
and the mechanical transformations are easier then too... avoiding confusion
e.g. "cpu" means speed or cores).


> It would be good to work these into the examples, TXT, JSON and XML.
>
> > link|||/47bb7df8-587e-47fa-bd89-6f2f81c14b19
>
> It's good that we have an end-point for editing the object directly. I'd
> make this a simple key-value rather than having blank fields:
>
> link|/47bb7df8-587e-47fa-bd89-6f2f81c14b19
>

The blank fields are not [really] intentional, but there's a good reason for
their existence. "link" is a generic Atom construct (see
here<http://tools.ietf.org/html/rfc4287#section-4.2.7>for details)
that can point back to the object, to an alternate (e.g. PDF)
representation of it, to another object (e.g. a parent, child, sibling,
etc.) or pretty much whatever we want (e.g. think screenshots, snapshots,
build documentation, etc. - again, where a lot of the innovation will take
place).

Perhaps we can move the URL to the front and only show the empty fields when
there are gaps... need to get me a good XSLT book (if anyone has a PDF or
even a recommendation I'd be happy to hear it).


> > link|http://purl.org/occi/storage#device|Hard<http://purl.org/occi/storage#device%7CHard>Drive|urn:uuid:4cc8cf62-69a4-4650-9e8c-7d4c516884df
> > link|http://purl.org/occi/network#interface|Internet<http://purl.org/occi/network#interface%7CInternet>Connection|urn:uuid:dc88b244-145f-49e4-be7c-0880dcad42e9
> > link|http://purl.org/occi/network#interface|Private<http://purl.org/occi/network#interface%7CPrivate>Network|urn:uuid:253d83dd-e417-4e1f-9958-8c0a63120475
>
> There are four fields here: 'link', a schema, a name and an object's UUID.
> However there's an important one missing - the specification of _how_ the
> other object is bound to the virtual machine (e.g. is the drive bound as
> IDE
> or SCSI, and to which bus? Which network is which virtual NIC?).
>

Again "link" is a generic Atom construct that can apply to anything...
usually it is used to point at the content of the object where none is
embedded (remembering you can embed XML e.g. OVF or any other data including
binary if need be).


> This extra field will be needed in the XML too, e.g. as 'type':
>
> <link type="ide:0:0" rel="http://purl.org/occi/storage#device"
> href="urn:uuid:4cc8cf62-69a4-4650-9e8c-7d4c516884df" title="Hard Drive"/>
>
> One this is specified, the first three fields aren't actually necessary in
> the TXT version, which can just be:
>
> ide:0:0|4cc8cf62-69a4-4650-9e8c-7d4c516884df
> nic:0|dc88b244-145f-49e4-be7c-0880dcad42e9
> nic:1|253d83dd-e417-4e1f-9958-8c0a63120475
>
> There's no need to specify the schema, since this is uniquely determined by
> the link type (and in any case only relevant to the XML), no need to
> specify
> 'link', since it always is for this key, and no need to specify the name,
> since that's the name of the object with the given UUID.
>

Yes, this makes a lot of sense, but it also means you've got logic in your
transforms which is something I would *very much* like to avoid. We also
start to impose on the format of device identifiers which doesn't
necessarily work in large, complex and/or heterogeneous environments (you
say eth0, I say en1 and others say "Local Area Connection"... and that's
before you start talking about paravirtualised block devices and so on) and
short of revising the spec we're at a dead end if implementors need to carry
other information such as block or frame size. Human friendly "title"s are
incredibly useful too by the way, and while one could use the title of the
storage or network resource that doesn't always work (think shared networks
and cluster storage).

I admit there's probably a better way to do it and to be honest I've not had
a chance to spend a lot of time on the problem (nor am I an XSLT wizard,
yet).


> > etag|
>
> What's this doing? Can it be deleted?
>

Nothing at the moment because I haven't worked out a sensible way to
generate eTags <http://en.wikipedia.org/wiki/HTTP_ETag>, however this is
part of the magic fairy dust that Google added to Atom to make GData and
it's very important for
performance/caching/proxying/gatewaying/concurrency/conflict resolution/etc.
(but also usually safe to ignore for those who don't care about such
things).


> Putting these together, the original:
>
> > [47bb7df8-587e-47fa-bd89-6f2f81c14b19]
> > title|Virtual Machine #1
> > summary|Sample Compute Resource
> > updated|2009-05-04T09:52:37Z
> > link|||/47bb7df8-587e-47fa-bd89-6f2f81c14b19
> > link|http://purl.org/occi/storage#device|Hard<http://purl.org/occi/storage#device%7CHard>Drive|urn:uuid:4cc8cf62-69a4-4650-9e8c-7d4c516884df
> > link|http://purl.org/occi/network#interface|Internet<http://purl.org/occi/network#interface%7CInternet>Connection|urn:uuid:dc88b244-145f-49e4-be7c-0880dcad42e9
> > link|http://purl.org/occi/network#interface|Private<http://purl.org/occi/network#interface%7CPrivate>Network|urn:uuid:253d83dd-e417-4e1f-9958-8c0a63120475
> > etag|
>
> Turns into:
>
> id|47bb7df8-587e-47fa-bd89-6f2f81c14b19
> title|Virtual Machine #1
> summary|Sample Compute Resource
> updated|2009-05-04T09:52:37Z
> smp|2
> cpu|2000
> mem|1024
> link|/47bb7df8-587e-47fa-bd89-6f2f81c14b19
> ide:0:0|4cc8cf62-69a4-4650-9e8c-7d4c516884df
> nic:0|dc88b244-145f-49e4-be7c-0880dcad42e9
> nic:1|253d83dd-e417-4e1f-9958-8c0a63120475
>
> Which is shorter, simpler and describes the virtual server more fully (3
> extra properties + the link attachment points).
>

Sure this looks nicer/neater for us humans but it's less flexible and makes
more work for the machines (and more importantly, programmers thereof). We
need to find a balance and I'll certainly be working with you and Chris to
do that.


> We can also render this same thing in JSON:
>
> {
>  "id": "47bb7df8-587e-47fa-bd89-6f2f81c14b19",
>  "title": "Virtual Machine #1",
>  "summary": "Sample Compute Resource",
>  "updated": "2009-05-04T09:52:37Z",
>  "smp": 2,
>  "cpu": 2000,
>  "mem": 1024,
>  "link": "/47bb7df8-587e-47fa-bd89-6f2f81c14b19",
>  "ide:0:0": "4cc8cf62-69a4-4650-9e8c-7d4c516884df",
>  "nic:0": "dc88b244-145f-49e4-be7c-0880dcad42e9",
>  "nic:1": "253d83dd-e417-4e1f-9958-8c0a63120475"
> }
>
> Which again is shorter, simpler and more descriptive than the original.
>

Agreed.


> Sam: please can you take these changes on board:
> - update the XML example with the 3 extra properties, the link attachment
>  points and probably some actuators too (like start and stop)
> - update the XSLT conversions to produce the improved TXT and JSON formats
>

No problem. I wanted to have the actuators in place already but also wanted
to make them responsive (that is, so you can actually test your clients
against this and so the general public can kick the tires).

Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/occi-wg/attachments/20090506/223fa20b/attachment.html 


More information about the occi-wg mailing list