[occi-wg] Is OCCI the HTTP of Cloud Computing?

Sam Johnston samj at samj.net
Mon May 25 11:12:30 CDT 2009


Reposting with comments, esp: "*I'm not a XML or JSON expert, but I think
this is not the time for the OCCI guys to discuss it, there work is already
on top of XML and they should focus it, specially, on its simplicity.*"
Is OCCI the HTTP of Cloud
Computing?<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html>The
Web is built on the Hypertext Transfer Protocol
(HTTP)<http://en.wikipedia.org/wiki/HTTP>,
a client-server protocol that simply allows client user agents to retrieve
and manipulate resources stored on a server. It follows that a single
protocol could prove similarly critical for Cloud
Computing<http://en.wikipedia.org/wiki/Cloud_computing>,
but what would that protocol look like?

The first place to look for the answer is limitations in HTTP itself. For a
start the protocol doesn't care about the payload it carries (beyond
its Internet
media type <http://en.wikipedia.org/wiki/Internet_media_type>, such as
text/html), which doesn't bode well for realising the
vision<http://www.w3.org/2001/sw/Activity.html>of the [
Semantic <http://en.wikipedia.org/wiki/Semantic_Web>]
Web<http://en.wikipedia.org/wiki/World_Wide_Web>as a "universal medium
for the exchange of data". Surely it should be
possible to add some structure to that data in the simplest way possible,
without having to resort to carrying complex, opaque file formats (as is the
case today)?

Ideally any such scaffolding added would be as light as possible, providing
key attributes common to all objects (such as updated time) as well as basic
metadata such as contributors, categories, tags and links to alternative
versions. The entire web is built on hyperlinks so it follows that the
ability to link between resources would be key, and these links should be
flexible such that we can describe relationships in some amount of detail.
The protocol would also be capable of carrying opaque payloads (as HTTP does
today) and for bonus points transparent ones that the server can seamlessly
understand too.

Like HTTP this protocol would not impose restrictions on the type of data it
could carry but it would be seamlessly (and safely) extensible so as to
support everything from contacts to contracts, biographies to books (or
entire libraries!). Messages should be able to be serialised for storage
and/or queuing as well as signed and/or encrypted to ensure security.
Furthermore, despite significant performance improvements introduced in HTTP
1.1 it would need to be able to stream many (possibly millions) of objects
as efficiently as possible in a single request too. Already we're asking a
lot from something that must be extremely simple and easy to understand.

XML

It doesn't take a rocket scientist to work out that this "new" protocol is
going to be XML based, building on top of HTTP in order to take advantage of
the extensive existing infrastructure. Those of us who know even a little
about XML will be ready to point out that the "X" in XML means "eXtensible"
so we need to be specific as to the schema for this assertion to mean
anything. This is where things get interesting. We could of course go down
the WS-* route and try to write our own but surely someone else has crossed
this bridge before - after all, organising and manipulating objects is one
of the primary tasks for computers.

Who better to turn to for inspiration than a company whose
mission<http://www.google.com/corporate/>it is to "organize the
world's information and make it universally
accessible and useful", Google. They use a single protocol for almost all of
their APIs, GData <http://code.google.com/apis/gdata/>, and while people
don't bother to look under the hood (no doubt thanks to the myriad client
libraries <http://code.google.com/apis/gdata/clientlibs.html> made available
under the permissive Apache 2.0 license), when you do you may be surprised
at what you find: everything from contacts to calendar items, and pictures
to videos is a feed (with some extensions for things like
searching<http://code.google.com/apis/gdata/docs/2.0/basics.html#Searching>and
caching<http://code.google.com/apis/gdata/docs/2.0/reference.html#ResourceVersioning>
).

OCCI

Enter the OGF's Open Cloud Computing Interface
(OCCI)<http://www.occi-wg.org/>whose (initial) goal it is to provide
an extensible interface to Cloud
Infrastructure Services (IaaS). To do so it needs to allow clients to
enumerate and manipulate an arbitrary number of server side "resources"
(from one to many millions) all via a single entry point. These compute,
network and storage resources need to be able to be created, retrieved,
updated and deleted (CRUD) and links need to be able to be formed between
them (e.g. virtual machines linking to storage devices and network
interfaces). It is also necessary to manage state (start, stop, restart) and
retrieve performance and billing information, among other things.

The OCCI working group basically has two options now in order to deliver an
implementable draft this month as promised: follow Amazon or follow Google
(the whole while keeping an eye on other players including Sun and VMware).
Amazon use a simple but sprawling XML based API with a PHP style flat
namespace and while there is growing momentum around it, it's not without
its problems. Not only do I have my doubts about its scalability outside of
a public cloud environment (calls like 'DescribeImages' would certainly
choke with anything more than a modest number of objects and we're talking
about potentially millions) but there are a raft of intellectual property
issues as well:

   - *Copyrights* (specifically section 3.3 of the Amazon Software
License<http://aws.amazon.com/asl/>)
   prevent the use of Amazon's "open source" clients with anything other than
   Amazon's own services.
   - *Patents* pending like
#20070156842<http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220070156842%22.PGNR.&OS=DN/20070156842&RS=DN/20070156842>cover
the Amazon Web Services APIs and we know that Amazon have been known
   to use patents
offensively<http://en.wikipedia.org/wiki/1-Click#Barnes_.26_Noble>against
competitors.
   - *Trademarks* like
#3346899<http://tarr.uspto.gov/servlet/tarr?regser=serial&entry=77054011>prevent
us from even referring to the Amazon APIs by name.

While I wish the guys at Eucalyptus <http://open.eucalyptus.com/> and
Canonical <http://news.zdnet.com/2100-9595_22-292296.html> well and don't
have a bad word to say about Amazon Web Services, this is something I would
be bearing in mind while actively seeking alternatives, especially as Amazon
haven't worked out<http://www.tbray.org/ongoing/When/200x/2009/01/20/Cloud-Interop>whether
the interfaces are IP they should protect. Even if these issues were
resolved via royalty free licensing it would be very hard as a single vendor
to compete with truly open standards (RFC 4287: Atom Syndication
Format<http://tools.ietf.org/html/rfc4287>and RFC
5023: Atom Publishing Protocol <http://tools.ietf.org/html/rfc5023>) which
were developed at IETF by the community based on loose consensus and running
code.

So what does all this have to do with an API for Cloud Infrastructure
Services (IaaS)? In order to facilitate future extension my initial designs
for OCCI have been as modular as possible. In fact the core protocol is
completely generic, describing how to connect to a single entry point,
authenticate, search, create, retrieve, update and delete resources, etc.
all using existing standards including HTTP, TLS, OAuth and Atom. On top of
this are extensions for compute, network and storage resources as well as
state control (start, stop, restart), billing, performance, etc. in much the
same way as Google have extensions for different data types (e.g. contacts
vs YouTube movies).

Simply by standardising at this level OCCI may well become the HTTP of Cloud
Computing.

10 comments:  bb <http://b.b3k.us/> said...

Sam,

I'm hoping there is more to the various decisions than is outlined in this
post. "Everyone is doing it" is illustrative, but not convincing.

XML drags along an awful lot of baggage, which has resulted in many folks
using lighter-weight formats like JSON. ATOM, in turns, lards still more
baggage into the mix, again, without any clear advantage in this
application. Finally, OAuth is very much in flux, as the recent security
incident makes painfully clear, and, once again, its compelling advantage in
this scenario is not covered above.

The Sun Cloud API work is extremely solid and goes towards simplicity in
almost every choice. Why are you ignoring it?


Ben
  Tuesday, 5 May 2009 4:28:00 AM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1241490480000#c6419411439335971468>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=6419411439335971468>
  Sam
Johnston <http://www.blogger.com/profile/13816529874906993705> said...

G'day Ben,

Au contraire, we are paying *very* close attention to the Sun Cloud APIs and
originally wanted to draw from them heavily were it not for a disagreement
with the OGF powers that be over Creative Commons licensing. Nonetheless we
have learnt a lot from them about high level concepts such as having a
single entry point and "buttons" for state changes and the like. We are also
looking at simple text based APIs like those used at ElasticHosts and GoGrid
(and of course Amazon).

Sun's decision to use JSON no doubt stems from the fact that the API was
previously consumed almost exclusively by the Q-Layer web interface - it is
not so clear that it is the best choice when you stray from the web and
native JavaScript environments, particularly when you want to (safely)
support extensibility... XML allows users to extend the API as necessary in
a fashion that can be validated and will not break other extensions.

Another reasonably firm requirement is that it be possible/trivial to script
the interface for e.g. sysadmin tasks... for this we need a simple text
based protocol anyway so we already have to support more than one output
format (which is increasingly common today, even if not particularly
elegant).

Finally, given that technology is only half the battle (the other half being
marketing/politics), it's necessary to cater to the needs of enterprise
users who are well used to horrors like WS-* and who will find a lightweight
XML layer refreshingly simple in comparison to the status quo. More
importantly though, enterprise requirements like validation, signing,
encryption, etc. are natively supported by XML and not (easily) supported
for the alternatives.

Oh and we are not at all wedded to OAuth - any HTTP authentication mechanism
will do (though it may make sense for us to limit this somewhat for
interoperability).

For more specific details check out the
wiki<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/HomePage>and/or
get involved yourself.

Sam
  Tuesday, 5 May 2009 9:33:00 AM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1241508780000#c4300728670229206261>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=4300728670229206261>
bb <http://b.b3k.us/> said...

Sam,

The onus is on you to justify the additional complexity. Rejecting simpler
solutions because they are unfamiliar is dangerous.

While JSON might have initially entered the protocol via Q-Layer, the
current state of the work is miles from those origins. JSON has similarly
moved well beyond its origins as something used by Javascript (see CouchDB
for a great example). Finally, it is a simple, text-based system, far
simpler than XML, and I have recent, painful experience in working in JSON
and XML simultaneously for systems management. It's a no brainer: JSON wins
in that space because it is easier to write, easier to read, and a lot
harder to make poor, structural decisions using it.

Enterprises currently use a lot of XML because that is what vendors like
Microsoft foisted on them in the name of "standards". If you adopt all the
existing validation and security stack for XML, you are so close to WS-* as
to be indistinguishable. This work will fail if all it does is produce
"WS-Cloud", and that is very much the path on which you are placing it with
these choices. You can do better.


Ben
  Tuesday, 5 May 2009 11:30:00 AM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1241515800000#c7113412609436775580>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=7113412609436775580>
  Sam
Johnston <http://www.blogger.com/profile/13816529874906993705> said...

Ben,

Thanks for your ongoing interest in this topic - your opinion is valued (and
not just by me) so I would very much like us to be on the same wavelength.

I've since been looking at CouchDB which is a very interesting project given
one of the design patterns I've been keeping in mind is a database driven
controller (that is, OCCI being a direct interface to the database) with
triggers firing off e.g. AMQP messages for state changes et al. Erlang is
the obvious choice for such infrastructure and it seems like CouchDB has
done a lot of the work already. Of course it's well out of scope for the
spec itself but it's useful to consider how it might be realised.

In terms of the XML vs JSON bikeshed <http://www.bikeshed.com/> (and I say
bikeshed because once you have to reach for library, as is the case for both
XML and JSON, you may as well use ASN.1 on the wire for all it matters), I
have many reasons for preferring XML, not the least of which is the ease at
which it can be trivially converted to $PREFERRED_FORMAT using tools and
expertise that already exists (the same cannot necessarily be said for the
reverse, or between other formats). Very few enterprise developers have JSON
exposure, let alone experience, and it's not surprising given that essential
features such as schemas <http://www.json.com/json-schema-proposal/> and
validation are still in a state of flux. Incidentally one thing I *do* like
about YAML (and to a lesser extent, JSON) is the relative ease at which even
complex data structures can be represented - this does provide more rope
with which to hang oneself though.

This brings us to the specifics of the proposal... XML can be [ab]used in so
many different ways it's worth pointing out that the proposal uses it in the
simplest way possible. Atom giving us just enough structure to link objects
together, categorise them and provide metadata (such as owners, etags for
caching, etc.) - as Andy observed it's more of a meta-model than anything
else and we found that much of what we were trying to do was reinventing the
[Atom] wheel. The actual content itself (cpu, memory, etc.) is simple,
easily readable, and in all cases so far, flat... as such it's more a
question of whether we use angle brackets or curly braces :)

So why go for angle brackets (XML) over curly braces (JSON) given most of
the action is going on in the latter camp? In addition to the reasons given
above, Google. That is, if we essentially rubber stamp GData (a well proven
protocol) in the form of OCCI core then it's just a matter of changing (or
supporting multiple) namespaces and we may well be able to get Google on
board along with a raft of running code. I've already been discussing this
with Google's API gurus and product managers this week and while there's no
commitment there is some amount of interest.

As for WS-Cloud, I assure you this is nothing like it... I wonder when
reading the WS-[death]* specs whether they weren't deliberately obfuscated
in some cases and even the better ones like
WS-Agreement<http://samj.net/2009/05/www.ogf.org/documents/GFD.107.pdf>are
[over]complicated, in this case running out to dozens of pages.

Sam
  Wednesday, 6 May 2009 11:49:00 AM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1241603340000#c7503506562813624552>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=7503506562813624552>
  William
Vambenepe <http://stage.vambenepe.com/> said...

Speaking of WS-*, I don't think it's a selfish plug for me to point you
towards the post in which I examine how previous efforts (mainly around
WS-*) can inform Cloud standards going forward. It is very relevant to what
you write here. When I read this paragraph:

"Enter the OGF's Open Cloud Computing Interface (OCCI) whose (initial) goal
it is to provide an extensible interface to Cloud Infrastructure Services
(IaaS). To do so it needs to allow clients to enumerate and manipulate an
arbitrary number of server side "resources" (from one to many millions) all
via a single entry point. These compute, network and storage resources need
to be able to be created, retrieved, updated and deleted (CRUD) and links
need to be able to be formed between them (e.g. virtual machines linking to
storage devices and network interfaces). It is also necessary to manage
state (start, stop, restart) and retrieve performance and billing
information, among other things."

... I am a bit worried because you describe the whole field of IT modeling
and interoperability as we've tried to capture it for the last decade. So
when I see that you plan to "deliver an implementable draft this month" I
get a little worried.

Anyway, here is the "lessons to learn" post, titled "A post-mortem on the
previous IT management revolution":
http://stage.vambenepe.com/archives/700Among other things, beware on
focusing on protocols rather than models, beware of working without clear
use cases and beware of trying to address broad problems rather than
constrained problems. BTW, I am not saying that you are doing these things
and I haven't looked at the working documents on the OCCI site. These are
just thoughts that come to mind from reading this blog entry.
  Wednesday, 6 May 2009 11:59:00 AM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1241603940000#c8765972353953633859>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=8765972353953633859>
  James
Broberg <http://www.metacdn.org/> said...

Interesting post Sam. As someone spectating the process it seems that there
is so much duplicate work going on in cloud standards. Just to name a few of
the many ongoing “open” standards efforts:
1. Cloud Computing Interoperability Forum;
2. Open Grid Forum (OGF) Open Cloud Computing
Interface Working Group;
3. DMTF Open Cloud Standards Incubator
4. GoGrid API (CC licensed)
5. Sun Cloud API (CC licensed)
6. Amazon Web Services as “de-facto” (i.e. as Euc. and Nimbus have
proceeded).

Someone more cynical than I could take the next step, creating another
standard / working group that sits on top of them all :)
  Monday, 11 May 2009 4:21:00 PM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1242051660000#c3790762555822944522>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=3790762555822944522>
pcalcada <http://www.cloudviews.org/author/pcalcada/> said...

What James has just said is definitely result of the growing interest on
Cloud Computing, and is something we all have seen happening on other
emerging technologies.

It could be a problem only if those who are working on the projects became
out of focus, and specially if they became in any kind of collision route
:). And that, I think, is something is not happening with the work promoted
here. I'm not a XML or JASON expert, but I think this is not the time for
the OCCI guys to discuss it, there work is already on top of XML and they
should focus it, specially, on its simplicity. Regarding the OAuth security
problems, I leave here a reference to an interesting document done by Nat
Sakimura about it, and about OpenID CX could be a "perfect" solution:
http://www.sakimura.org/en/modules/wordpress/index.php?p=77
  Sunday, 17 May 2009 1:17:00 PM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1242559020000#c2803257747027186479>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=2803257747027186479>
  Sam
Johnston <http://www.blogger.com/profile/13816529874906993705> said...

@William: Thanks for the feedback and for taking the time to write
your excellent
post-mortem <http://stage.vambenepe.com/archives/700>.

I'm quite convinced that one of the primary causes for WS-[death]*'s failure
was the formation of various groups trying to deal with problems in a
vacuum. If you put the blinkers on you're going to end up with a myriad
disparate and/or overlapping standards, all of which need to be understood
and implemented for the resulting system to succeed.

The concern I have now is that by seeking to stick to the [hastily created]
charter word for word and ignoring the context of other working groups we
could be setting the wheels in motion for an equally catastrophic
CC-[death]*, only with different people and the formats du jour.

Granted, the [over-]complexity of SOAP et al raised the barriers to entry
(some [vendors] would say that's a GoodThing™) but over-simplification can
just as easily result in the same outcome.

I'd very much like to see us build on work already done (e.g. AtomPub) but
if there's a simpler way that doesn't involve rolling a new protocol for
each itch that needs scratching then I'm all ears. Watch this space...

Sam
  Sunday, 24 May 2009 12:07:00 PM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1243159648073#c8385478345760581504>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=8385478345760581504>
  Sam
Johnston <http://www.blogger.com/profile/13816529874906993705> said...

@James: There has been some talk about creating a meta-standard group of
some sort - what could possibly go wrong? *cough*WS-I*cough*

My preference is to forge links between working groups as I have done
already with SNIA (the last OCCI call was joint with an SNIA face-to-face
meeting). And by links I don't mean worthless MoU style documents, rather
real intergeek working relationships.

I'd happily do this myself were it not for the cumulative expense/effort of
joining the various SSOs, but I will anyway for the stuff I care about (e.g.
OCCI)

Sam
  Sunday, 24 May 2009 12:12:00 PM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1243159937685#c5428022229359583026>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=5428022229359583026>
  Sam
Johnston <http://www.blogger.com/profile/13816529874906993705> said...

@Paulo: The XML-xenophobes/JSON-junkies made it perfectly clear in London
last week that their position is immutable. I'm busying myself this weekend
working out a solution so this unnecessary religious war doesn't jeopardise
the whole project. XML is certainly well established and an obvious choice
[to me], but the stuff I'm looking at now may be able to dispense with it
altogether... watch this space.

Looking forward to exploring the identity issues in Portugal this week.

Sam
  Sunday, 24 May 2009 12:40:00 PM CEST
<http://samj.net/2009/05/is-occi-http-of-cloud-computing.html?showComment=1243161624114#c3640945663217746787><http://www.blogger.com/delete-comment.g?blogID=6834535&postID=3640945663217746787>
<http://www.blogger.com/delete-comment.g?blogID=6834535&postID=3640945663217746787>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/occi-wg/attachments/20090525/2b127358/attachment.html 


More information about the occi-wg mailing list