The Bigger Picture.

Mario Antonioletti mario at epcc.ed.ac.uk
Mon May 23 06:14:02 CDT 2005


Hi,
   Word of warning - the diagram I posted out was not normative
in any way. It was my take of what was in the data architecture
document at the time. The links may thus not be associated with
the right terms. It was supposed to engender discussion ... like
this ... I'll try to provide answers to some of your points below.

>  - DataSet is data externalised from the data resource:
>    do you mean that what determines a dataset is not residing
>    in the data resource itself but somewhere external?
>    i seem to be missing something..

I think the concept - as we use - originally was introduced by
Malcolm and adopted in DAIS. The current definition there is:

A data set is an encoding of data suitable for externalization outside a
data service, for example, as an XML document or as a binary stream. The
concept of a data set is introduced to describe data as it appears in
the messages passing to and from data services, i.e. between the
consumer and the data service.

Does that make it any clearer? I think Malcolm's original conception was
much richer. It allowd for provenance to be included in the data set
envelope and the payload could be an instruction on how to generate the
data - not the data itself. This was all being talked about a long time
ago and has not carried through. The ability to talk about an
externalised piece of data was useful though and was adopted in DAIS and
is now being used in the OGSA-D WG. Cc'ing Malcolm in and Simon in case
they wish to pass comment.

> - composite data service: what do you mean by the unbounded link? 
> (sorry if i missed some discussions here)

Not all of these things are my definitions :-)
A simple data service can represent more than one data resource or
data service. Did not think that there was a limit hence I put
unbounded.

>  - i don't understand how data virtualization can provide
>    data management and data access. i understand how it provides
>    data creation (i think) but also then shouldn't it be described
>    the other way round? i.e. data virtualization can trigger
>    data creation? again i seem to be missing something..
>    so i still have problems with understanding what different people
>    mean by 'virtualization', do you we have a final definition if this
>    yet? i may have an outdated version of the doc..

I've had problem with the virtualization term myself. We phased it out
of DAIS because it was such an overloaded term and everyone had their
own view as to what it meant. I did not quite see what the association
was in the document. I think if you present a modified (or updated)
version of the concept map as to how you think it should look then that
would be cool.
 
>  - looking at your picture, it reads that data capabilities are
>    access,federation,location management and transfer. the problem
>    here is that this is easily misunderstood since 'data capabilities'
>    refers to the grid architecture capabilities, not to capabilities
>    of actual data being stored. so having this in the same
>    image may be confusing. it may help to talk about architectural
>    capabilities and models in a separate image.

I don't disagree. I was just trying to get a picture as to what the
conceptual relationships in the document then was. It needs to be
reviewed and where it's still not clear then the document needs to
be clarified.
 
>  - i agree with your annotations in the cmap! i think the data
> description is metadata to be stored either with the data or in a data
> catalog; and dataset is just like that too, its metadata of the actual
> data, not data itself. for transfers you would need to look up the
> metadata and you're right, it doesn't necessarily materialize except on
> the actual target. 

If you want to take a shot at updating the CMAP then that would be cool.
Would not mind iterating with someone/others. Should be clear though as
to what comes from the document and what comes from a perception as to
what it should be (so that it is later clarified in the document). My
initial take was to derive the cmap from what was in the document. I
should try to update it but if you want to have a go first then please
feel free :-). Thanks Peter,

			Mario
 
> 
> 
> 
> peter
> 
> On Tue, 2005-05-03 at 13:54 +0100, Mario Antonioletti wrote:
> > Hi,
> >    I think one thing that has been lacking a bit in the OGSA-D WG so far
> > has been the bigger picture trying to tie some of the concepts together.
> > A long time ago I remember seeing the concept maps that were produced
> > for the WS architecture (see: http://www.w3.org/TR/ws-arch/) and I quite
> > liked these. I don't know what tool they used but I found something
> > similar, if not the same. See:
> > 
> > 	http://cmap.ihmc.us/
> > 
> > It's a java application and I think, though I've not looked too closely,
> > is free. I have tried to create a concept map for the data architecture
> > and have included an exported image as an attachment to this email as
> > well as the cmap file that contains the source to the diagram in case
> > anyone wants to play with it. It was very easy to generate and change
> > and I believe the cmaps can be shared so they could be developed in a
> > collaborative fashion. I think it helps to ellicit detail from the
> > model, e.g. to what does the data description bind to: the
> > virtualization or the service? Does the idea of having a data set
> > defined in the data transfer make sense? i.e. I think of a data set as a
> > disconnected piece of data that has been materialised outside the data
> > resource and this would only make sense if it were staged. Do we benefit
> > at all from making the distinction between a simple and aggregated data
> > service? etc, etc, etc. 
> > 
> > Note that I am not advocating that these should be included in the
> > document, nor that everything should go in one diagram (as is
> > currently). I think that the diagrams can be nested (maybe). Is this a
> > worthwhile thing to do? If the answer is yes can it be done so the
> > diagrams are shared? Thoughts?
> > 
> > I will try to distribute something on data access by the close of day
> > today (UK time) albeit it'll be far from complete.
> > 
> > 			Mario
> > 
> > +-----------------------------------------------------------------------+
> > |Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ.   | 
> > |Tel:0131 650 5141|mario at epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ |
> > +-----------------------------------------------------------------------+
> -- 
> ------
> CERN, 1211 Geneva 23, Switzerland
> 
> 

+-----------------------------------------------------------------------+
|Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ.   |
|Tel:0131 650 5141|mario at epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ |
+-----------------------------------------------------------------------+





More information about the ogsa-d-wg mailing list