[dais-wg] Telcon Minutes from 19/04/05.

Wed Apr 20 06:00:57 CDT 2005

DAIS Telcon - 19/04/05
======================

Chair: Norman Paton
Notes: Mario Antonioletti

Attendees:
----------
	   Dave Pearson, Oracle
	   Norman Paton, University of Manchester
	   Mario Antonioletti, EPCC
	   Simon Laws, IBM
	   Savas Parastatidis, University of Newcastle
	   Amy Krause, EPCC
	   Dave Berry, NeSC
	   Susan Malaika, IBM
	   Allen Luniewski, IBM

Agenda:
-------
	Specification Issues; 
	OGSA Data Architecture; 
	Mappings

Actions:
--------

[Norman] Create a list of options for update of disconnected data sets.

[Simon] Write up the top level IO operation to go in the core spec.

[Mario/Simon/Amy] Look at DAIS issue 1200 and either decide what needs
                  to done to resolve this and respond accordingly on
                  Grid Forge or flag it as an issue that needs to be
                  discussed at some future telcon.

[Mario] Continue working on the DAIS contribution to the data
        architecture.

[Susan] Look the names and what influence the DMTF have on the XML
	attributes and elements being used for the resource properties
	in the CIM model and report back to the group.

Completed Actions
-----------------

[Simon] Post note to list on refined GGF13 proposal and relationship to 
        mapping approach (4).

	Done.

[Norman] Seek to ensure that relevant OGSA/WSRF people are following the 
         process.

	 Done.

+---

Specification Issues:

Want to ensure that there are no big open issues that we are failing
to address. There are 6 weeks to the document submission deadline.

In WS-DAI have 3 issues worthy of discussion:

977: Semantics of disconnected data sets

also 1370 - can you push data back in to the data resource. With
disconnected data sets we do not provide a mechanism to write back to
them.

Simon: that's not necessarily true. We don't prevent people from
       implementing any old interface.

Norman: ok, people can use extensibility but we don't specify
	operations for pushing data.

Dave Pearson: but we don't allow disconnected data sets to be reflected
	     back on to the data resource.

Norman: that's fair enough as that could be complicated. The other
	thing that we don't provide is the opportunity to push it
	somewhere else - as in a SaveInTableOfGivenName type of
	operation, or to save itself in a given location.

	The ability to write to a disconnected data set - can we
	specify operations analogous to the the get data or should we
	leave as is and allow other things to write to the data
	resource?

Simon: in theory - in the way you construct disconnected data sets -
       you could create an interface that could do this ...

Norman: but then it will not be standardised...

Simon: you could extend the service and use what ever you like or you
       could use interfaces that we already provide - you could use
       SQLExecuteFactory that allows you to pose a SQL select which is
       then disconnected but that would also have a SQLExecute on that
       that which would then provide you with the ability to do an
       update.

Dave Pearson: I was thinking along the lines... 

Mario: can we express the source of data that is separate from the
       messages?

Norman: no, we can't.

Simon: yes, we do not have a way of specifying this.

Dave Pearson: so we cannot support a bulk load procedure then?

Norman: yes, you could send a collection of insert operations - now,
	if we are seeing the disconnected data sets as a way of doing
	bulk transfer out of a data service then the obvious solution
	is to use this the other way round.  In which case what does
	one do? We allow data to go out but we don't provide any way
	of moving that data to the database. As a client you would
	have to extract the data and then do a bunch of inserts which
	is not very satisfactory.

Susan: JSR 114 specifies a way of updating a disconnected data set
       and to send them back. When people want to do major bulk loads
       they will generally use comma-separated-values so there are two
       or three ways that one could go about doing this.

Simon: I think we should note it and not put it in, as it's linked in
       with a bunch of stuff that we've looked at in the past...

Norman: there is the question about disconnected data sets and then
	there is another question about pushing the data back. Let's
	look at these.  A solution for some of these may be
	straightforward ... there is a lack of symmetry at the moment.

...

	We can let people use whatever they want - that's there
	already or we provide operations to put tuples into the data
	resource - that would be quite an easy thing ....

Susan: might be easy if you don't allow deltas, otherwise it might be
       difficult.

Dave Pearson: have to be careful as to what we mean by updates and
	     inserts - there may be issues about referential
	     integrity. The underlying data may be changed in the
	     meantime ... I don't think we should go down the path of
	     updates ....

Norman: this is one of the reasons why I'm trying to deal with things
	separately. Given that we are interacting with a disconnected
	artifact what are the analogues of the get operations - the
	put tuples? ...then there's the question, assuming that is
	fairly straightforward, how do you associate that with the
	database.  You can reflect back ... Susan says that JR114
	provides a way of doing this.

Susan: you can tell which are inserts/changes/deletes ... you can
       implement it how you like - how you lock is your business.

Norman: we could provide a reflect back operation which merely states
	that the reflect operation has the semantics as in JSR114 and
	then not specify much else. Then there is the StoreInSomePlace
	type of operation that effectively does a bunch of inserts -
	you give it the collection name and the table name ... that's
	another possibility and that too does not sound too difficult

Simon: so you are taking one data resource and tell it to combine
       itself with another data resource?

Norman: yes, avoids unnecessary communication.

Simon: it sounds like a simple and general operation that would
       provide real value. Not sure how we do it in DAIS - whether we
       do something specific to DAIS or propose something more
       general. It's tied into the transport mechanism....

Norman: so my take is that it should not be tied to the transport
        mechanism.

...

       We have a number of possibilities. The relevant functionalities
       are missing. Have looked at this before but shied away because
       the bulk loads tend to be database specific but maybe we can
       use the disconnected data set idea ....

Dave Pearson: if you wanted to do an update on disconnected data sets
	     you might use a bulk update ... you would want to have
	     the same update to be consistent across the whole data
	     set.

Norman: if one supports change on a derived artifact then you want to
	know what those are like ... if you use predicates then it
	begins to become a more sophisticated interface ... another
	approach is to have something quite simple. ... then that
	raises the question about delete as well.....

Dave Pearson: if people want to update data then they should do it
	     within the context of SQL within a database at least for
	     this initial release and if you want to do updates then
	     you should use the update syntax.

Norman: we could leave as is or look at the different
	capabilities. Can post issues to the list .....

*Norman takes an action to do this*

Susan: one of the things that compounds this is that any
       implementation would not have to parse the request and in this
       case we would have to do that ...

Norman: I did not have it in mind to do any parsing ...

Simon: if we came up with a list of the options then that would be
       useful.

Norman: next one is, top level default access... are we going to try
	to put the operation in the spec?

Simon: I think we should write it up. That would then go on the top
       level spec.

*Simon takes an action to do this*

Norman: Issue 1200 - inconsistent representation of properties in the
        specs.

Simon: the answer is probably yes, and we should have done it before.

* Simon, Mario and Amy take an action on this*

Norman: there are other things in the relational context there are two
	that we should touch base on ... 1997 the nature of the
	properties as place holders for the CIM name .... what is
	there may not eventually go in the DAIS specs.

Mario: We could specify the name space and not enumerate as we are
       beginning to do this ...

Susan: the XML for the CIM model is currently being generated...

Mario: we could just use the name space ...

Susan: that might work.

*Susan takes an action on this*

Norman: 1198 ... we can just fix this. On the XML spec there are 5
	outstanding issues .... are any of these closed?

Amy: ... there is 1086

Mario: this was me ... what are the semantics when you associate an
       XML schema with an existing collection of XML documents - do
       these have to validate? Moreover if this type of functionality
       is not supported by the underlying data resource do you then
       have to support it at the service level? The get out clause is
       that these operations are optional ...

Norman: it could fault if that capability is not provided. So it just
	needs to be added to the document and the issue closed. What
	about XUpdate?

Amy: we decided to keep that ...

Norman: inconsistent factory portTypes ...

Amy: that can be closed now.

Norman: any other issues Amy?

Amy: No.

Norman: Moving on to data architecture. Is there anything that DAIS
        can do for the data architecture? Dave Berry has asked for
        some text on DAIS.

Mario: I wrote some text which may have been too DAIS specific...

Dave Berry: would like to see something about the patterns used - what
	   interfaces are required for reading/writing - an overview
	   and then the specifics.

Norman: maybe you could use DAIS as a case study for identifying the
	level of detail that you require for the data architecture
	... it has properties identifying some of the semantics but
	not operations ...

Dave Berry: have an action to give more guidance on how to do
	   this. There are various people working on this at the
	   moment. So not quite accurate that WS-DAI would be the
	   template ....

Norman: ... just trying to be helpful. The fact that DAIS is silent on
	third party delivery means that there is a hook there that
	needs to be filled ... the data architecture document only has
	text and the only point where it is specific is that it lists
	properties but it does not clump behaviours in a diagrammatic
	fashion ... could use our experience for that ...

Dave Berry: that would be useful ...

Norman: that would require more detail in the DAIS section - to what
	level of depth you describe the properties and what do you do
	for the messages and the WSDL ....

Dave Berry: would be helpful to go round the loop a couple of times...

Allen: we do not want to copy masses of text from the other specs...

Norman: and there are specs that are missing, need to establish the
	level of detail that is going to be used .... Mario can keep
	the token ... Mario will evolve that and circulate round the
	usual suspects ... 

*Mario takes an action on this*

	are there any other things that should be doing?

Dave Berry: need people to contribute to the sections and then get
	   people to comment ... there's a meeting in London on May
	   26th - we can get more people to attend - a follow-on from
	   OGSA working group ... if anyone is interested please come
	   along.

...

Norman: move on to mappings ... since last week Simon posted a more
	detailed description between options 3 and 4 and Savas cobbled up
	some WSDL ... when I look through the WSDL I was looking for
	the relationship between option 3 and 4 but I did not see
	resource names being made available as opeartion parameters ...

Savas: there is a message called SQLQueryStorebyref .... this returns
       a string which is a reference to the stored web rowset. there
       webrowsetcount which accepts a name which is a name for the web
       rowset which was previously stored ....

Norman: there is no explicit name for the data resources though...

Savas: you could do that ....

Norman: what I'd like to get is an understanding the extent to which
	we can write the same spec for options 3 and 4 with the
	variation of where you put the MUSTs and SHOULDs in the specs
	so that they have largely the same message body .... with
	option 3 imposing some prioritisation on what you do in the
	implementation.

	If we were to do a type 3 proposal and did it in such a way
	that you could have type 4 - people preferring
	interoperability to flexibility - then this would mean a
	minimum change in the MUSTs and SHOULDs...

Simon: this hinges of the availability of the name in the message and
       whether you make this optional .... Savas can map a name to an
       EPR - have called this the conduit before. ...  These things
       used in combination could solve the problem but we've not put
       that in the same WSDL ....

Dave Berry: this notion of getting together at the ogsa face-to-face on
	   May 25th ... use WSRF and DAIS folks to discuss this topic.

Norman: we will need to decide very soon so we are likely to have made
	a decision before May but aspects of detail could be looked at
	in mid-May.  ...  It would be a good opportunity to air our
	approach.

*Discussion as to who could make it and whether this could be done*

...

        The question: given the explicit naming of resource how close
	can we make the message bodies.

Savas: all the DAIS messages can be identical whether you are using a
       WSRF/non-WSRF solution. The difference is in how you identify
       the resources in these messages. The name is going to be
       optional....

Norman: understand the name optionality ... but let's take the
	response in the factory case - you have defined it to be a
	string but in reality it is a choice of something - a string
	in one context or an EPR in another ...

Savas: you could have a choice or you could have an extra set of
       messages to convert the name into an EPR.

Norman: or the other way round ...

Savas: but if you do the other way round you break the WS-I Basic
       Profile assumption.

...

Norman: you are drawing the line further up stream by not including
        ws-addressing than I thought... what is the status of
        ws-addressing?

Savas: they are in the last call. Will become a recommendation in
       weeks or a couple of months if there are not major last call
       issues ... a recommendation and not a standard
       .... recommendations are considered stable in W3C.

Norman: have been slightly careless, there is the WSRF people and the
	non-WSRF people but I assumed the latter people would also use
	WS-Addressing ....

Savas: don't get me wrong - I like WS-Addressing but WS-I Basic
       Profile does not have WS-Addressing.

Norman: so we have pure WS-I, WS-I + WS-Addressing or WS-RF.

Simon: there is a subtlety in that point as we are returning something
       to return a disconnected data set ... we need something to
       abstract away ...

Dave Pearson: Tom Maguire suggested that you might need 2 trips to do
	     that resolution ... even if you are using WSRF.

...

Norman: the likely definition of a response type - what is that likely
        to be?

Simon: good question, we need to do some work. Dangerous to specify a
       new type - will take you a long time to standardise it. Don't
       want something arbitrary either though ...

Savas: the issue that Simon identified - given an EPR not knowing the
       underlying data resource, if you want to identify that, whether
       you use an EPR or a name, requires additional semantics - could
       use a URI that specifies the type but then you will have to
       identify the type - this is not possible with an EPR as that is
       opaque. You can however do it by having an additional message
       exchange ....

Simon: also attracted by having a structure that allows you to hold
       other things such as the type ... to avoid the naming scheme
       .... maybe that is something that we should revisit.

Savas: as it is a DAIS spec then you can do what you like as long as
       you don't expect other services to use it.

Norman: our hope is to produce a model that others feel comfortable
        with - on the way in we can define messages bodies with
        optional names - that seems ok ....  then can you have as
        little pain in the results in the context of the factory
        pattern ...

Savas: have my proposal, then you have your possibility with the
       compound element and then there is the choice but choices 
       are tricky programmatically ...

...

Dave Pearson: does the top level operation produce complications?

Simon: if we include the name as optional then we would have to
       ... this is a message which is generic across lots of different
       types of data resources .... it's not the same as your resolve
       operations - accept lots of messages to access the data ...

...

Norman: we need to get ourselves into a position where we understand
	the implications of the specs for defining a type but have a
	prioritisation for these messages - Savas has generated some
	WSDL and has shied away about putting in the optional names in
	the body ... now we are exploring the factory pattern - the
	result types for the factory pattern.

Simon: and how we can handle the optional names. Easy to put these
       into the specs if we know what these look like or how we avoid
       using a structure for them...

Savas: if it's a URI ....

...

Norman: a URI sounds ok for me ...

Savas: but it still does not give you any semantics ..

Norman: but we don't need any semantics ...

Savas: but if you don't need semantics you can just have a string ...

Simon: if we accept the name then it's resolving whether we have a two
       call structure for the factory or we have a complex
       data structure...

Savas: you could have something from a name space that is not
       standard...

Simon: my concern is about the 2 call operation is where you call to
       perform the resolve operation - the same service> ....

Savas: I assume that there is only one service .... I want to indicate
       that there is only one service and all you have to know the end
       point ....

Simon: I don't agree with that ... have to take into account where
       these do not have the same end point ... resilience for
       instance...

Savas: but then you have replication ...

Simon: or you could have versioning ....

Savas: I don't understand ... 

Norman: but the factory pattern allows flexibility as to which end
	point handles your results ...

Simon: or which interface handles your results... the end point
       required to handle the disconnected data resource could be
       different from that which was used to generate it ...

Norman: that sounds sensible......

Dave Pearson: does WSRF not require you to have a different end point?

Simon: you could have a different end point but the same service - it
       depends on what you mean by service.

...

Simon: it sounds like the thing to do is to work these issues into the
       WSDL and go round the loop again ... there is a danger in
       extending the WSDL into something that you cannot understand
       but then you can then reduce it to something you can understand
       ....

Savas: can add things to the WSDL - I can produce the WSDL if you let
       me know what you want ...

Norman: optionality and then whether you are dealing with a bridge or
	a complex type. This affects the types and what the bridging
	operations are

Savas: also the issue of resource properties ...

Norman: don't think that is tricky - when you are in WSRF you use the
	ResourceProperties and you use analogous messages when you are
	not ... in WS-I case you have to define some extra messages
	for lifetime and managing the resource properties ....

Simon: useful to use the same terminology ...

Norman: if DAIS is going to have a twin track want to make them as
	similar as possible.

        It would be useful to have examples the extended WSDL and to
	have a note of what the options are in the WSDL.

*Savas and Simon will continue the evolution of the WSDL and options
with Norman in the loop*

+-----------------------------------------------------------------------+
|Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ.   |
|Tel:0131 650 5141|mario at epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ |
+-----------------------------------------------------------------------+