[DAIS-WG] Notes from DAIS session at OGF22

Wed Feb 27 16:36:26 CST 2008

Notes from the DAIS Working Group Session at OGF22
==================================================

February 27th, 2008. Cambridge, Massachusetts.

Status Update (Mario Antonioletti)
----------------------------------

Dave Pearson resigned at last OGF. Mario Antonioletti and Isao Kojima are
the new chairs.

Mario summarised the history of the group.

It was noted that the rechartering process to include the RDF work had not
completed.
Mario will raise this with the Data Area Directors. It was noted that a Data
AD was not present in the room.

Mario summarised the current status of implementations of the DAI
specifications:
DAIX in OGSA-DAI and at Ohio State University
DAIR in OGSA-DAI, GReLC, and AMGA

OSU implementation of DAIX (Amy Krause)
---------------------------------------

Amy Krause presented some slides on behalf of the group at Ohio State
University who attempted to implement the DAIX specification using their
Introduce tool to create services which plugged into their Mobius database.

Issues included problem with the way that Introduce expected the definition
of the datatypes, and that faults do not extend WS-BaseFaults. However the
main issue was lack of available effort. 

A discussion around the issues of where the datatype definitions are placed
ensued, and it was agreed that Mario would get in touch with OSU to
determine the reasons why they did not just try altering the WSDL files to
see if it worked.

A WS-DAI implementation with OGSA-DAI (Elias Theocharopoulos)
-------------------------------------------------------------

Elias presented the work done on a DAIX implementation. He summarised the
WS-DAI operations supported and contrasted the OGSA-DAI and WS-DAI
architectures. He noted the memory overhead and JDBC row retrieval issues.
Are these major flaws in the spec which prevents efficient implementations?

Mario asked whether you could count the number of results without retrieving
them (execute SELECT COUNT(*) FROM ... on a database) to get the number of
rows in the result set. But that would mean running the query which could be
potentially expensive. 

Would it make sense to make some of the elements (e.g. result size) optional
rather than mandatory to allow better scalability? Are these issues related
to the specification or is it limitations in JDBC e.g. can you open multiple
JDBC connections with the same state? Access to multiple resultsets from the
same prepared statement is not allowed when using JDBC. Agreed that this
needs to be clarified and a resolution agreed.

There was a discussion of how potential changes would be fed back: via
experience documents, via errata or via a spec revision.

There was a discussion of where the NotAuthorizedFault is thrown
(NotAuthorizedFault may be raised before the actual operation is invoked -
depending on container implementation of security possible solution) and
that the fault may be obselete. This should be discussed to see if it
requires the fault to become optional and that a tracker should be raised.

Everyone appears to violate the opacity of the reference parameters of a
WS-EPR.

Elias noted that there were ambiguities in the SubcollectionName URI over
whether it should be absolute or relative. Need to see how others would
choose to implement this. there was a discussion of whether things should be
kept consistent, or whether they should reflect the expected behaviour.

Elias note that there was an issue when RemovingSubcollectionsIssue with
nonexisting parent collections should the parent tree be created
automatically or throw a fault? Not clear from the spec. Mario felt that a
fault should be thrown if intermediate collections do not exist as this is
more likely to be sensible default. Better not to have implicit creation.

On RemoveSubcollection faults, there isn't enough scope in the
InvalidCollectionFault to indicate which collection caused the fault. Is it
clear enough in use for the client that you can't express which collection
didn't exist. Mario gave a counter example where you would not want to be
told that something had been "deleted" when in fact it hadn't e.g. if
sensitive information was there. 

There was a disagreement between "spec purists" who felt it was enough
information in the fault, and developers who felt that this meant that in
practice that you might have to constantly repeat operations to identify
what went wrong. It was agreed that the fault structure should be discussed
on the mailing list and on the telcon. 

On the AddDocument operation there was ambiguity on one of the responses.
Missing parameter in invalid collection fault does not allow to provide more
information about the fault issue on overwriting documents: not clear in the
spec - what does the flag mean? The group agreed that the first "the
document WAS overwritten" is the correct response.

Elias went over a number of issues related the use of OGSA-DAI.

RDF(S) Querying (Isao Kojima + Mirza Said)
------------------------------------------

Isao and Said summarised the work that had been done on the query and 

It was noted that use cases can be used as basis of use cases to be provided
to SAGA.

Mario asked whether the OGSA Glossary editor (Jem Treadwell) had been kept
informed of the glossary work by UPM and AIST. This is important now that
OGSA Data effort has reduced.
This will be published in the motivational document for the RDF work. Mario
suggested that they should also contribute to OGSA.

Mario was confused about what is meant when things are considered synonyms.
This will be clarified offline. 

It was noted that there is a bigger issue with the adoption of WS-Naming and
the use of EPIs.  This needs to be clarified with the OGSA group.

RDF(S) Ontology (Miguel Esteban Gutierrez)
-----------------------------------------

Miguel summarised the work that had been done on the ontology.

Mario had an issue with UPM coming up with a set of write operations on
their own (the read capabilities are provided by SPARQL).

Mario asked whether for every RDF resource data resource you have to map
your API to the underlying way of updating the RDF store.

Mario noted the confusion arising from the use of the term "native
interfaces" on slides which are actually meaning an abstraction to native
interfaces (e.g. APIs for Jena and Sesame). Perhaps primitive interfaces
would be better. 

A discussion arose about why, when this work on profiles seems very good,
there weren't more people at the session. An issue is that the proposal was
not mature enough when proposed to the semantic web community. The main
added value is that we are providing service interfaces to RDF and also an
abstract API to many RDF implementations. This latter work Mario felt could
be very important to the community. This really needs to get better
socialisation by the group. Could these operations be implemented as an API,
rather than as web service operations. 

The group agreed that there was a lot still to be discussed!

-- 
Neil P Chue Hong            | T: [+44] (0)131 650 5957
Director, OMII-UK           | F: [+44] (0)131 650 6555
Rm 2409, JCMB, Mayfield Rd. | E: N.ChueHong at epcc.ed.ac.uk
Edinburgh, EH9 3JZ, UK      | W: http://www.omii.ac.uk      

  "A film is like a battleground. It's love, hate, action,
   violence, death - in a word, emotion." - Sam Fuller