[saga-rg] Comments on Strawman API

Thu Feb 2 06:32:22 CST 2006

Quoting [Graeme Pound] (Feb 02 2006):
> 
> Andre & Shantenu,
> 
> Thanks for the reply. I have added comments to some of the points below.
> 
> It is interesting that several of the issues that I raise are specific 
> to Java [and other languages]. I think that it is valuable to tease out 
> these distinctions. It would be useful to define where the languages 
> bindings may (or are expected to) differ from the SAGA specification.

Yes, very useful! :-)

> At present I am [fairly] faithful to the SAGA specification in my 
> preliminary Java bindings. I will continue development informed by your 
> comments, and attempt to capture any deviation from the spec in a document.

That would be very welcome!!

More below, 

  Andre.

> Graeme
> 
> 
> >>  -2.1 What is the purpose of NSEntry? This is not documented.
> >
> >Right, that should be better documented.
> >The purposes are:
> >
> >  - NSEntry should be a class, not an interface.  So
> >    Physical and Logical files can both be handled as
> >    NSEntries.
> >
> >  - There will be methods on NSEntry: get_parent, get_name,
> >    is_file etc.  
> 
> In the spec NSDir is described as implementing NSEntry. Is that also 
> correct?

Yes.  The idea is that a directory has entries.  These
entries can be directories.

> >>  -3.2 There should be a specified exception model for the elements of 
> >>the SAGA API that are not provided by an implementation. This is 
> >>described in the introduction of the API. May I suggest the SIDL 
> >>specification specify all methods that may throw a 'NotImplemented' 
> >>exception ('NotImplemented' is more consistent stylistically than 
> >>'NOT_IMPLEMENTED').
> >
> >It is difficult to specify which methods can be left unimplemented, as
> >that highly depends on the middleware your implementation build upon.
> >E.g. middleware one might not have logical files, but middleware two
> >might have no support for streams.  Or MW-1 can copy files, but not
> >obtain their size (ftp), while others can get size of the file, but
> >can't write (http in some configurations).
> >
> >So, we say in the Intro that as every implementable call must be
> >implemented.  Now, implementable means a lot, of course - what we mean
> >(which oissibly should be made more clear), is "every operation
> >natively supported by the middleware".  Does that make sense?
> 
> I still think that this exception is important. When providing an
> implementation the developers will very quickly encounter methods that
> cannot be provided. It is necessary to handle this in a sensible 
> fashion, particularly when the client is not developing for a specific 
> implementation. My thinking on this subject stems from an interest in 
> selecting the SAGA implementation at runtime (depending on the resource 
> selected by the user).
> 
> My naive solution would be to indicate that [almost?] every method may 
> throw a 'NotImplemented' exception. When an implementation cannot 
> implement a method this exception should be thrown by that method; in 
> addition to documenting the fact that the method is not supported.

Yes, right, that is the way it should be I think.

Hmm, that is also the way the spec should state the affair,
obviously it does a bad job :-)  I'll check that.

> The developer writing client code against a specific implementation 
> would use the documentation to determine which methods are supported. 
> Generic client code written against the SAGA API would have to catch and 
> handle the NotImplemented exception in an appropriate manner.
> 
> Perhaps this is another Java specific problem. I define the Java 
> bindings of the API as interfaces which specify all of the methods which 
> an implementation must provide. Therefore the implemented classes will 
> end up providing methods that cannot work, at which point an error 
> should be thrown.

No, I don't think its java specific.  In C++, we provide
header files as language bindings, and implementations need
to, well, implement the calsses defined in these headers -
so they need to be able to throw a NotImplemented on some
methods as well.  Same situation I think.

> >>These convenience methods would expose the semantics of interface and 
> >>would provide a typesafe way to handle these pre-define attributes 
> >>(which have a specified type). For example;
> >>     void getExitCode(out integer exitcode);
> >>     void setExitCode(in integer exitcode);
> >>Although perhaps setting values should be protected or handled by the 
> >>constructor.
> >>	This approach would also reduce the occurrence of runtime errors 
> >>resulting from typos in the attribute names (since these would be 
> >>detected at compile time). I had several of these whilst playing with my 
> >>toy implementation.
> >
> >I think the buttom line is that we are aware of the shortcomings of
> >the attribute approach, but think its the most simple and flexible way
> >right now (but not the safest for sure).
> >
> >Also, we added introspection to the interface, that should allow to
> >catch some of the cases you list (but not all, and does not make the
> >interface simplier, really).
> 
> I understand the desire to keep the API simple, however my preference 
> would certainly be to extend the API to keep the clients code simple. 
> Casting to and from strings is a pain, and any additional complexity in 
> the clients code is an opportunity for [unnecessary] bugs to be introduced.
> 
> The get and set methods could exist over the Attribute interface to 
> provide some structure in addition to the existing flexibility. 

You mean that both the attribute interface and the get/set
version should be provided?

> Constructors allow type-safe setting of attributes, without the 
> transparent ability to retrieve the value.

I certainly understand (and share) your points.  However, we
have had that discussion in earlier meetings, and opted for
the attribute way,. well aware of the shortcomings.  So it
would need quite some consensus in the group to change that
approach.  Maybe that is somethong to re-discuss in
Athens...

> >>  -3.17 The replication of the enumerations openDirFlags and openFlags 
> >>between the packages SAGA.Namespace, SAGA.File and SAGA.LogicalFile is a 
> >>little confusing. I understand that File and LogicalFile extend\override 
> >>these enumerations, is there a neat solution to this problem? - Ignore 
> >>this otherwise.
> >
> >We do not know a good solution to the various enums, which are
> >distributed over various name spaces.  In cpp this is somewhat
> >painful, really, I guess its worse in C.  Any idea?
> >
> 
> In an earlier email you indicated that you had remove the enumerations 
> from SAGA.NameSpace, this is sufficient to resolve the problem for me.

Uhm, well, you know... Right now they are back in namespace,
due to the fact that namespace was moved from interface to
class.  At least those flags which are used for methods
(e.g. Recursive for delete) need to be there, but probably
also the open flags.

So, the issue is actually surfacing again I'm afraid.

> >>  -3.18 How should permission denied exceptions be modelled by 
> >>SAGA.File.Directory? A new exception type required; 'PermissionDenied'.
> >
> >Right, thanks.  A remark: we left permissions mostly out of the spec
> >right now, as we don't know which security paradigms should be
> >reflected in the API at all (what is a user again?).  We wait for the
> >security area to give us input on that...
> >
> 
> There are other areas in which authentication and authorisation 
> exceptions should be expected.

Good point!  Actually, that might be the most generic way to
handle security for now: allow all methods to throw a
authentication/authorisation exception...

> >>  -3.20 I am very dubious about the value of the SAGA.Task API within 
> >>the context of high-level languages. Where asynchronous method 
> >>invocation exists, supported either within the language (i.e. delegates 
> >>in C#) or documented design patterns (i.e. inner classes in Java), I 
> >>suggest that the semantics of the language should take precedence over 
> >>the semantics of the SAGA API (this also applies the Exception handling 
> >>in high-level languages). I am not certain whether asynchronous method 
> >>invocation should simply be left to the language bindings, or be 
> >>overridden in the language bindings when accepted solutions already 
> >>exist. (see 3.15)
> >
> >Ah, java speaking ;-)  Beleave me, on other languages you DO want a
> >async API, and DON'T want to code threads all the time...
> >
> >I am not sure if we should allow some languages not to implement tasks
> >- async notification would then be gone as well, and also
> >task_containers!  Opinions?
> >
> 
> It may be better for the language bindings to specify a preferred 
> solution for languages where an existing solution exists.

Out of curiosity (and p;oease excuse my Java ignorance), how
would the async copy of multiple files look in Java?  

In C++ it would be:

  ----------------------------------------------------------
  saga::directory      d ("gsiftp://ftp.remote.net//data/");
  saga::task_container tc;

  saga::task t_1 = d.copy <saga::ASync> ("file_1.dat", ".");
  saga::task t_2 = d.copy <saga::ASync> ("file_2.dat", ".");
  saga::task t_3 = d.copy <saga::ASync> ("file_3.dat", ".");

  tc.add (t_1);
  tc.add (t_2);
  tc.add (t_3);

  tc.wait (-1);

  // we are done here
  ----------------------------------------------------------

  or somewhat shorter:

  ----------------------------------------------------------
  saga::directory      d ("gsiftp://ftp.remote.net//data/");
  saga::task_container tc;

  tc.add (d.copy <saga::ASync> ("file_1.dat", "."));
  tc.add (d.copy <saga::ASync> ("file_2.dat", "."));
  tc.add (d.copy <saga::ASync> ("file_3.dat", "."));

  tc.wait (-1);

  // we are done here
  ----------------------------------------------------------

> >>  -3.21 JobService.runJob() returns Job and stdin, stdout, stderr. 
> >>However stdin, stdout, stderr are available from Job.getJobInfo()? I 
> >>understand that this is a convenience functions but this is a prime 
> >>example of the multiple return values problem. Can runJob() simply 
> >>return an object Job?
> >
> >Yes, but the conveniuence of that convenience call is exactly that it
> >simplifies the most common case: running a remote job, and controling
> >its STDIO.  Otherwise, we could use submit_job...
> >
> >Its the only obvious call with multiple output parameters I think,
> >apart from the reads (no chance of change there I guess).  
> 
> This is not a big issue for me. I have further comments on the semantics 
> of the read() methods that may resolve this problem.

ok *uff* :-)

> >>  -3.22 I dislike that capitalisation of namespaces in the API, in 
> >>particular where the namespace is the same as a class that it contains, 
> >>for example;
> >>	SAGA.File.File
> >>          SAGA.LogicalFile.LogicalFile
> >>          SAGA.Stream.Stream
> >>          SAGA.Task.Task
> >>This is merely a stylistic point, however it would be clearer is the 
> >>namespace elements were changed to lowercase, for example;
> >>	SAGA.file.File
> >
> >I think thats a language issue - in CPP we have all name spaces
> >lowercase, as that seems the natural way to do it: saga::file
> >
> >OTOH, we have _not_ implemented the SIDL packages as name spaces in
> >cpp.  Hmm, I am not sure if that should be done, really: as you point
> >out, this will usually merily increase nameing conflicts.
> >
> >I am not sure about this point though...
> 
> I think that I will alter the Java language bindings to meet the 
> stylistic expectations of Java users ;)

Good! :-)

> >>  -5.5 Implementations could be provided for generic classes to prevent 
> >>developers from needing to re-implementing, for example; Attribute, 
> >>Context. Obviously implementation specific subclasses of these would 
> >>have to be defined as 'abstract' classes. Rather I think that an 
> >>interface only solution is simpler and more flexible in the long term, 
> >>default implementations of generic classes could still be provided.
> >
> >See above: the spec talks about classes by now.  Do you think that
> >this limits flexibility overly much?  In our opinion, the implementors
> >of SAGA have the same flexibility as before, and the users of SAGA
> >don't need it, really.  What do you think?
> 
> I am not clear about the interface/class distinction in SIDL.
> 
> In the Java world interfaces are more *fashionable* than class 
> inheritance. There is little to be gained from abstract classes apart 
> from tying developers to code defined by the language bindings. It is 
> also harder to introduce a bug into a Java interface.
> 
> The purpose of the SAGA API is to define an interface. The 
> implementation developers are not tied to an implementation of the 
> utility classes, and the users should not care.

I am not sure about the implications either, but one thing
which keeps popping up is that Java interfaces can't have
constructors.  However, the API specifies a number of
constructors, so, in Java, that would defy the interface
only approach - how do you handle that?

And further we tried to label as interface where we did not
expect an instance to appear in the application API.  E.g.
attribute is an interface implemented by a number of classes
- but there won't ever be an instance of type Attribute in
the API.

That might be a somewhat simplicistic approach, however, it
seemed a sueful distinction for the API spec.

Cheers, Andre.

-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+