[saga-rg] Comments on Strawman API

Graeme Pound G.E.POUND at soton.ac.uk
Thu Dec 15 11:38:40 CST 2005


Andre,

Here are my comments on the SAGA strawman API v0.2. I would be very 
interested to hear your opinions about my comments on the API.

I have in-lined the text below as well as attaching the document.

Thanks,
Graeme

--------------------------------------------------------
######### General comments

  -1.1 The comments in this document stem from my experiences in 
producing complete Java language bindings of the API (based upon the 
strawman API 0.2), and a trivial Java implementation of SAGA.NameSpace, 
SAGA.File and SAGA.JobManagement packages. Whilst this exercise was 
reasonable straightforward, the comments in this document indicate 
numerous points for discussion/clarification. I also expect that 
implementations of the remaining namespaces (SAGA.LogicalFile, 
SAGA.Stream and SAGA.Task) would raise additional comments about the API 
in these areas. I believe that trial implementations will prove the best 
way to debug this API before the draft is finalised.

  -1.2 The API is inconsistent between different sections of the 
document. For example, the interface for Session is defined as a 
pre-requisite for the creation of all SAGA objects, however this is not 
referred to in any other part of the API. Similarly strawman_task.txt 
says that each object in the SAGA API must define a createTaskFactory() 
method, but this is not included in the SIDL definition. Perhaps there 
should be an individual with editorial control/responsibility for the 
entire API?

  -1.3 I understand that issues relating to multiple return values are 
being deferred to the language bindings. However since this is an issue 
for many language I suggest that the effects are minimised by tackling 
it where appropriate in the language independent specification, for 
examples see 3.21 and 5.7.

######### Queries

  -2.1 What is the purpose of NSEntry? This is not documented.

  -2.2 What is the purpose of Async? This is not documented.

  -2.3 Are all relative paths relative to the user's home directory? 
For example, JobDefinition.SAGA_JobCwd is described as being relative to 
the home directory. Is this the same for physical (and logical) files?


######### Specific comments on the API

  -3.1 Constructors involving Session objects should be defined for all 
relevant SAGA classes. These classes should also specify an operation to 
return a session object (i.e. getSession())?

  -3.2 There should be a specified exception model for the elements of 
the SAGA API that are not provided by an implementation. This is 
described in the introduction of the API. May I suggest the SIDL 
specification specify all methods that may throw a 'NotImplemented' 
exception ('NotImplemented' is more consistent stylistically than 
'NOT_IMPLEMENTED').

  -3.3 There is no close() method defined for the File interface. The 
user should be able to explicitly release a file.

  -3.4 There are no exceptions defined for any of the interfaces in the 
SAGA.JobManagement package. There are numerous possible exceptions 
during the job submission process (aside from failed jobs), for example 
if the requirements defined by JobDefinition cannot be met, or bad 
user-parameters are provided. The SAGA API *MUST* define a basic set of 
possible job management errors through which the implementation specific 
exceptions that arise can be rethrown.

  -3.5 Interfaces extending SAGA.Attribute (i.e. Context, JobDefinition, 
JobExitStatus, JobInfo, Stream, StreamServer) pre-define attributes of a 
specific type. However, the SAGA.Attribute class only supports values of 
type String. At present users will have to cast the values themselves. 
Should SAGA.Attribute be extended to support attribute values of 
different types (a complementary/alternative solution is provided by 
3.6)? In this case an exception type of IncorrectAttributeType should be 
defined for the methods setAttribute() and setVectorAttribute() (?).

  -3.6 It may be appropriate to define attribute specific get/set 
methods for known attributes of interfaces extending SAGA.Attribute 
(i.e. Context, JobDefinition, JobExitStatus, JobInfo, Stream, 
StreamServer). For example JobExitStatus defines three known attributes:
     SAGA_ExitCode
     SAGA_Signaled
     SAGA_Termsig
These convenience methods would expose the semantics of interface and 
would provide a typesafe way to handle these pre-define attributes 
(which have a specified type). For example;
     void getExitCode(out integer exitcode);
     void setExitCode(in integer exitcode);
Although perhaps setting values should be protected or handled by the 
constructor.
	This approach would also reduce the occurrence of runtime errors 
resulting from typos in the attribute names (since these would be 
detected at compile time). I had several of these whilst playing with my 
toy implementation.

  -3.7 SAGA.NameSpace.NSDir defines method changeDir(), since the 
directory can change there should be a method to query the current 
directory (i.e. pwd()).

  -3.8 A specific exception should be defined for Multiplexer.watch()

  -3.9 Could the enumerations be defined as integers in the range 
0-MAXVALUE? This makes it simpler to validate the values, for example 
where contextType is defined as:
         public final class contextType {
             public final static int X509            = 0;
             public final static int MyProxy         = 1;
             public final static int SSH             = 2;
             public final static int Kerberos        = 3;
             public final static int UserPass        = 4;
             public final static int MAXVALUE        = 5;
         }
     a supplied integer value could be validated with the code:
         if (type>=0 && type<SAGA.contextType.MAXVALUE)
     This would require the following enumerations to be altered: 
File.openFlags, LogicalFile.openFlags, NameSpace.copyFlags, 
NameSpace.linkFlags, NameSpace.makeDirFlags, NameSpace.moveFlags, 
NameSpace.removeFlags, Stream.ActivityType.

  -3.10 "Any SAGA operation can throw a IncorrectSession exception" the 
SIDL should specify method which may throw this exception.

  -3.11 All of the exceptions that may be thrown by methods must be 
specified in the SIDL specification for that method.

  -3.12 The Session objects should provide some method to allow them to 
be serialised/deserialised. This would allow the user to submit jobs, 
quit the application and resume at a later date. At the simplest save() 
and load() methods could be defined.
        In the world of Globus 2.4 all that you needed was the GRAM job 
handle string, whilst this information is resource specific the SAGA API 
should strive for the same simplicity. For example, imagine the 
following scenario required to regain control of a job object (please 
excuse the Java syntax):

            Session mysession = new Session(args); 
//Create session with specific arguments
            JobService js = new JobService(mysession); 
//Create JobService with config defined by mysession
            Job myjob = js.runJob( "localhost", "\bin\date" ); 
//Submit job
            String myjobid = js.getJobID();                      //Get jobID
            mysession.save("somefile.xml");                      //Save 
the session to file

            %%% Quit the application, and restart
            Session mysession = Session.load("somefile.xml");    //Load 
the session from file
            JobService js = new JobService(mysession); 
//Create JobService with original config defined by mysession
            Job myjob = js.getJob(myjobid); 
//Create Job object corresponding to my job
            ...                                                  //Do stuff

         Of course in this example if the JobService had been 
instantiated without the optional Session argument a default 
configuration would have been used, and all that would be required to 
recreate the Job object would be the jobID string.

  -3.13 Could the Job interface provide the convenience function wait() 
(or similar, i.e. waitUntilComplete() to prevent confusion with 
Task.wait())? This would save user effort of scripting a polling 
function, or similar. This would also be useful in situations where 
explicit job status is not available until completion, for example when 
forking a job using java.lang.Runtime you may only call waitFor() on a 
process (there is no other simple way to determine whether it is complete).

  -3.14 Methods of SAGA.Attibute throw the NoSuchKey exception if a key 
is not present (rather than returning null, see 4.6), would it be 
possible to define a "boolean = hasKey(String key)" method?  Actually, 
it soon got pretty tedious handling NoSuchKey exceptions, perhaps it 
better for getAttribute() to return null (or perhaps not).

  -3.15 It is not clear to me how the ErrorHandler interface corresponds 
to the Java exception handling model. Which errors are intended to be 
handled via this interface? Any serious errors I will throw immediately, 
in my trial interface the error handler became the repository for crud 
exceptions that could otherwise be ignored, which is wasted effort. I 
suggest that the ErrorHandler interface is redundant in Java because it 
is supplanted by the natural way to handle exceptions in this language. 
The ErrorHandler may be appropriate in other languages, however in many 
cases I would expected there already to be a tried and tested method for 
dealing with errors. Rather than reinvent the wheel I would remove this 
section of the API and leave this to the language bindings to handle 
errors in the manner appropriate for each language. (see 3.20)

  -3.16 Should NSDir.makeDir() return an object NSdir corresponding to 
the created directory?

  -3.17 The replication of the enumerations openDirFlags and openFlags 
between the packages SAGA.Namespace, SAGA.File and SAGA.LogicalFile is a 
little confusing. I understand that File and LogicalFile extend\override 
these enumerations, is there a neat solution to this problem? - Ignore 
this otherwise.

  -3.18 How should permission denied exceptions be modelled by 
SAGA.File.Directory? A new exception type required; 'PermissionDenied'.

  -3.19 NSDir.list(dir) should return a String array rather than a 
String. Also would it not be preferable to have a NSDir.list() method to 
return the contents of the directory?

  -3.20 I am very dubious about the value of the SAGA.Task API within 
the context of high-level languages. Where asynchronous method 
invocation exists, supported either within the language (i.e. delegates 
in C#) or documented design patterns (i.e. inner classes in Java), I 
suggest that the semantics of the language should take precedence over 
the semantics of the SAGA API (this also applies the Exception handling 
in high-level languages). I am not certain whether asynchronous method 
invocation should simply be left to the language bindings, or be 
overridden in the language bindings when accepted solutions already 
exist. (see 3.15)

  -3.21 JobService.runJob() returns Job and stdin, stdout, stderr. 
However stdin, stdout, stderr are available from Job.getJobInfo()? I 
understand that this is a convenience functions but this is a prime 
example of the multiple return values problem. Can runJob() simply 
return an object Job?

  -3.22 I dislike that capitalisation of namespaces in the API, in 
particular where the namespace is the same as a class that it contains, 
for example;
	SAGA.File.File
          SAGA.LogicalFile.LogicalFile
          SAGA.Stream.Stream
          SAGA.Task.Task
This is merely a stylistic point, however it would be clearer is the 
namespace elements were changed to lowercase, for example;
	SAGA.file.File


######### Comments on the Strawman API document

  -4.1 The following methods of File are not documented in 
strawman_files.txt:
         void readP       (in  pattern         pattern,
                           out string          buffer,
                           out long            len_out  );
         void writeP      (in  pattern         pattern,
                           in  string          buffer,
                           out long            len_out  );
         void lsEModes    (out array<string,1> emodes   );
         void readE       (in  string          emode,
                           in  string          spec,
                           out string          buffer,
                           out long            len_out  );
         void writeE      (in  string          emode,
                           in  string          spec,
                           in  string          buffer,
                           out long            len_out  );

  -4.2 The interface Exception does not need to extend 
sidl.SIDLException. You do not intend to build language bindings to be 
built with Babel.

  -4.3 In the details of Job.signal() signum is incorrectly listed as an 
output

  -4.4 The following methods are not described in strawman_namespace.txt:
         void getURL      (out string               name   );
         void getName     (out string               name   );

  -4.5 Details of Multiplexer.wait() differs from SIDL spec.

  -4.6 Details of Attribute.getVectorAttribute() specifies that it will 
return NULL if key is not present, but also specifies NoSuchKey 
exception. This should be consistent with the other methods of this class.

  -4.7 Enumeration values of JobState are not provided

  -4.8 JobState enumeration values are inconsistently capitalised within 
the API documentation. For example, DoneOk is referred to as DONE_OK.

  -4.9 Exceptions are not described in the SIDL interface, however 
Exception types are listed in long description of the methods (for 
example ReadOnlyAttribute and NoSuchKey). This information should be in 
the SIDL description as it is fundamental to the interface.

  -4.10 Enumerations are inconsistently capitalised. Capitalise ivec, 
openDirFlags, openFlags?

  -4.11 LogicalFile argument types not specified for addLocation, 
removeLocation, replicate. These should be 'string'

  -4.12 The interfaces Stream and StreamServer implements Attributes, 
however Attributes does not exist. This should be Attribute (although 
perhaps Attributes is a better name for this interface)?

  -4.15 File.readP() and File.writeP() define input argument of type 
'pattern'. This type is not defined by the SIDL API.

  -4.16 In SAGA.Stream.Stream
    void getContext (out context info);
  should read
    void getContext (out Context info);

  -4.17 strawman_error.txt: "All objects in SAGA implement 
ErrorHandler...", the SIDL for these objects should indicate that they 
implement the ErrorHandler interface. The following objects:
    Directory
    File
    Job
    JobService
    LogicalDirectory
    LogicalFile
    Multiplexer
    Stream
    StreamServer
    Task
    TaskContainer
    ... any others ?


######### Comments on Java language bindings

  -5.1 JobInfo; the Java stream classes InputStream and OutputStream 
used in place of opaque

  -5.2 Mapping SAGA exceptions into Java exception model:
         SAGA.Exception extends java.lang.Exception
     the constructor would allow the ExceptionCategory to be defined
         SAGA.Exception(String message, ExceptionCategory category)
     and other defined Exception types would then subclass SAGA.Exception
    (at present all exceptions are defined in the root SAGA package, 
with a little effort they could be categorised throughout the package 
structure)

  -5.3 Enumerations are supplied as a class containing constant field 
values as public static ints. This is the simplest solution, however a 
typesafe solution could be provided if required.

  -5.4 The API is defined as a collection of interfaces that may be 
implemented to provide access to Grid resources. Interfaces are the most 
suitable solution. However the inability to describe constructors in a 
Java interface is problematic since constructors are occasionally 
defined in SIDL specification. There is no way to make implementing 
classes honour a specific constructor signature, instead the 
documentation for the interface should suggest constructors that should 
be provided by any implementing class.

  -5.5 Implementations could be provided for generic classes to prevent 
developers from needing to re-implementing, for example; Attribute, 
Context. Obviously implementation specific subclasses of these would 
have to be defined as 'abstract' classes. Rather I think that an 
interface only solution is simpler and more flexible in the long term, 
default implementations of generic classes could still be provided.

  -5.6 Single out arguments become the method return type.

  -5.7 Mutiple out arguments must be worked around (hopefully there will 
be as few as possible in the API). For example, in defining the 
interface for the File class the following method:
          void read        (in  long            len_in,
                            out string          buffer,
                            out long            len_out  );
     became:
          java.lang.String read(long len_in)
     since it is straightforward to determine the length of a String.

  -5.8 Relative paths are relative to user's home directory, they should 
not be relative to the elusive Java current working directory. Therefore 
it is necessary to perform a mapping using java.io.File objects (see the 
article 'Java Applications and the "Current Directory"' 
http://www.devx.com/tips/Tip/13804).




-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: strawman_notes.txt
Url: http://www.ogf.org/pipermail/saga-rg/attachments/20051215/50cf8199/attachment-0003.txt 


More information about the saga-rg mailing list