[saga-rg] Comments on Strawman API
Graeme Pound
G.E.POUND at soton.ac.uk
Thu Dec 15 11:38:40 CST 2005
Andre,
Here are my comments on the SAGA strawman API v0.2. I would be very
interested to hear your opinions about my comments on the API.
I have in-lined the text below as well as attaching the document.
Thanks,
Graeme
--------------------------------------------------------
######### General comments
-1.1 The comments in this document stem from my experiences in
producing complete Java language bindings of the API (based upon the
strawman API 0.2), and a trivial Java implementation of SAGA.NameSpace,
SAGA.File and SAGA.JobManagement packages. Whilst this exercise was
reasonable straightforward, the comments in this document indicate
numerous points for discussion/clarification. I also expect that
implementations of the remaining namespaces (SAGA.LogicalFile,
SAGA.Stream and SAGA.Task) would raise additional comments about the API
in these areas. I believe that trial implementations will prove the best
way to debug this API before the draft is finalised.
-1.2 The API is inconsistent between different sections of the
document. For example, the interface for Session is defined as a
pre-requisite for the creation of all SAGA objects, however this is not
referred to in any other part of the API. Similarly strawman_task.txt
says that each object in the SAGA API must define a createTaskFactory()
method, but this is not included in the SIDL definition. Perhaps there
should be an individual with editorial control/responsibility for the
entire API?
-1.3 I understand that issues relating to multiple return values are
being deferred to the language bindings. However since this is an issue
for many language I suggest that the effects are minimised by tackling
it where appropriate in the language independent specification, for
examples see 3.21 and 5.7.
######### Queries
-2.1 What is the purpose of NSEntry? This is not documented.
-2.2 What is the purpose of Async? This is not documented.
-2.3 Are all relative paths relative to the user's home directory?
For example, JobDefinition.SAGA_JobCwd is described as being relative to
the home directory. Is this the same for physical (and logical) files?
######### Specific comments on the API
-3.1 Constructors involving Session objects should be defined for all
relevant SAGA classes. These classes should also specify an operation to
return a session object (i.e. getSession())?
-3.2 There should be a specified exception model for the elements of
the SAGA API that are not provided by an implementation. This is
described in the introduction of the API. May I suggest the SIDL
specification specify all methods that may throw a 'NotImplemented'
exception ('NotImplemented' is more consistent stylistically than
'NOT_IMPLEMENTED').
-3.3 There is no close() method defined for the File interface. The
user should be able to explicitly release a file.
-3.4 There are no exceptions defined for any of the interfaces in the
SAGA.JobManagement package. There are numerous possible exceptions
during the job submission process (aside from failed jobs), for example
if the requirements defined by JobDefinition cannot be met, or bad
user-parameters are provided. The SAGA API *MUST* define a basic set of
possible job management errors through which the implementation specific
exceptions that arise can be rethrown.
-3.5 Interfaces extending SAGA.Attribute (i.e. Context, JobDefinition,
JobExitStatus, JobInfo, Stream, StreamServer) pre-define attributes of a
specific type. However, the SAGA.Attribute class only supports values of
type String. At present users will have to cast the values themselves.
Should SAGA.Attribute be extended to support attribute values of
different types (a complementary/alternative solution is provided by
3.6)? In this case an exception type of IncorrectAttributeType should be
defined for the methods setAttribute() and setVectorAttribute() (?).
-3.6 It may be appropriate to define attribute specific get/set
methods for known attributes of interfaces extending SAGA.Attribute
(i.e. Context, JobDefinition, JobExitStatus, JobInfo, Stream,
StreamServer). For example JobExitStatus defines three known attributes:
SAGA_ExitCode
SAGA_Signaled
SAGA_Termsig
These convenience methods would expose the semantics of interface and
would provide a typesafe way to handle these pre-define attributes
(which have a specified type). For example;
void getExitCode(out integer exitcode);
void setExitCode(in integer exitcode);
Although perhaps setting values should be protected or handled by the
constructor.
This approach would also reduce the occurrence of runtime errors
resulting from typos in the attribute names (since these would be
detected at compile time). I had several of these whilst playing with my
toy implementation.
-3.7 SAGA.NameSpace.NSDir defines method changeDir(), since the
directory can change there should be a method to query the current
directory (i.e. pwd()).
-3.8 A specific exception should be defined for Multiplexer.watch()
-3.9 Could the enumerations be defined as integers in the range
0-MAXVALUE? This makes it simpler to validate the values, for example
where contextType is defined as:
public final class contextType {
public final static int X509 = 0;
public final static int MyProxy = 1;
public final static int SSH = 2;
public final static int Kerberos = 3;
public final static int UserPass = 4;
public final static int MAXVALUE = 5;
}
a supplied integer value could be validated with the code:
if (type>=0 && type<SAGA.contextType.MAXVALUE)
This would require the following enumerations to be altered:
File.openFlags, LogicalFile.openFlags, NameSpace.copyFlags,
NameSpace.linkFlags, NameSpace.makeDirFlags, NameSpace.moveFlags,
NameSpace.removeFlags, Stream.ActivityType.
-3.10 "Any SAGA operation can throw a IncorrectSession exception" the
SIDL should specify method which may throw this exception.
-3.11 All of the exceptions that may be thrown by methods must be
specified in the SIDL specification for that method.
-3.12 The Session objects should provide some method to allow them to
be serialised/deserialised. This would allow the user to submit jobs,
quit the application and resume at a later date. At the simplest save()
and load() methods could be defined.
In the world of Globus 2.4 all that you needed was the GRAM job
handle string, whilst this information is resource specific the SAGA API
should strive for the same simplicity. For example, imagine the
following scenario required to regain control of a job object (please
excuse the Java syntax):
Session mysession = new Session(args);
//Create session with specific arguments
JobService js = new JobService(mysession);
//Create JobService with config defined by mysession
Job myjob = js.runJob( "localhost", "\bin\date" );
//Submit job
String myjobid = js.getJobID(); //Get jobID
mysession.save("somefile.xml"); //Save
the session to file
%%% Quit the application, and restart
Session mysession = Session.load("somefile.xml"); //Load
the session from file
JobService js = new JobService(mysession);
//Create JobService with original config defined by mysession
Job myjob = js.getJob(myjobid);
//Create Job object corresponding to my job
... //Do stuff
Of course in this example if the JobService had been
instantiated without the optional Session argument a default
configuration would have been used, and all that would be required to
recreate the Job object would be the jobID string.
-3.13 Could the Job interface provide the convenience function wait()
(or similar, i.e. waitUntilComplete() to prevent confusion with
Task.wait())? This would save user effort of scripting a polling
function, or similar. This would also be useful in situations where
explicit job status is not available until completion, for example when
forking a job using java.lang.Runtime you may only call waitFor() on a
process (there is no other simple way to determine whether it is complete).
-3.14 Methods of SAGA.Attibute throw the NoSuchKey exception if a key
is not present (rather than returning null, see 4.6), would it be
possible to define a "boolean = hasKey(String key)" method? Actually,
it soon got pretty tedious handling NoSuchKey exceptions, perhaps it
better for getAttribute() to return null (or perhaps not).
-3.15 It is not clear to me how the ErrorHandler interface corresponds
to the Java exception handling model. Which errors are intended to be
handled via this interface? Any serious errors I will throw immediately,
in my trial interface the error handler became the repository for crud
exceptions that could otherwise be ignored, which is wasted effort. I
suggest that the ErrorHandler interface is redundant in Java because it
is supplanted by the natural way to handle exceptions in this language.
The ErrorHandler may be appropriate in other languages, however in many
cases I would expected there already to be a tried and tested method for
dealing with errors. Rather than reinvent the wheel I would remove this
section of the API and leave this to the language bindings to handle
errors in the manner appropriate for each language. (see 3.20)
-3.16 Should NSDir.makeDir() return an object NSdir corresponding to
the created directory?
-3.17 The replication of the enumerations openDirFlags and openFlags
between the packages SAGA.Namespace, SAGA.File and SAGA.LogicalFile is a
little confusing. I understand that File and LogicalFile extend\override
these enumerations, is there a neat solution to this problem? - Ignore
this otherwise.
-3.18 How should permission denied exceptions be modelled by
SAGA.File.Directory? A new exception type required; 'PermissionDenied'.
-3.19 NSDir.list(dir) should return a String array rather than a
String. Also would it not be preferable to have a NSDir.list() method to
return the contents of the directory?
-3.20 I am very dubious about the value of the SAGA.Task API within
the context of high-level languages. Where asynchronous method
invocation exists, supported either within the language (i.e. delegates
in C#) or documented design patterns (i.e. inner classes in Java), I
suggest that the semantics of the language should take precedence over
the semantics of the SAGA API (this also applies the Exception handling
in high-level languages). I am not certain whether asynchronous method
invocation should simply be left to the language bindings, or be
overridden in the language bindings when accepted solutions already
exist. (see 3.15)
-3.21 JobService.runJob() returns Job and stdin, stdout, stderr.
However stdin, stdout, stderr are available from Job.getJobInfo()? I
understand that this is a convenience functions but this is a prime
example of the multiple return values problem. Can runJob() simply
return an object Job?
-3.22 I dislike that capitalisation of namespaces in the API, in
particular where the namespace is the same as a class that it contains,
for example;
SAGA.File.File
SAGA.LogicalFile.LogicalFile
SAGA.Stream.Stream
SAGA.Task.Task
This is merely a stylistic point, however it would be clearer is the
namespace elements were changed to lowercase, for example;
SAGA.file.File
######### Comments on the Strawman API document
-4.1 The following methods of File are not documented in
strawman_files.txt:
void readP (in pattern pattern,
out string buffer,
out long len_out );
void writeP (in pattern pattern,
in string buffer,
out long len_out );
void lsEModes (out array<string,1> emodes );
void readE (in string emode,
in string spec,
out string buffer,
out long len_out );
void writeE (in string emode,
in string spec,
in string buffer,
out long len_out );
-4.2 The interface Exception does not need to extend
sidl.SIDLException. You do not intend to build language bindings to be
built with Babel.
-4.3 In the details of Job.signal() signum is incorrectly listed as an
output
-4.4 The following methods are not described in strawman_namespace.txt:
void getURL (out string name );
void getName (out string name );
-4.5 Details of Multiplexer.wait() differs from SIDL spec.
-4.6 Details of Attribute.getVectorAttribute() specifies that it will
return NULL if key is not present, but also specifies NoSuchKey
exception. This should be consistent with the other methods of this class.
-4.7 Enumeration values of JobState are not provided
-4.8 JobState enumeration values are inconsistently capitalised within
the API documentation. For example, DoneOk is referred to as DONE_OK.
-4.9 Exceptions are not described in the SIDL interface, however
Exception types are listed in long description of the methods (for
example ReadOnlyAttribute and NoSuchKey). This information should be in
the SIDL description as it is fundamental to the interface.
-4.10 Enumerations are inconsistently capitalised. Capitalise ivec,
openDirFlags, openFlags?
-4.11 LogicalFile argument types not specified for addLocation,
removeLocation, replicate. These should be 'string'
-4.12 The interfaces Stream and StreamServer implements Attributes,
however Attributes does not exist. This should be Attribute (although
perhaps Attributes is a better name for this interface)?
-4.15 File.readP() and File.writeP() define input argument of type
'pattern'. This type is not defined by the SIDL API.
-4.16 In SAGA.Stream.Stream
void getContext (out context info);
should read
void getContext (out Context info);
-4.17 strawman_error.txt: "All objects in SAGA implement
ErrorHandler...", the SIDL for these objects should indicate that they
implement the ErrorHandler interface. The following objects:
Directory
File
Job
JobService
LogicalDirectory
LogicalFile
Multiplexer
Stream
StreamServer
Task
TaskContainer
... any others ?
######### Comments on Java language bindings
-5.1 JobInfo; the Java stream classes InputStream and OutputStream
used in place of opaque
-5.2 Mapping SAGA exceptions into Java exception model:
SAGA.Exception extends java.lang.Exception
the constructor would allow the ExceptionCategory to be defined
SAGA.Exception(String message, ExceptionCategory category)
and other defined Exception types would then subclass SAGA.Exception
(at present all exceptions are defined in the root SAGA package,
with a little effort they could be categorised throughout the package
structure)
-5.3 Enumerations are supplied as a class containing constant field
values as public static ints. This is the simplest solution, however a
typesafe solution could be provided if required.
-5.4 The API is defined as a collection of interfaces that may be
implemented to provide access to Grid resources. Interfaces are the most
suitable solution. However the inability to describe constructors in a
Java interface is problematic since constructors are occasionally
defined in SIDL specification. There is no way to make implementing
classes honour a specific constructor signature, instead the
documentation for the interface should suggest constructors that should
be provided by any implementing class.
-5.5 Implementations could be provided for generic classes to prevent
developers from needing to re-implementing, for example; Attribute,
Context. Obviously implementation specific subclasses of these would
have to be defined as 'abstract' classes. Rather I think that an
interface only solution is simpler and more flexible in the long term,
default implementations of generic classes could still be provided.
-5.6 Single out arguments become the method return type.
-5.7 Mutiple out arguments must be worked around (hopefully there will
be as few as possible in the API). For example, in defining the
interface for the File class the following method:
void read (in long len_in,
out string buffer,
out long len_out );
became:
java.lang.String read(long len_in)
since it is straightforward to determine the length of a String.
-5.8 Relative paths are relative to user's home directory, they should
not be relative to the elusive Java current working directory. Therefore
it is necessary to perform a mapping using java.io.File objects (see the
article 'Java Applications and the "Current Directory"'
http://www.devx.com/tips/Tip/13804).
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: strawman_notes.txt
Url: http://www.ogf.org/pipermail/saga-rg/attachments/20051215/50cf8199/attachment-0003.txt
More information about the saga-rg
mailing list