[SAGA-RG] core spec errata
Andre Merzky
andre at merzky.net
Mon Mar 2 16:37:05 CST 2009
Here is some food for discussion at Wednesdays SAGA session:
We have addressed most reported errors and mistakes etc in
the core spec by now, apart from 4. These are listed below.
Please excuse the length of this mail. I would appreciate
if people could read it anyway, so that we don't need to
fully describe all items in the SAGA sessions again.
1: - Page 33, figure 2: Picture has no block for URL, URL is nowhere
to be found. iovec and parameter are in the wrong block (should
not be in io, but in file and rpc)
This does not need any discussion, but simply needs to be done.
2: - job::service needs to have its rm URL as attribute, as
that URL is needed if more job services are to be
created in the same security domain, for example.
Otherwise, a job_service created with no URL can never
be reconnected to, e.g. to find previously run jobs.
A code example would be:
{
saga::job::service js_1;
js.run_job ("/bin/sleep 1000");
}
{
saga::job::service js_2;
std::list <std::string> ids = js_2.list_jobs ();
for ( int i = 0; i < ids.size (); i++ )
{
saga::job::job j = js_2.get_job (ids[i]);
}
}
In this example, it is not guaranteed that the job from
the first code block is amongst the ones found in the
second block, as the default js constuctor may very well
connect to a different backend. Further, the user has
no means to learn about the js URL, to reconnect later
on.
A fix *could* look like:
saga::url u;
{
saga::job::service js_1;
js.run_job ("/bin/sleep 1000");
u = js_1.get_attribute ("url");
}
{
saga::job::service js_2 (u);
std::list <std::string> ids = js_2.list_jobs ();
for ( int i = 0; i < ids.size (); i++ )
{
saga::job::job j = js_2.get_job (ids[i]);
}
}
Please note that the js URL SHOULD be part of the job
id, but that is not a hard requirement to the SAGA
implementation.
Also, the example above is certainly, well, dumb. But
we met a couple of use cases which make it sufficiently
painful to keep track of jobs to ask for this to be
fixed.
3: - context c'tor should not call set_defaults: that
significantly complicates usage if default ctx cannot
be initialized.
You have probably seen the lengthy email exchange between
Ceriel and me on this list. No other opinions have been
forthcoming so far, but we need to get this item closed
ASAP.
So, a short summary:
The original behaviour (calling set_defaults in the
c'tor) failed for:
saga::context c ("globus");
c.set_attribute ("UserProxy", "...");
c.set_defaults ();
if no default globus context can be created, a globus
context cannot be created at all (set_defaults, and thus
the c'tor, would throw).
So, we resolved that by not calling set_defaults in the
c'tor, which got the above working as it should.
Ceriel (and to some extent also Steve) are arguing that
set_defaults is not needed in the API at all, but that
errors on context initialization should be thrown on the
first remote operation where that context is used.
I counter-argued that this is an ill defined point, and
that I'd rather prefer to have a fixed point in the code
where I can expect the SAGA implementation to signal
errors in the context initialization. Yes, errors in
context usage can still occur later on, but those are
well defined.
4: - As suggested earlier, the algorithm to find the most
specific error has been changed to ignore the
NotImplemented error as long as there were other errors
thrown. NotImplemented will be reported only if there
are 'only' NotImplemented errors in the error list.
That errata item led to a lengthy discussion on several
mail threads. Below is a summary of those
Issue:
------
The exception precedence list in the spec does not make sense:
(a) the NotImplemented exception is actually the least informative
one, and should be ate the *end* of the list.
(b) for late binding implementation, and for implementations with
multiple backends in general, it is very difficult to determine
generically which exception is more interesting to the end user.
Problem example:
----------------
Assume an implementation of the SAGA file API binds (late) to HTTP
and FTP.
Assume the following setup: on host a.b.c, an http server (http
root = /var/www) and an ftp server (ftp root /var/ftp) are
running, using the same credentials for access.
The following file exist, and are owned by root (permissions in brackets)
URL (rwx)
/var/www/etc/ (x--)
/var/www/etc/passwd (xxx)
/var/www/usw/ (xxx)
/var/ftp/etc/ (xxx)
/var/ftp/usw/ (x--)
/var/ftp/usw/passwd (xxx)
Assume a SAGA application wants to open any://a.b.c/etc/passwd
for reading. The WWW backend will throw PermissionDenied, the FTP
backend with throw DoesNotExist.
Both exceptions are correct. There are valid use cases for either
exception to be the more specific, and thus, in the spec's
argumentation, the more dominant one.
Further, upon accessing any://a.b.c/usw/passwd, the situation is excatly
inversed. Of course, the implementation will have no means to deduce the
intention of the application, and to decide that suddenly the exception from
the other backend is more useful.
Diagnosis:
----------
The root of the problem is the ability of SAGA to be implemented with late
binding. Any binding to a single middleware will result in exactly one
error condition, which is to be forwarded to the application. Also,
implementations with early bindings can (and indeed will) focus on
exceptions which originate from the bound middeware binding for that
specific object, and will again be able to report exactly one error
condition. (Note that for early binding implementations, the initial
operation which causes the implementation to bind to one specific
middleware is prone to the same exception ordering problem.)
So it is mostly for late binding implementations that this issue arises,
when several beckends report errors concurrently, but the standard error
reporting mechanism in most languages default to report exactly one error
condition.
Conclusion:
-----------
A global, predefined ordering of exception will be impossible, or at least
arbitrary. The native error reporting facilities of most languages will by
definition be inadequate to report the full error information of late
binding SAGA implementations.
That leaves SAGA language bindings with three possibilities:
(a) introduce potentially non-native error reporting mechanisms
saga::filesystem::file f ("any://a.b.c/etc/passwd");
std::list <saga::exception> el = f.get_exceptions ();
// handle all backend exceptions
for ( int i = 0; i < el.size (); i++ )
{
try
{
throw el[i];
}
catch ( saga::exception::DoesNotExist )
{
// handle exception from ftp backend
}
catch ( saga::exception::PermissionDenied )
{
// handle exception from www backend
}
}
(b) acknowledge the described limitation, document it, and stick to the
native error reporting mechanism
try
{
saga::filesystem::file f ("any://a.b.c/etc/passwd");
}
catch ( saga::exception::DoesNotExist )
{
// handle exception from ftp backend
}
catch ( saga::exception::PermissionDenied )
{
// handle exception from www backend (which will not be forwarded in
// our example, this this will never be called)
}
(c) a mixture of (a) and (b), with (b) as default.
try
{
saga::filesystem::file f ("any://a.b.c/etc/passwd");
}
catch ( saga::exception::DoesNotExist )
{
// handle top level exception
}
catch ( saga::exception e)
{
// handle all backend exceptions
std::list <saga::exception> el = e.get_all_exceptions ();
for ( int i = 0; i < el.size (); i++ )
{
try
{
throw el[i];
}
catch ( saga::exception::DoesNotExist )
{
// handle exception from ftp backend
}
catch ( saga::exception::PermissionDenied )
{
// handle exception from www backend
}
}
}
Note that (c) may not be possible in all languages.
Discussion C++ bindings:
------------------------
C++ is actually be able to implement (c). The C++ bindings would then
introduce a saga::exception class, and the respective sub classes, which
represent the 'most informative/specific' exception. How exactly the 'most
informative/specific' exception is selected from multiple concurrent
implementations is left to the implementation, and cannot sensibly be
prescribed by the specification not the language binding, as discussed
above. (The spec could propose such a selection algorithm though).
However, the saga::exception class could have the additional ability to
expose the full set of backend exceptions, for example as list:
std::list <saga::exception> saga::exception::get_all_exceptions ();
Further, it would be advisable (for all language bindings actually) to
include all error messages from all backend exceptions into the error
message of the top level exception (this is already implemented in the
CCT's C++ implementation):
catch ( saga::exception e )
{
std::cerr << e.what ();
// print the following message:
// exception (top level): DoesNotExist
// exception (ftp binding): DoesNotExist - /etc/passwd does not exist
// exception (www binding): PermissionDenied - access to /etc denied
}
Best, Andre.
--
Nothing is ever easy.
More information about the saga-rg
mailing list