[DRMAA-WG] New synchronization approach for DRMAA2
Andre Merzky
andre at merzky.net
Mon Aug 31 13:25:34 CDT 2009
FWIW, this matches the SAGA model, wher we have a 'metric'
called job.state, for which callbacks can be registered.
Those fire whenever the job state changes.
Nice.
Best, Andre.
Quoting [Peter Tr?ger] (Aug 31 2009):
>
> Dear all,
>
> in reflection of the last phone conference (check the minutes), here
> is a possible realization of the new synchronization approach as IDL
> snippet:
>
> --- snip ---
>
> enum DrmaaEvent {NEW_STATE_UNDETERMINED, NEW_STATE_QUEUED_ACTIVE,
> NEW_STATE_HOLD, NEW_STATE_RUNNING, NEW_STATE_SYSTEM_SUSPENDED,
> NEW_STATE_USER_SUSPENDED, NEW_STATE_USER_SYSTEM_SUSPENDED,
> NEW_STATE_DONE, NEW_STATE_FAILED, ... };
>
> interface DrmaaCallback {
> void notify(in DrmaaEvent event, in Job job)
>
> interface JobSession{
> readonly attribute string contact;
> void registerEventNotification(in DrmaaCallback callback) raises
> UnsupportedFeatureExeption, ....
> JobTemplate createJobTemplate()
> void deleteJobTemplate(in DRMAA::JobTemplate jobTemplate)
> Job runJob(in DRMAA::JobTemplate jobTemplate)
> sequence<Job> runBulkJobs(in DRMAA::JobTemplate jobTemplate,in long
> beginIndex,in long endIndex,in long step)
> sequence<Job> waitAnyStarted(in sequence<Job> jobs, in long long
> timeout)
> sequence<Job> waitAnyTerminated(in sequence<Job> jobs, in long long
> timeout)
> ...
>
> interface Job{void suspend()
> void resume()
> void hold()
> void release()
> void terminate()
> JobState getState(out native subState)
> void waitStarted(in long long timeout)
> void waitTerminated(in long long timeout)
> JobInfo getInfo()
>
> --- snip ---
>
> waitAnyStarted() would return if any of the provided jobs has one of
> the states RUNNING, SYSTEM_SUSPENDED, USER_SUSPENDED, or
> USER_SYSTEM_SUSPENDED. It returns the according job(s) as result,
> which allows subsequent calls by the application with a reduced list.
>
> waitAnyTerminated() would return if any of the provided jobs has
> either FAILED or DONE state.
>
> waitStarted() and waitTerminated() on job level work in a similar way
> for only one job. The timeout parameter keeps the DRMAA1 semantics.
>
> JobSession::registerEventNotification() would accept the function
> pointer / object reference for the callback sink implemented by the
> application. This is the first time that we introduce an optional
> method in DRMAA, therefore we need the new UnsupportedFeatureExeption
> to express if this function is supported or not. The callback function
> signature is also standardized by the language binding
> (DrmaaCallback), so that all DRMAA libraries for one language can work
> with any application in a portable way. For the sake of portability,
> we also need to standardize the possible events then. New / other
> proposals for this enumeration (DrmaaEvent) are welcome.
>
> Please comment.
>
> Thanks,
> Peter.
--
Nothing is ever easy.
More information about the drmaa-wg
mailing list