[DRMAA-WG] New synchronization approach for DRMAA2

Andre Merzky andre at merzky.net
Mon Aug 31 13:25:34 CDT 2009


FWIW, this matches the SAGA model, wher we have a 'metric'
called job.state, for which callbacks can be registered.
Those fire whenever the job state changes.

Nice.

Best, Andre.


Quoting [Peter Tr?ger] (Aug 31 2009):
> 
> Dear all,
> 
> in reflection of the last phone conference (check the minutes), here  
> is a possible realization of the new synchronization approach as IDL  
> snippet:
> 
> --- snip ---
> 
> enum DrmaaEvent {NEW_STATE_UNDETERMINED,   NEW_STATE_QUEUED_ACTIVE,  
> NEW_STATE_HOLD, NEW_STATE_RUNNING, NEW_STATE_SYSTEM_SUSPENDED,  
> NEW_STATE_USER_SUSPENDED, NEW_STATE_USER_SYSTEM_SUSPENDED,  
> NEW_STATE_DONE, NEW_STATE_FAILED, ... };
> 
> interface DrmaaCallback {
> 	void notify(in DrmaaEvent event, in Job job)
> 
> interface JobSession{
> 	readonly attribute string contact;
> 	void registerEventNotification(in DrmaaCallback callback) raises  
> UnsupportedFeatureExeption, ....
> 	JobTemplate createJobTemplate()
> 	void deleteJobTemplate(in DRMAA::JobTemplate jobTemplate)
> 	Job runJob(in DRMAA::JobTemplate jobTemplate)
> 	sequence<Job> runBulkJobs(in DRMAA::JobTemplate jobTemplate,in long  
> beginIndex,in long endIndex,in long step)
> 	sequence<Job> waitAnyStarted(in sequence<Job> jobs, in long long  
> timeout)
> 	sequence<Job> waitAnyTerminated(in sequence<Job> jobs, in long long  
> timeout)
> ...
> 
> interface Job{void suspend()
> 	void resume()
> 	void hold()
> 	void release()
> 	void terminate()
> 	JobState getState(out native subState)
> 	void waitStarted(in long long timeout)
> 	void waitTerminated(in long long timeout)
> 	JobInfo getInfo()
> 
> --- snip ---
> 
> waitAnyStarted() would return if any of the provided jobs has one of  
> the states RUNNING, SYSTEM_SUSPENDED, USER_SUSPENDED, or  
> USER_SYSTEM_SUSPENDED. It returns the according job(s) as result,  
> which allows subsequent calls by the application with a reduced list.
> 
> waitAnyTerminated() would return if any of the provided jobs has  
> either FAILED or DONE state.
> 
> waitStarted() and waitTerminated() on job level work in a similar way  
> for only one job. The timeout parameter keeps the DRMAA1 semantics.
> 
> JobSession::registerEventNotification() would accept the function  
> pointer / object reference for the callback sink implemented by the  
> application. This is the first time that we introduce an optional  
> method in DRMAA, therefore we need the new UnsupportedFeatureExeption  
> to express if this function is supported or not. The callback function  
> signature is also standardized by the language binding  
> (DrmaaCallback), so that all DRMAA libraries for one language can work  
> with any application in a portable way. For the sake of portability,  
> we also need to standardize the possible events then. New / other  
> proposals for this enumeration (DrmaaEvent) are welcome.
> 
> Please comment.
> 
> Thanks,
> Peter.
-- 
Nothing is ever easy.


More information about the drmaa-wg mailing list