[DRMAA-WG] maxSlots attribute

Andre Merzky andre at merzky.net
Wed Mar 24 12:59:11 CDT 2010


Quoting [Daniel Templeton] (Mar 24 2010):
> 
> What does that mean when you say "retrieve them programatically?"  What 
> tool are they using to do it?  What is the interface?


I mean retrieving via an API like DRMAA or SAGA, instead of
retrieving by some other means outside of the program (reading a web
page and entering the queue name in the code or some config file).

Cheers, Andre.


> Daniel
> 
> On 03/24/10 06:03, Andre Merzky wrote:
> >Hi,
> >
> >Quoting [Peter Tr?ger] (Mar 23 2010):
> >>Hi,
> >>
> >>>As long as the MonitoringSession::drmsQueueNames is nothing more
> >>>than an opaque set of strings that are the valid values for
> >>>JobTemplate::queueName, I can live with it.  I can see where that
> >>>would be useful for a portal.  I thought, however, that we had come
> >>>to the conclusion previously that portals and user interfaces were
> >>>not really our target applications.  (Anyone remember what feature
> >>>spawned that conclusion?)  I thought that DRMAA was specifically
> >>>focused on applications integrating with clusters.  If so, a list of
> >>>opaque strings is useless.
> >>
> >>We dropped the portal example, that's true. The most convincing DRMAA
> >>applications at the moment are high-level APIs and meta-schedulers on
> >>top of / with DRMAA support.
> >>
> >>I did some field study to get the picture right. LSF, PBS, SGE,
> >>LoadLeveler, SAGA, Globus and GridWay can submit jobs to particular
> >>queues. In LoadLeveler, queues are called "classes". Condor, JSDL and
> >>OGSA-BES seem to have no queue concept - correct me if I am wrong. The
> >>retrieval of the list of queue names is only supported in:
> >>
> >>LSF: bqueues (http://www.vub.ac.be/BFUCC/LSF/man/bqueues.1.html)
> >>PBS: qstat -q (http://linux.die.net/man/1/qstat)
> >>SGE: qstat -f
> >>LoadLeveler: llclass -l (http://www.ccs.ornl.gov/Cheetah/
> >>LL.html#Classes)
> >>
> >>So if we add the monitoring facility, an empty return value must be
> >>still valid.
> >>
> >>>By the way, you'll also have to give a little thought to reconciling
> >>>the 1:1 queue/host model with the 1:n and n:m models, as far as
> >>>identifying them in a list goes.
> >>
> >>This is the true counter argument. If DRMAA monitoring gives no
> >>additional hints here, invalid combinations of valid machine / queue
> >>names in the job template could occur.
> >>Let's wait if any defender of queue list monitoring stands up.
> >>Otherwise, I propose to keep only the queue name attribute in the job
> >>template.
> >
> >It's not a hard requirement from our side, but I think its utterly
> >useful.  In general, our end users are complaining more often than
> >not if they need to manually retrieve resource details like service
> >contact URLs or queue names from some obscure web page, instead of
> >being able to retrieve those information programatically.  The
> >manual way is simply too error prone, tedious and static.
> >
> >Cheers, Andre.
> >
> >
> >>
> >>/Peter.
> >>
> >>
> >>>Daniel
> >>>
> >>>On 3/23/10 10:27 AM, Peter Tröger wrote:
> >>>>>As I said in the email I just wrote, I'm willing to be convinced
> >>>>>of the
> >>>>>value of adding queues to the job submission side of things.  I am,
> >>>>>however, fundamentally opposed to adding queues to the monitoring
> >>>>>side.
> >>>>
> >>>>I will heavily insist on queue support in DRMAAv2, This is a long
> >>>>demanded feature, which also popped up again in the survey.
> >>>>
> >>>>>The various concepts of queues are too different for that to make
> >>>>>any
> >>>>>sense.  There is absolutely no way we will be able to model both
> >>>>>LSF and
> >>>>>SGE queues in a way that is abstract enough to be consistent and
> >>>>>still
> >>>>>specific enough to be meaningful and accurate.  We'll talk on the
> >>>>>next
> >>>>>call. :)
> >>>>
> >>>>The intention of the current model is that JobTemplate::queueName
> >>>>and MonitoringSession:: drmsQueueNames act as counterparts. DRMAA
> >>>>would promise that all strings that show up in MonitoringSession::
> >>>>drmsQueueNames are valid input for JobTemplate::queueName. Nothing
> >>>>more.
> >>>>The use case are DRMAA-based portals and command-line applications.
> >>>>The interpretation of what a queue is can be provided by the
> >>>>library implementation - at the end, the user anyway has to reason
> >>>>about the meaning of queue names.
> >>>>
> >>>>We could relax the conditions so that other values are also allowed
> >>>>in JobTemplate::queueName. This would allow MonitoringSession::
> >>>>drmsQueueNames to return nothing in SGE. This must be anyway
> >>>>possible - Condor has no queue concept at all.
> >>>>
> >>>>I could also agree to remove MonitoringSession::
> >>>>queueMaxWallclockTime and  MonitoringSession::
> >>>>queueMaxSlotsAllowed, since these two attributes are the ones that
> >>>>demand a particular understanding of what a queue is.
> >>>>
> >>>>Best,
> >>>>Peter.
> 



-- 
Nothing is ever easy.


More information about the drmaa-wg mailing list