[DRMAA-WG] maxSlots attribute

Peter Tröger peter at troeger.eu
Mon Mar 29 05:18:51 CDT 2010


I asked about this issue in some personal talks at OGF and D-Grid, and people agreed that a queue name listing is something they really want to see.

Dan, please note that the description of both attributes (listing and JT part) is already very relaxed. We only promise that if MonitoringSession::drmsQueueNames returns something, it must be an accepted input in JobTemplate::queueName. All other combinations (no list / custom queue name set; list available / custom queue name set; ...) are still allowed.

We will finalize the discussion this week in the phone call.

Best,
Peter.



Am 25.03.2010 um 03:47 schrieb Yves Caniou:

> Hi,
> 
> This is also usefull for blind users, like servers of applications on top of a 
> DRM. The queue may not be the best one, but at least a submission may occur.
> 
> Cheers.
> 
> .Yves.
> 
> Le Wednesday 24 March 2010 19:30:25 Andre Merzky, vous avez écrit :
>> Quoting [Daniel Templeton] (Mar 24 2010):
>>> So the program is going to go read the queue names via the API and
>>> then present them to the user to pick from?  Doesn't the user
>>> still need to go reference a web page somewhere to figure out
>>> which queue is the right one for what he's doing?
>> 
>> Yes, the user need to have some idea what the queue names mean.
>> Queue names like 'bigjob', 'preempt', 'checkpt', 'workq' are,
>> however, rather intuitive.  Thats of course not always the case.
>> BTW: the above names are from our current systems on loni.org.
>> 
>> Anyway, as said: that is not a hard requirement, but our users seem
>> to find this useful.
>> 
>> Cheers, Andre.
>> 
>>> Daniel
>>> 
>>> On 03/24/10 10:59, Andre Merzky wrote:
>>>> Quoting [Daniel Templeton] (Mar 24 2010):
>>>>> What does that mean when you say "retrieve them programatically?"  What
>>>>> tool are they using to do it?  What is the interface?
>>>> 
>>>> I mean retrieving via an API like DRMAA or SAGA, instead of
>>>> retrieving by some other means outside of the program (reading a web
>>>> page and entering the queue name in the code or some config file).
>>>> 
>>>> Cheers, Andre.
>>>> 
>>>>> Daniel
>>>>> 
>>>>> On 03/24/10 06:03, Andre Merzky wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Quoting [Peter Tr?ger] (Mar 23 2010):
>>>>>>> Hi,
>>>>>>> 
>>>>>>>> As long as the MonitoringSession::drmsQueueNames is nothing more
>>>>>>>> than an opaque set of strings that are the valid values for
>>>>>>>> JobTemplate::queueName, I can live with it.  I can see where that
>>>>>>>> would be useful for a portal.  I thought, however, that we had come
>>>>>>>> to the conclusion previously that portals and user interfaces were
>>>>>>>> not really our target applications.  (Anyone remember what feature
>>>>>>>> spawned that conclusion?)  I thought that DRMAA was specifically
>>>>>>>> focused on applications integrating with clusters.  If so, a list of
>>>>>>>> opaque strings is useless.
>>>>>>> 
>>>>>>> We dropped the portal example, that's true. The most convincing DRMAA
>>>>>>> applications at the moment are high-level APIs and meta-schedulers on
>>>>>>> top of / with DRMAA support.
>>>>>>> 
>>>>>>> I did some field study to get the picture right. LSF, PBS, SGE,
>>>>>>> LoadLeveler, SAGA, Globus and GridWay can submit jobs to particular
>>>>>>> queues. In LoadLeveler, queues are called "classes". Condor, JSDL and
>>>>>>> OGSA-BES seem to have no queue concept - correct me if I am wrong.
>>>>>>> The retrieval of the list of queue names is only supported in:
>>>>>>> 
>>>>>>> LSF: bqueues (http://www.vub.ac.be/BFUCC/LSF/man/bqueues.1.html)
>>>>>>> PBS: qstat -q (http://linux.die.net/man/1/qstat)
>>>>>>> SGE: qstat -f
>>>>>>> LoadLeveler: llclass -l (http://www.ccs.ornl.gov/Cheetah/
>>>>>>> LL.html#Classes)
>>>>>>> 
>>>>>>> So if we add the monitoring facility, an empty return value must be
>>>>>>> still valid.
>>>>>>> 
>>>>>>>> By the way, you'll also have to give a little thought to reconciling
>>>>>>>> the 1:1 queue/host model with the 1:n and n:m models, as far as
>>>>>>>> identifying them in a list goes.
>>>>>>> 
>>>>>>> This is the true counter argument. If DRMAA monitoring gives no
>>>>>>> additional hints here, invalid combinations of valid machine / queue
>>>>>>> names in the job template could occur.
>>>>>>> Let's wait if any defender of queue list monitoring stands up.
>>>>>>> Otherwise, I propose to keep only the queue name attribute in the job
>>>>>>> template.
>>>>>> 
>>>>>> It's not a hard requirement from our side, but I think its utterly
>>>>>> useful.  In general, our end users are complaining more often than
>>>>>> not if they need to manually retrieve resource details like service
>>>>>> contact URLs or queue names from some obscure web page, instead of
>>>>>> being able to retrieve those information programatically.  The
>>>>>> manual way is simply too error prone, tedious and static.
>>>>>> 
>>>>>> Cheers, Andre.
>>>>>> 
>>>>>>> /Peter.
>>>>>>> 
>>>>>>>> Daniel
>>>>>>>> 
>>>>>>>> On 3/23/10 10:27 AM, Peter Tröger wrote:
>>>>>>>>>> As I said in the email I just wrote, I'm willing to be convinced
>>>>>>>>>> of the
>>>>>>>>>> value of adding queues to the job submission side of things.  I
>>>>>>>>>> am, however, fundamentally opposed to adding queues to the
>>>>>>>>>> monitoring side.
>>>>>>>>> 
>>>>>>>>> I will heavily insist on queue support in DRMAAv2, This is a long
>>>>>>>>> demanded feature, which also popped up again in the survey.
>>>>>>>>> 
>>>>>>>>>> The various concepts of queues are too different for that to make
>>>>>>>>>> any
>>>>>>>>>> sense.  There is absolutely no way we will be able to model both
>>>>>>>>>> LSF and
>>>>>>>>>> SGE queues in a way that is abstract enough to be consistent and
>>>>>>>>>> still
>>>>>>>>>> specific enough to be meaningful and accurate.  We'll talk on the
>>>>>>>>>> next
>>>>>>>>>> call. :)
>>>>>>>>> 
>>>>>>>>> The intention of the current model is that JobTemplate::queueName
>>>>>>>>> and MonitoringSession:: drmsQueueNames act as counterparts. DRMAA
>>>>>>>>> would promise that all strings that show up in MonitoringSession::
>>>>>>>>> drmsQueueNames are valid input for JobTemplate::queueName. Nothing
>>>>>>>>> more.
>>>>>>>>> The use case are DRMAA-based portals and command-line applications.
>>>>>>>>> The interpretation of what a queue is can be provided by the
>>>>>>>>> library implementation - at the end, the user anyway has to reason
>>>>>>>>> about the meaning of queue names.
>>>>>>>>> 
>>>>>>>>> We could relax the conditions so that other values are also allowed
>>>>>>>>> in JobTemplate::queueName. This would allow MonitoringSession::
>>>>>>>>> drmsQueueNames to return nothing in SGE. This must be anyway
>>>>>>>>> possible - Condor has no queue concept at all.
>>>>>>>>> 
>>>>>>>>> I could also agree to remove MonitoringSession::
>>>>>>>>> queueMaxWallclockTime and  MonitoringSession::
>>>>>>>>> queueMaxSlotsAllowed, since these two attributes are the ones that
>>>>>>>>> demand a particular understanding of what a queue is.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Peter.
> 
> -- 
> Yves Caniou
> Associate Professor at Université Lyon 1,
> Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
> Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
> * in Information Technology Center, The University of Tokyo,
>   2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
>   tel: +81-3-5841-0540
> * in National Institute of Informatics
>   2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
>   tel: +81-3-4212-2412 
> http://graal.ens-lyon.fr/~ycaniou/
> --
> drmaa-wg mailing list
> drmaa-wg at ogf.org
> http://www.ogf.org/mailman/listinfo/drmaa-wg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2208 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/drmaa-wg/attachments/20100329/60e740b7/attachment-0001.bin 


More information about the drmaa-wg mailing list