[DRMAA-WG] MonitoringSession

Peter Tröger peter at troeger.eu
Tue Nov 9 04:04:27 CST 2010


Hi,


Am 08.11.2010 um 18:59 schrieb Mariusz Mamoński:

> 2010/11/8 Daniel Gruber <daniel.x.gruber at oracle.com>:
>> On 11/08/10 16:58, Mariusz Mamoński wrote:
>>> 
>>> Can we split the discussion into two paths? ;-)
>>> 
>>> Does anyone vote against 1. point (MachineInfo)?
>>> 
>> 
>> When there is a method "getMachineInfo(in string machineName)"
>> we would also require "getQueueInfo(in stirng queueName)" for consistency.
> agree.
>> Then in total we would have "getQueueNames", "getMachineNames",
>> "getMachineInfo", "getQueueInfo", and "getAllJobs". Which would result in a
>> pretty clear interface. But the current approach has its strengths in
>> simplicity when accessing the values. I would keep the interface simple and
>> portable in case of monitoring since this is IMHO not our core competence.
> what about efficiency? in some systems the cost of fetching one of the
> host attribute is equal to fetching all of them and IMHO user is
> usually interested in at least few of them.

The whole discussion already took place during several phone calls, and Mariusz never managed to get his 'consistency' proposal through - even though he tried hard ;-). 
I vote against opening new API structure discussions at this point. There was enough time for such objections in the past.

>>>  We could make maxSlotsCount optional (like the threadsPerCore) and
>>> define it only as the upper bound (and only for the purpose of
>>> computing host/cluster "capacity"). Alternatively: what do you think
>>> about the idea of:
>>>  - changing threadsPerCore -> maxThreads
>>>  - relaxing meaning of the threadsPerCore/maxThreads: as "maximal
>>> number of threads that can run *simultaneously* on given machine"
>>> without saying if this is imposed by hardware configuration or system
>>> policy.
>> 
>> The other machine values (load, sockets, cores, threads, physical mem,
>> virtual mem, machine os, OS version, machine arch) are not user dependent
>> hence it would break consistency. Queue values should be user dependent.

Same counter argument from my side. We had long and painful slot-related discussions during the phone calls. The overall agreement was that DRMAA cannot apply any meaning to the slot concept, so we just treat it is opaque monitoring data. Check the meeting minutes.

Best,
Peter.



More information about the drmaa-wg mailing list