[DRMAA-WG] Load average interval ?

Daniel Gruber D.Gruber at Sun.COM
Thu Mar 25 10:50:17 CDT 2010


I would also vote for the total amount of cores and sockets :)

We could also think about reporting the amount of concurrent
threads that are supported by the hardware (hyperthreading in
case of Intel or chip-multithreading in case of Sun T2 processors).
This could prevent the user for puzzling out what is meant by
a core (is it a real one, or the hyperthreading/CMT thing).

If not we should at least define that a core is really a physical core.

Daniel


On 03/25/10 15:44, Daniel Templeton wrote:
> I would tend to agree that total core count is more useful.  SGE also 
> reports socket count as of 6.2u5, by the way.  (That's actually thanks 
> to our own Daniel Gruber.)
>
> Daniel
>
> On 03/25/10 07:03, Mariusz Mamoński wrote:
>   
>> Also for me. As we are talking about monitoring interface i propose
>> two more changes to the machine monitoring interface:
>>
>> 1. Having a new data struct called "MachineInfo" with attributes like
>> Load, PhysMemory, ... and getMachineInfo(in String machineName) method
>> in the Monitoring interface. Rationale: the same as for the JobInfo
>> (consistency issue, fetching all machines attributes at once is more
>> natural in DRMS APIs then querying for each attribute separately)
>>
>> 2. change machineCoresPerSocket to machinesCores, if one have
>> machineSockets he or she can easily determine the
>> machineCoresPerSocket. The problem with the current API is that if the
>> DRM do not support "machineSockets" (as far i checked only LSF provide
>> this two-level granularity @see Google Doc) we loose the most
>> essential information: "how many single processing units do we have on
>> single machine?"
>>
>> Cheers,
>>
>> On 23 March 2010 23:00, Daniel Templeton<daniel.templeton at oracle.com>  wrote:
>>     
>>> That's fine with me.
>>>
>>> Daniel
>>>
>>> On 03/23/10 13:51, Peter Tröger wrote:
>>>       
>>>>> Any non-SGE opinion ?
>>>>>           
>>>> Here is mine:
>>>>
>>>> I could only find one single source that explains the load average
>>>> source in Condor :)
>>>>
>>>> http://www.patentstorm.us/patents/5978829/description.html
>>>>
>>>> Condor provides only the 1-minute load average from the uptime command.
>>>>
>>>> Same holds for Moab:
>>>> http://www.clusterresources.com/products/mwm/docs/commands/checknode.shtml
>>>>
>>>> And PBS:
>>>> http://wiki.egee-see.org/index.php/Installing_and_configuring_guide_for_MonALISA
>>>>
>>>> And MAUI:
>>>> https://psiren.cs.nott.ac.uk/projects/procksi/wiki/JobManagement
>>>>
>>>> I vote for reporting only the 1-minute load average.
>>>>
>>>> /Peter.
>>>>
>>>>         
>>>>> And BTW, by using the uptime(1) load semantics, we loose Windows
>>>>> support. There is no such attribute there, load is measured in
>>>>> percentage of non-idle time, and has no direct relationship to the
>>>>> ready queue lengths.
>>>>>
>>>>> Best,
>>>>> Peter.
>>>>>
>>>>> Am 22.03.2010 um 16:02 schrieb Daniel Templeton:
>>>>>
>>>>>           
>>>>>> SGE tends to look at the 5-minute average, although any can be
>>>>>> configured.  You could solve it the same way we did for SGE -- offer
>>>>>> three: machineLoadShort, machineLoadMed, machineLoadLong.
>>>>>>
>>>>>> Daniel
>>>>>>
>>>>>> On 03/22/10 06:05, Peter Tröger wrote:
>>>>>>             
>>>>>>> Hi,
>>>>>>>
>>>>>>> next remaining thing from OGF28:
>>>>>>>
>>>>>>> We support the determination of machineLoad average in the
>>>>>>> MonitoringSession interface. At OGF, we could not agree on which of
>>>>>>> the typical intervals (1/5/15 minutes) we want to use here. Maybe
>>>>>>> all of them ?
>>>>>>>
>>>>>>> Best,
>>>>>>> Peter.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>    drmaa-wg mailing list
>>>>>>>    drmaa-wg at ogf.org
>>>>>>>    http://www.ogf.org/mailman/listinfo/drmaa-wg
>>>>>>>               
>>>>>> --
>>>>>> drmaa-wg mailing list
>>>>>> drmaa-wg at ogf.org
>>>>>> http://www.ogf.org/mailman/listinfo/drmaa-wg
>>>>>>             
>>>>> --
>>>>>    drmaa-wg mailing list
>>>>>    drmaa-wg at ogf.org
>>>>>    http://www.ogf.org/mailman/listinfo/drmaa-wg
>>>>>           
>>>> --
>>>>     drmaa-wg mailing list
>>>>     drmaa-wg at ogf.org
>>>>     http://www.ogf.org/mailman/listinfo/drmaa-wg
>>>>         
>>> --
>>>   drmaa-wg mailing list
>>>   drmaa-wg at ogf.org
>>>   http://www.ogf.org/mailman/listinfo/drmaa-wg
>>>
>>>       
>>     
>
> --
>   drmaa-wg mailing list
>   drmaa-wg at ogf.org
>   http://www.ogf.org/mailman/listinfo/drmaa-wg
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20100325/400e349e/attachment-0001.html 


More information about the drmaa-wg mailing list