[DRMAA-WG] Call for participation: DRMAA2 machine monitoring
Peter Tröger
peter at troeger.eu
Tue Nov 10 08:07:57 CST 2009
In order to trigger some discussion, here is my first quick evaluation for LSF,
based on
http://www.zdv.uni-mainz.de/cms-extern/lsf/lsf6.0/pdf/manuals/lsf_ref_6.0.pdf
http://www.cisl.ucar.edu/docs/LSF/7.0.3/command_reference/lshosts.cmdref.html
http://ams.cern.ch/AMS/7/admin/troubleshooting.html
http://www.slac.stanford.edu/comp/unix/package/lsf/LSF6.1_doc/html/lsf6.1_admin/manage_hosts.html
> enum OperatingSystem {HPUX, LINUX, TRUE64, DUNIX, OSF1, MACOS, SUNOS,
> WIN, WINNT, AIX, UNIXWARE, BSD, OTHER}
>
> enum CpuArchitecture {ALPHA, PA-RISC, X86, X64, IA-64, MIPS, PPC, PPC64,
> SPARC, SPARC64, OTHER}
>
> interface MonitoringSessions{
>
> readonly attribute string[] drmVersionString;
> readonly attribute string[] drmMachineNames;
> int machineSockets(in string machineName);
> int machineCoresPerSocket(in string machineName);
> int machineLoad(in string machineName, in long coreNumber);
> int machinePhysMemory(in string machineName);
> int machineVirtMemory(in string machineName);
> OperatingSystem machineOS(in string machineName);
> string machineOSVersion(in string machineName);
> CpuArchitecture machineArch(in string machineName);
>
> };
LSF supports the "lshosts" command, which can show (beside other things) the
following machine information:
- host name (== machineName)
- type, e.g. CRAYJ, SUNSOL, ALPHA, RS6K, SGI6, HPPA, LINUX86 (== machineOS ???)
- model, e.g. Ultra2, SunSparc, DEC3000, IBM350, R10K, HP715, Intel_IA64,
Ultra5S, PowerPC_G4, HP300 (== machineArch)
- cpuf, the relative CPU performance factor (== ???)
- ncpus (== machineSockets * machineCoresPerSocket)
- nprocs (== machineSockets)
- ncores (== machineCoresPerSocket)
- maxmem (== machinePhysMemory)
- maxswp (== machineVirtMemory - machinePhysMemory)
- ndisks, the number of local disks (== ???)
- maxtmp, the maximum available temporary space (== ???)
This mapping is still incomplete, but at a first glance, our interface seems to
fit. Machine load information analysis is still unclear for me. LSF seems to
support a lot of mainframe / Unix architectures that are missing in our current
list. The amount of tmp space available might be an interesting addition.
Best,
Peter.
P.S.: In case nobody finds time to contribute, I will skip tomorrow's phone
call. We need to do the offline work first.
> --- snip
>
> Some rationales:
>
> The list of operating systems is a reduced version of the DMTF list I
> sent earlier, and currently only considers the supported OS types in
> Condor. The list of CPU architectures is a combination of the supported
> identifiers in Condor + Debian.
>
> It is assumed that each OS identifier only makes sense with an OS
> version number string, which is not standardized by DRMAA. It is
> tempting to derive this version number string from "uname -r" by
> default. However, this might be too much of information for a DRMAA
> application. You would get the Darwin kernel version in MacOS, or the
> specific minor build revision with a Linux kernel. I think that such
> information is not really useful for job submission decisions. Instead,
> I favour the interpretation as true "operating system version",
> something that does not change when you do software updates on the
> machine. Some examples:
>
> Snow Leopard: "MACOS" + "10.6"
> Windows 7: "WINNT" + "6.1"
> Ubuntu Jaunty Jackalope: "LINUX" + "2.6"
> Solaris 10: "SUNOS" + "5.10"
>
> Things I am not sure about:
>
> - Do we need to distinguish the different BSD derivations ?
> - Do we really need support for Non-NT Windows and OSF/1 ?
> - Is SCO OpenServer something different from SCO UnixWare ? Is this a
> relevant separation ?
> - Do we need to add mainframe operating systems ?
> - Do we need a more fine-grained distinguishing between different Sparc
> processors ?
> - What is missing for LSF / PBS / Globus / ... ?
>
> Best,
> Peter.
>
>
>
> ------------------------------------------------------------------------
>
> --
> drmaa-wg mailing list
> drmaa-wg at ogf.org
> http://www.ogf.org/mailman/listinfo/drmaa-wg
More information about the drmaa-wg
mailing list