[gin-info] Possible list of subset data for info interop
JP Navarro
navarro at mcs.anl.gov
Wed Mar 1 10:40:28 CST 2006
Thomas,
Our (TeraGrid) definition of a subcluster is a homogeneous set of
resources,
so moving certain h/w characteristics to the subcluster level will
enable
users to select subclusters based on their characteristics without
having
to sift thru the properties of each subcluster node.
Users can submit to a subcluster using queues and/or node properties.
JP
On Mar 1, 2006, at 10:00 AM, Jennifer M. Schopf wrote:
> Thomas-
>
> I specifically put that list out as a set of attributes not as
> anything bound to GLUE or CIM so we could pick the information to
> be communicated in a schema-neutral way. The pieces should be
> looked at in that way, not as being categorized or grouped really.
> If you'd like to pull pieces out or group them differently than i
> have below that's fine.
>
> The total counts were something TG needed, and many users request -
> and in fact we come up with them in MDS by adding up other
> attributes, not by having it reported natively. I'm trying to
> separate out the data we need and the implementation of how it's
> achieved.
>
> We grouped things into subcluster data because that's how people
> are using the data - no one wants to wade through 1,000 nodes of
> information. That's why those attributes are there for our work
> with TG. This group may instead decide to only have node data,
> which is a discussion we should have. I would argue quite strongly
> that if we want this data to be useful, listing subcluster
> attributes will help. but perhaps that's a second stage.
>
> The unique ids really ARE unique in the data we collect from TG -
> we preface local names with a resource-specific "unique-fying"
> attribute. We need a way to differentiate queue names that are the
> same but at different sites, for example. I would argue quite
> strongly we need unique identifiers associated with all the
> components.
>
> -jen
>
>
>
>
> At 14:54 01/03/2006, Dr. Thomas Soddemann wrote:
>> Hi Jen,
>>
>> just a few thoughts on your suggestions:
>>
>> The GLUE schema addresses ComputingElements (CEs) rather than
>> queues. IMHO, queues are bound to a specific LRMS. A grid resource
>> management system like Globus GRAM or UNICORE NJS/TSI can make use
>> of a queue but its details should not be in the queue definition
>> (a queue does not need to know that it is used by a GRAM). In the
>> GLUE case of CEs the definition makes sense since a CE can be seen
>> as a stand-alone unit.
>> I do not see that the GLUE schema expresses here, what we may need
>> -- a description of queues, managed by local resource management
>> systems.
>>
>> - Another tiny thing, do we really need the total counts in the
>> cluster/subcluster data? A simple XPath expression would give it.
>> - SMPsize should be in the host definition, shouldn't it? Same for
>> OS and memory.
>> - Storage information should be allowed for host and cluster.
>>
>> What do you (or the GLUE people) mean with the unique ids? If the
>> hostname is its DN or DNS entry name it certainly is unique. Does
>> a queue really has to have a unique id? Same for the clusters.
>>
>> Cheers,
>> Thomas
>>
>> P.S.: Maybe we should put the things into an UML diagram in order
>> to have a clearer view of the whole picture.
>>
>> Jennifer M. Schopf wrote:
>>> Hello-
>>>
>>> At the grid interop session in GGF i said i'd circulate a
>>> document describing the TeraGrid deployment of MDS4 which
>>> included a possible starting point of a list of attributes that
>>> we could ask resources to advertise. The document is attached
>>> (any comments or interest in having your own MDS4 deployment let
>>> me know!) The (possible) minimal list of attributes is:
>>>
>>> Queue Data:
>>> Queue name
>>> Unique queue ID
>>> GRAM version
>>> GRAM host name
>>> GRAM port/url
>>> LRMS type LL
>>> LRMS version 3.2.1
>>> Total CPUs 128
>>> Free CPUs 16
>>> Queue status ...
>>> Total jobs 4
>>> Running jobs 4
>>> Waiting jobs 0
>>> Policy max wall clock time
>>> Policy max CPU time
>>> Policy max total jobs
>>> Policy max running jobs
>>>
>>> cluster/subcluster data:
>>> Type (cluster/Subcluster)
>>> Cluster/subcluster Name
>>> cluster/subcluster Unique ID
>>> Processor type
>>> Processor speed
>>> Total memory
>>> Operating system
>>> SMP size
>>> Total nodes
>>> Storage device name
>>> Storage device size
>>> Storage device available space
>>>
>>> Host data
>>> Host Name
>>> Host Unique ID
>>> Node properties
>>>
>>> Note that about 95% of this is currently available as mapped into
>>> the GLUE schema, there are a couple additions that aren't (total
>>> nodes and node properties, maybe one other, not sure).
>>>
>>> -jen
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dr. Jennifer M. Schopf
>>> Scientist eInfrastructure Policy Advisor
>>> Distributed Systems Lab National eScience Centre and JISC
>>> Argonne National Laboratory The University of Edinburgh
>>> jms at mcs.anl.gov jms at nesc.ac.uk
>>> http://www.mcs.anl.gov/~jms http://homepages.nesc.ac.uk/~jms
>>
>>
>>
>
>
> Dr. Jennifer M. Schopf
> Scientist eInfrastructure Policy Advisor
> Distributed Systems Lab National eScience Centre and JISC
> Argonne National Laboratory The University of Edinburgh
> jms at mcs.anl.gov jms at nesc.ac.uk
> http://www.mcs.anl.gov/~jms http://homepages.nesc.ac.uk/~jms
>
>
More information about the gin-info
mailing list