[jsdl-wg] Process Topology

Christopher Smith csmith at platform.com
Thu Apr 21 11:51:49 CDT 2005


On 19/4/05 22:05, "Andreas Savva" <andreas.savva at jp.fujitsu.com> wrote:
>> 
>> If we had a term called "TotalCPUCount" for the entire job, I could do:
>> 
>> 4. TotalCPUCount == 32
>>   -> LSF : "-n 32"
>>   -> PBS : "not sure how to express"
> 
> Using the current terminology (bear with me) I would translate it as
> 
> <Application>
>     ...
>     <jsdl-posix:POSIXApplication>
> ...
> <ProcessCount>32</ProcessCount>
>     </jsdl-posix:POSIXApplication>
> </Application>
> <Resource>
>    ...
>    <ResourceCount>
>       <LowerBoundedRange>1.0</LowerBoundedRange>
>    </ResourceCount>
> </Resource>
> 
This works ... I didn't think of using the range on ResourceCount. :-)

> I translate "No tiling constraints" as meaning TileSize=1  and since it
> is the default value I have omitted it.
> 
Nope ... "no tiling constraints" means "no tiling constraints" (i.e.
TileSize undefined). TileSize=1 is a tiling constraint.

> 
> We need TileSize. I agree that the default ResourceCount=1 definition
> should be changed to  'undefined' as you mention in a subsequent email.
> 
> So there are two issues:
> 1. Whether the topology requirements should be in the Application
> section or not. If they are in the Application section then the terms
> used should not be resource-flavored, i.e., not TotalCPUCount but
> something else.
> 2. How to rename 'ProcessCount' to eliminate the confusion with the term
> 'process'
> 
> My answer to (1) is to keep these in the Application section. I am not
> sure how to rename ProcessCount though.
> 
Ha ... so my answer to (1) is to put it somewhere else (near the Resource
section). My view on this is that TotalCPUCount and TileSize are resource
requirements on the global allocation, and not really tied to the
application at all (i.e. they equally apply to POSIX applications, a
clustered service instance, etc, etc).

I basically like to categorize things based on whether they are associated
with allocating resources, or whether they are associated with binding the
"work unit" to the allocation, since these are often two separate phases to
getting work done in a batch system, or other execution management systems.

You can also subcategorize allocation requirements based on whether they
apply globally to the entire allocation (e.g. TotalCPUCount) or whether they
apply at an individual resource level (e.g. CPUCount). I don't think we have
the notion of the former, do we?

> Could I also ask you to let me know if the examples in section 8 of the
> spec actually make sense or not?
> 
Sort of? I won't be sure until we agree on the terminology changes.

I actually think that my 4 use cases cover it pretty well (from the
allocation point of view for parallel jobs), although some examples could be
used to illustrate the use of CPUCount ... perhaps in conjunction with an
"Exclusive" flag. 

-- Chris






More information about the jsdl-wg mailing list