[ogsa-hpcp-wg] Minutes for OGSA HPC Profile telecon (Aug 25 2006)

Marty Humphrey humphrey at cs.virginia.edu
Sat Aug 26 10:00:59 CDT 2006


 

OGSA HPC Profile Minutes - Fri Aug 25 2006

 

Participants:

            Marty Humphrey            (University of Virginia)

            Marvin Theimer             (MSFT)

            Glenn Wasson               (University of Virginia)

            Michel Drescher            (Fujitsu Europe)

 

Minutes: Marty Humphrey

 

 

* Summary of Actions:

 

            AI-HPCP-0825a: Everyone will re-read Donal's version of the HPC
Profile Application

                        in the JSDL section of Gridforge by Friday Sept 1
call.

 

            AI-HPCP-0825b: Check with BES doc and/or WG to determine the
"final word"

                                    on issues of idempotency and
subscriptions (and others).

 

            AI-HPCP-0825c: Marvin will suggest phrasing re: vectors to the
email

                                    group for consensus. 

 

            AI-HPCP-0825d: Marvin will mopdify HPC Profile doc to resolve
Chris' comments,  

                                    restrict to integer the total CPU count,
total resource count 

                                    and individual CPU count.

 

            AI-HPCP-0825e: Marty to send email to BES/JSDL/HPCP: "We are
looking 

                                    for prototype and interop participants."

 

 

* Previous minutes: Approved 

 

* Review of Actions

 

            AI-HPCP-0818a: Donal Fellows to review HPC Profile Application.

 

                        DONE. On Mon Aug-21, Donal Fellows uploaded a new
version to the JSDL space in 

                        Gridforge. The main modification is a more
comprehensive security section.

                        Our group discussion in today's telecon on this
subject resulted in 

                        AI-HPCP-0825a:

 

                        AI-HPCP-0825a: "Everyone" will re-read it (in the
JSDL section of 

                                    Gridforge). On the Sept 1 call, we will
determine (via vote) 

                                    if we believe that it's ready for
ratification via the JSDL 

                                    group and the large OGF at large.

 

                        In short,  Michel suggested that we keep it in the
HPCP

                        gridforge section while it is being actively worked
on and transition it to 

                        the JSDL section when it's ready for "normative
radification". Because it's 

                        currently in the JSDL forge section, we will only
"pull it back" to HPCP if, 

                        upon reading it, we decide (collectively) that it
needs modifications (see

                        AI-HPCP-0825a).

 

            AI-HPCP-0818b: Marvin Theimer will add paragraph to HPC Basic
Profile

                                    stating: "all other elements of the JSDL
1.0 MAY appear

                                    but MAY           also result in a 'not
implemented' fault, but

                                    only the list here must be supported"

                        

                        DONE. This is discussed in this telecon, below.  

 

            AI-HPCP-0818c: Marvin Theimer will add first draft of section 4
to

                                    HPC Basic Profile that restricts BES
operations to

                                    singletons

 

                        DONE. This is discussed in this telecon, below.  

 

* Go over changes to HPC Profile that Marvin sent out 

-----------------------------------------------------

 

            Marvin: Before giving an overview of the modifications, there's
an issue/question 

                        to BES guys: what about idempotency and
subscriptions (and perhaps other

                        issues)? That is, this was either in a previous
version of the BES space *OR*

                        explicitly mentioned to be included in a future
version (but it's not mentioned

                        in the current BES spec). We understand that this
will appear as an option in

                        the BES spec, but this has implications on the HPC
Profile (e.g., we might like

                        to restrict this and/or claim that such
considerations are out-of-scope).

            Marty: Yes, we knew this going in -- that JSDL is "easier" in
that it's fixed in

                        stone, while BES is a moving target. This makes
profiling difficult.

 

                        AI-HPCP-0825b: Check with BES doc and/or WG to
determine the "final word"

                                    on this issue. 

 

            Marvin gives an overview of his modifications, beginning with
the proposed text

                        in the Intro paragrph to Section 3 (consensus:
working is fine).

            Marvin described Mods to Section 4: changed the intro, and
vectors must be exactly 

                        length 1.

            The group discussed at depth the issue of the treatment of
vectors

                        Initially, the group wanted to insert something akin
to Michel's 

                                    proposed test in email 8/25

                        Glenn: isn't there potential confusion: the client
sends a vector of 

                                    size greater than 1... what exactly does
the service do? Does 

                                    the service have a choice which one to
choose?

                        Marvin: BES needs to deal with this case, although
there are two different

                                    classes of faults. There is a BES fault
- "error in 

                                    inputs (the following inputs -- activity
IDs -- were unknown)" and 

                                    an HPC Profile fault ("the vector is too
large")

                        Glenn: suggests that the base profile should say 1
(and return a size of 1)

                                    and greater than 1 is a fault;
extension: here's how it maps 

                                    differing size inputs to differing size
outputs

                        Marvin: asks for clarification: extension to BES,
right? this idea that 

                                    "I don't return exactly what was
specified" is a BES concept       

                        Glenn: yes

                        Michel: saying that you'll return a vector of size 1
retricts composability;

                                    instead, return a vector of the same
size

                        Marvin: Good point -- it may handle things of size
greater than one -- if 

                                    so, then it must return a vector of the
same size. But it may fault. 

                                    It MUST handle a vector of size 1. If it
handles a vector of size 

                                    larger than one, then it would publish
an attribute indicating so.

                        Michel suggests wording that the client "SHOULD" try
to use a vector of size 1

                                    but the group consensus is (while the
intent is good) that this is not

                                    the best wording because it could be
interpreted to make clients ONLY

                                    work in terms of vectors of size 1 --
this is not the intent and might

                                    restrict clients.

 

                                    AI-HPCP-0825c: Marvin will suggest
phrasing and send it to the email

                                                group for consensus. Marty
to make this into a tracker.

 

                        this concluded the discussion of vectors.

            

            Marty: What about total CPU count? "non-exact" clearly has
value, but it just 

                        seems too complicated for interop, for compliance

            Marvin: In addition, not all schedulers implement a range, so
let's make it exact. 

            Marty: Section 3.2.5.8. What is "total resource count"? 

            Michel: this is for "tiling" - e.g., 5x10processes.

            Marty: Let's make this exact as well. Group consensus is that
this is reasonable. 

            Marvin: some of these issues can arise and be resolved during
implementations as well

 

            Marty: what about Chris' comments? Re: RAM... "total physical
memory" is certainly easier.

            Michel: JSDL had long discussions... it comes down to
requirements and capabilities. 

            Marvin: this must be clarified in the HPC profile.  in our JSDL
section, make these 

                        appear as requirements. That is, the phrasing of the
text in the JSDL section 

                        of the HPC Profile should clearly indicate that
they're *requirements* (not

                        capabilities). In the BES section of the HPC
profile, make the text clearly

                        reflect that these are capabilities (not
requirements). 

            Michel: Section 3 of HPCP is clear. Section 4 needs
clarification: if BES exposes this 

                        via JSDL terms, then we need to clarify that these
are capabilities, which 

                        are static values.

            Marvin: 3.2.5.6 and 3.2.5.7 need to have clarifications changed.

                        

            Glenn: we need to restrict to integer in the total CPU count,
total resource count 

                        and individual CPU count 

            Group consensus on this 

 

                        AI-HPCP-0825d: Marvin will resolve Chris' comments
by making clarifications that 

                                    these are requirements. In addition,
Marvin will change the document to 

                                    restrict to integer the total CPU count,
total resource count 

                                    and individual CPU count 

 

* Discuss prototyping and interop opportunities and next-steps.

---------------------------------------------------------------         

 

            The group agreed that the HPC Profile document is reasonably
close to completion, so it 

                        was suggested that we start developing prototypes
and perform interop testing 

                        to further disambiguate the document (and start
developing a compliance suite)

            Michel: this document should get into public comments first, so
we should wait before

                        creating implementations

            Marvin: I propose something slightly different... I propose that
we put up for 

                        public comment and in parallel do implementations...
this is primarily 

                        due to time contraints.

 

            Marty: Public comment is good, of course, but can we do anything
to ensure that 

                        the vendors are engaged/committed?

            Marvin: I will send it directly to Sun, Altair, Condor, Globus,
ESI, etc.... 

                        "please please review this carefully and
contribute..."

 

            Marty: if we get Marvin's changes in the BES (and "security
section"), we will 

                        SUSPEND development of the HPCP docment

 

                        AI-HPCP-0825e: Marty to send email to BES/JSDL/HPCP:
"We are looking 

                                    for prototype and interop participants.
We will be start to 

                                    discuss compliance suite. This will be
the agenda for next week's

                                    call. first round of interop will be
attempted BEFORE ggf18 

                                    (exactly what? we're not sure). Two
phases: phase 1: quick, 

                                    phase 2: for interop after GGF18.
Discussion fo extensions 

                                    can be accomodated RIGHT NOW in parallel
to this -- to keep 

                                    the momentum going.

 

Meeting concluded. Next call: Friday Sep 1 2006 7am Pacific time.

 

 

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-hpcp-wg/attachments/20060826/a72234ed/attachment.html 


More information about the ogsa-hpcp-wg mailing list