[ogsa-hpcp-wg] Minutes for OGSA HPC Profile telecon (Aug 25 2006)
Marty Humphrey
humphrey at cs.virginia.edu
Sat Aug 26 10:00:59 CDT 2006
OGSA HPC Profile Minutes - Fri Aug 25 2006
Participants:
Marty Humphrey (University of Virginia)
Marvin Theimer (MSFT)
Glenn Wasson (University of Virginia)
Michel Drescher (Fujitsu Europe)
Minutes: Marty Humphrey
* Summary of Actions:
AI-HPCP-0825a: Everyone will re-read Donal's version of the HPC
Profile Application
in the JSDL section of Gridforge by Friday Sept 1
call.
AI-HPCP-0825b: Check with BES doc and/or WG to determine the
"final word"
on issues of idempotency and
subscriptions (and others).
AI-HPCP-0825c: Marvin will suggest phrasing re: vectors to the
email
group for consensus.
AI-HPCP-0825d: Marvin will mopdify HPC Profile doc to resolve
Chris' comments,
restrict to integer the total CPU count,
total resource count
and individual CPU count.
AI-HPCP-0825e: Marty to send email to BES/JSDL/HPCP: "We are
looking
for prototype and interop participants."
* Previous minutes: Approved
* Review of Actions
AI-HPCP-0818a: Donal Fellows to review HPC Profile Application.
DONE. On Mon Aug-21, Donal Fellows uploaded a new
version to the JSDL space in
Gridforge. The main modification is a more
comprehensive security section.
Our group discussion in today's telecon on this
subject resulted in
AI-HPCP-0825a:
AI-HPCP-0825a: "Everyone" will re-read it (in the
JSDL section of
Gridforge). On the Sept 1 call, we will
determine (via vote)
if we believe that it's ready for
ratification via the JSDL
group and the large OGF at large.
In short, Michel suggested that we keep it in the
HPCP
gridforge section while it is being actively worked
on and transition it to
the JSDL section when it's ready for "normative
radification". Because it's
currently in the JSDL forge section, we will only
"pull it back" to HPCP if,
upon reading it, we decide (collectively) that it
needs modifications (see
AI-HPCP-0825a).
AI-HPCP-0818b: Marvin Theimer will add paragraph to HPC Basic
Profile
stating: "all other elements of the JSDL
1.0 MAY appear
but MAY also result in a 'not
implemented' fault, but
only the list here must be supported"
DONE. This is discussed in this telecon, below.
AI-HPCP-0818c: Marvin Theimer will add first draft of section 4
to
HPC Basic Profile that restricts BES
operations to
singletons
DONE. This is discussed in this telecon, below.
* Go over changes to HPC Profile that Marvin sent out
-----------------------------------------------------
Marvin: Before giving an overview of the modifications, there's
an issue/question
to BES guys: what about idempotency and
subscriptions (and perhaps other
issues)? That is, this was either in a previous
version of the BES space *OR*
explicitly mentioned to be included in a future
version (but it's not mentioned
in the current BES spec). We understand that this
will appear as an option in
the BES spec, but this has implications on the HPC
Profile (e.g., we might like
to restrict this and/or claim that such
considerations are out-of-scope).
Marty: Yes, we knew this going in -- that JSDL is "easier" in
that it's fixed in
stone, while BES is a moving target. This makes
profiling difficult.
AI-HPCP-0825b: Check with BES doc and/or WG to
determine the "final word"
on this issue.
Marvin gives an overview of his modifications, beginning with
the proposed text
in the Intro paragrph to Section 3 (consensus:
working is fine).
Marvin described Mods to Section 4: changed the intro, and
vectors must be exactly
length 1.
The group discussed at depth the issue of the treatment of
vectors
Initially, the group wanted to insert something akin
to Michel's
proposed test in email 8/25
Glenn: isn't there potential confusion: the client
sends a vector of
size greater than 1... what exactly does
the service do? Does
the service have a choice which one to
choose?
Marvin: BES needs to deal with this case, although
there are two different
classes of faults. There is a BES fault
- "error in
inputs (the following inputs -- activity
IDs -- were unknown)" and
an HPC Profile fault ("the vector is too
large")
Glenn: suggests that the base profile should say 1
(and return a size of 1)
and greater than 1 is a fault;
extension: here's how it maps
differing size inputs to differing size
outputs
Marvin: asks for clarification: extension to BES,
right? this idea that
"I don't return exactly what was
specified" is a BES concept
Glenn: yes
Michel: saying that you'll return a vector of size 1
retricts composability;
instead, return a vector of the same
size
Marvin: Good point -- it may handle things of size
greater than one -- if
so, then it must return a vector of the
same size. But it may fault.
It MUST handle a vector of size 1. If it
handles a vector of size
larger than one, then it would publish
an attribute indicating so.
Michel suggests wording that the client "SHOULD" try
to use a vector of size 1
but the group consensus is (while the
intent is good) that this is not
the best wording because it could be
interpreted to make clients ONLY
work in terms of vectors of size 1 --
this is not the intent and might
restrict clients.
AI-HPCP-0825c: Marvin will suggest
phrasing and send it to the email
group for consensus. Marty
to make this into a tracker.
this concluded the discussion of vectors.
Marty: What about total CPU count? "non-exact" clearly has
value, but it just
seems too complicated for interop, for compliance
Marvin: In addition, not all schedulers implement a range, so
let's make it exact.
Marty: Section 3.2.5.8. What is "total resource count"?
Michel: this is for "tiling" - e.g., 5x10processes.
Marty: Let's make this exact as well. Group consensus is that
this is reasonable.
Marvin: some of these issues can arise and be resolved during
implementations as well
Marty: what about Chris' comments? Re: RAM... "total physical
memory" is certainly easier.
Michel: JSDL had long discussions... it comes down to
requirements and capabilities.
Marvin: this must be clarified in the HPC profile. in our JSDL
section, make these
appear as requirements. That is, the phrasing of the
text in the JSDL section
of the HPC Profile should clearly indicate that
they're *requirements* (not
capabilities). In the BES section of the HPC
profile, make the text clearly
reflect that these are capabilities (not
requirements).
Michel: Section 3 of HPCP is clear. Section 4 needs
clarification: if BES exposes this
via JSDL terms, then we need to clarify that these
are capabilities, which
are static values.
Marvin: 3.2.5.6 and 3.2.5.7 need to have clarifications changed.
Glenn: we need to restrict to integer in the total CPU count,
total resource count
and individual CPU count
Group consensus on this
AI-HPCP-0825d: Marvin will resolve Chris' comments
by making clarifications that
these are requirements. In addition,
Marvin will change the document to
restrict to integer the total CPU count,
total resource count
and individual CPU count
* Discuss prototyping and interop opportunities and next-steps.
---------------------------------------------------------------
The group agreed that the HPC Profile document is reasonably
close to completion, so it
was suggested that we start developing prototypes
and perform interop testing
to further disambiguate the document (and start
developing a compliance suite)
Michel: this document should get into public comments first, so
we should wait before
creating implementations
Marvin: I propose something slightly different... I propose that
we put up for
public comment and in parallel do implementations...
this is primarily
due to time contraints.
Marty: Public comment is good, of course, but can we do anything
to ensure that
the vendors are engaged/committed?
Marvin: I will send it directly to Sun, Altair, Condor, Globus,
ESI, etc....
"please please review this carefully and
contribute..."
Marty: if we get Marvin's changes in the BES (and "security
section"), we will
SUSPEND development of the HPCP docment
AI-HPCP-0825e: Marty to send email to BES/JSDL/HPCP:
"We are looking
for prototype and interop participants.
We will be start to
discuss compliance suite. This will be
the agenda for next week's
call. first round of interop will be
attempted BEFORE ggf18
(exactly what? we're not sure). Two
phases: phase 1: quick,
phase 2: for interop after GGF18.
Discussion fo extensions
can be accomodated RIGHT NOW in parallel
to this -- to keep
the momentum going.
Meeting concluded. Next call: Friday Sep 1 2006 7am Pacific time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-hpcp-wg/attachments/20060826/a72234ed/attachment.html
More information about the ogsa-hpcp-wg
mailing list