[Pgi-wg] Meeting - Tuesday, 8 Dec 2009, 16:00 (CET) - Notes

Johannes Watzl watzl at nm.ifi.lmu.de
Tue Dec 8 10:38:26 CST 2009


Dear all,

please see below the notes from today's meeting.

Best,
Johannes

Meeting - Tuesday, 8 Dec 2009, 16:00 (CET)

Participants:
Morris, Balazs, Etienne, Johannes, Aleksander

Agenda:
Discussion of the work of Etienne on the "OGF PGI - AGU Execution
Service Strawman Rendering"

Morris:
go through meeting notes from last meeting

would be good to have gLite participation

Balazs:
we need at least one more person from gLite
real work can only be done with a screensharing tool
just discuss the ideas behind Etienne's work

Balazs (chairing the strawman rendering document discussion):
doc is the latest version of the doc (version 37) by Aleksander on the
PGI GridForge page

page 4
synchronize, merge Etiennes's cahnges with actual model

in createActivity (second sentence)

Etienne:
the initial state of a job is the submitted state

Balazs:
substate?

Etienne:
incoming, waiting, outgoing

input queue, has not been processed yet

submitted waiting: it is being processed
submitted outgoing: id or epr has been given to the job, ready for next
state

change that in a consistent way

Balazs:
combining substates in a certain way -> clearify more
not combine submitted outgoing with response

Etienne:
useful that each state will represent something
if you do not give job id at the end of submitted state, you have to
completely include the submitted state

Balazs:
only the first level states, 2nd third are optional?

Etienne:
important: at the end of submitted state - job id associated

Balazs:
why at the end?

Etienne:
at least at the end!
could be returned earlier also
it is useful for the execution service for checking if it is possible
(JSDL checking)

Balazs:
not validate JSDL here

when execution service returns response?

Morris:
page 4, comments

in terms of UNICORE it is alway possible directly, you don't do brokering

not exactly sure how this relates to the state model
out of the scope of the session

Balazs:
not discussing staging of data
discussing in which state job should be

does not see this written down now

Morris:
it is in the job description document validation

Balazs:
missing the satet model
how do you read the job description validation?

Morris:
matchmaking

if there is a must - there is a must

in gLite, first from the broker, then from execution service


Balazs:
first mandatory validation, then respose, maybe validation errors
you have to validate first, to check for errors

Etienne:
first mandatory validations:
XML schema, semantic

some validation may be only possible later?!

Balazs:
all validation should be done before
then assign state

you submit a job, execution service does couple of things like validation
the first time you can check the state, when you have the job id

Etienne:
in the beginning did not write the internal second level submit state
added them for some request
ready to remove them

Balazs:
hold point is important

substate of the submitted which is a hold!

Etienne:
first sent on 13 Aug

Balazs:
changeState operation

resume like operations
andy kind of hold states

Morris:
need a figure
not the state model figure

should have a step by step base like figure of presentation

Etienne:
agree of the scope of the submitted state
inside the submitted state as soon has a job id, the job should go to
the preprocessing state, where it can be held

submitted state a certain time, in the beginning already job id
can go to a hold substate

Balazs:
preprocessing: different data staging

Etienne:
preprocessing includes data staging
any other actions performed
that's why third level hold
we have another preprocessiong hold suspended when user wants to hold
withoud relationship to staging

Morris:
i.e. preprocessiong stage, only one CPU before using the whole machine
not related to data staging

Balazs:
proprocessing is very complex state

Morris:
during the execution vs. pre and post processing

"before finished"
other requirement for putting it somewhere

Etienne:
postprocessing hold suspened state

Morris:
clearly identify it is postprocessing - job is finished

Etienne:
can go in any order at any time you want

Balazs:
pre, post processing very complex, dangerous
does not see a sequence how the job goes through

Balazs:
really would like to see a linear state model

Etienne:
agree with Balazs, inside the preprocessing hold there is no clear
idication in which order the third level substates are processed
implicitely suppose: can be processed in any order and any number of times
achieved only ba the implementation then it is ok

the order must be well known from the beginnging, we have to renew the
state model

Balazs:
when exactly this will happen?
put the job in a hold before any kind of data staging takes place

Etienne.
if we keep model exacly as it is, it is not very precise,
implementations can do it more easy
if we create a too precise model, it will be only implemented by one
middleware
to many details, will limit the chance to have different middlewares
are you shure you want to do that?

Balazs:
just want to stop the job before preprocessing

Morris:
use case?
you don't do data staging then?

Balazs:
just to be sure, the job does no other things
for safety reason

Morris:
pre means before
preprocessing is not data staging

Balazs:
in ARC it is a safety measure to be able to stop the job before running

Morris:
does not really see the difference
user gets the feedback
-> submitted

submitted hold sounds strange

no destinction to preprocessing

Balazs:
submitted hold is the end of submitted

Morris.
seeking to understand the difference

consistency in the sate model

automatic resubmission
people may want to specify certain points
use case maybe accounting problems

Etienne:
inside the submitted state, reformulate substates
one where the jobid is associated to the job after

second job already has the job id, user is able to perform an operation
on the job

the jobid is associated in the middle of the submitted state
there can be a submitted hold where the id has already given to the job

we have to synchronize the state model with the messages exchanged

Aleksander:
how can it be an activity without an id

Morris:
validation steps
createActivity operations should only be successful if validation successful
if validation not successful, no job created, no hold

Etienne:
in figure
rename submitteds waiting to submitted validating
add another substate, submitted hold

id must be associated to the job
corresponding messages have to be sent back to the user

validation, creation of id, then send id to user

Morris:
put still the waiting

Etienne:
rename waiting by validating

Balazs:
everything is done, job is ready to go to the next state
preprocessing, delegating, postprocessing

Etienne:
in which ord third level substates are processed?

Balozs:
hold blocks
blocking before going to the next state

Etienne:
second level hold substate before outgoing substate?

Balazs:
yes

Etienne:
understands the requirement
infact in the three secondlevel states
preproc,delegating postproc

secondlevel hold substate corresponds to what Balazs wishes

Balazs:
different states secondlevel hold, thrid level hold?

Morris:
there might be middlewares without third level

Balazs:
needs preprocessing hold

Morris:
how is the third level flow?

Balazs:
optional: third, second level

Morris:
disagree
second level should be adopted from all

submitted, preprocessing, delegating, postprocessing

i.e. preprocessing hold failed recoverable
not many middlewares can do this

agree with Balazs: aligning the thrird level of states

Etienne:
first two levels are mandatory

inside submitted: add a hold substate

Morris:
submitted incoming is missing

comparing to the state model

submitted incoming is a state or is it no state?

Balazs:
job ide already

Etienne:
no
initial substate of the submitted state
state without a name

remove submitted incoming
at this point no job id yet - cannot be handled
just internal state

Morris:
problems reported in logfiles

Etienne:
expose only states which can be used by the submitter

remove the submitted incoming label - it is not relevant to the submitter

agreement on what is going on in the submitted state.

Morris:
submitted validated

only job id if it is validated

Etienne.
job id is given at validated substate

return to the document
inside the submitted state, the execution servcie cannot always provide
the location

Balazs:
switch to the data staging issue

data staging on createActivity

pull approach

service can publish information

Etienne:
mail: three ways for message transfer

which way is the best?

Balazs:
client needs the info about the data staging location

Etienne:
have to repeatedly poll until the job comes to manual staging

if there is  manual staging the job goes the preprocessing state, hold

cannot guarantee the consistency of that - to be discussed

perhaps not so easy to link the return to a late state

Morris.
one figure not enough
maybe an animation

hopefully can work on this
ugently needed here!!!

Balazs:
strongly avoid notification


publish the staging location, not be coupled to job state

server needs time to figure out the staging location - publish after
some time
why should this be coupled to a job state??

Etienne:
in some cases the service has inner knowledge about the storage location
inside the submitted state

Balazs:
job should not be preprocessing before it does not know the staging location

Morris:
compared to job description validation

i.e. we need 2 GB, requirement
how it could be still vaild to have it in submitted?

Etienne:
computing element whcih really owns storage
but if it is divided into sequential parts where submitted parts do not
know about physical storage but of batch systems or strage systeme,
there is a problem

Etienne:
at validated an execution service may already have allocated storage

Balazs:
we should edit the text when everybody can see the document

Aleksander:
validation, allocation
data staging after allocation
id should be assigned as soon job is validated
client receives information, polling or notification

Morris:
continue next time with desktop sharing tool

Etienne:
are these three ways from email consistent?
which ways are preferred?

Aleksander:
resend the mail!

Etienne:
sent on 26 Oct, 4 Dec

Morris:
continue next week







-- 
        _  _ _  _ _  _          Johannes Watzl
        |\/| |\ | |\/|          Institut für Informatik / Dept. of CS
        |  | | \| |  |          Ludwig-Maximilians-Universität München
     ======= TEAM =======       Oettingenstr. 67, 80538 Munich, Germany
                                Room E 005, Phone +49-89-2180-9162
Munich Network Management Team  Email: watzl at nm.ifi.lmu.de
Münchner Netz-Management Team   http://www.nm.ifi.lmu.de/~watzl


More information about the Pgi-wg mailing list