[Pgi-wg] Meeting - Tuesday, 8 Dec 2009, 16:00 (CET) - Notes

Etienne URBAH urbah at lal.in2p3.fr
Tue Dec 8 13:29:00 CST 2009


Johannes and all,

Lot of thanks to Johannes for his notes.

I have improved below the formulation of :

-  what I said,

-  what I understood other participants said.

Best regards.

Etienne URBAH


Meeting - Tuesday, 8 Dec 2009, 16:00 (CET)

Participants:
Morris, Balazs, Etienne, Johannes, Aleksander

Agenda:
   Discussion of the work of Etienne on the "OGF PGI - AGU Execution 
Service Strawman Rendering"

Morris:
   Go through meeting notes from last meeting
   Would be good to have gLite participation

Balazs:
   We need at least one more person from gLite
   Real work can only be done with a screensharing tool
   Just discuss the ideas behind Etienne's work

Balazs (chairing the strawman rendering document discussion):
   Doc is the latest version of the doc (version 37) by Aleksander on 
the PGI GridForge page
   Page 4
   Synchronize, merge Etiennes's changes with actual model

   Inside createActivity (second sentence)

Etienne:
   The initial state of a job is the submitted state

Balazs:
   Substates ?

Etienne:
   Substates of 'Submitted' state are :  Incoming, Waiting, Outgoing
   - Incoming = Job inside input queue, has not been processed yet
   - Waiting  = JSDL is being processed
   - Outgoing = JobId or EPR has been given to the job, ready for next
state
   Any change must be performed in a consistent way

Balazs:
   Combining substates in a certain way -> clarify more
   Not combine submitted outgoing with response

Etienne:
   It is useful that each state really represent something
   If you do not give JobId at the end of 'Submitted' state, you have to
completely include the 'Submitted' state inside the 'Pre-processing' state.

Balazs:
   Only the first level states, 2nd third are optional?

Etienne:
   Important: at the end of submitted state - JobId associated

Balazs:
   Why at the end ?

Etienne:
   At latest at the end !
   Could be returned earlier also
   It is useful for the execution service for checking if it is possible
(JSDL checking)

Balazs:
   Not validate JSDL here
   When does execution service return response ?

Morris:
   Page 4, comments
   - In terms of UNICORE it is alway possible directly, you don't do 
brokering
   - Not exactly sure how this relates to the state model
   - Out of the scope of the session ?

Balazs:
   Now, we are not discussing staging of data
   We are discussing in which state job should be
   I do not see this written down now

Morris:
   It is in the job description document validation

Balazs:
   Missing the state model
   How do you read the job description validation?

Morris:
   Matchmaking
   If there is a must - there is a must
   In gLite, first from the broker, then from execution service

Balazs:
   First mandatory validation, then response containing maybe validation 
errors
   You have to validate first, to check for errors

Etienne:
   First mandatory validations:  XML, Schema, Semantic,
   Some validation may be only possible later ?

Balazs:
   All validation should be done before, then assign state
   You submit a job, execution service does couple of things like validation
   The first time the submitter can check the state is when he/she has 
the JobId

Etienne:
   At the beginning, I did not write the second level substates of 
'Submitted'.
   I added them after some request, and I am ready to remove them.

Balazs:
   Hold point is important :
   Substate of 'Submitted' which is a 'Hold' !

Etienne:
   Concerning the 'Submitted:Hold' substate, I sent a mail on 13 August

Balazs:
   changeState operation
   resume like operations
   andy kind of hold states

Morris:
   We need a figure :
   - not only the state model figure
   - but a step by step base like figure of presentation

Etienne:
   We have to agree on the scope of the 'Submitted' state :
   Inside the 'Submitted' state, as soon a job has its JobId, the job 
should go to the 'Pre-processing' state, where it can be put in a 'Hold' 
substate

Balazs:
   Is 'Pre-processing' different from data staging ?

Etienne:
   'Pre-processing' includes data staging
   Any other actions can be performed
   That's why there are third level 'Hold' substates
   There is 'Pre-processing:Hold:Suspended' when user wants 'Hold'
without relationship to staging

Morris:
   i.e. preprocessiong stage, only one CPU before using the whole 
machine not related to data staging

Balazs:
   'Pre-processing' is very complex state

Morris:
   During the execution vs. pre and post processing
   "before finished"
   Other requirement for putting it somewhere

Etienne:
   There is also the 'Post-processing:Hold:Suspend' state

Morris:
   Clearly identify it is 'Post-processing' : Job is finished

Etienne:
   Inside second level 'Hold' substates, third level substates can occur 
in any order at any time you want

Balazs:
   Pre, post processing very complex, dangerous
   I do not see a sequence how the job goes through
   I really would like to see a linear state model

Etienne:
   I agree with Balazs, inside the preprocessing hold there is no clear 
indication in which order the third level substates are processed
   I implicitely suppose that they can be processed in any order and any 
number of times
   - If it is achieved only by the implementation, then it is OK
   - If the order must be well known from the beginning, we have to 
completely review the state model

Balazs:
   When exactly this will happen?
   Put the job in 'Hold' state before any kind of data staging takes place

Etienne:
   If we keep model exactly as it is, it is not very precise, and 
several middlewares can implement it more easily.
   If we create a too precise model, it will be only implemented by one
middleware.
   Too many details will limit the chance to have different middlewares. 
Are you sure you want to do that ?

Balazs:
   Just want to stop the Job before 'Pre-processing'

Morris:
   Use case?
   You don't do data staging then?

Balazs:
   Use case :  For safety reasons, the Submitter just wants to be sure 
that the job does not perform anything.

Morris:
  'Pre' means before
  'Pre-processing' is not data staging

Balazs:
   Inside ARC it is a safety measure to be able to stop the job before 
running

Morris:
   I do not really see the difference
   User gets the feedback -> 'Submitted'
   'Submitted:Hold' sounds strange
   No distinction with 'Pre-processing'

Balazs:
   'Submitted:Hold' is the end of 'Submitted'

Morris:
   I am seeking to understand the difference ...
   Consistency in the state model
   In the case of 'automatic resubmission', people may want to specify 
certain 'Hold' points
   Use case maybe accounting problems

Etienne:
   Inside the submitted state, I propose to change the list of substates :
   Create one substate where the JobId is associated to the Job
   When the Job already has its JobId, the submitter is able to perform 
operations on the job
   If the JobId is associated to the Job in the middle of the 
'Submitted' state, there can be a 'Submitted:Hold' substate where the 
JobId has already been given to the Job
   We have to synchronize the state model with the messages exchanged

Aleksander:
   How can it be an activity without a JobId ?

Morris:
   Validation steps
   createActivity operations should only be successful if validation is 
successful
   If validation is not successful, no Job is created, no 'Hold' is possible

Etienne:
   Inside the 'Submitted' state, I propose to :
   - rename 'Submitted:Waiting' to 'Submitted:Validating'
   - add another substate, 'Submitted:Hold'
   JobId must be associated to the Job
   Corresponding messages have to be sent back to the Submitter
   Sequence :  Validation, Creation of JobId, then send JobId to Submitter

Morris:
   Still keep the 'Waiting' substate

Etienne:
   No, rename 'Waiting' by 'Validating'

Balazs:
   When everything is done, the job is ready to go to the next state
   Pre-processing, Delegated, Post-processing

Etienne:
   In which order should the third level substates be processed?

Balazs:
   'Hold' blocks :  blocking before going to the next state

Etienne:
   Second level 'Hold' substate before outgoing substate ?

Balazs:
   Yes

Etienne:
   I understand the requirement
   In fact, inside the 3 first level states preproc, delegated, 
postproc, there already is a second level 'Hold' substate corresponding 
to what Balazs wishes

Balazs:
   Different states ?  Second level 'Hold', third level 'Hold' ?

Morris:
   There might be middlewares without third level

Balazs:
   needs preprocessing hold

Morris:
   How is the third level flow ?

Balazs:
   Third and second level are optional

Morris:
   I disagree :
   Second level should be adopted by all
   submitted, preprocessing, delegating, postprocessing
   i.e. preprocessing hold failed recoverable
   not many middlewares can do this
   I agree with Balazs: aligning the third level of states

Etienne:
   I had written that only the first level was mandatory, but I agree to 
write that the first two levels are mandatory
   Besides, inside 'Submitted', I will add a 'Hold' substate

Morris:
   'Submitted:Incoming' is missing when comparing to the state model
   Is 'Submitted:Incoming' a substate or is it not?

Balazs:
   For a substate, the JobId must already exist

Etienne:
   I agree with Balazs :
   The initial substate of the 'Submitted' state should have no name
   I will remove the 'Submitted:Incoming' label, because at this point 
there is no JobId yet - The Job cannot be handled by the Submitter, it 
is just an internal substate

Morris:
   But we should take into account problems reported in logfiles

Etienne:
   Such problems are implementation-specific.
   We should expose only states which can be used by the Submitter
   So I will remove the 'Submitted:Incoming' label, which is not 
relevant for the Submitter
   Is there now an agreement on what is going on in the submitted state ?

Morris:
   'Submitted:Validating' should be renamed as 'Submitted:Validated' 
because there is a JobId only if the JSDL has already been validated.

Etienne.
   I agree, the JobId is given at 'Submitted:Validated' substate

   Now, I propose to return to the document
   Inside the 'Submitted' state, the Execution Service cannot always 
provide a storage location

Balazs:
   So we are switching the subject to to the data staging issue
   Data staging on createActivity
   Pull approach
   Service can publish information

Etienne:
   Please look at my mail, where I present 3 ways for message transfer
   Which way is the best ?

Someone:
   We can not suppose that the Submitter implements subscription to 
notifications.

Balazs:
   The Submitter needs the info about the data staging location

Etienne:
   Without notifications, the consistent solution is that the Submitter 
has to repeatedly poll until the job comes to manual staging

   If the Execution Service returns the JobId only at manual staging 
when the Job is inside the 'Pre-processing:Hold' substate, I cannot 
guarantee the consistency
   This has to be discussed, because I a afraid that it is not so easy 
to link the 'CreateActivity' return message of to such a late state

Morris.
   one figure not enough
   maybe an animation
   hopefully can work on this
   urgently needed here !!!

Balazs:
   I strongly wish to avoid notifications
   Publishing the staging location is not be coupled to job state
   Execution Service needs time to figure out the staging location - It 
publishes it after some time
   Why should this be coupled to a job state ??

Etienne:
   In some cases, the Execution Service has knowledge about the storage 
location already inside the submitted state, in some other cases it has 
not this knowledge

Balazs:
   The Job should not go to 'Pre-processing' before if the Execution 
Service does not know the staging location

Morris:
   Compared to job description validation
   i.e. we need 2 GB, requirement
   How it could be still valid to have it in submitted?

Etienne:
   A standard computing element really owns storage, so that at 
'Submitted:Validated', an Execution Service may already have allocated 
storage to the Job
   But if the Execution Service is divided into sequential parts where 
the 'Submitted' part does not know about physical storage but only about 
capabilities of batch systems and storage systems, there is a problem

Balazs:
   We should edit the text when everybody can see the document

Aleksander:
   validation, allocation
   data staging after allocation
   id should be assigned as soon job is validated
   client receives information, polling or notification

Morris:
   We will continue next time with desktop sharing tool

Etienne:
   Important questions :
   - Are these 3 ways described in my email consistent ?
   - Which ways are preferred?

Aleksander:
   Which mail ?

Etienne:
   Sent on 26 October, resent on 4 December

Morris:
   Continue next week


-----------------------------------------------------
Etienne URBAH         LAL, Univ Paris-Sud, IN2P3/CNRS
                       Bat 200   91898 ORSAY    France
Tel: +33 1 64 46 84 87      Skype: etienne.urbah
Mob: +33 6 22 30 53 27      mailto:urbah at lal.in2p3.fr
-----------------------------------------------------


On Tue, 08 Dec 2009, Johannes Watzl wrote:
> Dear all,
> 
> please see below the notes from today's meeting.
> 
> Best,
> Johannes
> 
> ...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5073 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.ogf.org/pipermail/pgi-wg/attachments/20091208/d8b63754/attachment.bin 


More information about the Pgi-wg mailing list