[Pgi-wg] WG: OGF PGI Session 3, 14:00-15:30 Draft Minutes

Morris Riedel m.riedel at fz-juelich.de
Thu Sep 22 02:42:36 CDT 2011


Was blocked...

>-- -----Ursprüngliche Nachricht-----
>-- Von: Daniel S. Katz [mailto:dsk at ci.uchicago.edu]
>-- Gesendet: Donnerstag, 22. September 2011 09:39
>-- An: Steve Crouch
>-- Cc: pgi-wg at ogf.org
>-- Betreff: Re: [Pgi-wg] OGF PGI Session 3, 14:00-15:30 Draft Minutes
>-- 
>-- I wasn't at this session, unfortunately.  The name shortcuts under attendance makes it look like I was...
>-- 
>-- Dan
>-- 
>-- 
>-- On Sep 21, 2011, at 3:28 PM, Steve Crouch wrote:
>-- 
>-- > 14:00-13:30 PGI Session 3
>-- > -------------------------
>-- >
>-- > Chair: Andrew Grimshaw
>-- > Minutes: Steve Crouch
>-- >
>-- > Attendance:
>-- >
>-- > Name shortcuts:
>-- >  AG - Andrew Grimshaw
>-- >  EU - Etiennce Urbah
>-- >  DK - Daniel Katz
>-- >  OS - Oxana Smirnov
>-- >  JPN - JP Navarro
>-- >  KS - Katsushige Saga
>-- >  SC - Steve Crouch
>-- >  DM - David Meredith
>-- >
>-- > New actions:
>-- >
>-- > Minutes
>-- >
>-- > Discussion of state model - how can we accommodate additional states?
>-- >
>-- > EU: doc number is 16306 under OGSA-BES WG on GridForge.
>-- >
>-- > [Page 31, fig 6]
>-- >
>-- > EU: take same states as original BES, add in transitions. Add purge
>-- > finished -> [end], add failure transition pending -> failed. If
>-- > cancelled -> [end], it's purged.
>-- >
>-- > AG: this is change in underlying state model.
>-- >
>-- > EU: could drop it, it exists on one system. If it doesn't fit, we drop it.
>-- >
>-- > EU: next diagram for held states [fig 7, pg 34]. Based on original BES,
>-- > only added some transitions.
>-- >
>-- > AG: if specified to go into suspended state initially in JSDL, it starts
>-- > in suspend until told to resume ('proceeding'). Transitions from
>-- > suspended/proceeding to both cancelled and failed. Still need to add a
>-- > transition. Automatic resubmission perhaps not put in.
>-- >
>-- > AG: other one is extended job state [referring to session slide 3] from
>-- > RENKEI.
>-- >
>-- > KS: EU covered this in more detail yesterday. (Discussion of incoming
>-- > queues for pre-processing).
>-- >
>-- > JPN: other terms meaning same thing: pre-processing:pending,
>-- > pre-processing:complete.
>-- >
>-- > AG: are we trying to go to a different state model? We'e handled this as
>-- > specified in the spec, which has substates dealing with staging in and
>-- > out. e.g. staging-in is pre-processing state. Also running:executing is
>-- > a substate. Didn't we want to stay with the basic state model and extend
>-- > via substates? This state model is extended.
>-- >
>-- > EU: not compatible with basic state model?
>-- >
>-- > KS: mapping exists from this to basic state model substates [next slide].
>-- >
>-- > AG: good profile some of these substates - we need to know what these
>-- > substates actually mean. Are these proposed JSDL extensions?
>-- >
>-- > KS: this implementation is just for testing. We don't care deeply about
>-- > extensions.
>-- >
>-- > AG: you name a substate where you want it to stop?
>-- >
>-- > KS: yes.
>-- >
>-- > AG: thought something similar to EU's idea. In original state model, two
>-- > place for helds (need pre and post): in running (as EU did), in pending,
>-- > and terminated held/finished held failed/held. But this is subsumed into
>-- > EU's.
>-- >   Do we want to give advice for how to order additional substates? Do
>-- > we recommend these in proceeding or suspended? If we use substates as in
>-- > data staging, execution in running where to put the helds?
>-- >
>-- > SM: better to have a named state for data staging pre/post. Confusing
>-- > for client otherwise.
>-- >
>-- > AG: optional client initiated staging hold. Letting client do whatever
>-- > they want to do. Dont want to hold for all purposes.
>-- >
>-- > Mike: you know it's held for pre-processing.
>-- >
>-- > AG: what does delegated refer to? Delegated queue for queued, delegated
>-- > running for running.
>-- >
>-- > SM: want to have pre and post for delegated.
>-- >
>-- > EU: names not well chosen.
>-- >
>-- > AG: suggest change names for some of these states.
>-- >
>-- > SM: delgated - incoming is queued.
>-- >
>-- > AG: delegated should be processing. Delegated:incoming should be
>-- > delegated:queued-in. Processing:outgoing (renamed) not visible to
>-- > outside, used as transitory state.
>-- >
>-- > Mike: no error around it, it's mandatory right?
>-- >
>-- > AG: if somebody were to subscribe to notification about this, ...
>-- >
>-- > AG: processing:running, processing:hold.
>-- >   If no queueing system, just executes - can it go directly to running.
>-- >
>-- > SM: debugging problematic. Held not implemented for all middlewares.
>-- >
>-- > AG: not sure I like this (confusion with states of running jobs for
>-- > held/running for executing jobs).
>-- >
>-- > SM: have u seen EMI execution service? Can take some of these from there.
>-- >
>-- > AG: keep same 5 states.
>-- >
>-- > SM: have attributes for substates.
>-- >
>-- > AG: need to agree on these attributes. To avoid having to guess on
>-- > substate meanings.
>-- >
>-- > SM: many reasons for including more info in substates e.g. errors -
>-- > storage is down, etc.
>-- >
>-- > EU: doc - 10th March on pgi list. PGI execution specification.
>-- >
>-- > [Discussion about pg 20 and state definitions - pre-processing and
>-- > processing]
>-- >
>-- > AG: mapping of these states to our discussed states.
>-- >
>-- > SM discusses how this mapping works.
>-- >
>-- > AG: referring to current state model...
>-- >   Agreed yesterday not to change state model otherwise, have to change
>-- > spec. Profile substates, don't have to.
>-- >
>-- > SM: too simple to represent all things. Problems with users to
>-- > distinguish clearly with states.
>-- >
>-- > AG: use [substates] and no pushback from users.
>-- >
>-- > Mike: but we'd have to change the spec.
>-- >
>-- > AG: want to expedite the process. If it can be done in context of the
>-- > existing state model, it vastly simplifies things unless it's too ugly.
>-- >
>-- > SM having multiple states.
>-- >
>-- > AG: how do we feel about this.
>-- >
>-- > SM: many substates of failed - how do we identify that?
>-- >
>-- > SC: this would break state model, new model, should use existing spec
>-- > where possible.
>-- >
>-- > SM: make concept smoother - define running, focus on pre/main/post
>-- > processing.
>-- >
>-- > AG: we went around the state discussions before, with substates you can
>-- > model everything.
>-- >   Doing a whole new spec would take longer.
>-- >
>-- > SM: one more possible point - model from EU/KS they have this community
>-- > requirement.
>-- >
>-- > AG: NAREGI state model - all these collapse into running.
>-- >
>-- > SM: limiting, should find a better way.
>-- >
>-- > AG: can't go backwards and change the document.
>-- >
>-- > SM: staging as substate - what does this mean?
>-- >
>-- > Mike: it may need to change anyway. A cleaner way would be to redo this.
>-- >
>-- > SM: substates of failed, terminate...
>-- >
>-- > AG: legitimate to do this in model.
>-- >
>-- > MM: problem we found was no additional additional transition to failure
>-- > state. It can occur at any moment. We introduced artificial transition
>-- > to failure state.
>-- >   Afraid about too much substating of running state - other substates
>-- > have more informative meaning e.g. running: data-staging.
>-- >
>-- > AG: if we do a new version of the spec base it on existing spec, and
>-- > just change the bits we need e.g. FactoryAttributes, vector operations,
>-- > but keep it simple, otherwise it will escalate - genie out of the bottle.
>-- >   SM is right - various people have substated this in very similar ways
>-- > in implementations - substates in running. This is fine, can even have k
>-- > layers deep of substates. What are transitions introduced? To error states?
>-- >
>-- > AG: too much of a decision to reach today. If we redo the spec, we
>-- > timebox it. If not, we will profile. Otherwise it will get into a
>-- > problem of arguing about stuff as in PGI.
>-- >
>-- > JPN: use cases for these requirements?
>-- >
>-- > AG: held states got us here. They wanted client-initiated held states
>-- > for data staging, then can say 'go'. Then be able to stop it afterwards.
>-- > No requirement for us, since data staging is automated, or done manually.
>-- >
>-- > JPN: requirement for all BES to support thee states?
>-- >
>-- > AG: specified in JSDL, if you don't support it, you throw a fault.
>-- >
>-- > EU: not in favour of manual staging in complicated state model. Have
>-- > documented it in documentation so people can understand these
>-- > complexities. Advocate existing BES states, just adding few transitions
>-- > that are missing e.g. pending -> failed.
>-- >
>-- > AG: pending -> failed could be done as an addendum e.g. it was
>-- > forgotten, ok. Adding new states is different.
>-- >
>-- > SM: have given this requirement to PGI and EMI.
>-- >
>-- > EU: have a requirement to users. They should think about workflow model
>-- > before trying to send jobs.
>-- >
>-- > SM: our users are expecting this held, manual staging. We shoulnd't
>-- > perhaps introduce hold mechanism, just implement simple flags. But
>-- > doesn't have to be supported.
>-- >
>-- > EU: do we have to implement new state model.
>-- >
>-- > SM: no. Different issue.
>-- >
>-- > EU: related to post/pre processing.
>-- >
>-- > AG: hold and manual can be substated away. SM wants to restate
>-- > misleading substates.
>-- >
>-- > SM: e.g. running:queued.
>-- >
>-- > EU: just rename.
>-- >
>-- > DM: haven't heard a convincing argument for these pre/post processing
>-- > substates...
>-- >
>-- > AG: it's ugly, a bit of ugly at a time. Term for this: technical debt.
>-- > The easy path, some point in future clean it up, not today - fine with
>-- > me! Not me who will have to implement.
>-- >   SM from UNICORE. PGI and EMI had this as their requirements. PGI/WG
>-- > had this as a requirement driven from gLite and ARC. But now UNICORE?
>-- >
>-- > SM: yes.
>-- >
>-- > AG: how functionality differnet from telling them to copy data in first?
>-- >
>-- > SM: job starts, get EPR & storage service reference, use this to stage
>-- > data into. Can upload files/directory into this, once done, start job
>-- > physically.
>-- >
>-- > AG: 3 choices
>-- >  1) We stay on profile only path, incorporate reqs fully later on
>-- >  2) Bite bullet and do the spec rewrites
>-- >  3) Addendum - additional profile to stick changes in only
>-- >
>-- > AG: addendum to add state ok, changing state space wouldn't ba an
>-- > addendum as such.
>-- >   Other than UNICORE, others want to profile.
>-- >   UNICORE - do it properly, rewrite.
>-- > Need input from others - gLite, CREAM, SAGA, NorduGrid, etc. Need to get
>-- > this approach right.
>-- >
>-- > SM: first priority data staging profiling ok, state model - secondary.
>-- > Thirdly (personal) BES - meaning of substates from user's point of view.
>-- >   Rename running as processing.
>-- >
>-- > AG: addendum?
>-- >
>-- > Mike: can we do all this in profiles?
>-- >
>-- > AG: don't think vectors can be done this way. Is it important? Not sure
>-- > - may slide on important scale. Those that wanted it have walked away
>-- > from the table.
>-- >
>-- > EU: in EDGI, we're implementing handling of vectors. Tell users if you
>-- > want vector, you explain in separate file internally and generate
>-- > internally as many jobs as necessary using existing client interfaces.
>-- >
>-- > AG: support in our client implementation by generating multiple internal
>-- > JSDLs from a single one, returning single EPR. If you send list, we'll
>-- > accept and start them all.
>-- >   Param sweep spec never says what to return.
>-- >
>-- > SM: rename states.
>-- >
>-- > AG: perhaps not a bad idea. Notion of addendums to specs; very easily done.
>-- >   a) transition from pending to failed.
>-- >   b) rename running to processing.
>-- >
>-- > AG: can have conditional - support both running and processing (but they
>-- > mean the same thing).
>-- >   On to the substate model. Assuming processing, not running. Perhaps
>-- > not enough time. Back and forth transitions are problematic, not sure
>-- > it's way to go. Alternative way would be to have hold states within
>-- > processing; not clear how existing imps that break down running into
>-- > substates would handle this. Look at RENKEI's mapping.
>-- >
>-- > EU: see previews drawing.
>-- >
>-- > KS: we implement pre-processing/post-processing only for data staging.
>-- > Maybe original PGI specification state model comes from strawman. Also
>-- > doc says pre-post used for data staging.
>-- >
>-- > AG: think they are.
>-- >
>-- > KS: implement this state model. In our system we have workflow system,
>-- > with this system, we can transfer data directory from and to computer
>-- > resources. We implemented these states.
>-- >
>-- > EU: pre/post processing not only for manual but automatic data staging.
>-- >
>-- > AG: right.
>-- >   Not a strictly requirement to support suspend/resume.
>-- >
>-- > AG: consensus to take running renamed as processing, change from
>-- > running:stage-in/running:stage-out to new processing states.
>-- >
>-- > SM: how to do this resume in service?
>-- >
>-- > AG: discussed yesterday, new port type.
>-- >
>-- >
>-- >
>-- >
>-- > --
>-- > Dr Stephen Crouch,
>-- > Software Architect,
>-- > OMII-UK,
>-- > Electronics and Computer Science,
>-- > Room 4067, Level 4, Building 32,
>-- > University of Southampton,
>-- > Highfield, Southampton SO17 1BJ
>-- > Tel: +44 (0)23 8059 8787        EMail: s.crouch at software.ac.uk
>-- > Fax: +44 (0)23 8059 3045        WWW: http://www.ecs.soton.ac.uk/~stc
>-- > _______________________________________________
>-- > Pgi-wg mailing list
>-- > Pgi-wg at ogf.org
>-- > http://www.ogf.org/mailman/listinfo/pgi-wg
>-- 
>-- --
>-- Daniel S. Katz
>-- University of Chicago
>-- (773) 834-7186 (voice)
>-- (773) 834-6818 (fax)
>-- d.katz at ieee.org or dsk at ci.uchicago.edu
>-- http://www.ci.uchicago.edu/~dsk/
>-- 
>-- 
>-- 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3550 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/pgi-wg/attachments/20110922/a7211775/attachment-0001.bin 


More information about the Pgi-wg mailing list