[SAGA-RG] Workflow support in SAGA

Malcolm Illingworth malcolm at epcc.ed.ac.uk
Thu Jun 19 05:21:22 CDT 2008


Hi,

I think trying to define a workflow description language from scratch would
be very tricky to get right. It would have to be sufficiently general not to
be domain-specific, but at the same time verbose enough to actually be
useful.

I do like the idea of having a Workflow class defined in the API. At least
from the Java perspective, I think this would be easy to do. I cant remember
offhand if a TaskContainer is itself a Task, but it would be extremely
powerful if the Workflow class could accept TaskContainers as well as Tasks.
This would make dependancies easy to manage; for example you could have
tasks at multiple sites, with a TaskContainer representing each site, and a
single Workflow managing the set of TaskContainer instances.

(Within DESHL we took a simple but effective approach. Since we are
command-line based, some of my colleagues at EPCC have developed a workflow
application for ensemble computing based on bash scripts issuing DESHL
commands.)

Regards,
Malcolm. 

   

-----Original Message-----
From: saga-rg-bounces at ogf.org [mailto:saga-rg-bounces at ogf.org] On Behalf Of
Andre Merzky
Sent: 18 June 2008 21:46
To: SAGA RG
Subject: Re: [SAGA-RG] Workflow support in SAGA

ust a small comment for now:  I'd like to adjust this statement of mine

>>>>> That is correct.  What is missing in SAGA for basic workflow 
>>>>> support are job dependencies.

There are actually two ways to add workflow support in SAGA (not mutually
exclusive):

 (a) add task dependencies, and thus allow to describe a  workflow built
from saga jobs and saga tasks; 

 (b) add the ability to submit a work flow description  (usually some XML
document or such) to SAGA, which will  (directly or indirectly) take steps
to execute that  workflow.

IMHO, (a) would be really cool, and very powerful.  (b) OTOH would be simple
to implement:

  class saga::workflow : public saga::async
  {
    workflow (std::string workflow_description);
    ~workflow (void);

    void run (void);
  };

Problems only start once we start to look inside that description... - as
Ashiq said, its unlikely that the  will be a single one workflow desciption
language to make all people happy...

Best, Andre.


Quoting [Andre Merzky] (Jun 18 2008):
> 
> [forwarded reply from Ashiq:]
> 
> Dear Andre,
> 
> Thanks for your detailed reply.
> 
> I believe applications really need the workflow support in the SAGA 
> specification(and implementations). Such a feature will definitely 
> help the applications to quickly adopt SAGA and without this, they are 
> being restricted in exploiting the full potential of Grids. We will be 
> happy to talk to you more on the subject and will help wherever 
> required.
> 
> I do no think that different communities will ever agree on a single 
> workflow language. Biomedical community is predominantly using SCUFL,  
> HEP Physics community has been using DAGs since long and business 
> community is quickly adopting BPEL. Some other groups are using 
> pi-calculus and petri-nets for more abstract formalism. I believe 
> different communities should have the freedom to use a language of 
> their choice and SAGA should provide a mechanism where different 
> languages and enactment engines can be supported.
> This could be a wish list but I do not know how much difficult is to 
> achieve this goal?
> 
> Moreover, what are your views on the workflow support in JSAGA? Is it 
> following any standard or they plan to update the SAGA specification 
> retrospectively?
> 
> 
> 
> Quoting [Andre Merzky] (Jun 18 2008):
> > Date: Wed, 18 Jun 2008 22:33:26 +0200
> > From: Andre Merzky <andre at merzky.net>
> > To: SAGA RG <saga-rg at ogf.org>
> > Bcc: Andre Merzky <andre at merzky.net>
> > Subject: Workflow support in SAGA
> >
> > Hi folx,
> >
> > we recently started a discussion thread about workflow support in 
> > SAGA, off this list.  This mail tries to move that thread onto this 
> > list, in order to gauge interest and feedback from the wider 
> > community.  So, please feel free to jump in!
> >
> > Below are excerpts from the thread (resorted, reformatted, shortened),
as starting point.
> >
> > Best regards,
> >
> >   Andre.
> >
> > >>>>>> On Fri, Jun 13, 2008, yasir mehmood wrote:
> > >>>>>>
> > >>>>>> A colleague of mine (Irfan Habib) is writing a pipeline 
> > >>>>>> service as a client to the glueing service which passes it 
> > >>>>>> (or plan to pass) a workflow instead of a job.  He is 
> > >>>>>> concerned about the workflow support in Saga Java 
> > >>>>>> Implementations.  Could you please confirm this ?
> > >>>>>>
> > >>>>>>
> > >>>>> Thilo Kielmann wrote:
> > >>>>>
> > >>>>> Dear Yasir,
> > >>>>>
> > >>>>> The SAGA API (and thus also its implementation in
> > >>>>> Java) does not deal with workflows, only with individual jobs.  
> > >>>>> You have to construct workflows yourself, using SAGA jobs, 
> > >>>>> containers, etc.
> > >>>>
> > >>>>
> > >>>> Quoting [Irfan Habib] (Jun 13 2008):
> > >>>>
> > >>>> Dear Thilo,
> > >>>>
> > >>>> Thank you for your kind reply, may I just ask one more 
> > >>>> question.
> > >>>>
> > >>>> As you have stated that the SAGA API does not deal with 
> > >>>> workflows only individual jobs, so in that case if I want to 
> > >>>> execute a workflow through SAGA, I have to externally handle 
> > >>>> the execution of the workflow and invoke SAGA only for 
> > >>>> execution of individual jobs?
> > >>>>
> > >>>>
> > >>> -----Original Message-----
> > >>> From: Andre Merzky [mailto:andre at merzky.net]
> > >>>
> > >>> That is correct.  What is missing in SAGA for basic workflow 
> > >>> support are job dependencies.  That is what you need to maintain 
> > >>> manually.
> > >>>
> > >>> Please have a look at the saga::task::container class - that 
> > >>> class is supposed to make it easier to manage large numbers of 
> > >>> tasks/jobs.
> > >>>
> > >>> If you have interest in that topic, we could discuss to draft a 
> > >>> saga package which adds WF like dependencies to tasks...  This 
> > >>> is on our roadmap anyway, but so far with low priority, due to 
> > >>> lack of interest and use cases.
> > >>>
> > >>> Best, Andre.
> > >>
> > >>
> > >> Quoting [Ashiq Anjum] (Jun 17 2008):
> > >>
> > >> Dear Andre,
> > >>
> > >> If this is the case then SAGA introduces a large
> > >> overhead: The applications (for example the Pipeline service and 
> > >> other services we intend to use)  would have to have a middle 
> > >> component which would get the workflow, break it into constituent 
> > >> parts, and invoke the API for each job separately, this component 
> > >> in this scenario would be a enactment engine. In this scenario 
> > >> the Grid middleware at any given time would be scheduling one 
> > >> single job, rather then an entire workflow.
> > >>
> > >> I think this is a bottleneck and users can not do fine grained 
> > >> scheduling of compute and data intensive jobs without this 
> > >> feature. The purpose behind SAGA was to facilitate user by 
> > >> enabling them to transparently access the Grid.  But if the users 
> > >> has to code themselves for the workflow dependencies, this will 
> > >> be much more complex for the users and will go against the 20:80 
> > >> slogan for SAGA. I think this feature will be very helpful to be 
> > >> included in the future releases.
> > >>
> > >> I was also under the impression that glite adapter is already 
> > >> available. Do you have some idea that how far is this from a 
> > >> stable release?
> > >>
> > >> Moreover, did some one submit jobs (using Java implementation of 
> > >> SAGA) on a real execution environment (eg. condor, SGE etc)? It 
> > >> looks like most of the testing has been made using fork based 
> > >> execution but submitting jobs on condor/SGE will require range of 
> > >> JSDL parameters to be sent from the client application through 
> > >> the SAGA API. Can we create JSDL's for complex jobs to be 
> > >> executed on remote execution environments (eg Condor, SGE)?
> > >>
> > >> Thanks,
> > >>
> > >> Best Regards Ashiq Anjum
> > >
> > > Dear Ashiq,
> > >
> > > we are aware (painfully so) of the fact that SAGA does not yet 
> > > cover the full scope of programming models used in Grid 
> > > environments.  And yes, workflows are an obvious gap, but so are 
> > > others: p2p, messaging, checkpoint/recovery, transactions, resorce 
> > > discovery, service discovery, ...
> > >
> > > The list is long, and we are working on a number of those items.  
> > > As for workflows: we simply did not have enough input from the 
> > > workflow community in order to make an informed decision on how a 
> > > workflow oriented job API should look like.  It might be enough to 
> > > simply accept a workflow description and to submit it to an 
> > > enactor - but which language to use?  At the time, it was _very_ 
> > > unclear if and what workflow language would dominate the scene (I 
> > > have no idea how the situation is now - is BPAL 'the one'?).  And 
> > > idealy we would also like to express workflows programmatically, 
> > > too, i.e. as dependencies between saga::job instances.
> > >
> > >
> > > I have two more comments slightly less defensive, and more 
> > > constructive I hope :-)
> > >
> > >
> > > First, writing a workflow enactor in SAGA would be cool, and 
> > > potentially useful for a range of people and pojects!
> > > I understand that this is not on your / Yasirs roadmap, and 
> > > probably very distracting and time consuming.  But you may want to 
> > > check with other projects if they would be interested in joining 
> > > forces.
> > >
> > > Second, this might be good timing to actually add workflow support 
> > > to the SAGA standard!  We would very much appreciate your input on 
> > > that topic (although I would like to move this thread to the 
> > > saga-rg at ogf.org mailing list then).  Is that something you would 
> > > consider spending time on?
> > >
> > > These are my toughts on the topic so far - I'd be happy to get 
> > > your feedback.
> > >
> > > Thanks, Andre.
--
Nothing is ever easy.
--
  saga-rg mailing list
  saga-rg at ogf.org
  http://www.ogf.org/mailman/listinfo/saga-rg



More information about the saga-rg mailing list