[ogsa-wg] Paper proposing "evolutionary vertical design efforts"

Tue Mar 21 12:53:20 CST 2006

Marty:

I wasn't trying to be philosophical, just commenting that 
at-once-submission semantics is important. If a client can't be sure that a 
job is submitted or not, clients get very complicated.

Ian.

At 01:43 PM 3/21/2006 -0500, Marty Humphrey wrote:

>But this is not so simple. The knee-jerk reaction is to separate these two 
>concerns into implementation vs. interface, and develop each one 
>independently. But taken to the extreme, a system that appearsto be rich 
>in its capabilities might not be so in reality for some time (if EVER!).
>
>
>
>Lets assume that we truly separate these concerns and build sophisticated 
>interfaces. But then what about the potential consumer of such services? 
>Building an overly complex interface to such a service (without any 
>practical implementations behind it) might promote further complicated 
>clients (which promotes further complexity upstream&) Build the interface 
>and they will come with implementationsis a variation on a theme that 
>doesnt always come true. Arguably, complexity is what were trying to get 
>away from.
>
>
>
>And no, Im not advocating only an interface that matches existing 
>capabilities. Im just saying that its NOT obvious that the most effective 
>approach is to entirely decouple these two concerns.
>
>
>
>-- Marty
>
>
>
>----------
>From: Ian Foster [mailto:foster at mcs.anl.gov]
>Sent: Tuesday, March 21, 2006 1:34 PM
>To: Marvin Theimer; Carl Kesselman
>Cc: humphrey at cs.virginia.edu; ogsa-wg at ggf.org; Marvin Theimer
>Subject: RE: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Marvin:
>
>I think you are mixing two things together: the capabilities of the 
>scheduler and the capabilities of the remote submission interface. The 
>proposal that we submit at-most-once submission capabilities is a proposal 
>for capabilities in the remote submission interface, not the scheduler. I 
>wouldn't expect existing schedulers to provide this capability, just as 
>they don't (for the most part) support Web Services interfaces. But once 
>we define a Web Services-based remote submission interface, at-most-once 
>submission capabilities become important.
>
>Ian.
>
>
>At 10:28 AM 3/21/2006 -0800, Marvin Theimer wrote:
>
>
>Hi;
>
>
>
>Whereas I agree with you that at-most-once semantics are very desirable, I 
>would like to point out that not all existing job schedulers implement 
>them.  I know that both LSF and CCS (the Microsoft HPC job scheduler) 
>dont.  Ive been trying to find out whether PBS and SGE do or dont.
>
>
>
>So, this brings up the following slightly more general question: should 
>the simplest base case be the simplest case that does something useful, or 
>should it be more complicated than that?  I can see good arguments on both 
>sides:
>
>·         Whittling things down to the simplest possible base case 
>maximizes the likelihood that parties can participate.  Every feature 
>added represents one more feature that some existing system may not be 
>able to support or that a new system has to provide even when its not 
>needed in the context of that system.  Suppose, for example, that PBS and 
>SGE dont provide transactional semantics of the type you described.  Then 
>4 of the 6 most common job scheduling systems would not have this feature 
>and would need to somehow add it to their implementations.  In this 
>particular case it might be too difficult to add in practice, but in 
>general there might be problems.
>
>·         On the other hand, since there are many clients and arguably far 
>fewer server implementations, features that substantially simplify client 
>behavior/programming and that are not too onerous to implement in existing 
>and future systems should be part of the base case.  The problem, of 
>course, is that this is a slippery slope at the end of which lies the 
>number 42 (ignore that last phrase if youre not a fan of The Hitchhikers 
>Guide to the Galaxy).
>
>
>
>Personally, the slippery slope argument makes me lean towards defining the 
>simplest possible base use case, since otherwise well spend a (potentially 
>very) long time arguing about which features are important enough to 
>justify being in the base case.  One possible way forward on this issue is 
>to have people come up with lists of features that they feel belong in the 
>base use case and then we agree to include only those that have a large 
>majority of the community arguing for their inclusion in the base case.
>
>
>
>Unfortunately defining what large majorityshould be is also not easy or 
>obvious.  Indeed, one can argue that we cant even afford to let all votes 
>be equal.  Consider the following hypothetical (and contrived) case: 100 
>members of a particular academic research community show up and vote that 
>the base case must include support for a particular complicated scheduling 
>policy and the less-than-ten suppliers of existing job schedulers with 
>significant numbers of users all vote against it.  Should it be included 
>in the base case?  What happens if the major scheduler vendors/suppliers 
>decide that they cant justify implementing it and therefore cant be GGF 
>spec-compliant and therefore go off and define their own job scheduling 
>standard?  The hidden issue is, of course, whether those voting are 
>representative of the overall HPC user population.  I cant personally 
>answer that question, but it does again lead me to want to minimize the 
>number of times I have to ask that question i.e. the number of features 
>that I have to consider for inclusion in the base case.
>
>
>
>So this brings me to the question of next steps.  Recall that the approach 
>Im advocating and that others have bought in to as far as I can tell is 
>that we define a base case and the mechanisms and approach to how 
>extensions of the base case are done.  I assert that the absolutely most 
>important part of defining how extension should work is ensuring that 
>multiple extensions dont end up producing a hairball thats impossible to 
>understand, implement, or use.  In practice this means coming up with a 
>restricted form of extension since history is pretty clear on the pitfalls 
>of trying to support arbitrarily general extension schemes.
>
>
>
>This is one of the places where identification of common use cases comes 
>in.  If we define the use cases that we think might actually occur then we 
>can ask whether a given approach to extension has a plausible way of 
>achieving all the identified use cases.  Of course, future desired use 
>cases might not be achievable by the extension schemes we come up with 
>now, but that possibility is inevitable given anything less than a fully 
>general extension scheme.  Indeed, even among the common use cases we 
>identify now, we might discover that there are trade-offs where a simpler 
>(and hence probably more understandable and easier to implement and use) 
>extension scheme can cover 80% of the use cases while a much more 
>complicated scheme is required to cover 100% of the use cases.
>
>
>
>Given all this, here are the concrete next steps Id like to propose:
>
>·         Everyone who is participating in this design effort should 
>define what they feel should be the HPC base use case.  This represents 
>the simplest use case and associated features like transactional submit 
>semantics that you feel everyone in the HPC grid world must implement.  We 
>will take these use case candidates and debate which one to actually settle on.
>
>·         Everyone should define the set of HPC use cases that they 
>believe might actually occur in practice.  I will refer to these as the 
>common use cases, in contrast to the base use case.  The goal here is not 
>to define the most general HPC use case, but rather the more restricted 
>use cases that might occur in real life.  For example, not all systems 
>will support job migration, so whereas a fully general HPC use case would 
>include the notion of job migration, I argue that one or more common use 
>cases will not include job migration.
>
>Everyone should also prioritize and rank their common use cases so that we 
>can discuss 80/20-style trade-offs concerning which use cases to support 
>with any given approach to extension.  Thus prioritization should include 
>the notion of how common you think a use case will actually be, and hence 
>how important it will be to actually support that use case.
>
>·         Everyone should start thinking about what kinds of extension 
>approaches they believe we should define, given the base use case and 
>common use cases that they have identified.
>
>
>
>As multiple people have pointed out, an exploration of common HPC use 
>cases has already been done one or several times before, including in the 
>EMS working group.  Im still catching up on reading GGF documents, so I 
>dont know how much those prior efforts explored the issue from the 
>point-of-view of base case plus extensions.  If these prior explorations 
>did address the topic of base-plus-extensions and you agree with the 
>specifics that were arrived at then this exercise will be a quick-and-easy 
>one for you: you can simply publish the appropriate links to prior 
>material in an email to this mailing list.  I will personally be sending 
>in my list independent of prior efforts in order to provide a 
>newcomersperspective on the subject.  It will interesting to see how much 
>overlap there is.
>
>
>
>One very important point that Id like to raise is the following: Time is 
>short and bestis the enemy of good enough.  Microsoft is planning to 
>provide a Web services-based interoperability interface to its job 
>scheduler sometime in the next year or two.  I know that many of the other 
>job scheduler vendors/suppliers are also interested in having an 
>interoperability story in place sooner rather than later.  To meet this 
>schedule on the Microsoft side will require locking down a first fairly 
>complete draft of whatever design will be shipped by essentially the end 
>of August.  That's so that we can do all the necessary debugging, 
>interoperability testing, security threat modeling, etc. that goes with 
>shipping an actual finished product.  What that means for the HPC profile 
>work is that, come the end of August, Microsoft and possibly other 
>scheduler vendors/suppliers will need to lock down and start coding some 
>version of Web Services-based job scheduling and data transfer 
>protocols.  If there is a fairly well-defined, feasible set of 
>specs/profile coming out of the GGF HPC working group (for recommendation 
>NOT yet for actual standards approval) that has some reasonable level of 
>consensus by then, then that's what Microsoft will very likely go 
>with.  Otherwise Microsoft will need to defer the idea of shipping 
>anything that might be GGF compliant to version 3 of our product, which 
>will probably ship about 4 years from now.
>
>
>
>The chances of coming up with the bestHPC profile by the end of August are 
>slim.  The chances of coming up with a fairly simple design that is good 
>enoughto cover the most important common cases by means of a relatively 
>simple, restricted form of extension seems much more feasible.  Covering a 
>richer set of use cases would need to be deferred to a future version of 
>the profile, much in the manner that BES has been defined to cover an 
>important sub-category of use cases now, with a fuller EMS design being 
>done in parallel as future work.  So I would argue that perhaps the most 
>important thing this design effort and the planned HPC profile working 
>group that will be set up in Tokyo can do is to identify what a good 
>enoughversion 1 HPC profile should be.
>
>
>
>Marvin.
>
>
>
>
>
>----------
>From: Carl Kesselman [mailto:carl at isi.edu]
>Sent: Thursday, March 16, 2006 12:49 AM
>To: Marvin Theimer
>Cc: humphrey at cs.virginia.edu; ogsa-wg at ggf.org
>Subject: Re: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Hi,
>
>In the interest of furthering agreement, I was not arguing that the 
>application had to be restartable. Rather, what has been shown to be 
>important is that the protocol be restartable in the following sense:  if 
>you submit a job and the far and server fails, is the job running or not, 
>if you resubmit, do you get another job instance. The GT sumbission 
>protocol and Condor have a transactional semantics so that you can have at 
>most once submit semantics reegardless of client and server failures. The 
>fact that your application may be non-itempote is exactly why having a 
>well defined semantics in this case is important.
>
>So what is the next step?
>
>Carl
>
>Dr. Carl Kesselman                              email:   carl at isi.edu
>USC/Information Sciences Institute        WWW: 
><http://www.isi.edu/~carl>http://www.isi.edu/~carl
>4676 Admiralty Way, Suite 1001           Phone:  (310) 448-9338
>Marina del Rey, CA 90292-6695            Fax:      (310) 823-6714
>
>
>
>-----Original Message-----
>From: Marvin Theimer <theimer at microsoft.com>
>To: Carl Kesselman <carl at isi.edu>
>CC: Marvin Theimer <theimer at microsoft.com>; Marty Humphrey 
><humphrey at cs.virginia.edu>; ogsa-wg at ggf.org <ogsa-wg at ggf.org>
>Sent: Wed Mar 15 14:26:36 2006
>Subject: RE: [ogsa-wg] Paper proposing "evolutionary vertical design  efforts"
>
>Hi;
>
>
>
>I suspect that were mostly in agreement on things.  In particular, I think 
>your list of four core aspects is a great starting point for a discussion 
>on the topic.
>
>
>
>I just replied to an earlier email from Ravi with a description of what Im 
>hoping to get out of examining various HPC use cases:
>
>·        Identification of the simplest base case that everyone will have 
>to implement.
>
>·        Identification of common cases we want to optimize.
>
>·        Identification of how evolution and selective extension will work.
>
>
>
>I totally agree with you that the base use case I described isnt really a 
>griduse case.  But it is an HPC use case in fact it is arguably the most 
>common use case in current existence. J  So I think its important that we 
>understand how to seamlessly integrate and support that common and very 
>simple use case.
>
>
>
>I also totally agree with you that we cant let a solution to the simplest 
>HPC use case paint us into a corner that prevents supporting the richer 
>use cases that grid computing is all about.  Thats why Id like to spend 
>significant effort exploring and understanding the issues of how to 
>support evolution and selective extension.  In an ideal world a legacy 
>compute cluster job scheduler could have a simple grid shimthat let it 
>participate at a basic level, in a natural manner, in a grid environment, 
>while smarter clients and HPC services could interoperate with each other 
>in various selectively richer manners by means of extensions to the basic 
>HPC grid design.
>
>
>
>One place where I disagree with you is your assertion that everything 
>needs to be designed to be restartable.  While thats a good goal to pursue 
>Im not convinced that you can achieve it in all cases.  In particular, 
>there are at least two cases that I claim we want to support that arent 
>restartable:
>
>·        We want to be able to run applications that arent restartable; 
>for example, because they perform non-idempotent operations on the 
>external physical environment.  If such an application fails during 
>execution then the only one who can figure out what the proper next steps 
>are is the end user.
>
>·        We want to be able to include (often-times legacy) systems that 
>arent fault tolerant, such as simple small compute clusters where the 
>owners didnt think that fault tolerance was worth paying for.
>
>Of course any acceptable design will have to enable systems that are fault 
>tolerant to export/expose that capability.  To my mind its more a matter 
>of ensuring that non-fault-tolerant systems arent excluded from 
>participation in a grid.
>
>
>
>Other things we agree on:
>
>·        We should certainly examine what remote job submission systems 
>do.  We should certainly look at existing systems like Globus, Unicore, 
>and Legion.  In general, we should be looking at everything that has any 
>actual experience that we can learn from and everything that is actually 
>deployed and hence represents a system that we potentially need to 
>interoperate with.  (Whether a final design is actually able to 
>interoperate at any but the most basic level with various exotic existing 
>systems is a separate issue.)
>
>·        We should absolutely focus on codifying what we know how to do 
>and avoid doing research as part of a standards process.  I believe that 
>thinking carefully about how to support evolution and extension is our 
>best hope for allowing people to defer trying to bake their pet research 
>topic into standards since it provides a story for why todays standards 
>dont preclude tomorrows improvements.
>
>
>
>So I would propose that next steps are:
>
>·        Continue to explore and classify various HPC use cases of various 
>differing levels of complexity.
>
>·        Describe the requirements and limitations of existing job 
>scheduling and remote job submission systems.
>
>·        Continue identifying and discussing key featuresof use cases and 
>potential design solutions, such as the four that you identified in your 
>last email.
>
>
>
>Marvin.
>
>
>
>________________________________
>
>From: Carl Kesselman [<mailto:carl at isi.edu>mailto:carl at isi.edu]
>Sent: Tuesday, March 14, 2006 7:50 AM
>To: Marty Humphrey; ogsa-wg at ggf.org
>Cc: Marvin Theimer
>Subject: RE: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Hi,
>
>
>
>Just to be clear, Im not trying to suggest that the scope be expanded. I 
>agree with the approach of focusing on a baby step is a good one, and many 
>of the assumptions stated in Marvins list I am in total agreement with. 
>However, in taking baby steps I think that it is important that we end up 
>walking, and that in defining the use case, one can easily create 
>solutions that will not get you to the next step. This is my point about 
>looking at what we know how to do and have been doing in production 
>settings for many years now. In my mind, one of the scope grandness 
>problems has been that there has been far too little focus on codifying 
>what we know how to do in favor of using a standards process as an excuse 
>to design new things.  So at the risk of sounding partisan, the simplified 
>use case that Marvin is proposing is exactly the use case that GRAM has 
>been doing for over ten years now (I think the same can be said about 
>UNICORE and Legion).
>
>
>
>So let me try to be  constructive.  One of the things that falls out of 
>Marvins list could be a set of basic concepts/operations that need to be 
>defined.  These include:
>
>1) A way of describing localjob configuration, i.e. where to find the 
>executable, data files, etc. This should be very conservative with its 
>assumptions on shared file systems and accessibility. In general, what 
>needs to be stated here are what are the underlying aspects of the 
>underlying resource that are exposed to the outward facing interface.
>
>2) A way of naming a submission point (should probably have a way of 
>modeling queues).
>
>3) A core set of job management operations, submit, status, kill. These 
>need to be defined in such a way at to be tolerate to a variety of failure 
>scenarios, in that the state needs to be well defined in the case of failure.
>
>4) A state model that one can use to describe what is going on with the 
>jobs and a way to access that state.  Can be simple (queued, running, 
>done), may need to be extensible.  One can view the accounting information 
>as being exposed
>
>
>
>So, one thing to do would be to agree that these are (or are not) the 
>right four things that need to be defined and if so, start to flesh out 
>these in a way that supports the core use case but doesnt introduce 
>assumptions that would preclude more complex use cases in the future.
>
>
>
>
>
>Carl
>
>
>
>________________________________
>
>From: owner-ogsa-wg at ggf.org 
>[<mailto:owner-ogsa-wg at ggf.org>mailto:owner-ogsa-wg at ggf.org] On Behalf Of 
>Marty Humphrey
>Sent: Tuesday, March 14, 2006 6:32 AM
>To: ogsa-wg at ggf.org
>Cc: 'Marvin Theimer'
>Subject: RE: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Carl,
>
>
>
>Your comments are very important. We would love to have your active 
>participation in this effort. Your experience is, of course, matched by few!
>
>
>
>I re-emphasize that this represents (my words, not anyone elses) baby 
>stepsthat are necessary and important for the Grid community.  In my 
>opinion, the biggest challenge will be to fight the urge to expand the 
>scope beyond a small size. You cannot ignore the possibility that the GGF 
>has NOT made as much progress as it should have to date. Furthermore, one 
>such plausible explanation is that the scope is too grand.
>
>
>
>-- Marty
>
>
>
>
>
>________________________________
>
>From: owner-ogsa-wg at ggf.org 
>[<mailto:owner-ogsa-wg at ggf.org>mailto:owner-ogsa-wg at ggf.org] On Behalf Of 
>Carl Kesselman
>Sent: Tuesday, March 14, 2006 8:47 AM
>To: Marvin Theimer; Ian Foster; ogsa-wg at ggf.org
>Subject: RE: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Hi,
>
>
>
>While I have no wish to engage in the what is a Gridargument, there are 
>some elements of your base use case that I would be concerned 
>about.  Specifically, the assumption that the submission in into a local 
>clusteron which there is an existing account may lead one to a solution 
>that may not generalize to the solution to the case of submission across 
>autonomous policy domains.  I would also argue that ignoring issues of 
>fault tolerance from the beginning is also problematic.  One must at least 
>design operations that are restartable (for example at most once 
>submission semantics).
>
>
>
>I would finally suggest that while examining existing job schedule systems 
>is a good thing to do, we should also examine existing remote submission 
>systems (dare I say Grid systems).  The basic HPC use case is one in which 
>there is a significant amount implementation and usage experience.
>
>
>
>Thanks,
>
>
>Carl
>
>
>
>
>
>________________________________
>
>From: owner-ogsa-wg at ggf.org 
>[<mailto:owner-ogsa-wg at ggf.org>mailto:owner-ogsa-wg at ggf.org] On Behalf Of 
>Marvin Theimer
>Sent: Monday, March 13, 2006 2:42 PM
>To: Ian Foster; ogsa-wg at ggf.org
>Cc: Marvin Theimer
>Subject: RE: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Hi;
>
>
>
>Ian, you are correct that I view job submission to a cluster as being one 
>of the simplest, and hence most basic, HPC use cases to start with.  Or, 
>to be slightly more general, I view job submission to a black boxthat can 
>run jobs be it a cluster or an SMP or an SGI NUMA machine or what-have-you 
>as being the simplest and hence most basic HPC use case to start 
>with.  The key distinction for me is that the internals of the boxare for 
>the most part not visible to the client, at least as far as submitting and 
>running compute jobs is concerned.  There may well be a separate interface 
>for dealing with things like system management, but I want to explicitly 
>separate those things out in order to allow for use of boxesthat might be 
>managed by proprietary means or by means obeying standards that a 
>particular job submission client is unfamiliar with.
>
>
>
>I think the use case that Ravi Subramaniam posted to this mailing list 
>back on 2/17 is a good one to start a discussion around.  However, Id like 
>to present it from a different point-of-view than he did.  The manner in 
>which the use case is currently presented emphasizes all the capabilities 
>and services needed to handle the fully general case of submitting a batch 
>job to a computing utility/service.  Thats a great way of producing a 
>taxonomy against which any given system or design can be compared to see 
>what it has to offer.  I would argue that the next step is to ask whats 
>the simplest subset that represents a useful system/design and how should 
>one categorize the various capabilities and services he has identified so 
>as to arrive at meaningful components that can be selectively used to 
>obtain progressively more capable systems.
>
>
>
>Another useful exercise to do is to examine existing job scheduling 
>systems in order to understand what they provide.  Since in the real world 
>we will have to deal with the legacy of existing systems it will be 
>important to understand how they relate to the use cases we explore.  In 
>the same vein, it will be important to take into account and understand 
>other existing infrastructures that people use that are related to HPC use 
>cases.  Im thinking of things like security infrastructures, directory 
>services, and so forth.  From the point-of-view of managing complexity and 
>reducing total-cost-of-ownership, it will be important to understand the 
>extent to which existing infrastructure and services can be reused rather 
>than reinvented.
>
>
>
>To kick off a discussion around the topic of a minimalist HPC use case, I 
>present a straw man description of such below and then present a first 
>attempt at categorizing various areas of extension.  The categorization of 
>extension areas is not meant to be complete or even all that carefully 
>thought-out as far as componentization boundaries are concerned; it is 
>merely meant to be a first contribution to get the discussion going.
>
>
>
>A basic HPC use case: Compute cluster embedded within an organization.
>
>·     This is your basic batch job scheduling scenario.  Only a very basic 
>state transition diagram is visible to the client, with the following 
>states for a job: queued, running, finished.  Additional states -- and 
>associated state transition request operations and functionality -- are 
>not supported.  Examples of additional states and associated functionality 
>include suspension of jobs and migration of jobs.
>
>·     Only "standard" resources can be described, for example: number of 
>cpus/nodes needed, memory requirements, disk requirements, etc.  (think 
>resources that are describable by JSDL).
>
>·     Once a job has been submitted it can be cancelled, but its resource 
>requests can't be modified.
>
>·     A distributed file system is accessible from client desktop machines 
>and client file servers, as well as compute nodes of the compute 
>cluster.  This implies that no data staging is required, that programs can 
>be (for the most part) executed from existing file system locations, and 
>that no program "provisioning" is required (since you can execute them 
>from wherever they are already installed).  Thus in this use case all data 
>transfer and program installation operations are the responsibility of the 
>user.
>
>·     Users already have accounts within the existing security 
>infrastructure (e.g. Kerberos).  They would like to use these and not have 
>to create/manage additional authentication/authorization credentials (at 
>least at the level that is visible to them).
>
>·     The job scheduling service resides at a well-known network name and 
>it is aware of the compute cluster and its resources by "private" means 
>(e.g. it runs on the head node of the cluster and employs private means to 
>monitor and control the resources of the cluster).  This implies that 
>there is no need for any sort of directory services for finding the 
>compute cluster or the resources it represents other than basic DNS.
>
>·     Compute cluster system management is opaque to users and is the 
>concern of the compute cluster's owners.  This implies that system 
>management is not part of the compute cluster's public job scheduling 
>interface.  This also implies that there is no need for a logging 
>interface to the service.  I assume that application-level logging can be 
>done by means of libraries that write to client files; i.e. that there is 
>no need for any sort of special system support for logging.
>
>·     A simple polling-based interface is the simplest form of interface 
>to something like a job scheduling service.  However, a simple call-back 
>notification interface is a very useful addition that potentially provides 
>substantial performance benefits since it can enable the avoidance of lots 
>of unnecessary network traffic.  Only job state changes result in 
>notification messages.
>
>·     There are no notions of fault tolerance.  Jobs that fail must be 
>resubmitted by the client.  Neither the cluster head node nor its compute 
>nodes are fault tolerant.  I do expect the client software to return an 
>indication of failure-due-system-fault when appropriate.  (Note that this 
>may also occur when things like network partitions occur.)
>
>·     One does need some notion of how to deal with orphaned resources and 
>jobs.  The notion of job lifetime and post-expiration garbage collection 
>is a natural approach here.
>
>·     The scheduling service provides a fixed set of scheduling policies, 
>with only a few basic choices (or maybe even just one), such as FIFO or 
>round-robin.  There is no notion, in general, of SLAs (which are a form of 
>scheduling policy).
>
>·     Enough information must be returned to the client when a job 
>finishes to enable basic accounting functionality.  This means things like 
>total wall-clock time the job ran and a summary of resources used.  There 
>is not a need for the interface to support any sort of grouping of 
>accounting information.  That is, jobs do not need to be associated with 
>projects, groups, or other accounting entities and the job scheduling 
>service is not responsible for tracking accounting information across such 
>entities.  As long as basic resource utilization information is returnable 
>for each job, accounting can be done externally to the job scheduling 
>service.  I do assume that jobs can be uniquely identified by some means 
>and can be uniquely associated with some principal entity existing in the 
>overall system, such as a user name.
>
>·     Just as there is no notion of requiring the job scheduling service 
>to track any but the most basic job-level accounting information, there is 
>no notion of the service enforcing quotas on jobs.
>
>·     Although it is generally useful to separate the notions of resource 
>reservation from resource usage (e.g. to enable interactive and debugging 
>use of resources), it is not a necessity for the most basic of job 
>scheduling services.
>
>·     There is no notion of tying multiple jobs together, either to 
>support things like dependency graphs or to support things like 
>workflows.  Such capabilities must be implemented by clients of the job 
>scheduling service.
>
>
>
>Interesting extension areas:
>
>·      Additional scheduling policies
>
>o     Weighted fair-share, &
>
>o     Multiple queues
>
>o     SLAs
>
>o     ...
>
>·      Extended resource descriptions
>
>o     Additional resource types, such as GPUs
>
>o     Additional types of compute resources, such as desktop computers
>
>o     Condor-style class ads
>
>·      Extended job descriptions (as returned to requesting clients and 
>sys admins)
>
>·      Additional classes of security credentials
>
>·      Reservations separated from execution
>
>o     Enabling interactive and debugging jobs
>
>o     Support for multiple competing schedulers (incl. desktop cycle 
>stealing and market-based approaches to scheduling compute resources)
>
>·      Ability to modify jobs during their existence
>
>·      Fault tolerance
>
>o     Automatic rescheduling of jobs that failed due to system faults
>
>o     Highly available resources:  This is partly a policy statement by a 
>scheduling service about its characteristics and partly the ability to 
>rebind clients to migrated service endpoints
>
>·      Extended state transition diagrams and associated functionalities
>
>o     Job suspension
>
>o     Job migration
>
>o     &
>
>·      Accounting & quotas
>
>·      Operating on arrays of jobs
>
>·      Meta-schedulers, multiple schedulers, and ecologies and hierarchies 
>of multiple schedulers
>
>o     Meta-schedulers
>
>·      Hierarchical job scheduling with a meta-scheduler as the only entry 
>point; forwarding jobs to the meta-scheduler from other subsidiary schedulers
>
>o     Condor-style matchmaking
>
>·      Directory services
>
>o     Using existing directory services
>
>o     Abstract directory service interface(s)
>
>·      Data transfer topics
>
>o     Application data staging
>
>·      Naming
>
>·      Efficiency
>
>·      Convenience
>
>·      Cleanup
>
>o     Program staging/provisioning
>
>·      Description
>
>·      Installation
>
>·      Cleanup
>
>
>
>
>
>Marvin.
>
>
>
>________________________________
>
>From: Ian Foster [<mailto:foster at mcs.anl.gov>mailto:foster at mcs.anl.gov]
>Sent: Monday, February 20, 2006 9:20 AM
>To: Marvin Theimer; ogsa-wg at ggf.org
>Cc: Marvin Theimer; Savas Parastatidis; Tony Hey; Marty Humphrey; 
>gcf at grids.ucs.indiana.edu
>Subject: Re: [ogsa-wg] Paper proposing "evolutionary vertical design efforts"
>
>
>
>Dear All:
>
>The most important thing to understand at this point (IMHO) is the scope 
>of this "HPC use case," as this will determine just how minimal we can be.
>
>I get the impression that the principal goal may be "job submission to a 
>cluster." Is that correct? How do we start to circumscribe the scope more 
>explicitly?
>
>Ian.
>
>
>
>At 05:45 AM 2/16/2006 -0800, Marvin Theimer wrote:
>
>Enclosed is a paper that advocates an additional set of activities that 
>the authors believe that the OGSA working groups should engage in.
>
>
>
>Broadly speaking, the OGSA and related working groups are already doing a 
>bunch of important things:
>
>·         There is broad exploration of the big picture, including 
>enumeration of use cases, taxonomy of areas, identification of research 
>issues, etc.
>
>·         There is work going on in each of the horizontal areas that have 
>been identified, such as EMS, data services, etc.
>
>·         There is working going around individual specifications, such as 
>BES, JSDL, etc.
>
>
>
>Given that individual specifications are beginning to come to fruition, 
>the authors believe it is time to also start defining vertical 
>profilesthat precisely describe how groups of individual specifications 
>should be employed to implement specific use cases in an interoperable 
>manner.  The authors also believe that the process of defining these 
>profiles offers an opportunity to close the design loopby relating the 
>various on-going protocol and standards efforts back to the use cases in a 
>very concrete manner.  This provides an end-to-end setting in which to 
>identify holes and issues that might require additional protocols and/or 
>(incremental) changes to existing protocols.  The paper introduces both 
>the general notion of doing focused vertical design effortsand then 
>focuses on a specific vertical design effort, namely a minimal HPC design.
>
>
>
>The paper derives a specific HPC design in a first principlesmanner since 
>the authors believe that this increases the chances of identifying 
>issues.  As a consequence, existing specifications and the activities of 
>existing working groups are not mentioned and this paper is not an attempt 
>to actually define a specifications profile.  Also, the absence of 
>references to existing work is not meant to imply that such work is in any 
>way irrelevant or inappropriate.  The paper should be viewed as a first 
>abstract attempt to propose a new kind of activity within OGSA.  The 
>expectation is that future open discussions and publications will explore 
>the concrete details of such a proposal.
>
>
>
>This paper was recently sent to a few key individuals in order to get 
>feedback from them before submitting it to the wider GGF 
>community.  Unfortunately that process took longer than intended and some 
>members of the community may have already seen a copy of the paper without 
>knowing the context within it was written.  This email should hopefully 
>dispel any misconceptions that may have occurred.
>
>
>
>For those people who will be around on for the F2F meetings on Friday, 
>Marvin Theimer will be giving a talk on the contents of this paper at a 
>time and place to be announced.
>
>
>
>Marvin Theimer, Savas Parastatidis, Tony Hey, Marty Humphrey, Geoffrey Fox
>
>
>
>_______________________________________________________________
>Ian Foster                    www.mcs.anl.gov/~foster
>Math & Computer Science Div.  Dept of Computer Science
>Argonne National Laboratory   The University of Chicago
>Argonne, IL 60439, U.S.A.     Chicago, IL 60637, U.S.A.
>Tel: 630 252 4619             Fax: 630 252 1997
>         Globus Alliance, www.globus.org 
> <<http://www.globus.org/>http://www.globus.org/>
>
>_______________________________________________________________
>Ian Foster                    www.mcs.anl.gov/~foster
>Math & Computer Science Div.  Dept of Computer Science
>Argonne National Laboratory   The University of Chicago
>Argonne, IL 60439, U.S.A.     Chicago, IL 60637, U.S.A.
>Tel: 630 252 4619             Fax: 630 252 1997
>         Globus Alliance, www.globus.org

_______________________________________________________________
Ian Foster                    www.mcs.anl.gov/~foster
Math & Computer Science Div.  Dept of Computer Science
Argonne National Laboratory   The University of Chicago
Argonne, IL 60439, U.S.A.     Chicago, IL 60637, U.S.A.
Tel: 630 252 4619             Fax: 630 252 1997
         Globus Alliance, www.globus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-wg/attachments/20060321/0538e4fd/attachment.htm