[ogsa-wg] Teleconference minutes - 2 November 2005

Sat Dec 10 12:01:15 CST 2005

Donal,

I see from the RSS description on GridForge that RSS will consist of a
Candidate Set Generator, which I assume could be loosely described as a
"matching" service, and an Execution Planning Service, which I assume
could loosely be described at a "scheduling service").  Come to think of
it, I guess the combination of the two could be loosely described as a
"broking service".  A couple of questions come to mind.

First, at this year's London F2F people began talking about the need to
do scheduling over all services, not just execution services. I.e. we
may need to schedule access to data services and network services.  It
would be desirable to have a framework generic enough to handle all
sorts of resources (as long as they in turn provide the necessary
information).  Has anyone considered this w.r.t. RSS?

Second, there are lots of ways to implement scheduling.  Would it be
possible just to specify the interfaces and allow implementations of the
services to use whichever algorithms they want?  E.g. in the data
architecture we say almost nothing about the functioning of a data
federation service, because we can abstract from most of the inner
workings.  Do we clearly understood what the output of the RSS service
should be?  

Third, you mention the idea of passing in a ranking function parameter.
This sounds similar to a question I raised recently on this list about
selecting data sources, in which I suggested passing in a policy
argument.  Admittedly I was making that suggestion in the particular
context of passing a policy argument to a reference renewing service,
but the two cases do seem sufficiently similar to merit a comparison.
The replies I received then expressed a strong preference for letting
the selection be done by the client rather than attempting to
parameterise a service.  I think it's worth drilling down on this issue,
and this is what I attempt to do for the rest of this message.

At a sufficiently high level, we are attempting to do something simple.
We have two inputs: a list of activities and a list of resources, and
one output: a mapping of activities to resources.  In the simplest case,
we are selecting one resource (e.g. the replica to read data from or the
protocol/source from which to transfer the data), and this may be so
simple that we treat it differently (e.g. always in client code instead
of a separate service).  More complicated cases might produce a set of
pairs, or a more complicated conditional mapping.

In each case we could either (i) move the data to the client and perform
the selection in client code (ii) pass the algorithm as a parameter to
another service (iii) have a separate service that incorporates some
aspects of the algorithm while allowing some parameterisation by policy.

One factor that influences this choice might be who has the access to
the necessary information.  If the client has privileged information
that should not be made public, then the choice might have to be made by
the client; or the framework should encrypt the policy parameter and
provide a trust relationship with the selection service.  Conversely,
the client may want to see only an abstraction of the detailed selection
process.

If we have parameterisation, then the question arises of how expressive
a language we need to specify the parameterisation.  Do we need a
general-purpose programming language (e.g. Python script) or will the
draft WS-Agreement language be sufficient?  

The replies I received suggested both that the parameterisation would
require a general programming language and that the client may have
privileged knowledge that it does not wish to share, and therefore it
would be better to leave decisions to the client.  I'm rather sceptical
of the first point but it would be good to examine real systems to
determine what is really needed.

There are clearly many possible answers in this space - which brings us
back to Ian's list of schedulers.  One question might be whether we can
identify an interface, or a small number of interfaces, that generalises
a significant number of practical use cases.

Best wishes,

Dave.

-----Original Message-----
From: owner-ogsa-wg at ggf.org [mailto:owner-ogsa-wg at ggf.org] On Behalf Of
Donal K. Fellows
Sent: 09 December 2005 09:37
To: 'ogsa-wg'
Cc: ogsa-rss-wg at ggf.org
Subject: Re: [ogsa-wg] Teleconference minutes - 2 November 2005

Steven Newhouse wrote:
> A selector interface is being defined as part of the RSS activities
> (https://forge.gridforum.org/projects/ogsa-rss-wg/) . They may have a 
> comment to make on the nature of the interface they are considering - 
> and how it could/could not encompass these services.

I (as OGSA-RSS co-chair) welcome any input people have over possible
strategies. At the moment, I'm thinking in terms of ordered sets where
the caller supplies a ranking function[*] (or input to such a function).
That gives plenty of flexibility and power. I'd love to have input from
other people as to what sorts of things ought to go in the ranking
function parameter; we could leave it unspecified (easy for the RSS WG!)
but that's not good for working towards an interoperable solution so I'd
like to seed the space with at least a minimal level of things that must
be supported.

Donal Fellows.
[* Experience with Condor indicates that a "general acceptability"
    filter is also useful, but a first hack at that is a straight
    resource satisfaction check which ought to be performed anyway. ]