[SAGA-RG] SAGA Resource Package Comments
Ole Weidner
ole.weidner at rutgers.edu
Wed Jan 16 05:53:48 EST 2013
All,
On Jan 16, 2013, at 09:14 , Andre Merzky <andre at merzky.net> wrote:
> Hi Steve,
>
> sorry, I thought I had answered this -- wishful thinking it seems....
> But I also put it back on the list, hope that's ok with you...
>
> Your comments reflect, to some extent, the reservations Ole has: not
> focused enough, no clean separation of concerns.
Correct. Steve's and my concerns are pretty much the same.
> I understand what
> you mean, but I think we are missing the opportunity to make the
> *usage* of dynamically allocated resources simpler.
Not sure if this can or should really be accomplished in one single SAGA package.
>
> About pilots: sure, if you don't have control over the pilots, a
> resource handle is not very useful (beyond inspection). But the same
> holds for all other resource types, too: if you are using a VM which
> was not created by you, you likely can't tear it down either, and the
> resource handle is near useless. In those cases, the resources look
> like the old job service. The resource API will only be really useful
> for those cases where you have control over the resources...
>
> Either way, thanks for the comments. This will likely mean that we'll
> trim down the resource API to simply perform
>
> - describe resource
> - acquire resource
> - release resurce
> - get job/data service for resource
+1
>
> Best, Andre.
Cheers
Ole
>
>
>
> On Wed, Jan 16, 2013 at 2:21 AM, Steve Fisher <dr.s.m.fisher at gmail.com> wrote:
>> I am surprised to have received no reaction to this!
>>
>> Steve
>>
>>
>> On 2 January 2013 09:29, Steve Fisher <dr.s.m.fisher at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I am just back after Christmas. After all this time my response is rather
>>> short.
>>>
>>> I do not like the resource API as you have defined it. The problems are
>>> caused by trying to extend the job service. The task of allocating resources
>>> is just another service. The resource after allocation may itself be a job
>>> service but it will look just like any other job service and not know
>>> whether it has been dynamically created by a cloud, be on a virtual machine
>>> or a physical machine.
>>>
>>> I would leave the job stuff alone and think about an API which is
>>> open-nebula or OCCI inspired. I find your proposal unnecessarily complex and
>>> it is mixing concerns.
>>>
>>> I am not sure where pilot jobs come in. For EGI/EGEE a pilot job starts
>>> and the job cannot be manipulated by the user so it requires no API. The
>>> pilot job just looks after the running of a sequence of other jobs. In this
>>> model the pilot job is owned by the experiment and looks after running
>>> others users jobs. I have no experience of the way that pilot jobs have
>>> evolved. It is possible that if we want to accommodate them in SAGA that
>>> another extension might be needed, however I would need to understand better
>>> the way in which the pilot job notion has moven on and perhaps diversified.
>>>
>>> Steve
>>>
>>>
>>> On 18 December 2012 15:19, Andre Merzky <andre at merzky.net> wrote:
>>>>
>>>> Sorry, I should have attached both... - here is the package proper...
>>>>
>>>> Andre.
>>>>
>>>>
>>>> On Tue, Dec 18, 2012 at 2:31 PM, Steve Fisher <dr.s.m.fisher at gmail.com>
>>>> wrote:
>>>>> So the discussion just relates to the Python bindings for resources. I
>>>>> thought it was a more general resource package matter as an update to
>>>>> what
>>>>> was presented at OGF 29?
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>> On 18 December 2012 13:00, Andre Merzky <andre at merzky.net> wrote:
>>>>>>
>>>>>> Welcome to the future! :-) document is attached.
>>>>>>
>>>>>> Best, Andre.
>>>>>>
>>>>>> PS.: the create() methods need to get moved outside of the classes,
>>>>>> I'll fix that....
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Dec 18, 2012 at 1:54 PM, Steve Fisher
>>>>>> <dr.s.m.fisher at gmail.com>
>>>>>> wrote:
>>>>>>> "tomorrow" has finally arrived. Could you send me a copy of the
>>>>>>> document
>>>>>>> please.
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>> On 13 December 2012 01:09, Steve Fisher <dr.s.m.fisher at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> OK. I will take a look tomorrow
>>>>>>>> Steve
>>>>>>>>
>>>>>>>> On 12 Dec 2012 21:35, "Ole Weidner" <ole.weidner at rutgers.edu>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Steve,
>>>>>>>>>
>>>>>>>>> would you be willing to provide us with some input with regards to
>>>>>>>>> the
>>>>>>>>> proposed saga resource package? Unfortunately, Andre and I
>>>>>>>>> currently
>>>>>>>>> are in
>>>>>>>>> somewhat of a deadlock situation -- see our discussion below ;-)
>>>>>>>>> We
>>>>>>>>> haven't
>>>>>>>>> been very successful at getting any external input / opinion so
>>>>>>>>> far.
>>>>>>>>>
>>>>>>>>> We are planning to implement a first prototype of the resource
>>>>>>>>> package
>>>>>>>>> for saga-python in the next two or three days, so it would be good
>>>>>>>>> to
>>>>>>>>> have
>>>>>>>>> at least some consensus of how it should look like.
>>>>>>>>>
>>>>>>>>> So if you would like to share your opinion with us, that would be
>>>>>>>>> much
>>>>>>>>> appreciated!
>>>>>>>>>
>>>>>>>>> Many thanks!
>>>>>>>>> Ole
>>>>>>>>>
>>>>>>>>> On Dec 8, 2012, at 04:07 , Andre Merzky <andre at merzky.net> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Ole,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Nov 13, 2012 at 4:54 PM, Ole Weidner
>>>>>>>>>> <ole.weidner at rutgers.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Andre,
>>>>>>>>>>>
>>>>>>>>>>> here's some quick comments w.r.t. the resource package. It's
>>>>>>>>>>> more
>>>>>>>>>>> for
>>>>>>>>>>> the records - we can talk about this in detail when we discuss
>>>>>>>>>>> the
>>>>>>>>>>> resource
>>>>>>>>>>> package on the 28th (journal club).
>>>>>>>>>>
>>>>>>>>>> Sorry this took so long, I am trying to catch up with things
>>>>>>>>>> now...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I fundamentally disagree that 'Compute' inherits from
>>>>>>>>>>> 'saga.job.Service' and that 'Storage' inherits from
>>>>>>>>>>> saga.filesystem.Directory' because this:
>>>>>>>>>>>
>>>>>>>>>>> 1. breaks SAGA's horizontally independent package model: it
>>>>>>>>>>> would
>>>>>>>>>>> mean
>>>>>>>>>>> that I can't implement a resource-package only implementation
>>>>>>>>>>> of
>>>>>>>>>>> SAGA. I
>>>>>>>>>>> would have to implement the Job and the Filesystem package as
>>>>>>>>>>> well!
>>>>>>>>>>
>>>>>>>>>> A compute resource is useless if you can't submit jobs, a
>>>>>>>>>> storage
>>>>>>>>>> resource is useless if you can't store files. That does not
>>>>>>>>>> change
>>>>>>>>>> if
>>>>>>>>>> you replace inheritance with get_job_service / get_filesystem --
>>>>>>>>>> in
>>>>>>>>>> both cases, you will need to implement the job package / file
>>>>>>>>>> package,
>>>>>>>>>> too.
>>>>>>>>>>
>>>>>>>>>> While we indeed try to avoid too many cross dependencies between
>>>>>>>>>> functional API packages, we do have them in some places, most
>>>>>>>>>> notably
>>>>>>>>>> for the namespace derivates.
>>>>>>>>>>
>>>>>>>>>> FWIW, another reason why compute resource inherits from
>>>>>>>>>> job.service
>>>>>>>>>> is
>>>>>>>>>> that we intented to fix some shortcomings of the job service, in
>>>>>>>>>> particular wanted to add the ability to directly submit JSDL.
>>>>>>>>>> Inheritance provides a very simple means to do so. I agree
>>>>>>>>>> though
>>>>>>>>>> that this should not be the foremost concern for API design -
>>>>>>>>>> but
>>>>>>>>>> anyway.
>>>>>>>>>>
>>>>>>>>>> Another point though I want to make: I don't like the idea to
>>>>>>>>>> have a
>>>>>>>>>> job service, which is not stateful, depending on the state of a
>>>>>>>>>> compute resource (and same for filesystem / storage resource) --
>>>>>>>>>> on
>>>>>>>>>> API level, there are no means to infer if the job service is
>>>>>>>>>> valid
>>>>>>>>>> for
>>>>>>>>>> job submission at any point in time (you can't get a resource
>>>>>>>>>> handle
>>>>>>>>>> from a job service instance) - so it always boils down to
>>>>>>>>>> try/error.
>>>>>>>>>> We so far managed to avoid those implicit state dependencies,
>>>>>>>>>> and I
>>>>>>>>>> would like to keep it this way. [Yes, a decoupling maps better
>>>>>>>>>> to
>>>>>>>>>> the
>>>>>>>>>> Pilot API, but I would rather like to fix that in the Pilot API
>>>>>>>>>> ;-)]
>>>>>>>>>>
>>>>>>>>>> Don't get me wrong: I understand that inheritance is a pretty
>>>>>>>>>> strong
>>>>>>>>>> coupling, and it does not necessarily reflect how DCIs are
>>>>>>>>>> architected
>>>>>>>>>> internally -- from the end user perspective though, I find this
>>>>>>>>>> rendering simple, intuitive, and easy to use...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 2. it mixes separate concerns: resource management and job
>>>>>>>>>>> submission!
>>>>>>>>>>
>>>>>>>>>> I kind of agree, but think that this is set off by ease of use:
>>>>>>>>>> get
>>>>>>>>>> a
>>>>>>>>>> compute resource, submit jobs to it - bang. This is, by far,
>>>>>>>>>> the
>>>>>>>>>> dominant use case, so I would like to see this rendered
>>>>>>>>>> exceedingly
>>>>>>>>>> simple.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 3. I don't think that 'Directory' necessarily provides the
>>>>>>>>>>> right
>>>>>>>>>>> abstraction for all 'Storage' types. Certainly for most, but
>>>>>>>>>>> not
>>>>>>>>>>> for all.
>>>>>>>>>>> It's unnecessarily confining.
>>>>>>>>>>
>>>>>>>>>> Yes, that is a limitation -- but unless we have a compelling use
>>>>>>>>>> case
>>>>>>>>>> for other storage abstractions, and those use cases do not imply
>>>>>>>>>> an
>>>>>>>>>> overly complicated approach to storage resources, that is the
>>>>>>>>>> best
>>>>>>>>>> abstraction we have, right? Even if the backend storage
>>>>>>>>>> resources
>>>>>>>>>> have a limited / constraint namespace (think Amazon S3), the
>>>>>>>>>> filesystem abstraction still holds up nicely IMHO. Also, I am
>>>>>>>>>> not
>>>>>>>>>> concerned about provisioning of databases etc. -- we don't have
>>>>>>>>>> decent
>>>>>>>>>> (or any) abstractions for those in SAGA, nor do we have use
>>>>>>>>>> cases
>>>>>>>>>> that
>>>>>>>>>> I know of -- so that would be out of scope for now.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Furthermore,
>>>>>>>>>>>
>>>>>>>>>>> - class manager -> Manager
>>>>>>>>>>
>>>>>>>>>> fixed, thanks.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - what does manager.describe_resource() do? why can't it be
>>>>>>>>>>> manager.resources[x].get_description()
>>>>>>>>>>
>>>>>>>>>> Hmm, probably right - but while that works nicely in python, you
>>>>>>>>>> would
>>>>>>>>>> have
>>>>>>>>>>
>>>>>>>>>> manager.get_resource (id).get_description ()
>>>>>>>>>>
>>>>>>>>>> and chaining is something we do not promote in the API so far.
>>>>>>>>>> Thus,
>>>>>>>>>> I would like to keep the method in the API, but I agree that
>>>>>>>>>> your
>>>>>>>>>> version is (in Python) the more intuitive one.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - speaking of resources[x] - there's no 'non-property' version,
>>>>>>>>>>> i.e.,
>>>>>>>>>>> get_resource()
>>>>>>>>>>
>>>>>>>>>> thanks, I'll fix that.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - I would prefer explicit list/get_compute(),
>>>>>>>>>>> list/get_storage(),
>>>>>>>>>>> list/get_network() and so on, so that we don't have to do type
>>>>>>>>>>> checking all
>>>>>>>>>>> over the place.
>>>>>>>>>>
>>>>>>>>>> The list / get calls have a 'type' parameter, so you can filter
>>>>>>>>>> for
>>>>>>>>>> specific resource types:
>>>>>>>>>>
>>>>>>>>>> compute_resources = manager.list_resources
>>>>>>>>>> (saga.resource.Compute)
>>>>>>>>>> storage_resources = manager.list_resources
>>>>>>>>>> (saga.resource.Storage)
>>>>>>>>>>
>>>>>>>>>> The default is 'Any' though, which obviously gives you all
>>>>>>>>>> types.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - why are there two Pool.add() methods? Why do we want to be
>>>>>>>>>>> able
>>>>>>>>>>> to
>>>>>>>>>>> add resources as strings?
>>>>>>>>>>
>>>>>>>>>> Alas, we have that in a few places in the API. There was a very
>>>>>>>>>> long
>>>>>>>>>> discussion, a long time ago, where people argued that only using
>>>>>>>>>> handles would have too much of a performance impact (you'd
>>>>>>>>>> always
>>>>>>>>>> need
>>>>>>>>>> to create handles, which is *at least* one round-trip), and that
>>>>>>>>>> only
>>>>>>>>>> using IDs would be too unwieldy to handle in many cases. While
>>>>>>>>>> I
>>>>>>>>>> agree with the first point, I do not think that the second one
>>>>>>>>>> is
>>>>>>>>>> very
>>>>>>>>>> valid. That is one item I would like to clean up across SAGA in
>>>>>>>>>> an
>>>>>>>>>> eventual API revision (if that ever happens). So, for now that
>>>>>>>>>> is
>>>>>>>>>> in
>>>>>>>>>> the resource API as well, for consistency, but I personally do
>>>>>>>>>> not
>>>>>>>>>> care much about it. If we limit that, then I would be in favor
>>>>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>> id version.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cheers, Andre.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Cheers!
>>>>>>>>>>> Ole
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Nothing is really difficult...
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nothing is really difficult...
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nothing is really difficult...
>>>
>>>
>>
>
>
>
> --
> There are only two hard things in Computer Science: cache invalidation
> and naming things.
>
> -- Phil Karlton
More information about the saga-rg
mailing list