[saga-rg] JobService.submitJob() query
Christopher Smith
csmith at platform.com
Mon Mar 6 17:18:58 CST 2006
Ultimately the host specified in run_job turns into the SAGA_Hostlist
attribute with only the one name specified.
Yes ... I suppose the RM endpoint is needed somewhere for run_job, but then
it's starting to get more complicated, and thus you have to start
questioning whether it's needed ...... etc, etc
-- Chris
On 6/3/06 15:15, "Andre Merzky" <andre at merzky.net> wrote:
> Hi Chris,
>
> Quoting [Christopher Smith] (Mar 07 2006):
>>
>> run_job doesn't have an RM contact, but is actually specifying the end
>> resource to run on (a FQDN of a compute node). It's there mostly to support
>> things like ssh backends.
>
> For SSH or GRAM, what would be the difference between a
> "resource management contact" and a the "end resource"?
> Would there be any difference?
>
> If the "RM contact" is "my_rm://rm.grid.net:123/", what does the
> "end resource" be used for: as a hint for the RM where to
> run on?
>
> Thanks, Andre.
>
>
>>
>> -- Chris
>>
>>
>> On 6/3/06 13:59, "Andre Merzky" <andre at merzky.net> wrote:
>>
>>> Hi Chris, Graeme,
>>>
>>> if we go for 2, we should actually remove the RM contact
>>> from run_job, as that would conflict with the RM contact
>>> given on the job_service creation.
>>>
>>> Cheers, Andre.
>>>
>>>
>>> Quoting [Christopher Smith] (Mar 06 2006):
>>>>
>>>> I'll finally provide some context first ....
>>>>
>>>> As Andre mentioned in a previous email, we had left the choice of resource
>>>> manager up to the implementation of the library, and didn't expose it up
>>>> into the API layer. Feedback indicates that this was a mistake. :-)
>>>>
>>>> I can go for either 1 or 2, with a preference for 2. The reason is that to
>>>> me the JobService represents the service endpoint to a job scheduler or
>>>> resource manager. That said, I'm not religious about it.
>>>>
>>>> -- Chris
>>>>
>>>>
>>>> On 22/2/06 01:27, "Graeme Pound" <G.E.POUND at soton.ac.uk> wrote:
>>>>
>>>>> Andre,
>>>>>
>>>>> I may have confused the issue by mentioning the JobDefinition attribute
>>>>> "SAGA_HostList". This appears to have a valid role mapping to the
>>>>> "CandidateHosts" element of a JSDL document (and also in LSF?). The
>>>>> "SAGA_HostList" attribute should not be used to specify the resource
>>>>> manager.
>>>>>
>>>>> The problem rather is how is the resource manager (in the form of a URL
>>>>> or machine name) specified when the user invokes submitJob(). There are
>>>>> three possible solutions:
>>>>> 1) Add an argument "host" to the submitJob() method (similar to
>>>>> runJob()) to specify the resource manager
>>>>> 2) Specify the resource manager in the JobService constructor
>>>>> 3) Add a new required attribute to the JobDefinition class (e.g.
>>>>> "SAGA_ResouceManager")
>>>>>
>>>>> I do not like solution 3 since job definition is conceptually distinct
>>>>> from the choice of resource manager, for example you may wish to define
>>>>> a single JobDefinition and submit that job to several different resource
>>>>> managers.
>>>>>
>>>>> I do not have a strong preference between solutions 1 and 2. (Although
>>>>> constructors pose a small problem for the Java bindings)
>>>>>
>>>>> Graeme
>>>>>
>>>>>
>>>>>
>>>>> Andre Merzky wrote:
>>>>>> Hi Graeme,
>>>>>>
>>>>>> I am not much of an expert in resource thingies, so what I
>>>>>> say should be taken with a grain of salt. Or rather with a
>>>>>> spoon I guess...
>>>>>>
>>>>>> I assumed that run_job ("myhost", ...) would run the job on
>>>>>> myhost. But what you say (and what the spec implies I
>>>>>> think, after reading again) is that myhost is specifying the
>>>>>> resource manager contact, not the target host. Right?
>>>>>>
>>>>>> Well, then we have the ability to specify a resource manager
>>>>>> contact for run_job, and a target resource on job_submit.
>>>>>> But no combination seems possible.
>>>>>>
>>>>>> Frankly, I think that does not make sense. For one the
>>>>>> semantics between job_run and job_submit should not be that
>>>>>> different (job_run was supposed to be a shortcut for
>>>>>> job_submit). Secondly, there are clearly use cases for both
>>>>>> versions (resource manager contact and target resource).
>>>>>>
>>>>>> One solution would be to specify the resource manager
>>>>>> contact in the constructor of the job_service, as that is
>>>>>> supposed to represent the resource and job manager anyway I
>>>>>> guess. The host in run_job would then specify the target
>>>>>> resource ("" for 'any resource' I guess).
>>>>>>
>>>>>> But again, I am not on firm ground in resource management,
>>>>>> and rather would not like to mess w/o feedback of someone
>>>>>> more knowledgeable. What is your opinion?
>>>>>>
>>>>>> Thanks, Andre.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Quoting [Graeme Pound] (Feb 21 2006):
>>>>>>> Andre,
>>>>>>>
>>>>>>> I do not think that the attribute 'SAGA_HostList' of
>>>>>>> JobDefinition is appropriate either. The description of
>>>>>>> 'SAGA_HostList' in the strawman API reads:
>>>>>>>
>>>>>>> SAGA_HostList
>>>>>>> - A list of host names, or host group names, which can be
>>>>>>> considered by the resource manager as candidate hosts for
>>>>>>> the job. Whether or not the job actually ends up running
>>>>>>> on one of the hosts in the list, is solely at the
>>>>>>> discretion of the resource manager. Vector of strings.
>>>>>>> (JSDL, LSF)
>>>>>>>
>>>>>>> This attribute should be used to pass information to the
>>>>>>> resource manager, NOT to specify the resource manager. For
>>>>>>> example it maps to the "CandidateHosts" element of a JSDL
>>>>>>> document.
>>>>>>>
>>>>>>> The first argument of runJob() is the "host name or IP
>>>>>>> address of the endpoint which will accept and run the
>>>>>>> job". This argument is not defined for submitJob(), nor is
>>>>>>> it defined as an attribute of JobDefinition.
>>>>>>>
>>>>>>> Graeme
>>>>>>
>>>>>>>
>>>>>>> Andre Merzky wrote:
>>>>>>>> Oops, you are right! My wrong - mixed it up with
>>>>>>>> ExecutionHosts. Well, then I really was off target:
>>>>>>>> SAGA_HostList is then indeed what you should use to specify
>>>>>>>> the target resource.
>>>>>>>>
>>>>>>>> The run_job would, in its simpliest implementation, create a
>>>>>>>> job description with SAGA_HostList set to the specified
>>>>>>>> endpoint, and do a submit_job on that description.
>>>>>>>>
>>>>>>>> Sorry for creating confusion...
>>>>>>>>
>>>>>>>> Andre.
>>>>>>>>
>>>>>>>>
>>>>>>>> Quoting [Graeme Pound] (Feb 20 2006):
>>>>>>>>> Date: Mon, 20 Feb 2006 16:24:32 +0000
>>>>>>>>> From: Graeme Pound <G.E.POUND at soton.ac.uk>
>>>>>>>>> To: Andre Merzky <andre at merzky.net>
>>>>>>>>> CC: SAGA RG <saga-rg at ggf.org>
>>>>>>>>> Subject: Re: [saga-rg] JobService.submitJob() query
>>>>>>>>>
>>>>>>>>> Andre,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 'SAGA_HostList' is on the job, and read only - it gives
>>>>>>>>>> information on where the job _is_ running, not where it
>>>>>>>>>> _should_ run.
>>>>>>>>> I do not follow you here. 'SAGA_HostList' is an attribute of
>>>>>>>>> JobDefinition (not JobInfo), therefore it should not be read only.
>>>>>>>>>
>>>>>>>>> Graeme
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Andre Merzky wrote:
>>>>>>>>>> Uhm, I think you got us there. I don't see any way to
>>>>>>>>>> specify the resource either, so its left completely up to
>>>>>>>>>> the backend to schedule the job. I am not sure if that was
>>>>>>>>>> intended.
>>>>>>>>>>
>>>>>>>>>> Chris, are we missing something? Did we intend to leave
>>>>>>>>>> resource specification out? Can't really be, as we have it
>>>>>>>>>> in run_job as Graeme points out...
>>>>>>>>>>
>>>>>>>>>> 'SAGA_HostList' is on the job, and read only - it gives
>>>>>>>>>> information on where the job _is_ running, not where it
>>>>>>>>>> _should_ run.
>>>>>>>>>>
>>>>>>>>>> Andre.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Quoting [Graeme Pound] (Feb 20 2006):
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Can anybody clear up this issue for me? I may be missing something
>>>>>>>>>>> in
>>>>>>>>>>> the spec.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Graeme
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -3.31 How is the resource manager endpoint for
>>>>>>>>>>> JobService.submitJob()
>>>>>>>>>>> specified?
>>>>>>>>>>>
>>>>>>>>>>> The endpoint is specified by an argument of JobService.runJob(), but
>>>>>>>>>>> is
>>>>>>>>>>> not _obvious_ for submitJob(). Is the endpoint a property of the
>>>>>>>>>>> instance of JobService (with runJob() a 'static' method of the
>>>>>>>>>>> class),
>>>>>>>>>>> or is the endpoint specified as an attribute of JobDefinition?
>>>>>>>>>>>
>>>>>>>>>>> I assume that the contents of the JobDefinition attribute
>>>>>>>>>>> 'SAGA_HostList' is intended to be passed to the resource manager
>>>>>>>>>>> (rather than specify the resource manager itself); for example
>>>>>>>>>>> mapping
>>>>>>>>>>> to the 'CandidateHosts' element of a JSDL document. Is an additional
>>>>>>>>>>> attribute within JobDefinition required to specify the endpoint?
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>>
>
>
More information about the saga-rg
mailing list