[DRMAA-WG] Work distribution

Mariusz Mamoński mamonski at man.poznan.pl
Wed Feb 17 05:52:32 CST 2010


Hi,

2010/2/17 Peter Tröger <peter at troeger.eu>:
> Hi,
>
>> Some thoughts:
>> - most of the batch systems were designed to work with network shared
>> file system. However one can imagine a situation when the home file
>
>> - i would prefer to keep this interface as simple as possible and thus
>> handle the only case that can not be handled without interaction with
>> DRMS: staging file from submission host to execution host, as the
>> execution host is usually not know before a job starts.
>
> This brings the new file staging approach closer to what we had in DRMAAv1. The unknown execution host is a very valid argument, since we implicitly assume that staging happens before job start. Your research also shows that at LSF can only copy from / to the submission host, which definitely kills the idea of free server transfers. I can live with that.
>
>> - in order to keep the interface really simply i would assume that
>> file names (not necessary the absolute paths) are the same both on
>> execution host and submission host. (again if not, the user can do
>> easily some workaround by copying/moving the file on submission host).
>
> Interesting. So the idea is to stage only whole directories ? What is if I only want to move the STDIN file, and nothing else ?
>
sorry, i should be more clear on this. I meant that we should support
only the minimum functionality:

* staging single files (LSF and Torque for e.g. do not support staging
directories *by default*), so no directories and no wildard
expressions would be supported ( it could be supported in sense of
MAY[rfc2119] keyword ;-).
* the file name must be the same both on execution and submission host
(however absolute paths may differ, e.g. if relative paths are given)

With this basic tool, user can implement (if needed, let me remind
that in my opinion the non shared file system is quite rare case)
- wildcards on stage-in (by evaluating the wildard before submission
and giving explicit list of files)
- directories on stage-in (by listing the directory on submission host
and giving explicit list of files)
- different file names on submission/execution host (e.g. we want the
file name "foo" to be copied into "bar" file to the execution host -
move/copy file priori the submission)
- wildcards/directories on stage-out (one possible solution is to zip
all results files in the job's script)

>> So my proposition of the DRMAA staging interface looks as follows:
>> split "fileTransfers" attribute into two attributes (also of the
>> OrderedStringList type):
>> - stageInFiles
>> - stageOutFiles
>> which are simple list of files to be staged-in/staged-out (no URLs,
>> only paths). The paths can be relative (to current working directory
>> on submission host, and job working directory on execution host).
>
> I like that, despite the fact that I would like to see single file / wildcard support as discussed last time in the phone call.
>
see above.
> Let's find some agreement in todays phone call.
>
> Best,
> Peter.
>
>


Cheers,
-- 
Mariusz


More information about the drmaa-wg mailing list