[DRMAA-WG] some questions on "DRMAA JOB TEMPLATE ATTRIBUTES"
Peter Troeger
peter.troeger at hpi.uni-potsdam.de
Tue Jan 2 13:50:15 CST 2007
Hello,
happy new year !
> The source code piece looked like this,
> .......
> drmaa_set_attribute (jt, DRMAA_JOB_NAME,
> "test_submit", error, DRMAA_ERROR_STRING_BUFFER);
> drmaa_set_attribute (jt, DRMAA_WD,
> "/home/group1/test", error,
> DRMAA_ERROR_STRING_BUFFER);
> drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND,
> "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER);
> ...........
>
> it did not work, and sge6 gave error information,
> "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1
> failed on host compute2 general searching requested
> shell because: 12/26/2006 21:25:38 [501:10278]:
> execvp(test_submit.sh, "test_submit.sh") failed: No
> such file or directory"
>
> but after I changed the value of
> "DRMAA_REMOTE_COMMAND"
> to absolutely full path ---
> drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND,
> "/home/group1/test/test_submit.sh", error,
> DRMAA_ERROR_STRING_BUFFER);
> it worked well.
The DRMAA spec looks a little bit unclear here, but SGE confirms to
the traditional Unix thinking. Working directory means the directory
"where the job is executed", which is in first place an indication
for the location of input and output files. The exec command in Unix
searches the PATH directories, which may not include the current
directory ("."). Therefore, the exec command cannot locate the binary
on your execution host without full path. It is the same reason why
you must type "./test_submit.sh" instead of "test_submit.sh" in your
shell, even if you are in the right directory. This is for security -
ask your local administrator ;-) ...
The funny thing is that the same program works with Condor, even if
"." is not in the PATH. Condor seems to search the working directory
automatically, maybe for compatibility reasons between Windows and
Unix submission files. DRMAA has no real chance to do something about
that, since the library implementation cannot influence the execution
host mechanisms.
> Another question,
> While I write a SGE script file for parallel job,
> the script file looks like this,
> ........
> /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS
> /home/group1/test/test_arg 4
> SGE6 will give $NSLOTS a suitable value while it
> submits the job. If I want to do this only through
> DRMAA API, how to implement it.
This relates to DRM monitoring issues, and is not covered in DRMAA
1.0 so far. We know that users want more placeholders in job
templates, so there is a discussion wiki page about possible new
parameters:
http://www.drmaa.org/wiki/index.php/DrmaaJobTemplatePlaceholders
Please feel free to add your demanded job template parameter there.
It would be great if you add the SGE-specific implementation of your
suggestion, so that we can check if other DRM systems are also able
to handle this.
Thanks,
Peter.
More information about the drmaa-wg
mailing list