[DRMAA-WG] some questions on "DRMAA JOB TEMPLATE ATTRIBUTES"

Peter Troeger peter.troeger at hpi.uni-potsdam.de
Tue Jan 2 13:50:15 CST 2007


Hello,

happy new year !

> The source code piece looked like this,
>         .......
>     drmaa_set_attribute (jt, DRMAA_JOB_NAME,
> "test_submit", error, DRMAA_ERROR_STRING_BUFFER);
>     drmaa_set_attribute (jt, DRMAA_WD,
> "/home/group1/test", error,
> DRMAA_ERROR_STRING_BUFFER);
>     drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND,
> "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER);
>     ...........
>
> it did not work, and sge6 gave error information,
> "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1
> failed on host compute2 general searching requested
> shell because: 12/26/2006 21:25:38 [501:10278]:
> execvp(test_submit.sh, "test_submit.sh") failed: No
> such file or directory"
>
> but after I changed the value of
> "DRMAA_REMOTE_COMMAND"
>  to absolutely full path ---
>    drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND,
> "/home/group1/test/test_submit.sh", error,
> DRMAA_ERROR_STRING_BUFFER);
>     it worked well.

The DRMAA spec looks a little bit unclear here, but SGE confirms to  
the traditional Unix thinking. Working directory means the directory  
"where the job is executed", which is in first place an indication  
for the location of input and output files. The exec command in Unix  
searches the PATH directories, which may not include the current  
directory ("."). Therefore, the exec command cannot locate the binary  
on your execution host without full path. It is the same reason why  
you must type "./test_submit.sh" instead of "test_submit.sh" in your  
shell, even if you are in the right directory. This is for security -  
ask your local administrator ;-) ...

The funny thing is that the same program works with Condor, even if  
"." is not in the PATH. Condor seems to search the working directory  
automatically, maybe for compatibility reasons between Windows and  
Unix submission files. DRMAA has no real chance to do something about  
that, since the library implementation cannot influence the execution  
host mechanisms.

>    Another question,
>    While I write a SGE script file for parallel job,
> the script file looks like this,
>         ........
>    /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS
> /home/group1/test/test_arg 4
>    SGE6 will give $NSLOTS a suitable value while it
> submits the job. If I want to do this only through
> DRMAA API, how to implement it.

This relates to DRM monitoring issues, and is not covered in DRMAA  
1.0 so far. We know that users want more placeholders in job  
templates, so there is a discussion wiki page about possible new  
parameters:

http://www.drmaa.org/wiki/index.php/DrmaaJobTemplatePlaceholders

Please feel free to add your demanded job template parameter there.  
It would be great if you add the SGE-specific implementation of your  
suggestion, so that we can check if other DRM systems are also able  
to handle this.

Thanks,
Peter.



More information about the drmaa-wg mailing list