[drmaa-wg] DRMAA TEST SUITE

Ruben Santiago Montero rubensm at dacya.ucm.es
Tue Mar 21 04:55:00 CST 2006


Hi,
>
> Sorry, no. Since the job never ran, there is also no termination -
> normal or abnormal. The ST_INPUT_FILE_FAILURE test case checks for a
> non-zero result from drmaa_wifaborted(), which indicates that the job
> was 'ended' before it entered the running state. I would expect that
> drmaa_wifexited() returns a zero value for "exited" in this case, since
> no further exit information can be available.
>

Sorry, I do not agree. In the DRMS context, job life cycle comprises all the 
job execution stages since the job enters the DRM system. In this sense, 
whenever a job is submitted there should be a termination (either it actually 
ran or not). I can give you an example, if you submit a job (qsub) and then 
you kill it (qdel), it is obvious that the job terminated abnormally (it has 
been killed), although the job never entered the running state.

There is no relation between if the job terminated normally and if there is no 
further information from the DRM. In the previous example (a job that has 
been killed) could or could not be more information from the DRMS.  But in any 
case, it is clear that the job terminated abnormally.

drmaa_wifexited description should concentrate in one aspect since there is no 
obvious (or general) relation between job termination and getting further 
information from DRM.

> ( Note: The testsuite assumes here that unusable input files are
> detected by the DRM before the job starts. This  seems to be realistic,
> since file staging operations are usually not part of the job execution.)
>

I do not think so. Usually job preparation stages are part of the job 
execution, for example:

	PBS: performs rcp o scp of input/ouput files. The existence of these files
are not checked at submission (i.e. the job is queued).  In the situation of 
the ST_ERROR_*_FAILURE the jobs go through the following states:
	Q->R->E.
This is, even if the input file does not exist the job goes through the 
running state.

	SGE: Also does not check input or output file existence or permissions. In 
fact, in the situation of the ST_ERROR_*_FAILURE the jobs go through the 
following states:
	qw->r->Eqw.
This is, even if the input file does not exist the job goes through the 
running state. In this case you can use qalter command to redirect output 
paths.

	Globus: includes stage-in/stage-out steps in the GRAM protocol, file 
existence or not check at submission either.

Therefore I suggest removing the ST_ERROR_INPUT_FAIURE, ST_ERROR_FILE_FAILURE 
and  ST_ERROR_FILE_FAILURE from the official test suite. In the previous DRMs 
at least, you can submit a job with output file /etc/passwd or an unusable 
input file , the job is queued, runs and fails.
 
This assumption should be  agreed as it is not the default behavior of DRMs.

> The DRMAA job status monitoring functions are a little bit confusing,
> sometimes even for the group members ;-) ... The best thing is to look
> at the code example from the C binding, which should explain most of the
> intended use cases for DRMAA functions.

Sure. The problem is that the code is not clear either. From DRMAA 1.0 C 
bindings example:
...
drmaa_wifexited(&exited, stat, NULL, 0);

if (exited) {
	drmaa_wexitstatus(&exit_status, stat, NULL, 0);	
	printf("job \"%s\" finished regularly with exit status %d\n",
	all_jobids[pos], exit_status);
} else {
	drmaa_wifsignaled(&signaled, stat, NULL, 0);
	if (signaled) {
...

From this code it seems that a signaled job should end with a zero exited 
value from wifexited (as if it did not terminate normally), as opposed to 
your comments in the previous mails and the code in the DRMAA test suite.

> Best regards,
> Peter.

Best Regards,
Ruben
-- 
+-----------------------------------------------------------+
 Dr. Ruben Santiago Montero
 Assistant Professor
 Dpto. Arquitectura de Computadores y Automatica
 Facultad de Informatica
 Universidad Complutense      phone  : +34 91 394 75 38
 28040 Madrid                 fax    : +34 91 394 75 27
 Spain                        email  : rubensm at dacya.ucm.es
 http://asds.dacya.ucm.es/
+-----------------------------------------------------------+

GridWay, The Way to Grid! http://www.gridway.org





More information about the drmaa-wg mailing list