[DRMAA-WG] About obtaining the machines names in a parallel job

Yves Caniou yves.caniou at ens-lyon.fr
Thu Mar 18 04:11:25 CDT 2010


Dear All,

Discussions yesterday were great.
I understand why you don't want to put a mean to get the hostnamesfile for an 
MPI code, since it's should be transparently done in the configName (correct 
name if my rememberings are well).

But I thought of a different use case: a code is just launched on all 
machines. This code is a socket based one, thus it needs to know the other 
machine names to be able to run correctly.
Of course, this could be bypassed with the use of an external machine where a 
daemon runs, and where running codes can register -- I think of it like an 
omniNames running for example. Another solution is to encapsulate 
applications in an MPI code just to, maybe, have that information.

But don't you think that the cost is very big (if possible: a lot of policy is 
to not let run user code on the frontal, and a machine only knows that itself 
is taking part to the parallel run) compared to the possibility to at least 
having the possibility to copy the file containing the hostnames to all 
reserved nodes?

Bon courage for the discussions today!
Cheers.

.Yves.

-- 
Yves Caniou
Associate Professor at Université Lyon 1,
Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
  * in Information Technology Center, The University of Tokyo,
    2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
    tel: +81-3-5841-0540
  * in National Institute of Informatics
    2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
    tel: +81-3-4212-2412 
http://graal.ens-lyon.fr/~ycaniou/


More information about the drmaa-wg mailing list