[drmaa-wg] Re: Patch to python DRMAA wrapper for thread usage

Andreas Haas Andreas.Haas at Sun.COM
Mon Apr 10 10:41:23 CDT 2006


Hi Chuck,

all I understand is that GIL is a multi-threading related Python monitor
concept. So I fear you're asking the wrong person ;-)

Yet, it might be of interest for you, that Grid Engine jobs get rescheduled
automatically, if they return exit status 99 (see 'FORBID_RESCHEDULE' in
sge_conf(5)). Note however, that use of exit 99 makes your solution Grid
Engine dependent.

Regards,
Andreas


On Mon, 10 Apr 2006, Chuck Fox wrote:

> Hey Andreas & Enrico,
>    I am doing a project at school where I have a bunch of jobs that will run
> asynchronously, and then need to get restarted immediately
> upon exiting.  The way I found to do this was to enclose each job in a
> python thread and then just wait on it to return.  Another method
> that might be a little more scalable would be to store the job IDs in a list
> and then do non-blocking polls over the list (or use the JOB_IDS_SESSION_ANY
> sting in a non-blocking way, haven't tried that yet).  I'm a big fan of
> threads in general for self-contained items that don't require lots of
> interaction & locking with the rest of the program.  Of course the GIL makes
> things a little more complicated but in Python I keep threads as
> low-intensity as possible so they are still useful for getting stuff to run
> asynchronously.
>   Thanks for the heads up on the other thread safety issues in the wrapper.
> I had read about the underlying library and saw that it
> was threadsafe, but I definitely don't know enough about the wrapper and its
> thread issues.  I'll let you & Enrico know more in case I run
> across anything else.  I'm going to be using the interface pretty heavily so
> hopefully I'll find any outstanding issues.
>        Thanks for your help
>            -- Chuck
>
> On 4/10/06, Andreas Haas <Andreas.Haas at sun.com> wrote:
> >
> > Dear Chuck,
> >
> > well, all I did was uploading the wrapper that stems from: Enrico Sirola.
> >
> > As far as Grid Engine DRMAA library is concerned, I can approve the patch
> > will work, since the lib itself is MT-safe. For the same reasons
> > I would say that patch could be applied to drmaa_synchronize() and any
> > other DRMAA library call.
> >
> > Yet, I would assume there is a need to do further modifications with
> > cDRMAA_wrap.c module. E.g. in SWIG_Python_ConvertPtr() it makes a of
> > a static variable
> >
> >   static PyObject *SWIG_this = 0;
> >
> > accessed through
> >
> >     if (!SWIG_this)
> >       SWIG_this = PyString_FromString("this");
> >
> > that race condition that might crash the library.
> >
> > To overcome this you could use a pthread_once() wrapping that does
> > nothing but
> >
> >     if (!SWIG_this)
> >       SWIG_this = PyString_FromString("this");
> >
> > if the wrapper is called via pthread_once() always before 'SWIG_this'
> > is being accessed it would fix that problem.
> >
> > Unfortunately I'm not familar with deep mysteries of SWIG-wrapping
> > libraries, so I can't say whether that is actually sufficient.
> >
> > Best regards,
> > Andreas
> >
> > On Sat, 8 Apr 2006, Chuck Fox wrote:
> >
> > > Hi Andreas,
> > >   My name is Chuck Fox and I am doing some work with Python & SGE and
> > I've
> > > been using your
> > > DRMAA wrapper (thanks for writing it!).  I am in a situation where I run
> > > different jobs in different threads
> > > in a Python program and I often have threads that are doing blocking
> > waits
> > > on jobs that I submitted.
> > > I noticed that when I ran a blocking wait, all the other threads in my
> > > program would block until the
> > > submitted job would come to an end... not what we want with multiple
> > > threads.
> > >   I'm not the world's greatest expert on the Python C API but I know a
> > > little bit about the GIL and I
> > > tried wrapping your call to drmaa_wait in the cDRMAA_wrap.c file.
> > >   I just added the macros:  Py_BEGIN_ALLOW_THREADS and
> > Py_END_ALLOW_THREADS
> > > around the wrapper call to drmaa_wait and this solved my problem.  If
> > there
> > > are other blocking calls you
> > > can think of with DRMAA (like synchronize) you may want to use this
> > > technique on them too, I haven't tested
> > > them out yet.
> > >   I guess the biggest problem is that I modified the wrapper output of
> > SWIG
> > > which I know is not considered
> > > the best way to do things.  If you know how to get the GIL release
> > macros
> > > into the .i file itself that might
> > > be the best method to use in the next release.
> > >
> > > I have my simple patch below:
> > >
> > > 1753,1754c1753,1758
> > > <     result = (int)drmaa_wait((char const
> > > *)arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8);
> > > <
> > > ---
> > > >     /** Update: By Chuck: In case of a blocking wait a single wait can
> > > hang all of python due to the GIL, we need
> > > >         to use Python's macros to allow other threads to run while
> > this
> > > blocking wait occurs **/
> > > >     Py_BEGIN_ALLOW_THREADS
> > > >       result = (int)drmaa_wait((char const
> > > *)arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8);
> > > >     Py_END_ALLOW_THREADS
> > > >
> > >
> >
>





More information about the drmaa-wg mailing list