[SAGA-RG] Python bindings: Buffer class issue

Sat Nov 14 05:04:42 CST 2009

Quoting [Mathijs den Burger] (Nov 12 2009):
> > > >>
> > > >> Given that that memory management is automatic in Python, the notion
> > > >> of application-managed and implementation-managed Buffer disappears.
> > > >
> > > > From what I learned during the discussion in Banff, this is not
> > > > really true: one *can* allocate an array in user space and pass it
> > > > to an API by-reference, which actually makes it a application
> > > > managed memory segment.  The point in python seems to be that nobody
> > > > is doing that...
> > > 
> > > Well, in Python there is *only* by-reference parameter passing,
> > > references to objects that is. Version 2.6 introduced an io module
> > > that allows to do what you describe. One problem with this is that our
> > > JySAGA bindings can't support this new feature as Jython just reached
> > > version 2.5.1 and it looks like there is quite a long way to go to
> > > 2.6.
> > 
> > That is an implementation problem, and should not influence the
> > python bindings, right? ;-)
> 
> Well, defining bindings that break all current implementations and their
> usage won't work either. The C++ wrapper now also requires Python >=
> 2.2. Would all current users be willing/able to upgrade to >= 2.6?

*sigh* Tough one.  Obviously, I'd love to define the bindings more
from a (assumed) user perspective POV, but sure, if there is no
possible implementation, that makes not too much sense...

> The bindings will have to define which Python version is required. It
> not only a matter of 2.x or 3.x; the 2.x versions also contain
> increasingly more relevant functionality.

That may make sense, but of course it is confusing to have too many
levels here.  I guess that 2.x versus 3.x is a natural watershed,
whereas 2.x versus 2.y may look somewhat arbitrary to the user.

> We can either opt for something low (e.g. >= 2.2) to increase
> acceptance, or something high (e.g. >= 2.6 or >= 3.0) if these contain
> features that are essential for the bindings. A third option is to
> specify optional additional functionality for an implementation that's
> only targeted at newer versions of Python, but that will probably
> generate a lot of confusion.
> 
> I'd say we stick to >= 2.2; widely used, and supported by all current
> implementations.

Agree, but also would suggest to have an updated version for 3.x, as
seems to have fundamental changes.  But, once more, I am not a
python expert, so my opinion should not be weighted too highly...

> Not really; it memory-maps a file, not an arbitrary string. However, you
> can easily convert a string to a list or array and manipulate that in
> place.
> 
> The real question is: which use cases are we trying to optimize? What
> will SAGA Python apps do with binary data?

The primier use asking for optimized I/O on streams and files are
visualization use cases (you should know *those* use cases, right?
;-)  So, an application is repeatedly readong a couple of megabyte
or so from a stream or a file, into a memory buffer, at say 2*30fps,
to run some algorithm on the data (say isosurfacer).  

Now, in the simple case your kernel allocates memory for the read
you request, and reads data into that memory.  Later, that data is
copied into your application memory so your app can access the data.

The buffer object allows to allocate the memory in the application,
and to pass the buffer down to the kernel, so that it can read data
directly into the buffer (zero copy: no need to copy it again).

I have no idea how prevalent that use case and similar ones are for
the python community we are targeting, so its very hard to argue for
or against that optimization.  Zero Copy is in general useful in
other use cases, too, but again, it is hard to guesstimate the gain
without any specific use case to  discuss.

> > > >> In the VU Python bindings the buffer class is still present, while, as
> > > >> previously said, in the C++ Python bindings it was removed recently. I
> > > >> do not see any issues with the removal of the Buffer class in the
> > > >> Python bindings. However, I'm not sure whether I am forgetting some
> > > >> corner cases (e.g. async) that would require a dedicated Buffer class.
> > > >> When removing the Buffer class, the user would simply deal with 'str'
> > > >> type data to pass data back and forth to a SAGA file, stream or rpc.
> > > >
> > > > If the bindings decide to go for strings, then that should pose no
> > > > problem for the async calls, as far as I can tell: semantics of sync
> > > > and async calls is identical (apart from synchronization obviously).
> > > >
> > > >
> > > >> Now, I identified the following crucial questions:
> > > >> 1) Can the Buffer class be safely removed from the Python bindings?
> > > >
> > > > According to the original SAGA use cases: no
> > > > According to current SAGA users: yes
> 
> What were the original use cases that required a Buffer class?

See above.  You may also want to have a look at the SAGA use case
doc and the SAGA req doc 

  http://www.ogf.org/documents/GFD.70.pdf
  http://www.ogf.org/documents/GFD.71.pdf

Cheers, Andre.

-- 
Nothing is ever easy.