[SAGA-RG] Python bindings: Buffer class issue

Manuel Franceschini livewire at koltern.com
Mon Nov 9 13:27:06 CST 2009


Hi all,

Quick summary from GFD.90: the SAGA I/O Buffer encapsulates a sequence
of bytes to be used for I/O operations, e.g. read()/write() on files
and streams, and call() on rpc instances. The recent removal of the
buffer class from the Python bindings of the C++ SAGA implementation
led us to think again about this issue. The GFD is C/C++ oriented and
therefore the Python implementation is all but clear in this regard.

Given that that memory management is automatic in Python, the notion
of application-managed and implementation-managed Buffer disappears.
There is no need for a Python SAGA user to tell the bindings who
manages the Buffer, since it is managed by the underlying Python VM.

Another more critical issue is the data type used to hold binary data
in Python. In Python 2.x the immutable 'str' type is used whereas
Python 3.x has a newly introduced immutable 'bytes' type. Let's forget
about 3.x for a moment, since 2.x will be around for at least a couple
of more years. In order to manipulate large binary datasets, the mmap
class [0] could be used, which basically transforms a immutable 'str'
into a mutable mmap object. In other words it provides the ability to
efficiently modify binary data.

In the VU Python bindings the buffer class is still present, while, as
previously said, in the C++ Python bindings it was removed recently. I
do not see any issues with the removal of the Buffer class in the
Python bindings. However, I'm not sure whether I am forgetting some
corner cases (e.g. async) that would require a dedicated Buffer class.
When removing the Buffer class, the user would simply deal with 'str'
type data to pass data back and forth to a SAGA file, stream or rpc.

Now, I identified the following crucial questions:
1) Can the Buffer class be safely removed from the Python bindings?
2) Is handling of large binary datasets a primary concern? If yes, how
to handle them?
3) Is compliance to Python 3.x a concern right now? In other words, is
the eventual migration to 3.x to take into consideration?

Cheers,
/Manuel

[0] http://docs.python.org/library/mmap.html


More information about the saga-rg mailing list