pyshapelib for Python 3.x

Wed Oct 26 15:40:57 CEST 2011

Hi Bram,

On 18.10.2011, Bram de Greve wrote:
> On 18 October 2011 13:36, Bernhard Reiter <bernhard at intevation.de> wrote:
> > > I'm wondering about versioning though.
> > > Would this become pyshapelib 1.1 or 1.0.1.
> >
> > I'd say 1.1 or 2 because it is a major step forward.
>
> Then I'll go for 1.1, as this isn't a major API upgrade.

2.0 may be better, but that depends on the unicode/bytes default. See below.

> There is one little thingy though: Unicode.
[...]
> return types for strings, according to return_unicode flag:
> 2.x:
> False (default) -> str
> True -> unicode
> 3.x:
> False -> bytes
> True (default) -> str
>
> What do you think?

We should approach this question from the point of view of an application that 
uses pyshapelib and that's being ported to Python 3. Porting to Python 3 
already introduces some uncertainties because of incompatible changes and the 
real or imagined danger that the libraries the application uses have 
potentially more bugs when used with Python 3 because they've not been 
adapted correctly yet. pyshapelib should increase those difficulties as 
little as possible.

AFAIK the recommended way to port code to Python 3 is to port to Python 2.7 
first, then if necessary change the code so that no python 3 warnings are 
printed (python -3) and then use teh 2to3 tool to convert the source to 
Python 3 code. The 2to3 can be used to maintain a python 2.7 and Python 3.x 
version at the same time, by maintaining the 2.7 version manually and 
automatically deriving the Python 3 version from that.

For that process to work for code that uses pyshapelib, it must be possible to 
have one codebase that works correctly before and after the 2to3 
transformation.

For the specific case of the default value for the unicode transformation this 
means either making the default the same for python 2.x and 3.x or 
recommending that users set the default explicitly, one way or the other.

Using unicode by default for both Python 2.x and 3.x is the best solution, I 
think. Using unicode for text is the best way to handle text in python 2.x, 
too, so libraries should encourage that. Also it leads to the following 
recommendation for users trying to port their code to Python 3:

 1. upgrade to the new pyshapelib.

    Users that currently relying on the bytes default will have to adapt their
    code at this point by changing their code to work with unicode. This is
    best for for the eventual upgrade to Python 3 anyway.

 2. upgrade to python 3.

    This will not require any pyshapelib related changes.

If the default is changed for Python 2.x, though, pyshapelib's API changes in 
an incompatible way, so using 1.0.1 as the version is definitely not a good 
idea. I'm not sure it warrants going to 2.0, but that depends on how much 
compatibility you want to guarantee for an upgrade from x.y to x.y+1. The 
conservative approach would probably be using 2.0 because some users will 
have to adapt their code.

  Bernhard

-- 
Bernhard Herzog  |  ++49-541-335 08 30  |  http://www.intevation.de/
Intevation GmbH, Neuer Graben 17, 49074 Osnabrück | AG Osnabrück, HR B 18998
Geschäftsführer: Frank Koormann, Bernhard Reiter, Dr. Jan-Oliver Wagner