WIP-pyshapelib-Unicode issues

Bram de Greve bram.degreve at gmail.com
Wed Jan 9 20:33:30 CET 2008


Bernhard Reiter wrote:
> On Wednesday 09 January 2008 18:52, Bram de Greve wrote:
>   
>>>> This should be
>>>> configurable by the user, but I don't really know where to start.  Can
>>>> anyone who's familier with the Thuban UI give a headstart?
>>>>    
>>>>         
>>> First we have to decide where to save this property.
>>>  
>>>       
>> Is there any "config" file for Thuban?  If so, I would save it there.
>>     
>
> This would be only suitable if it is a global option,
> but if I add a few shapefiles, they could come from different sources
> and have different encoding in the dbf files.
> Thus this looks like at least a property per layer.
>   
TERMINOLOGY ISSUE.  I've been using the term code page for whatever the
DBF file thinks is the code page.  That's either an LDID constant or the
content of the .cpg file.  This is however not really down to the real
meaning of code page.  e.g. many of the LDID values refer to the same
code page. e.g. LDID_DUTCH_OEM and LDID_FRENCH_OEM use the same "code
page" cp437.

OK, so the import dialog should have something like an checkable option
to override the encoding.  dbflib.open() and dbflib.DBFFile would need
yet another optional argument to override the code page.  *grasp* 
Actually, we don't need that.  We can request dbflib to return encoded
strings (default) and decode them ourselves, either via the file
specific codec or via the overriding codec.  And we would save this in
the session.

How do we name the encodings?  If it is DBF specific, we can name them
by their code page names (whatever follows the LDID_ and CPG_
constants).  If it needs to be more general, we'll have to resort to
Python codec names.  You have to know that many of the code pages result
in the same Python codec!

> Thuban has a ~/.thubn/thubanstart.py file and the sessions files usually 
> contain all the information necessary to reload a set of layers.
>
>   
>> More important is the encoding to be used when _creating_ new dbf files!
>>     
>
> Yes, this would be the export possibilities from the table view dialog.
>   
Ok, so selecting the code page when creating the dbf should be a
non-optional selection box.  However, I believe it would be useful if we
store the user's preference in thubanstart.py.   So that they don't have
to select it over and over again if they use the same code page most of
the time.

> If it is not in the shapefile and cannot be reliably detected means the user
> must change it and it needs to be saved in the session file so it comes up
> again as the user wanted it to be. I think adding a display and button to the 
> table view is a good approach.
>
>   
You convinced me an overriden code page should be stored in the session
file.  And I won't say no to adding a display and a button either =)

Bram




More information about the Thuban-devel mailing list

This site is hosted by Intevation GmbH (Datenschutzerklärung und Impressum | Privacy Policy and Imprint)