typo: first sentence: change -&gt; chance =)<br><br>Update on linux filenames: it seems to work after all, I was merely testing old code ;)&nbsp; Py_FileSystemDefaultEncoding (which resides in bltinmodule.c), is initially set to&nbsp; NULL in linux, but Py_InitializeEx in 

pythonrun.c reinitializes it to nl_langinfo(CODESET).<br><br>So that still leaves the issue of the wide character support on windows &gt;NT, but that&#39;s a matter that first must be resolved by the shapelib library.<br>

<br>Bram<br><br><div><span class="gmail_quote">On 3/15/07, <b class="gmail_sendername">Bram de Greve</b> &lt;<a href="mailto:bram.degreve@gmail.com">bram.degreve@gmail.com</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Hi there, For a moment there I thought I&#39;ve seen my change to support unicode for the filenames.&nbsp; But it was only for a moment =) I&#39;ve looked in Python&#39;s source code how they handled things for their own file object, and I&#39;ve mimicked it as far as I could.

<br>Key aspect seems to be to parse a string argument using &quot;et&quot; instead of &quot;s&quot; and to use Py_FileSystemDefaultEncoding as encoding.<br>Except that it doesn&#39;t work ...<br><br>First of all, FileSystemDefaultEncoding is only defined for windows (mbcs) and apple (utf-8), 

and not for Linux (NULL, meaning default encoding, meanding ascii).&nbsp; So linux still gets plagued by the same error Didrik had before. And yet, Python&#39;s file() seems to be able to copy with unicode filenames in Linux.

Secondly, for windows mbcs is used, which is a lossy encoding (not all unicode can be represented using mbcs). This is necessary because the original shapelib library only uses the narrow (char*) API, and on windows that means mbcs encoding.

<br>To get full unicode support, the wide character API must be used instead (_wfopen), but shapelib simply doesn&#39;t support that.<br>(Python&#39;s file() does precisely that on windows, in case of unicode it tries to use the wide character API)

<br><br>Then there&#39;s also the issue of the encoding of the field names and the string values.&nbsp; The easiest solution would be to fix everything<br>on UTF-8 but I believe we could do better.&nbsp; It should be able to specify the encoding when opening or creating a DBFFile, defaulting

<br>to perhaps something specified by the locale.<br><br>There&#39;s also the issue of backwards compatibility.&nbsp; Getting strings in the DBFFile isn&#39;t a problem since we can check whether the<br>caller passes a unicode or a classic string, but getting out is.&nbsp; Should be always return unicode strings and risk some

<br>incompatibilities with calling code, or should be try to diversify (perhaps based on the used encoding, <br>ascii encoding could return classic strings, or maybe based on another flag ...)<br><br>Bram<br clear="all">

<br>

-- <br>hi, i&#39;m a signature viruz, plz set me as your signature and help me spread :)

</blockquote></div><br><br clear="all"><br>-- <br>hi, i&#39;m a signature viruz, plz set me as your signature and help me spread :)