bh: thuban/Thuban/Model xmlwriter.py, 1.2, 1.3 xmlreader.py, 1.1, 1.2 load.py, 1.55, 1.56

Fri Jul 1 22:49:06 CEST 2005

Author: bh

Update of /thubanrepository/thuban/Thuban/Model
In directory doto:/tmp/cvs-serv11857/Thuban/Model

Modified Files:
	xmlwriter.py xmlreader.py load.py 
Log Message:
First step towards unicode.  With this roughly we're at step 1
string_representation.txt

* Doc/technotes/string_representation.txt: New.  Document how
strings are represented in Thuban and how to get to a Unicode
Thuban.

* Thuban/__init__.py (set_internal_encoding)
(unicode_from_internal, internal_from_unicode): New. The first few
functions for the internal string representation

* Thuban/UI/about.py (unicodeToLocale): Removed.  Use
internal_from_unicode instead.

* Thuban/UI/__init__.py (install_wx_translation): Determine the
encoding to use for the internal string representation.  Also,
change the translation function to return strings in internal
representation even on unicode builds of wxPython

* Thuban/Model/load.py (SessionLoader.check_attrs): Decode
filenames too.
(SessionLoader.start_clrange): Use check_attrs to decode and check
the attributes.

* Thuban/Model/xmlreader.py (XMLReader.encode): Use
internal_from_unicode to convert unicode strings.

* Thuban/Model/xmlwriter.py (XMLWriter.encode): Use
unicode_from_internal when applicable

* test/runtests.py (main): New command line option:
internal-encoding to specify the internal string encoding to use
in the tests.

* test/support.py (initthuban): Set the internal encoding to
latin-1

* test/test_load.py (TestSingleLayer.test, TestClassification.test)
(TestLabelLayer.test): Use the internal string representation when
dealing with non-ascii characters

* test/test_load_1_0.py (TestSingleLayer.test)
(TestClassification.test, TestLabelLayer.test): Use the internal
string representation when dealing with non-ascii characters

* test/test_load_0_9.py (TestSingleLayer.test)
(TestClassification.test): Use the internal string representation
when dealing with non-ascii characters

* test/test_load_0_8.py (TestUnicodeStrings.test): Use the
internal string representation when dealing with non-ascii
characters

* test/test_save.py (XMLWriterTest.testEncode)
(SaveSessionTest.testClassifiedLayer): Use the internal string
representation when dealing with non-ascii characters where
applicable


Index: xmlwriter.py
===================================================================
RCS file: /thubanrepository/thuban/Thuban/Model/xmlwriter.py,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -d -r1.2 -r1.3

--- xmlwriter.py	30 Oct 2003 09:21:54 -0000	1.2
+++ xmlwriter.py	1 Jul 2005 20:49:04 -0000	1.3
@@ -1,4 +1,4 @@
-# Copyright (c) 2003 by Intevation GmbH
+# Copyright (c) 2003, 2005 by Intevation GmbH
 # Authors:
 # Jonathan Coles <jonathan at intevation.de>
 #
@@ -14,6 +14,8 @@
 import os
 from types import UnicodeType
 
+from Thuban import unicode_from_internal
+
 #
 # one level of indention
 #
@@ -119,12 +121,12 @@
             self.file.write(' %s="%s"' % (self.encode(name), 
                                           self.encode(value)))
 
-    def encode(self, str):
-        """Return an XML-escaped and UTF-8 encoded copy of the string str."""
-
-        esc = escape(str)
+    def encode(self, s):
+        """Return an XML-escaped and UTF-8 encoded copy of the string s.
 
-        if isinstance(esc, UnicodeType):
-            return esc.encode("utf8")
-        else:
-            return unicode(escape(str),'latin1').encode("utf8")
+        The parameter must be a string in Thuban's internal string
+        representation or a unicode object.
+        """
+        if not isinstance(s, unicode):
+            s = unicode_from_internal(s)
+        return escape(s).encode("utf8")

Index: xmlreader.py
===================================================================
RCS file: /thubanrepository/thuban/Thuban/Model/xmlreader.py,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -d -r1.1 -r1.2
--- xmlreader.py	12 Jun 2003 12:52:19 -0000	1.1
+++ xmlreader.py	1 Jul 2005 20:49:04 -0000	1.2
@@ -1,4 +1,4 @@
-# Copyright (C) 2003 by Intevation GmbH
+# Copyright (C) 2003, 2005 by Intevation GmbH
 # Authors:
 # Jonathan Coles <jonathan at intevation.de>
 #
@@ -15,6 +15,7 @@
 import xml.sax
 import xml.sax.handler
 from xml.sax import make_parser, ErrorHandler, SAXNotRecognizedException
+from Thuban import internal_from_unicode
 
 class XMLReader(xml.sax.handler.ContentHandler):
 
@@ -124,12 +125,10 @@
             getattr(self, method_name[1])(name, qname)
 
     def encode(self, str):
-        """Assume that str is in Unicode and encode it into Latin1.
-        
+        """Return the unicode object str in Thuban's internal representation
+
         If str is None, return None
         """
-
-        if str is not None:
-            return str.encode("latin1")
-        else:
+        if str is None:
             return None
+        return internal_from_unicode(str)

Index: load.py
===================================================================
RCS file: /thubanrepository/thuban/Thuban/Model/load.py,v
retrieving revision 1.55
retrieving revision 1.56
diff -u -d -r1.55 -r1.56
--- load.py	6 May 2005 14:17:03 -0000	1.55
+++ load.py	1 Jul 2005 20:49:04 -0000	1.56
@@ -1,4 +1,4 @@
-# Copyright (C) 2001, 2002, 2003, 2004 by Intevation GmbH
+# Copyright (C) 2001, 2002, 2003, 2004, 2005 by Intevation GmbH
 # Authors:
 # Jan-Oliver Wagner <jan at intevation.de>
 # Bernhard Herzog <bh at intevation.de>
@@ -248,7 +248,7 @@
                                     % (element, d.name))
             elif d.conversion == "filename":
                 value = os.path.abspath(os.path.join(self.GetDirectory(),
-                                                     value))
+                                                     self.encode(value)))
             elif d.conversion == "ascii":
                 value = value.encode("ascii")
             elif d.conversion:
@@ -579,11 +579,15 @@
         del self.cl_group, self.cl_prop
 
     def start_clrange(self, name, qname, attrs):
+        attrs = self.check_attrs(name, attrs,
+                                 [AttrDesc("range", False, None),
+                                  AttrDesc("min", False, None),
+                                  AttrDesc("max", False, None)])
 
-        range = attrs.get((None, 'range'), None)
+        range = attrs['range']
         # for backward compatibility (min/max are not saved)
-        min   = attrs.get((None, 'min'), None)
-        max   = attrs.get((None, 'max'), None)
+        min   = attrs['min']
+        max   = attrs['max']
 
         try:
             if range is not None: