Base class for Shapes?

Thu Nov 25 22:08:44 CET 2004

Jan-Oliver Wagner <jan at intevation.de> writes:

> Hi,
>
> on a train travel from Bonn to Osnabrück
> I worked out an idea about a Base Shape Class.

If it's there to reduce code duplication OK (The bounding box
computation has been copied several times already).  It should not be a
requirement for shape objects to be derived from that class, though.

> The idea is that we might want to have Shapes/Shapestores
> that are kept in memory instead in a file in order
> have a opportunity to create/modify them with arbitrary
> algorithms implemented in Thuban(or in its extensions).

I haven't looked at Ole's code yet, but in the test suite there's
already a simple shape store that has all the shape data in memory
(SimpleShapeStore in test/mockgeo.py).  It's only there for the tests,
but the MemoryTable started out that way too.

> This class "Shape" would also be a first step to redruce
> the code in that extension. Ongoing steps would be to introduce
> a MemoryShapeStore with ability to store it as a Shapefile.

Writing shapes to a shapefile is orthogonal of memory shape stores.  It
should be possible to write a function that takes any kind of shapestore
and writes its contents to a shapefile (or ideally any writable
shapestore, once we have such a thing in Thuban).

> Ah, and: this can even be ported back to 1_0 ;-)

Why should the shape base class be backported?

> --- Thuban/Model/data.py	24 Nov 2003 19:23:08 -0000	1.15
> +++ Thuban/Model/data.py	24 Nov 2004 21:43:02 -0000
> @@ -48,8 +48,114 @@ RAW_SHAPEFILE = "RAW_SHAPEFILE"
>  # Raw data in well-known text format
>  RAW_WKT = "RAW_WKT"
>  
> +class Shape:
>  
> -class ShapefileShape:
> +    """Base class (Interface) for representing a single Shape.
> +    """

Well, what is it?  A base class or an interface?  It shouldn't be both.
The interface should be independent of a particular base class.  As for
interface, most of the larger python projects (twisted and Zope for
instance) seem to be converging on something like PyProtocols.  The
details vary a bit and we don't need all of it for Thuban, we should go
in that direction as well.  At the very least it will document the
interfaces we use.

That sais, it would be OK to have a class that contains the code that's
currently duplicated in the various shape classes that already exist.
AFAICT, in the shape classes that come with Thuban, i.e. ShapefileShape,
PostGISShape and SimpleShape, there are only two things their
implementations have in common:

  shapeid: In all cases it's passed to __init__, stored in self.shapeid,
           and from there it's returned by the ShapeId() method.

  compute_bbox: The method to compute the bounding from the points.
                It's implementation is exactly the same in all three
                cases.

So if we want a base class for common implementation, these would be the
only things that belong there.

Your base class has several methods that modify the shape. So, one
general design issue that we may need to decide is whether shape objects
are mutable.  I would prefer to keep them immutable.  If they were
mutable, what would changing a shape mean?  If it were a shape read from
a shapefile, for instance, would the shapefile automatically be
modified?

> +    def isInBBox(self, bbox):
> +    	"""Test whether the the shape lies completely within the
> +	given bounding box.

It might be easier to promote bounding boxes to "real" objects.  Then
you could do that kind of bounding box arit

> +
> +	bbox -- the bounding box as tuple (minx,miny,max,maxy)
> +
> +	Returns: True if the shape lies within bbox, else False
> +	"""
> +        bb_minX, bb_minY, bb_maxX, bb_maxY = bbox
> +        minX, minY, maxX, maxY = self.BBox()
> +
> +        if minX >= bb_minX and maxX <= bb_maxX and \
> +	   minY >= bb_minY and maxY <= bb_maxY:
> +            return True
> +
> +        return False
> +
> +    def __str__(self):
> +        return "Shape " + repr(self.shapeid) + ": " + repr(self.points)
> +
> +#    def intersectsBBox(self, bbox):
> +#    	"""Test whether the the shape intersects the given bounding box.
> +#	In other words, tests whether at least one point of the shape
> +#	lies within the given bounding box.
> +#
> +#	bbox -- the bounding box as tuple (minx,miny,max,maxy)
> +#
> +#	Returns: True if at least one point of the shape lies within
> +#	         bbox, else False
> +#	"""
> +#        bb_minX, bb_minY, bb_maxX, bb_maxY = bbox
> +#        minX, minY, maxX, maxY = self.BBox()
> +#
> +#	# test each single point whether it is inside the given bbox
> +#        for part in self.Points():
> +#            for x, y in part:
> +#            if (x >= bb_minX and x <= bb_maxX) and
> +#	       (y >= bb_minY and y <= bb_maxY):
> +#                return True

Did you comment that out and forgot to remove it before posting this?
It's quite buggy :)

>      def RawData(self):
> -        """Return the shape id to use with the shapefile"""
> +        """Return the shape id to use with the shapefile.
> +	
> +	XXX: What is the difference to ShapeID()?"""
>          return self.shapeid

RawData() returns the data for the shape in it's "raw" form.  What that
is depends on the shape store.  For shapefiles it's the id, for the
PostGISShape a string with the WKT representation.  So, it's a
coincidence that RawData() is identical to ShapeID().  The raw form is
used by the low-level renderers, for instance, because it can increase
performance substantially.

   Bernhard

-- 
Intevation GmbH                                 http://intevation.de/
Skencil                                           http://skencil.org/
Thuban                                  http://thuban.intevation.org/