[Thuban-devel] Re: Base class for Shapes?

Fri Nov 26 17:55:20 CET 2004

On Thu, Nov 25, 2004 at 10:08:44PM +0100, Bernhard Herzog wrote:
> Jan-Oliver Wagner <jan at intevation.de> writes:
> > on a train travel from Bonn to Osnabrück
> > I worked out an idea about a Base Shape Class.
> 
> If it's there to reduce code duplication OK (The bounding box
> computation has been copied several times already).  It should not be a
> requirement for shape objects to be derived from that class, though.

I had code reduction in mind.

Why should not all Shape objects share the same base class?

> > The idea is that we might want to have Shapes/Shapestores
> > that are kept in memory instead in a file in order
> > have a opportunity to create/modify them with arbitrary
> > algorithms implemented in Thuban(or in its extensions).
> 
> I haven't looked at Ole's code yet, but in the test suite there's
> already a simple shape store that has all the shape data in memory
> (SimpleShapeStore in test/mockgeo.py).  It's only there for the tests,
> but the MemoryTable started out that way too.

Damn, I didn't find this. Will look into that code before doing
anything else on this topic.

> > This class "Shape" would also be a first step to redruce
> > the code in that extension. Ongoing steps would be to introduce
> > a MemoryShapeStore with ability to store it as a Shapefile.
> 
> Writing shapes to a shapefile is orthogonal of memory shape stores.  It
> should be possible to write a function that takes any kind of shapestore
> and writes its contents to a shapefile (or ideally any writable
> shapestore, once we have such a thing in Thuban).

Correct. I should have written with the ability to modify Shapes easily
and create layers without any Shapefile or Database as connection.

> > Ah, and: this can even be ported back to 1_0 ;-)
> 
> Why should the shape base class be backported?

The PIROL extension is based on 1.0 and there is no 1.2 yet.
So I thought this extension could be made more lean with
such classes ported to 1.0.

> > --- Thuban/Model/data.py	24 Nov 2003 19:23:08 -0000	1.15
> > +++ Thuban/Model/data.py	24 Nov 2004 21:43:02 -0000
> > @@ -48,8 +48,114 @@ RAW_SHAPEFILE = "RAW_SHAPEFILE"
> >  # Raw data in well-known text format
> >  RAW_WKT = "RAW_WKT"
> >  
> > +class Shape:
> >  
> > -class ShapefileShape:
> > +    """Base class (Interface) for representing a single Shape.
> > +    """
> 
> Well, what is it?  A base class or an interface?  It shouldn't be both.
> The interface should be independent of a particular base class.  As for
> interface, most of the larger python projects (twisted and Zope for
> instance) seem to be converging on something like PyProtocols.  The
> details vary a bit and we don't need all of it for Thuban, we should go
> in that direction as well.  At the very least it will document the
> interfaces we use.

I was not sure myself. In general, Interfaces would be nice,
but as along as we keep compatibility with  Python 2.2.1 it is
probably not good to introduce them.
So I was more looking at the base class aspect.

> That sais, it would be OK to have a class that contains the code that's
> currently duplicated in the various shape classes that already exist.
> AFAICT, in the shape classes that come with Thuban, i.e. ShapefileShape,
> PostGISShape and SimpleShape, there are only two things their
> implementations have in common:

>   shapeid: In all cases it's passed to __init__, stored in self.shapeid,
>            and from there it's returned by the ShapeId() method.
> 
>   compute_bbox: The method to compute the bounding from the points.
>                 It's implementation is exactly the same in all three
>                 cases.
> 
> So if we want a base class for common implementation, these would be the
> only things that belong there.

OK.

Aren't there any additional hard requirements all Shape object must have?
What about the RawData for example. Shouldn't there by a default method?

> Your base class has several methods that modify the shape. So, one
> general design issue that we may need to decide is whether shape objects
> are mutable.  I would prefer to keep them immutable.  If they were
> mutable, what would changing a shape mean?  If it were a shape read from
> a shapefile, for instance, would the shapefile automatically be
> modified?

Yes I agree. A base class should be immutable.

> > +    def isInBBox(self, bbox):
> > +    	"""Test whether the the shape lies completely within the
> > +	given bounding box.
> 
> It might be easier to promote bounding boxes to "real" objects.  Then
> you could do that kind of bounding box arit

not sure how to do that "arit".

> > +
> > +	bbox -- the bounding box as tuple (minx,miny,max,maxy)
> > +
> > +	Returns: True if the shape lies within bbox, else False
> > +	"""
> > +        bb_minX, bb_minY, bb_maxX, bb_maxY = bbox
> > +        minX, minY, maxX, maxY = self.BBox()
> > +
> > +        if minX >= bb_minX and maxX <= bb_maxX and \
> > +	   minY >= bb_minY and maxY <= bb_maxY:
> > +            return True
> > +
> > +        return False
> > +
> > +    def __str__(self):
> > +        return "Shape " + repr(self.shapeid) + ": " + repr(self.points)
> > +
> > +#    def intersectsBBox(self, bbox):
> > +#    	"""Test whether the the shape intersects the given bounding box.
> > +#	In other words, tests whether at least one point of the shape
> > +#	lies within the given bounding box.
> > +#
> > +#	bbox -- the bounding box as tuple (minx,miny,max,maxy)
> > +#
> > +#	Returns: True if at least one point of the shape lies within
> > +#	         bbox, else False
> > +#	"""
> > +#        bb_minX, bb_minY, bb_maxX, bb_maxY = bbox
> > +#        minX, minY, maxX, maxY = self.BBox()
> > +#
> > +#	# test each single point whether it is inside the given bbox
> > +#        for part in self.Points():
> > +#            for x, y in part:
> > +#            if (x >= bb_minX and x <= bb_maxX) and
> > +#	       (y >= bb_minY and y <= bb_maxY):
> > +#                return True
> 
> Did you comment that out and forgot to remove it before posting this?
> It's quite buggy :)

yes, that is broken totally and should have been removed before
I made the patch.

> 
> >      def RawData(self):
> > -        """Return the shape id to use with the shapefile"""
> > +        """Return the shape id to use with the shapefile.
> > +	
> > +	XXX: What is the difference to ShapeID()?"""
> >          return self.shapeid
> 
> RawData() returns the data for the shape in it's "raw" form.  What that
> is depends on the shape store.  For shapefiles it's the id, for the
> PostGISShape a string with the WKT representation.  So, it's a
> coincidence that RawData() is identical to ShapeID().  The raw form is
> used by the low-level renderers, for instance, because it can increase
> performance substantially.

This explanation could have helped me if in a default Shape.RawData()
method ;-) Currently we do not have a suitable place in the code to
put this text.

-- 
Jan-Oliver Wagner               http://intevation.de/~jan/

Intevation GmbH                      http://intevation.de/
FreeGIS                                http://freegis.org/