[Freegis-list] MySQL 4.1 GIS Functionality

Frank Koormann frank.koormann at intevation.de
Tue Jul 15 11:29:14 CEST 2003


Dear all,

some of you probably have read it, but for all not subscribed to
postgis-users and for the archives I repost here an analysis from
Paul Ramsey and Dave Blasby about the GIS functionality in MySQL 4.1.

The original is archived at
http://postgis.refractions.net/pipermail/postgis-users/2003-June/002651.html

Paul Ramsey <pramsey at refractions.net> [Thu Jun 19 22:42:08 2003]:
| 
| Dave and I have spent a little time over the last few days investigating 
| the spatial capabilities which have been added to the MySQL database as 
| of the 4.1 version. 4.1 is the current "testing" version, and 4.0 is the 
| current "shipping" version.
| 
| 1. Notes from Testing
| ---------------------
| 
| * Spatial data (geometry) only works with the MyISAM table type.  MySQL 
| supports multiple tables types, with different performance 
| characteristics. MyISAM is supposed to have high performance, but no 
| transactional capability. So that means that you cannot have 
| transactionally managed spatial updates.
| 
| * Trying to create create an InnoDB table with a geometry column, will 
| crash the server and muddle the system tables. The systems tables must 
| be MyISAM, since this is an effect of a failed transaction.
| 
| * The Spatial Index is almost certainly either tied to MyISAM or stored 
| in a MyISAM-type structure.  At best this leaves your index vulnerable 
| to ACID transactional issues.
| 
| * We used a 200K segment road database to test data handling.
| 
| * Data load time for the test table was significantly faster in MySQL. 
| To load 200K rows it took 34 seconds on MySQL and 71 seconds on PgSQL. 
|   MyISAM tables are noted for being fast at digesting nontransactional 
| inserts. PgSQL has to manage the inserts within a transactional context.
| 
| * Spatial index creation took about 14 seconds in MySQL and 9 seconds in 
| PostGIS.  This is probably because MySQL uses the quadatic pick-split 
| algorthym and PostGIS uses a linear-time version.
| 
| * The MySQL spatial index is not null-safe. Marking the geometry column 
| as "NOT NULL" is required in order to build a MySQL spatial index.  The 
| PostGIS spatial index is null safe.
| 
| * We did three test queries to compare the retrieval times on the 
| spatial index: first with a very small bounding box (enclosing 6 
| records), then with a medium box (70K records), then with a huge box 
| (all 200K records).
| 
| * To search for a very small area (6 records) in the 200,000 row table 
| took about 20 milliseconds on MySQL and about 2 milliseconds on PostGIS.
| * To search for a medium sized area (70,000 records) in the 200,000 row 
| table took about 1.03 seconds in MySQL and 0.25 seconds in PostGIS.
| * To search for a very large area (all records) in the 200,000 row table 
| took about 2.25 seconds in MySQL and 0.75 seconds in PostGIS.
| 
| * MySQL spatial supports only 2D geometries.
| 
| * MySQL stores the spatial data internally as WKB. Retrieving WKB should 
| be very fast, but we were not able to test that directly at this time.
| 
| * Because the internal representation of MySQL geometries is simple WKB, 
| un-indexed bounding box searches are fairly slow. The database must 
| calculate the bounding box for each feature in order to do the
| comparisons.
| 
| 
| 2. Notes from Examining the Source Code
| ---------------------------------------
| 
| * The MySQL code contains an assumption in the WKB handling that all WBK 
| will be little endian. If you run MySQL spatial on a big endian machine 
| (Sparc, PowerPC, PA-RISC) you will find things go amiss when you try to 
| send WKB to little endian machines (x86), because your big endian 
| machine will be creating invalid data (big endian data incorrectly 
| flagged as little endian). Hopefully MySQL will fix this soon.
| 
| * Spatial types are stored as WKB Strings in the MySQL database.  They 
| seem to have allocated a header on these for the SRID, but access to 
| this is unimplemented.
| 
| * Inside the MySQL database, the geometries are stored in exactly the 
| same type as String. The system does an Internal String -> Geometry 
| conversion (which very easy because the Geometry type is just WKB).
| 
| * This String/Geometry duality causes the structure of the code to 
| fairly complex. Geometry processing is done mostly in the String 
| handling section of the code base, but also appears in the SQL parser 
| and other scattered parts of the system.  We have no idea why they did 
| this, but it sure makes it difficult to follow the logic in the system.
| 
| * We did not look too deeply at the indexing source code, but it in the 
| myisam/ directory, so we assume it is either built inside a myisam 
| structure or is only useable on myisam tables.  Either way, transactions 
|   and ACID guarantees will be lacking.
| 
| * MySQL support for user defined functions has some limitations as well. 
|   It appears that in order to create a user defined function, the 
| programmer must allocate the memory for the results before seeing the 
| return objects. For large and variable sized things like geometries, 
| this could be problematic.  There might be work-arounds for this.
| 
| * MySQL does not offer any real support for object-relational 
| functionality.  PostGIS is built entirely on PostgreSQL?s 
| object-relational structure, making it very isolated from the postgresql 
| development and very easy to maintain.  In fact, the only PgSQL-related 
| maintenance to PostGIS has occurred because of significant improvements 
| to the GIST index.
| 
| 3. General Impressions
| ----------------------
| 
| * 4.1 is still just the "testing" release, but we did find it 
| surprisingly brittle.  Our attempts to create geometries in InnoDB 
| tables failed (fine) but created inconsistent system tables in the 
| process (not fine).  Spatial indexes seemed impossible to create, until 
| we realized the creation command seemed be sensitive to the case of the 
| referenced table name.
| 
| * The MySQL developers made a real effort to follow the OpenGIS 
| specification, sometimes to their own detriment. There are four 
| different MBR comparison functions, with the correct (and confusing) 
| OpenGIS-style function names (MBRContains, MBRTouches, MBRIntersects, 
| MBRWithin). Intersects and Touches are subtly different things.
| 
| * There are no full predicates in this implementation, the predicate 
| functions are all stubs against bounding box operations.
| 
| * There are no operators either (Intersection, Union, etc).
| 
| * Basically, it is a start. With some data loading tools and a little 
| more stability for general purpose stuff, it could be a passable web 
| serving platform.
| 
| 
| -- 
|        __
|       /
|       | Paul Ramsey
|       | Refractions Research
|       | Email: pramsey at refractions.net
|       | Phone: (250) 885-0632
|       \_
| 

-- 
Frank Koormann                             <frank.koormann at intevation.de>
 Professional Service around Free Software       (http://intevation.net/)
 FreeGIS Project                                 (http://freegis.org/)




More information about the Freegis-list mailing list

This site is hosted by Intevation GmbH (Datenschutzerklärung und Impressum | Privacy Policy and Imprint)