This weekend I had a look at SQL Server 2008 to find out about the new spatial functions and types that it includes. It has taken some time for Microsoft to include spatial capabilities, but it is here now, and it looks great.
To test out what speed I could get out of a SQL Server Express setup I wanted to take my existing shapefiles from my current SQL Server 2005 which has been extended with my own spatial types and functions. These are based on KML definitions, which I quickly found out is not supported by SQL Server 2008. I guess I should have known that, and I cannot blame Microsoft for going with OGC standards. I looked around and found the SQL Server 2008 Spatial Tools as well as an extensive set of shapefile from GADM. I tried to upload a couple of the country shapefiles using the uploader tool from Morten Nielsen.
While uploading I kept getting errors about the shapefiles not being properly formatted. As I assumed the shapefiles were correct, I wanted to have a look at the source code for the uploader, but sadly that is not available. So I thought I would start an open source shapefile reader and uploader. Actually I have already released the source code for a shapefile to KML converter, so it is merely a question of adding the ability to convert KML objects to SQL Server 2008 compatible object. The Microsoft.SqlServer.Types namespace contains theSqlGeography class, which is the CLR type that can be uploaded to the database.
I found that one of the main reasons the previously mentioned uploader failed so frequently was that shapefiles seemed to be read the wrong way round. A major restriction in SQL Server 2008 is that shapefiles cannot be larger than a hemisphere. Depending on how you view a shape, you can say that it either contains a certain area, or contains the whole globe except that certain area. As mentioned, the size cannot be larger than a hemisphere, but SQL Server doesn't seem to be able to be able to, by itself, determine which is the smaller area. (There is another restriction, that polygon boundaries cannot cross themselves). As I was able to read from various internet sites (does anyone have a good book on the spatial functions of SQL Server 2008?) the points of a polygon have to be read counter-clockwise, so as to keep the interior of the shape on the left of the border. KML files don't have this restriction. So even though the shapefile to KML converter is able to read shapefiles and create KML objects, these objects cannot be read directly into a SqlGeography object. However I found that reversing the points of a polygon made it possible to create a SqlGeography object that can be inserted into the database.
I have added a new project to the SVN trunk which contains the source code for reading Shapefiles and KML files and creating SqlGeography objects - you can find it at svn://svn.reimers.dk/sqlspatial.
The project uses the GoogleMap assembly to handle KML objects. If you read shapefiles, use the Shapefile class and fetch the included ShapeRecords from the ShapeRecord property. KML files can be read using the Reimers.Map.GoogleOverlayCollection. The records can be read from the collection itself. I have chosen to keep this as a separate project as it requires .NET Framework 3.5 SP1, and I cannot add that as a requirement for the map control at present.
I'm not going to go into too much detail about code, as this is still a work in progress. I had a chance to test it on the entirety of the GADM dataset. Trawling through the 244 shapefiles (>500MB of data) took around an hour on my old laptop, which seemed fair to me. This means I can discontinue the SQL Server 2005 Integration project, as soon as I have created an upgrade path. I don't kid myself about the SQL Server 2005 Integration project, it is fast, but it probably cannot compete with a dedicated spatial engine.