Thursday, July 6, 2017

Making tiled maps, and importing into OpenStreetMap

A user on the OSMand group asked about my process for generating my own tiled maps, and for importing parks and preserves.

The topic, I thought, deserved a decent writeup, so I decided to work my reply into a post here.

Workflow of getting tiles into Backcountry Navigator

I've got a reasonably capable (quad Core i7, 32 GB memory, half a terabyte of SSD, 4 terabytes of RAID 0+1) Linux system at home, where I host a PostGIS database. By far the biggest part of that database is taken up with the OSM North America export from geofabrik.de, which I initially loaded with osm2pgsql, and synchronize with GeoFabrik's nightly diffs using osmosis. I retain the 'slim' tables because I have scripts that need them.

Starting from Lars Ahlzen's TopOSM, I've set up a Mapnik rendering pipeline that uses OSM, the National Elevation Dataset, the USFWS national wetlands inventory, the National Hydrography Dataset, the National Landcover Dataset, and a bunch of state and local databases to assemble an American-style topo map that shows much of the information that I want to see. I've set up a previewer for that map at

https://kbk.is-a-geek.net/catskills/test4.html

(Feel free to pan and zoom.) What I'm set up at present to render is the US east of a line from roughly Atlanta to the Mackinac Strait, and north of Atlanta. (That's because I set it up partly for some correspondents of mine who are interested in maintaining information for approach trails to the Appalachian Trail, and the area I render is roughly the bounding box of that trail.

The map is augmented with information that I got from a number of state and local GIS departments. For example, this view shows a number of trails in magenta. Those are trails that I got from NY State Department of Environmental Conservation's GIS department. I do NOT import those because

  1. There are license incompatibilities
  2. The data are stale, and were originally digitized from inappropriately small scale maps. In some places they're quite good indeed, but in other places they're way off.

I find them to be a useful indication of "there is a trail somewhere near here", and a "to do" list for trail mapping with GPS.

For the most part, those auxiliary data came in the form of shapefiles, and got imported into more tables in the PostGIS database. The Mapnik source for the map as you see it has dozens and dozens of layers.

Since I've served the map up, I can tell BackCountry Navigator to use it as a web map, much as it would use the US topos from ArcGIS, or OpenStreetMap tiles, or Bing aerial imagery. The URL for the map is

https://kbk.is-a-geek.net/catskills/tiles/{Z}/{X}/{Y}.jpg

(Please don't incorporate into apps to re-share, I have limited bandwidth and even more limited time to support the thing. Also, ''please'' don't try to bulk-download all the tiles! If you need large amounts of map, email me and we'll work something out.)

Since BackCountry Navigator supports downloading the tiles for an area in advance of a trip, I download from my home Wi-Fi before I go, and run happily without cell service in the woods.

Importing parks and preserves

The imports of parks and preserves have been several projects, each with its own workflow.

New York City watershed recreation

I did an import of New York City Watershed Recreation Lands. For that, the city made available PostScript maps of each of its facilities (note that these are located outside the city, protecting the watershed lands in the Catskill Mountains that provide New York with its water). It turns out that these PostScript files were already georeferenced, and that the names of layers in them were predictable, so I was able to set up a script that downloads them one at a time, scrapes out of the file just the boundary of the facility, and pushes the facility boundary into PostGIS.

The script got rather complicated, because it had to check that it wasn't overwriting data that are already in OSM, repair topology of the polygons, simplify the ways, shrink the polygons back a short distance to avoid collisions, and similar tidying operations. I also developed a mapping between the descriptive attributes in the shapefile and OSM tagging.

All the scripting was done in Tcl/Tk, for no better reason than that I'm familiar with it through having used it for about 25 years.

I proposed the import on the OSM Wiki and went through the usual storm and fury on the 'imports' mailing list.

The eventual import was done by taking the data, one parcel at a time, and using the JOSM remote control interface to push the polygons into JOSM. I did a final eyeball check for each, and committed them to OSM.

I've since revisited the import once, picking up 10 new purchases, 25 boundary changes, and six modified sets of access restrictions.

New York State Department of Environmental Conservation lands

Emboldened by this experience, I took on reworking the seven-year-old import of the NYS Department of Environmental Conservation Lands shapefile. It was a similar workflow: pour the data into PostGIS, tidy up the geometry and topology, map tags, and so on - but on a much larger scale, and starting from a single shapefile rather than several hundred PostScript maps.

Once again, since I had automatable data, I did this one as a formal (re)import proposal

This was again a parcel-by-parcel effort in JOSM, but this time, there was much more manual work, since there were existing versions of the parcels that had to be conflated. It was a pretty hellish job, completed in off-and-on evening work between May and September of 2016. The hardest areas to handle were ones where complicated shorelines formed the boundaries of reserves; Saranac Lakes Wild Forest and Lake George Islands were ones that I recall as being particularly tricky.

Conflation was also tricky if the parcels shared ways with adjacent landuse or landcover polygons. In the worst cases, I simply left the original polygons in place, but removed the tagging identifying the land as state forest, and then overlaid with the protected area.

Once again, now that I keep after it every year or so, the modifications are more straightforward. I reimported again a couple of months ago and managed to do it in a couple of evenings.

New York State Parks

I then moved on to New York's State Parks. Note that the Adirondack Park and the Catskill Park are parks owned by the state, but they are not State Parks; instead they are entities unto themselves, enshrined in the state constitution.

Each of the state parks has a georeferenced PDF trail map available from New York State Office of Parks, Recreation and Historic Preservation. Unlike the New York City PDF's, there were no vector layers for me to scrape. Moreover, the license status of the state park maps is unclear, and I live in the one Federal Circuit where government entities can claim copyright to data such as these. Instead, I treated the PDF's as a 'to do' list of parks that needed to be mapped.

For each of these, I did the following:

  1. Converted the PDF to a GeoTIFF for efficiency, and loaded the PDF into Quantum GIS. (QGIS can read GeoPDF, but becomes unusably slow when it does.)

  2. As a separate layer, loaded up a shapefile of tax parcels owned by New York State. This shapefile has license terms compatible with ODBL - the public has the right to use the data for any lawful purpose.

  3. Selected all the tax parcels that were coterminous with the park. This could be as few as one or as many as several hundred.

  4. Conflated the parcels and repaired the topology. (This was a fair amount of manual patchwork.)

  5. Exported the tidied parcel from QGIS as a shapefile.

  6. Opened the shapefile in JOSM and downloaded the OSM data.

  7. Added tagging. For this, I wound up developing a couple of JOSM presets for 'New York State Park' and 'New York State Historic Site', and did a bunch of copy-and-paste of things like park names, web sites, and telephone numbers from parks.ny.gov.

  8. Conflated with what was already in OSM. A lot of state parks were already there, with somewhat whimsical boundaries. If the boundaries were from TIGER, I had no qualms about overwriting them.

    Please pick up after your TIGER

    If the boundaries were actually provided by a local mapper, I tried to get in touch with the mapper in question and find out how they were obtained. The mappers were very cooperative, indeed, and got back to me promptly. In virtually all cases, they had traced approximate boundaries from Bing and were happy to have the ones from the tax rolls.

    Again, there were adjacent-parcel issues, and again, I sometimes resorted to overlaying the protected area and leaving existing landcover polygons (and adjacent landuse polygons) alone.

I didn't call this one an 'import'. I was comfortable with not doing so. There was far too much manual work involved for it to fall under the definition of 'automated edits.' Everything that went in had been touched with eyeball and mouse. Nobody complained. It is more blessed to beg forgiveness than to ask permission.

Other land areas

I used the same technique, with different source datasets, to fill in a number of county and municipal parks, and some private preserves. This is a work in progress, there's always more to be done. The most recent ones that I brought in were just this past weekend (2017-07-03), with a few more parcels belonging to the nonprofit Mohawk Hudson Land Conservancy.

(I still need to get out to these and GPS the trails!)

That's also how I sorted out the unholy mess of overlapping polygons for West Point, four state parks (Bear Mountain, Harriman, Sterling Forsest, Schunnemunk, Storm King), the Federal corridor for the Appalachian Trail, a private, open-to-the-public preserve (Black Rock Forest), the villages of Harriman, Woodbury, Fort Montgomery and Stony Point, the Woodbury golf course, and the Hudson River riverbank. What a tangle that was!

TL;DR

The one-line summary: "It's never easy, is it?"

No comments: