Saturday, January 21, 2012

Mapping New York - GPS and quality control

Last time, we loaded up a photograph of an old map of the Mohawk River State Park (the former Schenectady Museum Nature Preserve) and traced its borders to put them in our map. It turns out that I have multiple sources of data regarding this pretty parcel of undeveloped public land - including my own boots on the ground. In this installment, I'm going to have a look at superimposing them and comparing them.

The first additional data source is a set of GPS tracks that Andy Arthur obtained under Freedom of Information laws. These show the trail routings in New York state park lands, much as the DEC files do for the state forests. (I may at some point discuss the data quality of those files; they're available in several redundant formats, and the data do not match.)

The second additional data source is that I tried walking the trails about a month ago with a GPS receiver. Unfortunately, it was the GPS on my Android phone, which suffered an operating system crash partway through the walk. But I was able to recover my tracks for at least one of the trails, the blue trail (marked as 'purp' in the DEC files) that runs from the entrance by the Lock 7 parking up to the east power line near River Road.

Mohawk River State Park trails
To compare what I found, I added vector layers with the KML file of the park (orange) and with the GPX file from my phone (blue). Looking at them superimposed in QGIS, I note the following.

  1. In the southern part of the preserve, the old trail map is largely schematic. I know from experience that the white trail follows the ridge top on the east bluffs and always has. The orange tracks match my memory, even if the phone had stopped recording.

  2. Similarly, the orange GPS tracks and my tracks (I *know* that I was following the blue blazes) largely align. on the blue trail, placing it a bit to the south of where the old map puts it.

  3. It also appears that whoever laid out the orange tracks missed a double blaze at one point and just bashed north to the power line. The blue blazes weren't hard to follow, and I have confidence in my track at that point.

  4. The orange tracks look a good bit better aligned with the map up at the north end near Whitmyer Drive. (There were better defined lanes through the old tree nursery, so this doesn't really surprise me.)

Conclusion: I have a lot of confidence in the park outline that I traced and in the alignment of the blue trail in my GPS tracks. I need more ground data for the rest of the trails before I'm willing to make my map available as something nearly definitive.

Before the snow gets too deep, I hope!


Tuesday, January 17, 2012

Mapping New York - georeferencing an old map

Last time, I mentioned that the Mohawk River State Park (former Schenectady Museum Nature Preserve) is missing from my map - because it appears neither in OpenStreetMap nor in the New York State inventory of public lands. How can I get it included?

Fortunately, I found online that someone had snapped a photograph of an old copy of the map that had once been posted at the Whitmyer Road parking lot, and posted it to a mountain bike forum. The picture looks to be output from a CAD system, or perhaps an early GIS. In any case, it appears to have been done to scale, and have just enough points that we can reference it to the map. Let's give it a try.

Pull up ‘Plugins->Georeferencer->Georeferencer’ from the QGIS menu bar.
Georeferencing a map

Select the ‘Open Raster’ button in the toolbar, and load the ‘niskabike.jpg file that we downloaded from the mountain bike forum. Zoom in a couple of levels; you can use the ‘hand’ tool to pan around, just as you can with the map.

Now we have to identify corresponding points in the two maps. Click the ‘Add Point’ button in the toolbar, and click a point for which we can identify a corresponence. (I used the intersection of Whitmyer Drive and River Road as the first point.)
Adding a control point
A dialog appears asking what corresponding point in the map to use. Click ‘From map canvas’ and select the point on the main map. The co-ordinates will be filled into the dialog box. Click ‘OK’

Continue this process for identifiable points that cover the map over a wide area. I used the turn of Whitmyer Road, the right angle in the upstream breakwater above the lock, the junction of the lock's retaining wall with the dam, the turn of Lock 7 Road near the parking area, and the junction of River Road and Rosendale Road.

Once you have a data table of corresponding points, you can let the georeferencer do its job by selecting ‘Start georeferencing’ in the toolbar.
Transform settings for georeferencer
You will need to fill out the dialog that appears, at least giving it a TIFF file for the output, and a projection to use (which should usually be the projection of the main project). The rest of the choices should be filled out as shown.

Clicking ‘OK’, we get the old map loaded up as a layer in the QGIS project. Lowering this layer in the stack (I put it below the contour lines), we can see that it aligns nearly perfectly. The retaining wall of the lock, the roads, and the bike path all superimpose nicely. The stream appears to have meandered a little, but it's pretty soggy around there, and I'll say that it's good enough.
Georeferenced map

Now we can start transcribing data by tracing over the old map. Create a new shapefile layer, choosing a Polygon layer, and adding a few attributes.
Create a new layer
In the new layer, turn on edit mode (‘Toggle Editing’ in the toolbar). Create polygons that outline the land parcels of interest. Save the newly-created shapefile, and turn editing mode back off.

Lower the layer below the relief shading, style the land use appropriately, remove the temporary layer with the old map, and we're done.

(I could have transcribed the walking tracks, and so on, but I have other data sources for them. So read on to the next installment to see where that material is coming from.)


Mapping New York - point data

We're just about done with the basemap. While we'll eventually augment it - we'll come back later to adding things like hiking tracks in the state forest lands - I'll skip them for now because they're irrelevant to my map of the Niskayuna nature preserves.

About all that remains is to add data about points of interest (and other single-point features).
All of the data that I'm adding at this step is already in the database from the ‘new_york.osm’ file. It's all in the 'point' table.

The OSM point data is, you should forgive the expression, all over the map in terms of what it describes. In order to get it to layer correctly with other things, I've broken it out into multiple layers in the QGIS project:

  • Highway points - those for which

    upper(geometrytype("way")) IN ('POINT','MULTIPOINT')
    AND "highway" IS NOT NULL
    AND "highway" NOT IN ('bus_stop', 'crossing', 'stop_sign', 'street_lamp,
    'traffic_signals', 'traffic_signals;motorway_junction',
    'unclassified', 'yield')

    These are things like roundabouts, turning circles, and freeway interchanges. They layer right next to the roads, and have a simpleminded style sheet. (The only thing of interest is that its symbols are sized in map units rather than millimetres - I represent a traffic circle by a circle of diameter 15 m at map scale.) Eventually, I want to do better labeling for numbered interchanges, but that can wait for a version of QGIS that has highway shields.

  • A mixed bag of features that includes schools, houses of worship, hospitals, public toilets, miscellaneous buildings, helipads, gates, and boat launches. (I expect this list to expand over time, depending on what is relevant to a given map. There are dozens of things that might be included.)

    upper(geometrytype("way")) IN ('POINT','MULTIPOINT')
    AND place IS NULL
    AND highway IS NULL
    AND (amenity IS NULL OR amenity NOT IN ('fire_hydrant'))

    This layer goes above the building footprints.

  • Place names (and names of dams, which are shown the same way). This layer is topmost in the stack. Its style displays nothing. It merely exists as a place for the labeling engine to add the names of objects.

    "place" IN ('city', 'hamlet', 'island', 'islet', 'locality', 'suburb',
    'town', 'unincorporated_area', 'village')
    OR "waterway" IN ('dam')

With all of this in place, the map appears quite informative - except that it's missing one of the preserves that I want to show! (The Mohawk River State Park, formerly the Schenectady Museum Nature Preserve.) In the next installment, I'll try to get that on the map.

In the meantime, here are the style sheets for the layers I discussed in this lesson:


Monday, January 16, 2012

Mapping New York - shading the relief

Last time, we added polygons to our base map that describe land use and show building footprints. Now let's make the map prettier by giving it shaded relief.

Recall that when we were drawing the contour lines, that I mentioned that we should keep the Digital Elevation Model (DEM) files around. Now we'll be returning to them, and getting shaded relief for them.

To do that, we create the shading layers for individual quadrangles. (While this process is running, you probably want to uncheck 'Render' in the status bar; updating the screen with the results is expensive.)

Select ‘Raster->Analysis->DEM (Terrain Models)’ from the ‘Raster’ menu.
Generating shaded relief
(1) In the ‘Input File,’ browse to a DEM file for a quadrangle that you saved from the earlier step. In the ‘Output File,’ enter (or browse to) the path of a new TIFF file that will hold the relief shading.

(2) Select ‘Hillshade’ under ‘Mode.’ You can leave all the other options at their default settings.

(3) Select ‘Load into canvas when finished,’ and select ‘OK’

Repeat the process for any other quadrangles that you need to process, then select ‘Close’ to dismiss the dialog.

Now drag the new layers in the left-hand column of the screen to a position below the 'subregions' layer but above the 'land use' and 'public lands' layers.

Right-click each of the new layers, and pull ‘Properties’ from the context menu.
Setting transparency
Go to the ‘Transparency’ tab, and change ‘Global Transparency’ to 50%.

Turn 'Render' back on in the status bar, and admire your shaded-relief map.
Shaded relief

Next installment: Adding some point features.


Mapping New York - more polygons

Last time, we got the map to a pretty decent topographic road map. (In a future installment, I may discuss curating the OpenStreetMap data; it has recurrent errors that come from the Census Bureau TIGER files. But that's a side issue.) The next thing that I want to get into place is some polygon data: "who owns this land? What's it used for? Does the public have right of access? Where are the landmark buildings?

The relevant part of the first set of questions can mostly be answered by a database of publicly-owned lands. This database is getting into territory where the public databases aren't quite up to snuff. Within the Adirondack and Catskill Blue Line, the data are readily available from NYSGIS. The usual drill of using ‘ogr2ogr’ loads them into PostGIS:

ogr2ogr -f PostgreSQL -overwrite -t_srs EPSG:32618 \
"PG:dbname=gis" DEC_Lands.shp \
-nln nys_dec_lands -nlt MULTIPOLYGON -lco PRECISION=NO

Once again, the ‘-t_srs’ option is there to reproject the data into the projection that I intend to use for the finished map, and the ‘-lco PRECISION=NO’ works around a bug that causes a failure in inserting some of the numeric data.

Outside the Blue Line, the data come from a different place: the New York State Office of Cybersecurity. (I'd be intrigued to know why they became the custodian of the data.) In any case, they have a collection of files available on the NYSGIS web site.

Rather than putting each of these files into a separate table in the database, and hence needing a separate layer to show them, I decided to integrate them into a single table, and add a column to the data representing which data set a given row came from. For this, I decided to resort to scripting. Pulling out my handy-dandy Tcl interpreter, I ran the following loadall.tcl script:

set firsttime true
# Find all the shapefiles in the working directory
foreach file [glob *.shp] {

# Extract the base name of each file, and create an 'ogr2ogr' command to load it
set base [file rootname [file tail $file]]
set cmd [list ogr2ogr -f "PostgreSQL"]
if {$firsttime} {
set firsttime false
lappend cmd -overwrite
} else {
lappend cmd -append
# Add a 'data_source' column to identify which file we loaded
lappend cmd -sql "SELECT *, '$base' AS data_source FROM $base" \
-t_srs EPSG:32618 \
-skipfailures \
"PG:dbname=gis" \
$file \
-nln nys_public_land_boundaries \
# Report on the console which file we're processing, and load it
puts $cmd
exec {*}$cmd >@stdout 2>@stderr

With these boundaries, what I mostly care about is “recreational” (go ahead and access) versus “nonrecreational” (permission needed, or special land use such as prisons and schools). (And I also want to treat the ‘AdirondackCatskill’ file specially, because that's the Blue Line, rather than reflecting public ownership.

Both of these layers need some styling. Rather than walk through that whole process, I have QML files attached at the end of the post.

Dealing with OpenStreetMap polygon data is rather more complicated, because it's got so many different things in the same file. I therefore made several different layers, with SQL queries to extract specific features.

(1) The first layer, I just left with the name, new_york_osm_polygon. This layer really represents "here are polygons that I don't know what to do with, yet." It is stacked behind everything else, and I usually leave it unchecked unless I'm actively working on styling for polygons. Its query looks like:

admin_level IS NULL
AND ("boundary" IS NULL OR "boundary" NOT IN ('national_park'))
AND ("waterway" IS NULL OR "waterway" IN ('boatyard','dam','dock','rapids','waterfall'))
AND ("natural" IS NULL OR "natural" NOT IN ('bay', 'marsh', 'pond', 'swamp', 'water','waterway','wetland'))

This query excludes:

  • Anything with an ‘admin_level’ attribute: these are administrative regions (states, counties, cities, towns, etc.)

  • National park boundaries. I have these more accurately in the NYS Public Lands file.

  • All waterways, other than man-made assets on the water.

  • All wetlands, I've taken care of those already.

I style this letter either in bright yellow or icky purple, just to call attention to the unclassified features.

(2) The next layer is the layer where I describe land use. Its SQL query looks like:

"landuse" IS NOT NULL
OR ("leisure" IS NOT NULL AND "leisure" NOT IN ('ice_rink','pitch', 'track', 'tennis_court'))
OR "aeroway" IS NOT NULL
OR (amenity IN ('school', 'college', 'university', 'hospital') AND building IS NULL)

which translates to:

  • Land use polygons

  • Polygons marked 'leisure', except for a handful that usually appear inside parks and want to be rendered at a higher level.

  • Polygons marked 'aeroway'.

  • Polygons marked 'school', 'college', 'university', or 'hospital', except for buildings (these allow highlighting of campuses).

This set may need to be considered a work in progress; I expect these rules will need to be tweaked depending on the theme of the map.

For this layer, I created a fairly complex style with rules that fill the polygons in different colors according to land use. The QML is attached.

(3) Next up are the 'public lands' and 'DEC lands' layers, which I already discussed.

(4) Next, I have a few more types of region from OpenStreetMap. I call the layer OSM Subregion, and its query looks like:

(amenity IN ('parking')
OR leisure IN ('Dog Run', 'pitch' ,'tennis court','track'))
OR aeroway IN ('apron')
AND building IS NULL

so that it includes parking lots, dog runs, playing fields, tennis courts, racetracks, and airport aprons. What these types of object have in common is that they are usually layered atop another object (a shopping mall, industrial facility, park, airport, etc.), and so they look better rendered in an upper layer. Once again, I've attached a QML file to style them. This layer goes above the 'public lands' layers, but below the contour lines.

(5) Finally, there are polygons for a few man-made features:

building IS NOT NULL
OR leisure IN ('pool','swimmin_pool','swimming_pool','wading_pool')

that is to say, buildings and swimming pools. This layer goes very high in the stack - above the roads. It provides footprints for these structures. Once again, I've attached the QML that styles it.

Wow, that's a lot of layers from one data set. But with them all in place, our map now has quite a lot of detail.
Added polygon data to the basemap
Next time, we'll make the map prettier, by adding shaded relief.



Saturday, January 14, 2012

Back to the street map

Last time we left off with a map that was getting pretty good at showing physical features. It displays waterways (rivers, streams, lakes), topography (with contour lines) and makes a first attempt at land cover (at least showing wetlands). Now, it's time to move back to the man-made features.

The first thing that nearly everyone wants to see on a map are the linear features: streets, roads, railroads, canals, tracks and so on. We've already loaded a wealth of information about these from Open Street Map into the PostGIS database. But we still need to show the stuff with appropriate styling. This styling winds up getting done with progressively more complex sets of rules, distinguishing one type of line from another: people like to see, for instance, a specific style of line for “hard surfaced secondary road” or “abandoned railroad.” Coming up with these rules is rather tedious, and I won't make you read through an entire posting of what the rules are. Rather, I'll just refer to QML files that contain the rules and link to them at the bottom of the posting.

The first thing we're going to have to do with the lines in the Open Street Map data is to get rid of the ones we don't want. Administrative boundaries, we'll skip for now. (I'm still working on getting those styled well.) We've already mostly done the waterways: we want to do some additional work to put in dams, locks, rapids and so on, but we want to exclude the lakes and streams. And 'routes' - such as numbered highways - duplicate the roads. We'll handle them separately.

So we need to select ‘new_york_osm_line’ (or whatever name we used to import the line graphics), and pull ‘Query’ from the 'Layer' menu. We enter the SQL query:

upper(geometrytype("way")) IN ('LINESTRING','MULTILINESTRING')
AND ("waterway" IS NULL
OR "waterway" IN ('breakwater', 'dam', 'dock', 'flume', 'lock_gate',
'rapids', 'waterfall', 'weir'))
AND ("boundary" IS NULL)
AND ("route" IS NULL)

(Read the query as: “We want only line graphics, that aren't waterways, boundaries, or routes, except that breakwaters, dams, ..., weirs are acceptable waterways.) Now we have a manageable set of lines to work with. After considerable fiddling, we come out with a set of rule-based styles to render them:
Styling OSM Roads
We also want to show highway numbers. While showing them on shields is promised functionality for QGIS, it isn't in the development main line yet. So for right now, we just use ordinary labels. We create a new layer against the same OpenStreenMap line layer, changing the query this time to:

"highway" IS NOT NULL AND "ref" IS NOT NULL

(which selects numbered highways). We render nothing for this layer: the easiest way to do this is to select “Simple Line” as the rendering style, and then select “No Pen” as the pen style. We use the “Labeling” dialog (from the “Layer” menu) to put in the labels, giving them a red foreground and a white buffer. When QGIS gets highway shields, we'll be able to change this one place and show proper shields instead.
Added OSM roads (and other line features)

Not too bad! The QML for the street style sheet is attached at the end of the post.

Next time, we'll start putting in some land use data, and add some building footprints.



Tuesday, January 10, 2012

Making maps in New York - marshlands

Last time, I discussed adding surface water features to the map under construction. This time, I want to continue on the hydrography theme and add the marsh lands.

It turns that not only the marsh lands themselves, but even the process of identifying the marsh lands, leads the unwary into a swamp. The issue is that the determination of what is and is not a wetland is extremely contentious. In New York State, several agencies are involved. Among them are the NYS Department of Environmental Conservation, the US Fish and Wildlife Service, the US Army Corps of Engineers, the Environmental Protection Administration, and even the New York City Department of Environmental Protection (the protected watershed that supplies New York with its drinking water extends for well over a hundred miles away from the city itself). All of these agencies have different ideas of what is or is not a wetland, and they are tasked with the enforcement of different wetlands-protection laws, which they do with a heavy hand. In fact, just the other day a case where the owner of a suburban lot was financially ruined by an EPA wetlands determination was argued in the Supreme Court.

The end result is that there are multiple sources, often conflicting, of data on wetlands. That said, all I really care about is how likely my boots are to stay dry on a walk, not what agencies have to bless building permits or inspect properties for illegal alterations to drainage.

If I were to use only a single source, it would be the Fish and Wildlife Service National Wetlands Inventory. It lists all types of wetland (salt marsh, emergent and wooded freshwater marshes, mangrove forest [not in New York!], submerged marsh, and so on), and it's "one-stop shopping." But it's by no means complete. In places where I have "boots on the ground" knowledge, there are some awfully soggy spots that aren't in the inventory.

A more comprehensive data set for New York, which identifies more wetlands but misses some that the USFWS covers, is available from the NYS Department of Environmental Conservation. In most of the state, it comes in data sets that are one to the county. For Essex and Hamilton counties in the north of the state, the wetlands inventory is maintained by the Adirondack Park Agency. In these two counties, the data are in two sets of maps - in quarters of USGS 7.5 minute quadrangles, so be prepared for a fair number of downloads. One covers the Oswegatchie-Black, Upper Hudson and Saint Regis watersheds, and the other covers the remainder of the Adirondack Park.

The Fish and Wildlife Service data had an issue when I tried to reproject it: it's on the Albers equal-area conic projection, referenced to NAD83, for the continental US. And that's a projection that doesn't have an EPSG number. (It's also not all that well-suited to small-scale maps, although it would make a nice wall map of the Lower 48.) So I had to add it. In QGIS, this is done in the 'Settings->Custom CRS' dialog. The parameters that need to be entered are:

+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs

Custom coordinate reference system

Then 'ogr2ogr' can load the polygons and lines into PostGIS. We need a lot of options, because (a) we want to reproject the data onto a more sensible projection (I intend the map to use NAD83, UTM Zone 18N), (c) we have to override the layer type to MULTIPOLYGON (ogr2ogr incorrectly guesses POLYGON), and (d) we have to give the PRECISION=NO option to layer creation to work around a bug.

ogr2ogr -f "PostgreSQL" -overwrite -skipfailures -t_srs EPSG:32618 -progress \
"PG:dbname=gis" NY_shapefile_wetlands/CONUS_wet_poly.shp \
-nln fws_nys_wetlands -nlt MULTIPOLYGON -lco PRECISION=NO

Go get lunch, this one will take a long time.

The NYS DEC files are equally problematic. They're in the ARC/GIS E00 export format, which was never all that widely supported. Quantum GIS and 'ogr2ogr' both purport to be able to import them, but generate imports with corrupted polygons that will not display. I was finally able to work around this problem with another open source product called AVCE00 - which includes a program called 'avcimport' that will convert E00 files into the ARC/GIS binary coverage files, which work correctly. (Whew!)

So I did a little bit of shell scripting to run all the AVCE00 imports:

for f in `find . -name \*.e0? -print` ; do \
~/avce00-2.0.0/avcimport $f `basename $f .e00`-imp ; \

And then for the counties of interest, I ran them into PostGIS. I should probably have scripted this part, too. You have to do arc and polygon layers separately:

ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 "PG:dbname=gis" 001fwa-imp/ \
-nln nys_wetlands_arc arc
ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 "PG:dbname=gis" 001fwa-imp/ \
-nln nys_wetlands_pal pal
ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 -append "PG:dbname=gis" 057fwa-imp/ \
-nln nys_wetlands_arc arc
ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 -append "PG:dbname=gis" 057fwa-imp/ \
-nln nys_wetlands_pal pal

With this in place, displaying polygons for the wetlands was easy. But now I wanted them to be styled prettily. I went and downloaded the National Park Service map symbols in SVG, and used Inkscape to extract the 'marsh' symbol. (SVG file attached at the end.) I changed the fill type on both the wetlands layers to 'SVG fill' and provided the file name as a custom fill pattern.
Custom SVG fill for marshlands

(Or you can use my QML file so you don't have to do this yourself. It's attached at the bottom of the post.)

Styled wetlands

So now we have marshlands appearing in the map, with the standard symbol. Looks nice. Next time - on to roads, railroads, trails, runways, and other linear features.

marsh.svg - Fill pattern for marshlands
NYS-wetlands.qml - QML to define the wetlands style


Monday, January 9, 2012

Making maps in New York - surface water

We left off last time with a map that has the street grid (well, sort of, it still needs to be styled!) and the contour lines (so that if I'm looking at a street or trail, I can tell how steep it is).

The next key piece of information that I need for a walking map is whether I'll get my feet wet: where is the water on the surface? So, let's look at some data sources for hydrography.

The hydrologic features are, we already saw, in the digital raster quadrangles, but (I repeat yet again) we don't want to use them, because they don't scale, and because they're cut off at the margins. So let's get vector data instead.

Fortunately, for New York, surface water features are "one-stop shopping." New York State's Office of Cybersecurity has two shapefiles, one with line features and one with area features, that cover the entire state and are suitable for rendering at 1:24000.

After downloading and unpacking them, I put them in the PostGIS database and spatially indexed them:
$ /usr/lib/postgresql/9.1/bin/shp2pgsql -I -c AreaHydrography.shp nys_area_hydrography \
| psql -d gis > /dev/null
$ /usr/lib/postgresql/9.1/bin/shp2pgsql -I -c LinearHydrography.shp nys_linear_hydrography \
| psql -d gis > /dev/null

I wound up loading two layers for the linear features. One of them had the query condition:
name IS NOT NULL AND name != 'Stream'

while the other had:
name IS NULL OR name = 'Stream'

The reason for this was so that I could assign labels only to those streams not named 'Stream': that name appears on more than half the streams in the database! [Post hoc: I had to do the same thing for area features named 'Lake' and 'Pond', for the same reason.]

I also did some styled lines for canals, intermittent streams, and so on. I'll put the QML files at the end of this post.
Added hydrography to basemap

So now there is open water in my map. I still want to do some more hydrography, because my feet can get just as wet in mud!


Styles for linear waterways: NYS-linear-waterways.qml
Styles for area waterways: NYS-areal-waterways.qml


Saturday, January 7, 2012

Making maps in New York - hydrography

We left off last time with a map that has the street grid (well, sort of, it still needs to be styled!) and the contour lines (so that if I'm looking at a street or trail, I can tell how steep it is).

The next key piece of information that I need for a walking map is whether I'll get my feet wet: where is the water on the surface? So, let's look at some data sources for hydrography.

The hydrologic features are, we already saw, in the digital raster quadrangles, but (I repeat yet again) we don't want to use them, because they don't scale, and because they're cut off at the margins. So let's get vector data instead.

Fortunately, for New York, surface water features are "one-stop shopping." New York State's Office of Cybersecurity has two shapefiles, one with line features and one with area features, that cover the entire state and are suitable for rendering at 1:24000.

After downloading and unpacking them, I put them in the PostGIS database and spatially indexed them:

$ /usr/lib/postgresql/9.1/bin/shp2pgsql -I -c AreaHydrography.shp nys_area_hydrography \
| psql -d gis > /dev/null
$ /usr/lib/postgresql/9.1/bin/shp2pgsql -I -c LinearHydrography.shp nys_linear_hydrography \
| psql -d gis > /dev/null

I wound up loading two layers for the linear features. One of them had the query condition:

name IS NOT NULL AND name != 'Stream'

while the other had:

name IS NULL OR name = 'Stream'

The reason for this was so that I could assign labels only to those streams not named 'Stream': that name appears on more than half the streams in the database!

I also did some styled lines for canals, intermittent streams, and so on. I'll put the QML files at the end of this post.

So now there is open water in my map. I still want to do some more hydrography, because my feet can get just as wet in mud! Swamps are important to indicate, too! Since the area that I'm immediately interested in mapping has no salt water, and since it's outside the Adirondack Park, the New York State Department of Environmental Conservation's list of protected freshwater wetlands should give me quite a reasonable starting point.

Those files are available by county. Downloading my county and several of its neighbours, I tried a quick import into QGIS and into PostGIS. For the E00 file format that was used, both imports failed silently, leaving a set of malformed polygons for the swamp areas. Fortunately, I was able to find a workaround. I used the program 'avcimport', part of the AVCE00 package, to convert the .E00 files into ARC/GIS binary coverages, and then used 'ogr2ogr' to import them:

$ for f in `find . -name \*.e0? -print` ; do ~/avce00-2.0.0/avcimport $f `basename $f .e00`-imp ; done
$ ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 "PG:dbname=gis" 001fwa-imp/ -nln nys_wetlands_arc arc
$ ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 "PG:dbname=gis" 001fwa-imp/ -nln nys_wetlands_pal pal
$ ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 -append "PG:dbname=gis" 057fwa-imp/ -nln nys_wetlands_arc arc
$ ogr2ogr -f "PostgreSQL" -s_srs EPSG:26918 -append "PG:dbname=gis" 057fwa-imp/ -nln nys_wetlands_pal pal
... continue for several more counties ...

Yet another task that I should have scripted!

Anyway, adding the resulting 'nys_wetlands_pal' and 'nys_wetlands_arc' in QGIS and doing a bit more styling gives quite an attractive result. I extracted an SVG of a swamp fill pattern and used that as the fill pattern for the polygons.

So now there is a lot of hydrography on the basemap. Next step is to start adding some cultural and administrative layers.


Making maps in New York - topography

When we left off, we had a map project that had OpenStreetMap data in a PostGIS database, If the symbols were properly styled, having this data set alone could produce rather a pretty map. But it's far from ideal as a walking map - for one thing, I want to know when I'm walking whether I should expect to be going up and down steep hills. Of course, the customary way for a map to present this information about topography is to mark up the map with a series of contour lines - lines joining points at the same elevation.

We saw before that we could get contours from the 'topo' layer of the NYS DOT quadrangles, but we also saw that doing so had problems, because (1) the quads appear to be trimmed off at the margins, and (2) the resulting images scale poorly because they are raster images. So let's attempt a different data source.

The US Geological Survey publishes a Digital Elevation Model (DEM) of the United States, meshing it into a fine grid (the point spacing is 1/3 of a second of arc) with the elevation reported at each point. The CUGIR server makes the data available by quadrangle. So we begin by downloading the model for the quadrangles of interest, and unpacking the resulting files. (They wind up placing a ZIP within a ZIP, so have to be unpacked twice.) The result is that we have a bunch of files with the extension '.dem' that have the required data.

If the Python-GDAL tools are installed, then QGIS can take these data and produce contour lines in vector format. These lines will scale gracefully with changes in map scale. The way to get them is to open up a fresh QGIS project (or it's all right temporarily to add layers to an existing one), and load in a DEM file as a raster layer. When it's loaded, right-click on it in the 'Layers' display and choose 'Set Project CRS from Layer' (if you're working in an existing project, it will have to be set back afterward.) This will change the co-ordinate reference of the project to EPSG:26718 (NAD27 Universal Transverse Mercator Zone 18N), which we need because the DEM files use the old 1927 North American Datum coordinate system.

Once you've loaded the layer and adjusted the coordinate frame, choose 'Zoom to Layer' from the view menu. You'll see a gray rectangle that shows the part of the ground that the model covers.
You can color this area according to elevation, if you like, by right-clicking on the layer, choosing 'Properties' from the menu, selecting the 'Style' tab and changing the 'Color Map' to
Selecting pseudocolor

Now you'll see the same area in false color: reds are high elevations, and blues are low.
Pseudocolor elevations

Since I'm an American, I tend to prefer maps with contours in feet rather than the more sensible metres. The DEM files have metric elevation, so to produce foot contours, we need to rescale them. Choose 'Raster Calculator' from the 'Raster' menu, and carry out the steps in the picture:
Changing elevations to feet

  • Enter the name of a GeoTIFF file to be created that will contain the elevation data in feet

  • Click 'Current Layer Extent' to make the new file have the same dimensions as the old

  • Set the output format to 'GeoTIFF' and select 'Add result to project' so that you can preview the conversion

  • Use the left-hand list to choose the layer to convert, and complete the expression (layer / 12.0 / 0.0254), which converts feet to metres

  • Click 'OK'

  • You'll get another gray rectangle that you can convert to pseudocolor to determine that the metres-to-feet conversion was successful.

    Now you're ready to produce the actual contour lines. Carry out this task by pulling 'Raster->Extract->Contour' from the menu bar. You'll get a dialog box:
    Making files with the contour lines

  • Select the feet-based raster that you just created as the input file

  • Choose the location of a new 'shp' file (together with the other files that go with it) as the output file. The file must not exist, but the containing directory must exist.

  • Enter an appropriate contour interval. For the rolling hills of upstate New York, at scales 1:24000 and finer, 20 feet is pretty good.

  • Check 'Load into canvas when finished' to be able to review the work

  • Click 'OK.' Don't close the window yet.

  • You should now see contour lines overlaid on top of the color image - uncheck both image layers in the list to make them more visible. If they look good, then we will also want index contours - highlighting of every fifth contour line to make them easier to read. Go back to the contour dialog, change both '20's to '100's, and click 'OK' again. You'll get a second set of contours overlaying the first.

    These steps have to be done for every DEM quadrangle that you download. Given the title of this blog, and my penchant for scripting languages, I really ought to script this workflow. But I haven't done it yet (partly because I'm not positive that I have it completely right).

    Once you have all the contours, it's advantageous to put them in the database. (If you have a large number of contours that don't appear in a particular view, QGIS can use database queries to filter them, and render much faster.) For this, we go back to the command line, and do:

    For the first file:

    $ /usr/lib/postgresql/9.1/bin/shp2pgsql -c r46elu_100ft.shp kbk_100ft_contours \
    | psql -d gis > /dev/null

    (The -c asks to create a database table afresh, and the remaining arguments are the first .shp file and the name of the table.)

    For each subsequent file except the last:

    $ /usr/lib/postgresql/9.1/bin/shp2pgsql -a r46elu_100ft.shp kbk_100ft_contours \
    | psql -d gis > /dev/null

    (The -a asks to append to an existing table in the database.)

    And for the last file:

    $ /usr/lib/postgresql/9.1/bin/shp2pgsql -a -I r46elu_100ft.shp kbk_100ft_contours \
    | psql -d gis > /dev/null

    (The -I option asks to create a spatial index over the layer after loading it.)

    These steps must be carried out for both 20- and 100-foot contours. Once again, I ought to script it. But I haven't done that yet.

    So now, start with a clean slate again in QGIS, and do 'Layer->Add PostGIS layer', selecting the two layers we just built. Remember to use EPSG:26718 as the layer coordinate reference system! These contour lines will be the ones we'll have on the finished map, so we might as well style them prettily. Bring up the 'Properties' on the 20 foot contours, change the color to brown and the line width to 0.18 mm (1/2 point). Then bring up the 'Properties' on the 100 foot contours, and change the color to brown and the line width to 0.35 mm (1 point). Finally, select the 100 foot contour layer, and pull 'Layer->Labeling'. Settings that I find attractive are:

    • Check 'Label this Layer' and select the 'elev' field

    • Choose the font 'Liberation Sans' 8-point italic, in the same shade of brown as the contours.

    • Use a 1-mm buffer

    • In the 'Advanced' tab: Placement parallel, on the line, at a distance of zero, oriented to the line. Medium priority. Suppress labeling for features smaller than 10 mm. Features don't act as obstacles.

    • In the 'Engine Settings' dialog, 'Popmusic Tabu' seems to give a trifle better placement for the contour labels than the default 'Chain'.

    So now we have a serviceable set of contour lines.
    Contour lines

    Don't throw out the original raster images, we may be reusing them to do hill shading later.
    In the meantime, we'll move onward to hydrography in the next installment.


    Making maps in New York - the street map

    To put together any sort of map that tells a story, you need two kinds of mapping data. The first is a basemap: a general picture of the land, and the second is the thematic data: the stuff you add to the basemap to tell your story. I'm not quite ready to tell the story of Niskayuna's nature preserves yet, but at least I know where they are, and I've found sources for basemap data. Let's have a look at some workflows for putting the basemap together.

    Andy Arthur, in his tutorial on QGIS, gives one approach, which starts from the 7.5-minute quadrangle maps published by the New York State Department of Transportation. While these maps are aligned to the same grid as the better-known USGS topographic maps, they are not the same maps; they are somewhat more up to date than the USGS series, and unlike the scanned USGS maps, they are available in four individual layers, one per ink color. They're available from the New York State GIS Clearinghouse - one of the two major GIS repositories for New York. (The other is the Cornell University Geospatial Information Repository [CUGIR].)

    The maps in question consist of one sheet for each of the nearly one thousand quadrangles that make up New York. Each sheet has four separate GeoTIFF files, one per ink color. On the traditional New York State topos that I used when I worked for a county government in the 1970s, the printing layers were:

    • Plan (black) - Overprinted features in black. These are a mix of everything: cultural features (roads and buildings), political features (boundaries and jurisdiction names), place names, and even some hydrology (linear features like streams, and patterned features like marshes).

    • Hydrology (blue) - Polygons giving surface water cover - rivers and lakes.

    • Topography (brown) - Contour lines giving elevations.

    • Built-up areas (yellow on NYS maps, pink on USGS ones) - A shaded overprint marking the area where habitation is so dense that individual buildings are not shown in the plan.

    Andy Arthur's tutorial on QGIS gives quite a comprehensive description for dealing with these layers in QGIS, so I won't repeat it here.

    Combining these four layers for the Niskayuna quadrangle gives a reasonably attractive map:
    Screenshot of the New York State basemap
    It's certainly a technique that I'll keep in my back pocket. But it's not the way that I've decided to go, for several reasons:

    • The maps are raster graphics, which means that they won't scale gracefully. They will pick up pixel artifacts when printed at any other scale than 1:24000, 508 pixels to the inch. (Why 508? Not sure!)

    • The maps are old data: the one that I show in the screenshot was produced in 1980. A lot has changed in 32 years.

    • The maps as scanned are missing stuff along the margins of the region. If I take two quads and lay them together, they don't quite join. It appears that the scans have been cropped a bit too aggressively.

    Gaps between quadrangles

    The USGS digital raster graphics do not have the problem of failing to join up, but they are, for the most part, even older. (Also, they have the problem that they are in color and can't be separated readily.) The newer USTOPO maps are not really intended for GIS use (and don't play all that well with GIS software): it's better to go back to the GIS sources that make them up, which are for the most part publicly available.

    The first thing that maps usually need to show is context: "where is this map relative to the street where I live?" is the first question to answer. So most maps that I produce will need a road map. Most road maps in the US eventually draw on the TIGER maps maintained by the Census Bureau. One public source that draws heavily on TIGER, but then crowdsources additional data, is the OpenStreetMap project. It's become quite good; if all you need is a street map, it's probably serviceable as a basemap. In any case, it should serve well as a starting point for an overlay of cultural features.

    The data in GIS form for just the state of New York are available from They are encoded in an OpenStreetMap-specific format, but QGIS (at least with plugins) can deal with it.

    But there's an issue: The OSM data set is big. The New York data alone are about three gigabytes after decompression. QGIS becomes unacceptably slow and unstable when trying to work with data of this size. Here, PostGIS (which I mentioned in the last post) comes to the rescue. It allows the map data to be saved in a PostgreSQL database, and allows QGIS to load only the data it needs to render a particular map.

    So, after installing all the tools that PostGIS needs, we follow the instructions to create a spatially enabled database. Next, we use the osm2pgsql utility to load up the OpenStreetMap data into the database from the command line:

    osm2pgsql -S /usr/share/osm2pgsql/ -E 32618 -d gis -k -s -p new_york_osm new_york.osm

    Go to lunch. it takes a while.

    OK, once it's done, "Add PostGIS Layer" will allow selection from the database and adding the layer to QGIS. It brings up a dialog that looks as shown below, and I select all the layers that I just put in using 'osm2pgsql', and add them.
    Screenshot: Adding PostGIS layers

    After it chugs for a little while, it shows me the graphics.
    Right after loading OSM data into QGIS

    Yes, they look like hell. We'll get around to that. The point is that the streets and boundaries are there. But right now, we need more data. The map I want is more than a road map.
    The next post will discuss topography.


    Making maps from open-access data in New York - Software

    Once in a while, I have the urge to draw a map of something. For instance, I wanted in my last post to include a map of the trails we took, to show where some of the interesting bits were. Because I do the odd map from time to time, I've been looking into the availability of software and data for doing them. It used to be that map-making was either done laboriously by hand on a light table, or else involved shelling out massive sums of money for software, for data and for Geographic Information Systems (GIS) consultants. But the world has improved; here I'm making notes about what I'm finding out, partly as an aid to myself because I don't do it often enough to remember where all the pieces are.

    I'm a well-known cheapskate: I hate paying a lot for software and data that I'm going to use only intermittently. But I also do respect copyright. In particular, I'm surely not going to scan in a NY/NJ Trail Conference map: producing their maps is a lot of work, and selling them is one of the few sources of revenue that the Conference has to maintain the trails (short of actual donations).

    Moreover, I'm working on a few things for which good maps simply don't exist. One example is that my community has several rather nice nature preserves. One in particular, the former Schenectady Museum Nature Preserve, transferred to New York State a few years back. It's now an unnamed and unsigned parcel of parkland, that the state really doesn't quite know what to do with. But the footpaths in it still exist and are regularly used - and a few people are even maintaining them.

    But the available maps of the preserve don't quite agree with one another, and they vary in readability and completeness. The three that I've found on line are here, here and here. I don't think any of them is entirely satisfactory as it is. To begin with, the only one that actually shows the land boundary and the blaze colors fails to show any context and lacks contour lines. Surely I can do better.

    So, let's see what's out there. I got inspired by stumbling upon a post by Andy Arthur introducing Quantum GIS for making trail maps. His end result was reasonably attractive, and the program looked useful, so I decided to give it a shot.

    Like many things in the open-source world, the program's distribution is a bit chaotic. There are a lot of pieces with "some assembly required," at least under Debian and related systems such as Ubuntu. Material that I know was needed included QGIS itself, GDAL and QGIS's GDAL Python tools, PostGIS (we'll come to why I wanted that in a later post), which in turn requires the Postgres 'contrib' repository and a package called 'hstore', and Postgres itself, and a number of other minor bits. Many of these are components that are run-time loaded into QGIS, and need compatible versions of client libraries (libtiff 3 versus libtiff 4, PostgreSQL 8.4 vs 9.1, and so on). I managed to get something patched together just before giving up and trying to build the whole shebang from source. We'll see whether the next run of the update manager brings everything crashing down around me.

    Once everything installs, the program looks pretty good. The UI is professional, and the functionality all seems to be there. Now for getting some maps! Fortunately, in New York, data availability is quite good. In the next post, I start trolling for data.