CityLab staff lesson
Wednesday, Sept. 26, 2018
Written by David H. Montgomery based on a presentation by Alexandra Kanik.
Not all datasets needs to be mapped, but some do! This mapping class is perfect for beginners looking to learn the basics of visualizing geographic data. We'll go over how to find publicly available data, convert addresses to map points, join datasets and use the open source mapping software, QGIS.
Is there a geographical story?
Would a table or chart tell the story just as well?
Am I just mapping where people live?
Analyze geographic data
Create static maps for display
Generate files for interactive map (CARTO or leaflet.js)
Alexandra Kanik tries to keep this mapping resources doc up to date. It could use some reworking and some additional resources, though, so feel free to contribute!
Census shapefiles DATA.GOV National Historical Geographic Information System
State government websites (such as the Minnesota Geospatial Commons)
College and university research centers (Penn State has this one for Pennsylvania)
IRE Census data
National States Geographic Information Council
Census places - Census places are defined as incorporated places, usually cities, towns, villages, or boroughs. You can read up more on that here.
Your local government should have shapefiles and other data that relate to your local region. Give them a call or visit their website. Here's the GIS office for the City of Pittsburgh.
Peter Aldhous's Refine geocoder - this is nice because it uses geocoders that are, in most circumstances, legal to use. It also geocodes your data using two geocoders so you can compare the results. And it's free! Rate limits may apply.
Census geocoder - Free but I've found it to be a little less accurate. Rate limit: 1,000 addresses at a time.
Here's a more complete list of geocoding tools and APIs
CSV
id, city, state, population 001, Pittsburgh, PA, 310885 002, Columbus, OH, 822092TSV
id city state population 001 Pittsburgh PA 310885 002 Columbus OH 822092PSV
id|city|state|population 001|Pittsburgh|PA|310885 002|Columbus|OH|822092
When you save a QGIS map file, it will save as a .qgs. This file is a configuration file. It basically just references the files that you load into QGIS. IT DOES NOT HOLD ANY OF YOUR DATA. This means if you move the shapefiles you load into your map after you save it, your map will break.
So establish a working directory for each map you work on. Put your shapefiles and your map file in there and don't move them around.
QGIS isn't perfect, and neither are you. Sometimes you come up against errors in behavior that you can explain. So your only recourse is to shut 'er down and reopen the program.
And sometimes, that doesn't work either. So you need to scrap your map file.
Now, if all of your work depended on that one .qgs file, that would suck hardcore. But since your shapefiles and data are separate files, you're totally fine.
So the earth is round, right? Turns out it's not the easiest thing to make something round and three-dimensional look flat. But to hell if we don't try.
That's basically what projections are, flat representations of our round world. Lots of people have taken a stab at making the best projection, but not all projections are created equal.
Depending on the scope and span of your data, you may want to stick to a more local, granular projection. But if you're trying to show the whole world things are going to get a little wonky at some point.
* I've collapsed this section because while it's very important to understand projections, they are also amazingly confusing and might serve only to confuse the novice map maker. Just know to return to this topic once you feel more comfortable with QGIS and displaying and analyzing geographic data in general.
In this demo, we're going to be creating a map looking at questions of transportation and equity in Washington, D.C.
Start by opening up QGIS, and clicking Project
--> New
.
This is a blank map.
A working directory is a place on your computer where you house all of your map shapefiles. Once a shapefile is added to the working directory, it should not be moved around because that can break the file path and therefore your map.
In the example above, map-lesson
is the working directory.
This is what you'll see if you move layers and map files around
We're going to be focusing on VECTOR LAYERS and DELIMITED TEXT LAYERS. These are the layers you will use most often.
Washington_DC_Boundary
--> Washington_DC_Boundary.shp
Simple fill
for more options!Potomac_poly
--> Potomac_poly.shp
Style it to look a little more like a river.dc_poverty
--> dc_poverty.shp
dc_poverty
layer we added is different from Washington_DC_Boundary
and Potomac_poly
— it actually contains data. In this case, it has Census estimates on poverty for every tract in Washington, D.C.dc_poverty
layer and click the icon in the toolbar to Open Attribute Table. (You can also do this by right-clicking/control-clicking the layer name on the left.) You should see this:We want to make this data show up on the map. To do this, we need to double-click the layer, like we styled it before.
But this time, click the dropdown menu at the top, where it says "Single symbol." Here we've got a number of options. Three are important:
Click "Graduated." There's a dropdown menu labeled "Column" where we can select the column of data we want to determine the appearance. If you click it, you should have two options: population
and poverty
. Click poverty
. Then, lower down, click "Classify."
After clicking "OK", you should see something like this:
You can customize this further on the properties page. For example, you can increase the number of classes you want to divide the data into. You can also choose the way QGIS divides up the data.
There's something not right about our map. The poverty
column we selected is just the Census's estimate of the number of people below poverty in each tract. But what we really want is the percentage of people in poverty, so we're not just mapping population. This is actually really easy to do.
Go back to the styling window and find the "Column" dropdown. Click it, and take note that our two columns are poverty
and population
. Click away from the dropdown, then click directly on the text where it says poverty
. Now just type freely in the window so it says poverty/population
! Click "Classify", and then "OK".
Add one last layer: Bicycle_Lanes
--> Bicycle_Lanes.shp
. This is a map of every bike lane in DC. Experiment with styling it.
Looking at this over the map of DC poverty, a pattern becomes pretty obvious — there are way fewer bike lanes in the poorer parts of D.C.
What if we wanted to quantify that? QGIS can do more than just edit the appearance of maps. It also has really robust analytical capabilities. We'll experiment with just one of those.
Click the Vector
menu, then choose Analysis Tools
--> Sum Line Lengths
.
Here we can choose a Line layer and a Polygon layer. What we're going to do here is calculate the length of a line layer inside each element of a polygon layer. Or, in other words, we're going to find out how many miles of bike lanes are in each Census tract!
Make sure that you choose Bicycle_Lanes
under "Lines" and dc_poverty
under "Polygons". Then click "Run in Background", and "Close".
A new layer has been added to our map, called Line length
. Look at its Attribute Table. You can see it's the same as the dc_poverty
attribute table — but with two extra columns. COUNT
is the number of bike lanes in each tract. LENGTH
is the total length of bike lanes. So we can see that "Census Tract 1" has a bike lane length of 422.58842. You might ask: 422.58842 what? Good question! QGIS can work with a variety of units, but the most common are degrees of latitude and longitude, and meters. Thankfully, this is in meters.
Experiment with styling this new layer!
Not all data is best visualized on a map. In this case, our comparison of bike lane length and poverty rate seems tailor-made for a graph. How can we do that? Right-click/control-click the Line length
layer and choose Export
--> Save Features As
. Here we could save this as a Shapefile — that's the default option under "Format". Instead, let's export this as a spreadsheet. Choose Comma Separated Value [CSV]
. Below it, click the ...
button next to "File Name" and save our new CSV in our working directory. Click OK!
Now we've saved that as a spreadsheet to our desktop for future analysis or graphing. (Note that QGIS also added this spreadsheet as a layer, too. QGIS can import spreadsheets! This is really helpful if you have a spreadsheet with geographic data that you want to join with a shapefile — but that's a lesson for another day.)
So you've got a nice map here. What can you do with it?
There are ways to get from QGIS to an interactive map, but we're going to focus on something quicker: how to export a static image of your map.
There's two ways: one easy, and one more complex but powerful:
Go to qgis.org to download the mapping program yourself.
This is only the very beginning of what's possible with QGIS. There are innumerable tutorials online, and I can also provide help if you want to explore yourself!