Now that we’ve covered vectors and rasters in R, it’s time to discuss how we can make some nice plots and maps of these data. The distributed network of R packages has continued to grow over the years and now offers many easy options for mapping. In addition, base R graphics and those provided by the powerful ggplot2 package can also be used to plot spatial data. We’ll discuss how we can leverage generic plotting functions in R and more specific packages to help us explore spatial data in more detail.
By the end of this lesson, you should be able to answer these questions:
Let’s first make sure we have an R Markdown file setup and the correct packages loaded.
1) In the project you created for this workshop, open a fresh R Markdown file from the File menu. Name it “Basic mapping” and save it in your project.
2) Remove all the template content below the YAML header. Add some descriptive text using Markdown below the YAML header that briefly describes the lesson plan (e.g., “This lessons covers base graphics, ggplot, and other R packages for mapping spatial data.”)
3) Below this text, add a code chunk and write some script to load the following packages: ggplot2
, maps
, sf
, ggmap
, ggrepel
, plotly
, mapview
, micromap
, tmap
. It should look like this inside your code chunk:
library(maps)
library(sf)
library(raster)
library(tmap)
library(micromap)
library(ggrepel)
library(ggmap)
4) When you’re done, compile the document by pressing the knit button at the top. Did it work?
Mapping spatial data falls under the broad category of data vizualization. The same motivation for creating a simple scatterplot can apply to creating a map. The overall goal is to develop insight into your data by visualizing patterns that cannot be seen in tabular format. Of course the added complication with spatial data is the inclusion of a location. How you handle this spatial component is up to you and depends on whether location is relevant for the plot or map.
Spatial data by definition always include a reference location for each observation. More often than not, additional variables not related to space may be collected for each observation. For example, multiple measurements of water quality taken at different locations in a lake can be indexed by latitude, longitude, and whatever water quality data were taken at a sample site. The only piece of information that makes the water quality data spatial is the location. Whatever your data look like, you have to choose if space is an important variable to consider given your question. For mapping spatial data, consider the following:
The answer to these questions can help you decide what type of visualization is important for the data. On the other hand, mapping the data can also give you answers to these questions. We’ll explore different mapping approaches in this lesson that will help us address these questions.
Let’s start by taking the simplest type of spatial data and creating the simplest type of map. We’ll use the same dataset when we first started the workshop.
cities <- c('Ashland', 'Corvallis', 'Bend', 'Portland', 'Newport')
longitude <- c(-122.699, -123.275, -121.313, -122.670, -124.054)
latitude <- c(42.189, 44.57, 44.061, 45.523, 44.652)
population <- c(20062, 50297, 61362, 537557, 9603)
dat <- data.frame(cities, longitude, latitude, population)
plot(latitude ~ longitude, data = dat)
Neat, we’ve created a map in cartesian space. As we’ve learned, we need to convert our data into a spatial object with the correct coordinate system. This will let us correctly view the georeferenced data. Let’s convert our data frame to an sf
object.
dat_sf <- st_as_sf(dat, coords = c("longitude", "latitude"))
st_crs(dat_sf) <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
head(dat_sf)
## Simple feature collection with 5 features and 2 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -124.054 ymin: 42.189 xmax: -121.313 ymax: 45.523
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## cities population geometry
## 1 Ashland 20062 POINT (-122.699 42.189)
## 2 Corvallis 50297 POINT (-123.275 44.57)
## 3 Bend 61362 POINT (-121.313 44.061)
## 4 Portland 537557 POINT (-122.67 45.523)
## 5 Newport 9603 POINT (-124.054 44.652)
Now we can just plot the locations of our sf
object by telling R to only plot the geometry attribute.
plot(st_geometry(dat_sf))
Although it’s a bland plot, the points are now georeferenced. We can use some base maps from the maps
package to add some context for the locations (note the add = T
argument).
map('county', region = 'oregon')
plot(st_geometry(dat_sf), add = T)
There really isn’t anything too interesting about the locations of these cities, so maybe we want to overlay an additional attribute from our dataset. Remember that we have the population of each city as well. We can change the point types and scale the size accordingly.
map('county', region = 'oregon')
plot(st_geometry(dat_sf), add = T, pch = 16, cex = sqrt(dat_sf$population * 0.0002))
This is kind of tedious. We can alternatively make the same plot using ggplot2 to map some of the data attributes to the plot aesthetics. We’ll take advantage of the geom_sf
function to plot the sf
object.
First, we have to have all of our map objects in the correct format for sf
. Our cities dataset is already an sf
object, so all we have to do is convert our base state map to an sf
object.
state <- st_as_sf(map('county', region = 'oregon', plot = F, fill = T))
Before we proceed, we need to make sure we have the develoment version of ggplot2. This version includes the geom_sf
function, which is not included with the stable release on CRAN. Run the following code if you haven’t done this already. The devtools package is needed to install development packages from GitHub.
install.packages("devtools")
library(devtools)
install_github("ggplot2")
library(ggplot2)
Now we make the plot with the geom_sf
function from ggplot2 for both sf
objects.
ggplot() +
geom_sf(data = state) +
geom_sf(data = dat_sf)
We can also map population to size and colour.
ggplot(dat_sf) +
geom_sf(data = state) +
geom_sf(aes(size = population, colour = population))
There are a few things we might want to modify with this plot, so let’s assign it to a variable in our workspace that we can easily modify (using the handy last_plot()
function from ggplot2).
p <- last_plot()
Now let’s make the size gradients more noticeable, remove the weird size legend, and change the color ramp.
p <- p +
scale_size(range = c(5, 15), guide = F) +
scale_colour_distiller(type = 'div', palette = 9)
p
We can also add labels using geom_text
. Let’s plot the city names over the points.
p + geom_text(aes(x = longitude, y = latitude, label = cities))
If we want to prevent the labels from overlapping the points, we can use the geom_text_repel
function from the ggrepel package. You’ll have to play around with the values for point.padding
to get the distances you like. The segment.alpha
value will also remove the lines joining the points to the labels.
p + geom_text_repel(aes(x = longitude, y = latitude, label = cities), point.padding = grid::unit(0.5, 'lines'), segment.alpha = 0)
As an alternative basemap, we can leverage the ggmap
package to download a static image of our plot region. This often works well for plotting points where the mapping unit of interest is relatively simple (i.e., not a line or polygon). It’s also useful for arbitrary areas of interest that aren’t easily captured by geopolitical boundaries.
To use ggmap
, you have to first download the image with the get_map
function based on the extent of the plotting area. This requires pulling the extent from our existing data object that we want to plot. Then, we can choose from different basemaps, such as Google basemaps, terrain maps, and satellite images. For this example we’ll download a satellite image of our area. One thing to keep in mind is the zoom option. This is used to specify the level of detail in your downloaded image and can take quite some to download depending on the area and level of specificity. We’ll follow these steps to download the basemap:
dat_sf
object using st_bbox
and unname
to remove the name attributes (ggmap will fuss otherwise)get_map
function in the ggmap package with the extent, pick the right zoom
level and specify maptype = "satellite"
ggmap
function# get the extent
dat_ext <- unname(st_bbox(dat_sf))
# get the base map using the extent
bsmap <- get_map(location = dat_ext, maptype = 'satellite', zoom = 7)
# plot the basemape
ggmap(bsmap)
Now that we’ve got the base map in order, we can simply add the locations of our cities with the geom_sf
function from the ggplot2
package as before. The ggmap
function returns a ggplot2
object so we can just add geoms and other ggplot options as we normally would. We’ll have to tell ggplot to ignore the aesthetics from ggmap for the plot to work correctly (inherit.aes = F
).
ggmap(bsmap) +
geom_sf(data = dat_sf, inherit.aes = F, color = 'red')
We can also map the same aesthetics in our dataset as before to plot size and colour for city population and create some nice labels for the city name.
ggmap(bsmap) +
geom_sf(data = dat_sf, inherit.aes = F, aes(size = population, colour= population)) +
geom_text_repel(data = dat_sf, aes(x = longitude, y = latitude, label = cities), point.padding = grid::unit(0.5, 'lines'), segment.alpha = 0, colour = 'white')
Changing some of the plot options for the aesthetics can also be done as before.
ggmap(bsmap) +
geom_sf(data = dat_sf, inherit.aes = F, aes(size = population, colour= population)) +
geom_text_repel(data = dat_sf, aes(x = longitude, y = latitude, label = cities), point.padding = grid::unit(0.5, 'lines'), segment.alpha = 0, colour = 'white') +
scale_size(range = c(5, 15), guide = F) +
scale_colour_distiller(type = 'div', palette = 9)
This might seem incredibly tedious but most of the coding has to do with formatting ggplot2. The spatial components are relatively simple and should come naturally if you are already familiar with using R.
In this exercise we’ll recreate the steps we learned in this module but on a different dataset. The meuse
dataset comes with the sp
package and describes the locations and topsoil heavy metal concentrations collected in a flood plain of the river Meuse, Netherlands. The data are in raw format so you’ll have to import and convert it to an sf
object with the correct reference system. Once we have the object created, you can map it with geom_sf
and link any of the other dataset variables to additional plot aesthetics. We’ll also want to pick an appropriate basemap for the region using ggmap
. This is a long exercise so feel free to use the code in the cheat box below.
Create a new code chunk in your R markdown file and load the sp
, sf
, maps
, and gglot2
libraries.
Import the meuse dataset from the sp
package. Just load the sp
library as you normally would and run data(meuse)
.
What is the structure of the data (Hint: class(meuse)
or str(meuse)
)? Which columns contain the sample coordinates?
Convert the meuse data frame to an sf
object using the st_as_sf
function. The help file tells you what required arguments are needed for converting a data frame with st_as_sf
(run ?st_as_sf
to open). Specifically, you’ll need to specify the coords
and crs
argument (hint: coords = c("x, y"), crs = "+init=epsg:28992")
. Don’t worry about which coordinate system to use, this info in the help file for meuse (?meuse
). The complete code should look something like this: dat <- st_as_sf(meuse, coords = c('x', 'y'), crs = "+init=epsg:28992")
.
The native coordinate system for meuse is projected. We’ll want to convert the sf
object to a geographic coordinate system as a requirement for ggmap
. You can use the st_transform
and the argument crs = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
. Don’t forget to assign the transformed object to a new variable in your environment (e.g., newdat <- st_transform(...)
).
Now get an appropriate basemap for the region from ggmap
. Remember, you have to first get the extent from the sf
object (hint: unname(st_bbox(dat))
where dat
is your transformed meuse data). Then download the map using the extent with get_map
using maptype = "satelitte"
and zoom = 13
. Assign the basemap to an object.
Plot the basemap with ggmap
and then add the meuse data with geom_sf
. Don’t forget to use the argument inherit.aes = F
in geom_sf
. It should look something like this: ggmap(bsmap) + geom_sf(data = dat, inherit.aes = F)
if your base map is called bsmap
and your projected meuse data is dat
.
When you’re satisified, try mapping some of the attributes in the meuse dataset to some of the ggplot aesthetics, e.g., geom_sf(data = dat, inherit.aes = F, aes(colour = cadmium))