Mapping Telemetry Observations

Mapping in R

There are a number of workflows within R for making maps based on geospatial data. Here, we focus almost exclusively on mapping within the ggplot2 package ecosystem. But, it is certainly worth the effort to learn about the other options including tmap, mapview, and leaflet.

Choosing a Projection

It’s important that your spatial coordinates be projected. This is an important step when presenting your data as a map because the choice of projection can lead to unintentional perception bias and cause confusion. Just as importantly, geographic data that is represented in latitude and longitude coordinates cannot be used for animal movement modeling. Ideally, you will already have a good understanding of your study area and will have determined your target CRS. In this situation, you can use the sf::st_transform() function to transform your coordinates into this new projection.

If you are unsure of what CRS might be a good choice for your data, the crsuggest package can provide some insight. We’ll use this package to find a sensible projection for our Alaska bearded seal data. To help narrow the options, we can specify that we want to keep the corresponding geographic coordinate system as WGS 84 by passing the 4326 EPSG code to the gcs argument. We can also specify that we want our units to be in meters.

library(sf)
library(crsuggest)

top_crs <- suggest_crs(akbs_locs,
            gcs = 4326,
            units = "m")

top_crs %>% 
  dplyr::select(crs_code, crs_name, crs_gcs, crs_units)

# A tibble: 10 × 4
   crs_code crs_name                              crs_gcs crs_units
   <chr>    <chr>                                   <dbl> <chr>    
 1 6124     WGS 84 / EPSG Arctic zone 4-12           4326 m        
 2 32602    WGS 84 / UTM zone 2N                     4326 m        
 3 32601    WGS 84 / UTM zone 1N                     4326 m        
 4 32603    WGS 84 / UTM zone 3N                     4326 m        
 5 32604    WGS 84 / UTM zone 4N                     4326 m        
 6 32605    WGS 84 / UTM zone 5N                     4326 m        
 7 32606    WGS 84 / UTM zone 6N                     4326 m        
 8 32607    WGS 84 / UTM zone 7N                     4326 m        
 9 5926     WGS 84 / EPSG Arctic Regional zone B1    4326 m        
10 5931     WGS 84 / EPSG Arctic Regional zone C1    4326 m

The first CRS suggested is the WGS 84 / EPSG Arctic zone 4-12. So, let’s go ahead and transform our data and see what this projection looks like on a map with our data. As mentioned above, we’ll use the sf::st_transform() function to do the transformation. To give ourselves some context, we’ll load simple world basemap data from the rnaturalearth package.

library(ggplot2)
library(rnaturalearth)
library(rnaturalearthdata)

world <- ne_countries(scale = "medium",returnclass = "sf") %>% 
  sf::st_make_valid()
crs_code <- as.integer(top_crs$crs_code[1])
 
filt_locs <- akbs_locs %>% sf::st_transform(crs_code)
world <- world %>% sf::st_transform(crs_code) 

map <- ggplot() + geom_sf(data = world) +
  geom_sf(data = filt_locs, size = 0.5, shape = 19) + 
  coord_sf(
    xlim = sf::st_bbox(filt_locs)[c(1,3)],
    ylim = sf::st_bbox(filt_locs)[c(2,4)]) +
  theme_minimal()

map

Basic map of locations from bio-loggers deployed on six bearded seals in northwest Alaska

This looks pretty good and, now, we can proceed with making a more complete set of maps for our observations. The focus is still more about understanding the nature and distribution of our data than making perfect maps. But, it is reasonable to expect these products might be useful within future publications or presentations.

Map Observations with `ggplot2`

There are three methods for visualizing our observation data:

points on a map, colored by deploy_id
lines on a map, colored by deploy_id
grid of point density

Points on a Map

For this, we can start with the figure we made above. But, we’ll want to be more deliberate about grouping, color choices, and labels. For color choices, we’re going to choose palettes from the colorspace package. Also, we’ll take the opportunity to reduce the length of our deploy_id values.

library(colorspace)
library(stringr)

filt_locs <- filt_locs %>% 
  dplyr::mutate(deploy_id = stringr::str_sub(deploy_id, 1,11))

map <- ggplot() + geom_sf(data = world) +
  geom_sf(data = filt_locs, aes(color = deploy_id),
          size = 0.35, shape = 19) + 
  coord_sf(
    xlim = sf::st_bbox(filt_locs)[c(1,3)],
    ylim = sf::st_bbox(filt_locs)[c(2,4)]) +
  scale_color_discrete_qualitative(
    palette = "Dark 3",
    name = "deployment") +
  labs(title = "Observed Locations of 6 Bearded Seals in NW Alaska",
        subtitle = paste0("some data were censored based on a",
                          "speed, distance, and angle filter")) +
  theme_minimal()

map

Map of locations from bio-loggers deployed on six bearded seals in northwest Alaska

This is pretty good, but it would be nice if we could interact with the data and zoom in on particular regions. The mapview package provides an easy means for creating interactive maps of your spatial data. The one caveat to mapview is that it is based on the Web Mercator projection (a la Google Maps, etc) and there are some extra steps required if your tracks cross the dateline at 180 (which ours do … yay! fun!)

What we need to do is transform our data back to geographic and, then, convert the coordinates from -180:180 to 0:360. We will use a custom written function to handle this for us. The st_to_360() function is not needed for data sets that do not cross the 180 meridian.

st_to_360 <- function(g) {
  coords <- (sf::st_geometry(g) + c(360,90)) %% c(360) - c(0,90)
  g <- sf::st_set_geometry(g,coords) %>% sf::st_set_crs(4326)
  return(g)
}

We will use the ESRI Ocean Basemap for our background layer and the same qualitative color palette we used above. The mapview package is based on the web mapping library, leaflet. Because of that, these interactive maps are only available when rendering to HTML or other HTML capable viewers.

library(mapview)

sf::st_transform(filt_locs,4326) %>% st_to_360() %>% 
mapview::mapview(map.types = "Esri.OceanBasemap",
                 zcol = "deploy_id",
                 layer.name = "Deployment ID",
                 col.regions = qualitative_hcl(palette = "Dark 3", n =6),
                 cex = 1.5, lwd = 0)

Interactive map of locations from bio-loggers deployed on six bearded seals in northwest Alaska

Lines on a Map

In actuality, the point observations from our deployments represent an underlying movement path for each animal. And, later, we’ll work to create a movement model that estimates this path. But, for now, we can approximate this by simply connecting the points into an ordered path. The sf package can represent a range of spatial data types, including LINESTRING. Since we have multiple lines within our data set, we’ll create a MULTILINESTRING. Creating a spatial line from a set of spatial points isn’t, initially, intuitive. We need to ensure key attributes are retained during the conversion. First, we need to identify our ‘grouping’ variable. In this case, as with our points, the deploy_id identifies our groups. Within each group, our points need to be properly ordered. The date column provides this order for us. Lastly, when we combine the individual points into a line, all of the point level attributes (e.g. date, location type, error) will be lost. If there are key attributes that should be retained, then those need to either be included as grouping variables or stored in a table that can be, later, joined with the lines data. We are going to keep things simple and not worry about line attributes beyond deploy_id.

filt_lines <- filt_locs %>% 
  dplyr::group_by(deploy_id) %>% 
  dplyr::arrange(date) %>% 
  dplyr::summarise(do_union = FALSE) %>% 
  sf::st_cast("MULTILINESTRING")

We can use much of the same code as before and just replace filt_locs with filt_lines.

library(colorspace)
library(stringr)

map <- ggplot() + geom_sf(data = world) +
  geom_sf(data = filt_lines, aes(color = deploy_id),
          lwd = 0.5) + 
  coord_sf(
    xlim = sf::st_bbox(filt_lines)[c(1,3)],
    ylim = sf::st_bbox(filt_lines)[c(2,4)]) +
  scale_color_discrete_qualitative(
    palette = "Dark 3",
    name = "deployment") +
  labs(title = "Connected Movement Paths of 6 Bearded Seals in NW Alaska",
        subtitle = paste0("some data were censored based on a",
                          "speed, distance, and angle filter")) +
  theme_minimal()

map

Map of connected movement paths from bio-loggers deployed on six bearded seals in northwest Alaska

And, just as before, we can create an interactive map with mapview, this time, showing our connected movement paths. Note that, before, we relied on the col.regions parameter to specify colors for our points. For lines, we specify the color palette with color. Also, note, the lwd was increased to 1.5 and the cex specification was removed.

library(mapview)

sf::st_transform(filt_lines,4326) %>% st_to_360() %>% 
mapview::mapview(map.types = "Esri.OceanBasemap",
                 zcol = "deploy_id",
                 layer.name = "Deployment ID",
                 color = qualitative_hcl(palette = "Dark 3", n =6),
                 lwd = 1.5)

Interactive map of connected movement paths from bio-loggers deployed on six bearded seals in northwest Alaska

Point Density Map

Over-plotting occurs when multiple points are plotted on top of one another because of their close proximity. This can make it difficult fully understand the distribution of points. To address this, we can create a uniformspatial grid and summarize the number of points within each grid cell. This density grid should provide a more representative visualization of our data.

The first step in this process is to create a spatial grid tesselation. We need to decide on a cell resolution and the shape of our grid cells. There are no specific rules regarding grid cell resolution. Although, the memory and computational time required to process grid data increases quickly as changes in cell dimensions lead to non-linear increases in the number of cells. A spatial grid can be any regular tesselation of the study area. We have three possible regular tesselations: equilateral triangles, squares, and regular hexagons. Of those, squares are the most common but hexagons have some key benefits.

Why Hexagons

The following explanation comes from Matt Strimas-Mackey at https://strimas.com/post/hexagonal-grids/

Regular hexagons are the closest shape to a circle that can be used for the regular tessellation of a plane and they have additional symmetries compared to squares. These properties give rise to the following benefits.

Reduced edge effects: a hexagonal grid gives the lowest perimeter to area ratio of any regular tessellation of the plane. In practice, this means that edge effects are minimized when working with hexagonal grids. This is essentially the same reason beehives are built from hexagonal honeycomb: it is the arrangement that minimizes the amount of material used to create a lattice of cells with a given volume.
All neighbours are identical: square grids have two classes of neighbours, those in the cardinal directions that share an edge and those in diagonal directions that share a vertex. In contrast, a hexagonal grid cell has six identical neighbouring cells, each sharing one of the six equal length sides. Furthermore, the distance between centroids is the same for all neighbours.
Better fit to curved surfaces: when dealing with large areas, where the curvature of the earth becomes important, hexagons are better able to fit this curvature than squares. This is why soccer balls are constructed of hexagonal panels.

Creating our grid within R is pretty straightforward using the sf::st_make_grid() function. We’ll set our cellsize to 50,000 meters (25 km) and, then, create an index for each of our cells. The sf::st_join() function will intersect the hexagonal grid with our locations so we can count the number of locations within each cell. One thing we need to decide is whether to aggregated the counts across all observations or keep the grouped approach and determine separate counts for each deploy_id. Since we are still interested in understanding the distribution and nature of our data as best as possible, it seems sensible to keep the count separated by deploy_id. However, with more than 6 deployments, this could become untenable to plot and fully explore.

hexgrid <- sf::st_make_grid(st_bbox(filt_locs) %>% st_as_sfc(), 
                            cellsize = 50*1000,
                            what = "polygons", square = FALSE) 
hexgrid <- sf::st_sf(index = 1:length(lengths(hexgrid)), hexgrid)

hexbin <- sf::st_join(filt_locs, hexgrid, join = st_intersects)

locs_count <- hexgrid %>%
  dplyr::left_join(
    dplyr::count(hexbin, index, deploy_id) %>%
      tibble::as_tibble() %>%
      dplyr::select(deploy_id, index, ct=n)
  ) %>% tidyr::drop_na()

library(scico)
map <- ggplot() + geom_sf(data = world) +
  geom_sf(data = locs_count, 
          size = 0.125,
          aes(fill = ct)) +
  coord_sf(
    xlim = sf::st_bbox(filt_lines)[c(1,3)],
    ylim = sf::st_bbox(filt_lines)[c(2,4)]) +
  scale_fill_scico(palette = 'imola',
                    trans = "log10",
    name = "number of locations",
    guide = guide_colorbar(title.position = "bottom", barwidth = 15, 
                           barheight = 0.5, title.hjust = 0.5)) + 
  facet_wrap(~ deploy_id) +
  theme(legend.position = "bottom") +
  ggtitle("Spatial distribution of bearded seal bio-logger locations") +
  labs(caption = bquote("One hexagonal cell represents 50"~km^2))
  
map

Maps showing the spatial distribution of location observations from bio-loggers deployed on six bearded seals in northwest Alaska