library(sf)
library(lubridate)
library(dplyr)
<- akbs_locs %>%
dat ::st_drop_geometry() %>%
sfmutate(month = lubridate::month(date)) %>%
group_by(deploy_id, month) %>%
count(name = "num_locs")
Deployment Summary Tables
Understanding Our Source Data
Now that we’ve setup a data structure for efficient access, imported the source data into R, and converted that data into a spatial data set, it’s time to explore and see what we have to work with. This is an important step so you can recognize any problems with the data or inconsistencies that need to be further investigated.
Summary tables are a good way of splitting large data into components of interest and learning how our data might be distributed across those components. One example might be a simple calculation of the number of location observations within each month across the individual animals. This might identify anomalies such as locations in months prior to deployment or missing data when it was expected.
We’ll first want to group our location records by deployment and month.
To create a sensible table, let’s just focus on a single deploy_id, EB2009_3000_06A1346
library(knitr)
%>%
dat ::filter(deploy_id == "EB2009_3000_06A1346") %>%
dplyr::arrange(month) %>%
dplyr::kable() knitr
deploy_id | month | num_locs |
---|---|---|
EB2009_3000_06A1346 | 1 | 307 |
EB2009_3000_06A1346 | 2 | 446 |
EB2009_3000_06A1346 | 3 | 567 |
EB2009_3000_06A1346 | 4 | 153 |
EB2009_3000_06A1346 | 6 | 183 |
EB2009_3000_06A1346 | 7 | 778 |
EB2009_3000_06A1346 | 8 | 1007 |
EB2009_3000_06A1346 | 9 | 764 |
EB2009_3000_06A1346 | 10 | 748 |
EB2009_3000_06A1346 | 11 | 753 |
EB2009_3000_06A1346 | 12 | 476 |
One oddity that immediately stands out is the lack of locations in May. This, however, is to be expected as these deployments started in June and stopped transmitting in April of the following year and matches expectations for battery life.
If we look at another deployment, EB2009_3001_06A1332, we can see that this deployment ended in March.
%>%
dat ::filter(deploy_id == "EB2009_3001_06A1332") %>%
dplyr::arrange(month) %>%
dplyr::kable() knitr
deploy_id | month | num_locs |
---|---|---|
EB2009_3001_06A1332 | 1 | 476 |
EB2009_3001_06A1332 | 2 | 458 |
EB2009_3001_06A1332 | 3 | 63 |
EB2009_3001_06A1332 | 6 | 119 |
EB2009_3001_06A1332 | 7 | 735 |
EB2009_3001_06A1332 | 8 | 641 |
EB2009_3001_06A1332 | 9 | 792 |
EB2009_3001_06A1332 | 10 | 789 |
EB2009_3001_06A1332 | 11 | 581 |
EB2009_3001_06A1332 | 12 | 677 |
Lastly, deployment EB2011_3001_10A0552 appears to have stopped transmitting in January of the following year which is 3-4 months earlier than any of the other devices.
%>%
dat ::filter(deploy_id == "EB2011_3001_10A0552") %>%
dplyr::arrange(month) %>%
dplyr::kable() knitr
deploy_id | month | num_locs |
---|---|---|
EB2011_3001_10A0552 | 1 | 675 |
EB2011_3001_10A0552 | 6 | 482 |
EB2011_3001_10A0552 | 7 | 993 |
EB2011_3001_10A0552 | 8 | 1062 |
EB2011_3001_10A0552 | 9 | 1134 |
EB2011_3001_10A0552 | 10 | 904 |
EB2011_3001_10A0552 | 11 | 896 |
EB2011_3001_10A0552 | 12 | 1019 |
This is not so unusual, but such an anomaly is worth investigating further to ensure there were no issues with the data processing or other important deployment metadata.