The reduction in drilling since the latter half of 2014 due to the collapse in oil prices is clearly visible in the animation through both the histogram and the map itself. There is also a strong seasonal spike in drilling every winter. The seasonal spikes cause some challenges in comparing the true changes in the amount of drilling for different parts of the province as time goes on. I think an interesting next step will be to plot a heat map style plot of the year over year change in drilling and animate it as well. This is also just a short animation that maps only a tiny subset of Alberta’s total drilling history. Another interesting next step will be to make this code work more efficiently and to run it on a more powerful machine that can plot the entire drilling history of the province from 1883 to present.

The data and R code necessary to create this animation are detailed below for those who are interested.

Creating The Animation

We’re going to plot wells that have been drilled in Alberta as an animated map. The data comes from the Alberta Energy Regulator (AER) and requires the following supporting data files to re-create the full human friendly dataset:

The above files should be stored in a data/ folder within the same directory as this project.

The well list can then be reconstructed:

wells <- aertidywells::create_well_tibble()

The AER does not provide any precise location information for wells within the freely available well list so it is necessary to find and alternative means of placing them spatially. Luckily all wells in Alberta are issued a Unique Well Identifer (UWI) which consists of three main components:

XX/YY-YY-YYY-YYWY/ZZ

The X components identify in sequential order the wells drilled in a given surface location using the Dominion Land Survey (DLS) coordinate system, the Y components identify the location of the bottom of the well in DLS coordinates, and the Z components identify the the drilling event for the well indicated by the previous two components.

The Y components are the interesting and useful parts for this project. The breakdown of the DLS location is LSD-SEC-TWP-RNGWMER, defined as follows:

It is possible to use the Y components of the UWI to approximate the precide location of each well and convert these locations to latitude and longitude for plotting. A means of converting the DLS locations to an easier coordinate systems is required. Unfortunately the process of converting the grid based DLS coordinates to Latitude and Longitude can only be approximate because the Legal Subdivisions of the DLS location we are given typically measure 400m by 400m.

There are numerous tools to convert DLS locations to other coordinate systems, but most of them are proprietary systems that only allow one query at a time or charge money for bulk conversions. This is largely due to the hisotrically closed and proprietary licensing of the the Alberta Township System (ATS), Alberta’s precise version of the DLS system. Luckily the ATS coordinate files are now freely available from the Geospatial data vendor AltaLIS. The ATS V4.1 Coordinate files should be obtained from AltaLIS and extracted to /data/AltaLIS.

ats_col_widths <- readr::fwf_widths(c(1, 2,3,2,2,11,12,4,2,2,1,1,1), c("MERIDIAN", "RANGE", "TOWNSHIP", "SECTION", "QUARTER SECTION", "LATITUDE", "LONGITUDE", "YEAR COMPUTED", "MONTH COMPUTED", "DAY COMPUTED", "STATION CODE", "STATUS CODE", "HORIZONTAL CLASSIFICATION"))

ats_col_types <- list(readr::col_character(), # MERIDIAN
                      readr::col_character(), # RANGE
                      readr::col_character(), # TOWNSHIP
                      readr::col_character(), # SECTION
                      readr::col_character(), # QUARTER SECTION
                      readr::col_double(), # LATITUDE
                      readr::col_double(), # LONGITUDE
                      readr::col_integer(), # YEAR COMPUTED
                      readr::col_integer(), # MONTH COMPUTED
                      readr::col_integer(), # DAY COMPUTED
                      readr::col_character(), # STATION CODE
                      readr::col_character(), # STATUS CODE
                      readr::col_character()  # HORIZONTAL CLASSIFICATION
)

ats_location_data <- read_fwf("data/AltaLIS/ATS_V4_1.SEQ", ats_col_widths, col_types = ats_col_types)

The above data provided by AltaLIS is not accurate to the level of a single LSD. The finest resolution directly provided are the corners of the quarter sections. The ATS_Coordinate_File_Information.doc describes the precise positioning of these quarter sections. For our purposes we don’t need to be so accurate. If we plot our wells to the level of a section the wells will be placed accurately within 1.6km. That doesn’t seem particularly good, but the inaccuracies should be unnoticable at a provincial map scale.

A single latitude and Longitude can be obtained for each section by grouping the data by each component of the DLS location up to the section and finding the mean latitude and longitude of all the rows given for each section.

ats_location_data <- ats_location_data %>% group_by(MERIDIAN, RANGE, TOWNSHIP, SECTION) %>% 
  summarise(mean_latitude = mean(LATITUDE), mean_longitude = mean(LONGITUDE))

# The AltaLIS longitudes are all positive values which puts them in Siberia, Mongolia, and China. Multiplying by -1 fixes this.
ats_location_data <- ats_location_data %>% mutate(mean_longitude = -1 * mean_longitude)

The DLS location components of the UWI should also be extracted out into their own columns:

wells <- wells %>% dplyr::mutate(`SECTION` = stringr::str_sub(`UWI-DISPLAY-FORMAT`, 7, 8))
wells <- wells %>% dplyr::mutate(`TOWNSHIP` = stringr::str_sub(`UWI-DISPLAY-FORMAT`, 10, 12))
wells <- wells %>% dplyr::mutate(`RANGE` = stringr::str_sub(`UWI-DISPLAY-FORMAT`, 14, 15))
wells <- wells %>% dplyr::mutate(`MERIDIAN` = stringr::str_sub(`UWI-DISPLAY-FORMAT`, 17, 17))

Now the mean latitude and longitude can now be joined to the wells data set:

wells <- wells %>% left_join(ats_location_data %>% select(MERIDIAN, RANGE, TOWNSHIP, SECTION, mean_latitude, mean_longitude))
## Joining, by = c("SECTION", "TOWNSHIP", "RANGE", "MERIDIAN")

The base map is sourced from Google maps using ggmap:

ab <- get_map("Alberta", zoom = 5, maptype = "terrain")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=Alberta&zoom=5&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Alberta&sensor=false

And the wells should show up nicely on top of this base map:

map_boundaries <- attr(ab, "bb")

# The upper and lower boundaries are used to place the date label in the map
label_lon <- (map_boundaries$ur.lon + map_boundaries$ll.lon) / 2
label_lat <- map_boundaries$ur.lat - (map_boundaries$ur.lat - map_boundaries$ll.lat) / 20

date_label_data <- wells %>%
  summarise(
    lon = label_lon,
    lat = label_lat,
    label = "May 20, 1990"
  )

# Plotting a small subset of the wells since plotting all 592,966 of them will take a long time.
wells_May_20_1990 <- wells %>% filter(`FINAL-DRILL-DATE` == ymd("1990-05-20"))

# By creating the base map separate from the points plotted, we will be able to reuse it later in the animation loop.
well_map <- ggmap(ab) +
  theme(legend.position="none") + 
  ggtitle("Wells Drilled in Alberta Over Time") + 
  labs(
    x = "Longitude",
    y = "Latitude"
  )

well_points <- geom_point(data = wells_May_20_1990, aes(x = mean_longitude, y = mean_latitude, color = "red"))
date_label <- geom_text(aes(label = label), data = date_label_data, vjust = "top", hjust = "left")

well_map + well_points + date_label

It would also be nice to see where the drilling on this day falls in a historical context:

date_rect <- data.frame(xmin = ymd("1990-05-20"), xmax = ymd("1990-05-20"), ymin=-Inf, ymax=Inf)
date_range_geom <- geom_rect(data = date_rect, aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax), 
                             color = "dodgerblue1", fill = "dodgerblue1", alpha = 0.4, inherit.aes = FALSE)

well_chart <- ggplot(data = wells) + 
  geom_histogram(aes(x = `FINAL-DRILL-DATE`), na.rm = TRUE, binwidth = 1) + 
  theme(legend.position="none") + 
  ggtitle(" ") + # The blank title ensures the top of the chart is flush with the map. 
  labs(
    x = "Date Drilled",
    y = "Count of Wells Drilled"
  )

well_chart + date_range_geom

And these two plots can be combined into one frame of the eventual animation:

# This ratio of the plots was discovered to balance their heights and widths nicely through experimentation.
layout_dims <- rbind(c(1,1,1,2,2),
                     c(1,1,1,2,2))
grid.arrange(well_map + well_points, well_chart + date_range_geom, layout_matrix = layout_dims)

Running one frame per day will result in a huge number of frames and a very long animation even at a high frame rate ike 60fps:

# days / frames per second / 60 seconds per minute
frames <- as.numeric(difftime(max(wells$`FINAL-DRILL-DATE`, na.rm = TRUE), min(wells$`FINAL-DRILL-DATE`, na.rm = TRUE), units = "days"))
frames / 60 / 60
## [1] 13.62722

An animated gif lasting more than 13 minutes is a bit excessive. Even as a video that is a very long time to spend watching dots dance around on a map. For this reason it’s best to start with a smaller subset of the data to test that a shorter animation will turn out as intended.

wells_short <- wells %>% filter(lubridate::year(`FINAL-DRILL-DATE`) >= 2010)

A simple loop can create the image sequence for the animation. Note that this function uses a system call to ffmpeg to create an mp4 video file of the animation. FFMPEG will need to be installed on your computer to properly render the animation. The precise call to ffmpeg may also need to be adjusted depending on your operating system. The system call below is verified to work on Ubuntu 16.04.

animate_well_points <- function(well_data, duration, frames_per_second = 12, fade_duration = 0.5) {
  
  frame_count <- duration * frames_per_second
  fade_frame_count <- fade_duration * frames_per_second

  # Period is the duration in days that each frame of the animation will span
  period <- (max(well_data$`FINAL-DRILL-DATE`, na.rm = T) - min(well_data$`FINAL-DRILL-DATE`, na.rm = T)) / frame_count
  start_date <- min(well_data$`FINAL-DRILL-DATE`, na.rm = T)

  # Frame is the first frame of the animaion in which each well should be shown
  well_data <- well_data %>% mutate(FRAME = floor((`FINAL-DRILL-DATE` - start_date) / as.double(period, units = 'days')))
  
  histogram_bins <- as.numeric(seq(from = start_date, to = start_date + (frame_count + 1) * period, by = period))
  
  # The chart can be created outside the loop and reused inside
  well_chart <- ggplot(data = well_data) + 
    geom_histogram(aes(x = `FINAL-DRILL-DATE`), na.rm = TRUE, breaks = histogram_bins) + 
    theme(legend.position="none") + 
    ggtitle(" ") + # The blank title ensures the top of the chart is flush with the map. 
    labs(
      x = "Date Drilled",
      y = "Count of Wells Drilled"
    )
  
  # Padding the file names with zeros ensures the frames are assembled into the gif in the right order.
  filename_padding = floor(log10(frame_count)) + 1
  
  for (frame in 0:frame_count) {
    frame_start_date = start_date + frame * period
    frame_end_date = start_date + (frame + 1) * period
    
    date_label_data <- well_data %>%
      summarise(
        lon = label_lon,
        lat = label_lat,
        label = str_c(month(frame_start_date, label = TRUE), " ", year(frame_start_date)) 
      )
    
    frame_wells <- well_data %>% filter((FRAME >= frame - fade_frame_count) & (FRAME <= frame)) %>% 
      mutate(ALPHA = 1 - (frame - FRAME)/fade_frame_count)
    
    well_points <- geom_point(data = frame_wells, aes(x = mean_longitude, y = mean_latitude, color = "red"), 
                              size = 0.25, alpha = frame_wells$ALPHA)
    date_label <- geom_text(aes(label = label), data = date_label_data, vjust = "top", hjust = "left")
    
    date_rect <- data.frame(xmin = frame_start_date, xmax = frame_end_date, ymin=-Inf, ymax=Inf)
    date_range_geom <- geom_rect(data = date_rect, aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax), 
                                 fill = "dodgerblue1", alpha = 0.4, inherit.aes = FALSE)
    
    frame_grob <- arrangeGrob(well_map + well_points + date_label, well_chart + date_range_geom, layout_matrix = layout_dims)
    
    ggsave(file = str_c("image_sequence/well_points", str_pad(frame, width = filename_padding, side ="left", pad ="0"),".png"), 
           plot = frame_grob)
  }
  
  # This call to ffmpeg may need to be adjusted depending on your operating system.
  system(str_c("ffmpeg -y -framerate ", frames_per_second, " -i 'image_sequence/well_points%02d.png' -s:v 1280x720 -c:v libx264 -profile:v high -crf 20 -pix_fmt yuv420p wells_animation.mp4"))
}

The function can then be called on the wells_short data set and an animation of all drilling between 2010 and the present is created:

animation_dur <- 7.4
fps <- 12
fade_dur <- 1/4

animate_well_points(wells_short, duration = animation_dur, frames_per_second = fps, fade_duration = fade_dur)