R in Space

inSileco

2026-03-25

Welcome

Objectives

What To Expect

What This Workshop Is

  • a practical introduction to spatial workflows in R
  • a workshop about using R operationally as a GIS
  • focused on common manipulations you will reuse in real projects

What This Workshop Is Not

  • not a full course in geodesy, cartography, or remote sensing
  • not a survey of every spatial package or method
  • not a theory-first geocomputation course

Resources

Workshop Objective


By the end of the workshop, participants should be able to use R as a GIS, work operationally with common spatial data manipulations, and have the right tools in hand to keep learning.

Our Focus

  • import spatial data
  • inspect geometry and CRS
  • transform and subset data
  • measure distance, area, and length
  • combine layers through joins and overlays
  • extract raster values to vector features
  • build summaries, analyses, and maps

Workshop Structure

Materials

  • short concept-driven slides
  • live examples in R
  • reusable scripts and resources for later review

Exercises

  • targeted hands-on exercises
  • designed specifically for workshop participants
  • based on the data shared ahead of the event

The workshop alternates between concise instructional material and applied exercises, so participants can move directly from concepts to practice.

R as a GIS

R to build reproducible pipelines

Spatial Data

Object type Description sf stars terra
Vector Where are the features?
Point icon Points Discrete locations with coordinates and attributes.
Line icon Lines Linear features such as roads, rivers, or tracks.
Polygon icon Polygons Areas such as lakes, study zones, or administrative boundaries.
Raster What is the value across space?
Raster grid icon Rasters Gridded surfaces with values stored in cells.

Main Spatial R Packages

New ecosystem

sf 🌐

  • simple features standard
  • vector data
  • tidy friendly

stars 🌐

  • spatiotemporal arrays
  • raster and datacubes
  • strong companion to sf

terra 🌐

  • vector and raster support
  • strong raster tooling
  • full GIS-style workflow

Old ecosystem

sp 🌐

  • vector data

raster 🌐

  • rasters

Accompanying Packages

Data manipulation and measurements

tidyverse 🌐

  • tabular workflows around spatial objects
  • filtering, joining, grouping, summarising

tidyterra 🌐

  • tidy verbs and plotting helpers for terra
  • a useful extension, not a replacement for terra

units 🌐

  • explicit lengths, areas, and distances in sf
  • contrasts terra, returns numeric measurements

Visualization

ggplot2 🌐

  • communication-quality static graphics
  • flexible layered maps

mapview 🌐

  • fast interactive exploration
  • quick visual checks of spatial layers

tmap 🌐

  • thematic mapping workflows
  • static and interactive map support

These are examples of particularly relevant accompanying packages. The spatial ecosystem in R is broad and active, and is not limited to the packages shown here.

A Note on IDEs

What are IDEs? Integrated Development Environments

RStudio 🌐

  • long-standing default environment for R
  • integrated console, editor, plots, files, and packages
  • a strong starting point for most R users

VS Code 🌐

  • general-purpose editor with R support through extensions
  • suited for mixed workflows across R, Python, Quarto, Git
  • a good option for multi-language projects

Positron 🌐

  • newer data-science IDE from Posit, VS Code-like
  • combines scripting, notebooks, and IDE features
  • worth watching as the ecosystem evolves

The goal of this workshop is not to teach a specific IDE. Use the environment that lets you work comfortably with scripts, objects, plots, and spatial outputs.

A Note on Pipes

What are pipes?

Pipes pass the output of one step into the next, making multi-step workflows easier to read and write. We mention them explicitly because they are used throughout the workshop.


magrittr pipes: %>%

CO2 %>%
  dplyr::filter(Type == "Quebec")

native pipes: |>

CO2 |>
  dplyr::filter(Type == "Quebec")

A Note on Code

Choose Your Tool(s)

sf_object <- sf::function(...)
stars_object <- stars::function(...)
terra_object <- terra::function(...)


Navigating through the code in the presentation

The slides often show the same operation twice: once with sf/stars, and once with terra.

You do not need to follow both versions at once. Use the tabs to stay in the ecosystem you chose for the workshop.

A Note on Exercises

Choose Your Route

Do both levels!

More advanced participants can complete both workflows, as there will be advanced workflows available with both points and lines.

Vector Data in R

sf & terra

Read Vector Data

aoi <- st_read("data/polygons.gpkg", layer = "study_area")
zones <- st_read("data/polygons.gpkg", layer = "zones")
aoi
zones
Simple feature collection with 1 feature and 1 field
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -73 ymin: 46.55 xmax: -72.6 ymax: 47
Geodetic CRS:  WGS 84
        name                           geom
1 study_area POLYGON ((-73 46.55, -72.6 ...
Simple feature collection with 4 features and 2 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -73 ymin: 46.55 xmax: -72.6 ymax: 47
Geodetic CRS:  WGS 84
  zone_id   zone_type                           geom
1      NW      forest POLYGON ((-73 46.775, -72.8...
2      NE       urban POLYGON ((-72.8 46.775, -72...
3      SW     wetland POLYGON ((-73 46.55, -72.8 ...
4      SE agriculture POLYGON ((-72.8 46.55, -72....

aoi <- vect("data/polygons.gpkg", layer = "study_area")
zones <- vect("data/polygons.gpkg", layer = "zones")
aoi
zones
        name
1 study_area
  zone_id   zone_type
1      NW      forest
2      NE       urban
3      SW     wetland
4      SE agriculture

Create Points From A Table

stations <- read.csv("data/points.csv") |>
  st_as_sf(coords = c("lon", "lat"), crs = 4326, remove = FALSE)
Simple feature collection with 6 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -72.92 ymin: 46.7 xmax: -72.72 ymax: 46.94
Geodetic CRS:  WGS 84
  obs_id track_id seq           timestamp    lon   lat    category value
1  OBS01        A   1 2026-03-01 08:00:00 -72.92 46.78       urban    14
2  OBS02        A   2 2026-03-01 09:00:00 -72.86 46.83       urban    16
3  OBS03        A   3 2026-03-01 10:00:00 -72.79 46.88      forest    21
4  OBS04        A   4 2026-03-01 11:00:00 -72.72 46.94      forest    24
5  OBS05        B   1 2026-03-02 08:00:00 -72.88 46.70 agriculture    11
6  OBS06        B   2 2026-03-02 09:00:00 -72.83 46.76 agriculture    13
              geometry
1 POINT (-72.92 46.78)
2 POINT (-72.86 46.83)
3 POINT (-72.79 46.88)
4 POINT (-72.72 46.94)
5  POINT (-72.88 46.7)
6 POINT (-72.83 46.76)


stations <- read.csv("data/points.csv") |>
  vect(geom = c("lon", "lat"), crs = "EPSG:4326")
  obs_id track_id seq           timestamp    category value
1  OBS01        A   1 2026-03-01 08:00:00       urban    14
2  OBS02        A   2 2026-03-01 09:00:00       urban    16
3  OBS03        A   3 2026-03-01 10:00:00      forest    21
4  OBS04        A   4 2026-03-01 11:00:00      forest    24
5  OBS05        B   1 2026-03-02 08:00:00 agriculture    11
6  OBS06        B   2 2026-03-02 09:00:00 agriculture    13

Inspect Vector Data

st_geometry_type(zones)
st_crs(zones)$input
st_bbox(zones)
[1] POLYGON POLYGON POLYGON POLYGON
18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
[1] "WGS 84"
  xmin   ymin   xmax   ymax 
-73.00  46.55 -72.60  47.00 
geomtype(terra_zones)
crs(terra_zones, proj = TRUE)
ext(terra_zones)
[1] "polygons"
[1] "+proj=longlat +datum=WGS84 +no_defs"
SpatExtent : -73, -72.6, 46.55, 47 (xmin, xmax, ymin, ymax)


Spatial projections

CRS information tells you how coordinates are referenced on the Earth. When you need to inspect or identify a projection, epsg.io is a useful lookup tool.

Transform CRS

zones_qc <- st_transform(zones, 32198)
points_qc <- st_transform(points_sf, 32198)
st_crs(points_qc)$input
st_bbox(points_qc)
[1] "EPSG:32198"
     xmin      ymin      xmax      ymax 
-340347.4  299996.5 -318731.5  336605.7 

zones_qc <- project(terra_zones, "EPSG:32198")
points_qc <- project(terra_points, "EPSG:32198")
crs(points_qc, proj = TRUE)
ext(points_qc)
[1] "+proj=lcc +lat_0=44 +lon_0=-68.5 +lat_1=60 +lat_2=46 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs"
SpatExtent : -340347.355525386, -318731.482880504, 299996.469057655, 336605.697992394 (xmin, xmax, ymin, ymax)

Measurements

Use a suitable projected CRS when you need interpretable distances, lengths, or areas.

Export Vector Outputs

st_write(points_sf_qc, "outputs/points_projected.gpkg")
        object                 file exists features
1 points_sf_qc fileea346bc9483.gpkg   TRUE       12
writeVector(terra_points_qc, "outputs/points_projected.gpkg")
           object                 file exists features
1 terra_points_qc fileea340f7022c.gpkg   TRUE       12

Exercise 1

Vector Foundations

Import, inspect, transform, export


Bonus: use R as a converter

Build Lines From Ordered Points

tracks <- points_sf |>
  dplyr::arrange(track_id, timestamp) |>
  dplyr::group_by(track_id) |>
  dplyr::summarise(do_union = FALSE) |>
  st_cast("LINESTRING")
Simple feature collection with 3 features and 1 field
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: -72.95 ymin: 46.6 xmax: -72.69 ymax: 46.94
Geodetic CRS:  WGS 84
# A tibble: 3 × 2
  track_id                                                 geometry
  <chr>                                            <LINESTRING [°]>
1 A        (-72.92 46.78, -72.86 46.83, -72.79 46.88, -72.72 46.94)
2 B         (-72.88 46.7, -72.83 46.76, -72.76 46.82, -72.69 46.87)
3 C         (-72.95 46.6, -72.89 46.66, -72.82 46.73, -72.75 46.79)

tracks_tbl <- points_tbl |>
  dplyr::arrange(track_id, timestamp) |>
  dplyr::summarise(
    wkt = paste0(
      "LINESTRING (",
      paste(paste(lon, lat), collapse = ", "),
      ")"
    ),
    .by = track_id
  ) |>
  terra::vect(geom = "wkt", crs = "EPSG:4326")
  track_id
1        A
2        B
3        C

Quick Static Mapping

plot(st_geometry(aoi), col = "#f7fbfc", border = "#355c67")
plot(st_geometry(zones), add = TRUE, border = "#264653")
plot(st_geometry(stations), pch = 16, col = "#7f1d1d", add = TRUE)

plot(aoi, col = "#f7fbfc", border = "#355c67")
plot(zones, add = TRUE, border = "#264653")
plot(stations, pch = 16, col = "#7f1d1d", add = TRUE)

Quick Interactive Mapping

mapview(zones, zcol = "zone_id", layer.name = "Zones") +
  mapview(stations, zcol = "category", layer.name = "Observations")
mapview(zones, zcol = "zone_id", layer.name = "Zones") +
  mapview(stations, zcol = "category", layer.name = "Observations")

Export Interactive Maps

m <- mapview(zones, zcol = "zone_id", layer.name = "Zones") +
  mapview(stations, zcol = "category", layer.name = "Observations")

mapview::mapshot(m, file = "outputs/interactive_map.html")
          object                 file format
1 mapview sf map fileea35da38cec.html   html
m <- mapview(zones, zcol = "zone_id", layer.name = "Zones") +
  mapview(stations, zcol = "category", layer.name = "Observations")

mapview::mapshot(m, file = "outputs/interactive_map.html")
             object                 file format
1 mapview terra map fileea34d3ebcbe.html   html


Share your maps!

Interactive mapview outputs can be saved as standalone HTML files and shared like any other web document.

Exercise 2

Vector Foundations on Your Data

Import, inspect, transform, export, visualize

Subset To A Study Area

focus_area <- zones |>
  dplyr::filter(zone_id %in% c("NE", "SE")) |>
  dplyr::summarise()

points_focus <- st_filter(points_sf, focus_area)
zones_focus <- st_crop(zones, st_bbox(focus_area))
Simple feature collection with 5 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -72.79 ymin: 46.79 xmax: -72.69 ymax: 46.94
Geodetic CRS:  WGS 84
  obs_id track_id seq           timestamp    lon   lat category value
1  OBS03        A   3 2026-03-01 10:00:00 -72.79 46.88   forest    21
2  OBS04        A   4 2026-03-01 11:00:00 -72.72 46.94   forest    24
3  OBS07        B   3 2026-03-02 10:00:00 -72.76 46.82  wetland    17
4  OBS08        B   4 2026-03-02 11:00:00 -72.69 46.87  wetland    18
5  OBS12        C   4 2026-03-03 11:00:00 -72.75 46.79    urban    19
              geometry
1 POINT (-72.79 46.88)
2 POINT (-72.72 46.94)
3 POINT (-72.76 46.82)
4 POINT (-72.69 46.87)
5 POINT (-72.75 46.79)

focus_area <- terra_zones[terra_zones$zone_id %in% c("NE", "SE")]
points_focus <- crop(terra_points, focus_area)
zones_focus <- crop(terra_zones, ext(focus_area))
  obs_id track_id seq           timestamp category value
1  OBS03        A   3 2026-03-01 10:00:00   forest    21
2  OBS04        A   4 2026-03-01 11:00:00   forest    24
3  OBS07        B   3 2026-03-02 10:00:00  wetland    17
4  OBS08        B   4 2026-03-02 11:00:00  wetland    18
5  OBS12        C   4 2026-03-03 11:00:00    urban    19

Measure Vector Features

zone_area <- units::set_units(st_area(zones_qc), "km^2")
track_length <- units::set_units(st_length(tracks_sf_qc), "km")
boundary_distance <- units::set_units(st_distance(points_sf_qc, st_boundary(aoi_qc)), "km")
zone_area
track_length
head(boundary_distance)
  zone_id area_km2
1      NW    380.0
2      NE    380.0
3      SW    381.9
4      SE    381.9
  track_id length_km
1        A      23.4
2        B      23.9
3        C      26.1
  obs_id boundary_km
1  OBS01         6.1
2  OBS02        10.7
3  OBS03        13.3
4  OBS04         6.7
5  OBS05         9.2
6  OBS06        13.0
zone_area <- expanse(terra_zones_qc, unit = "km")
track_length <- perim(terra_tracks_qc) / 1000
boundary_distance <- distance(terra_points_qc, as.lines(terra_aoi_qc))[, 1] / 1000
zone_area
track_length
head(boundary_distance)
  zone_id area_km2
1      NW    381.3
2      NE    381.3
3      SW    382.9
4      SE    382.9
  track_id length_km
1        A      23.4
2        B      23.9
3        C      26.1
  obs_id boundary_km
1  OBS01         6.1
2  OBS02        10.7
3  OBS03        13.3
4  OBS04         6.7
5  OBS05         9.2
6  OBS06        13.0


Note

sf often returns unit-aware measurements. terra often returns numeric values whose meaning depends on the CRS and function used.

Join Attributes To Features

points_joined <- st_join(points_sf, zones[, c("zone_id", "zone_type")])
  obs_id track_id seq           timestamp    lon   lat    category value
1  OBS01        A   1 2026-03-01 08:00:00 -72.92 46.78       urban    14
2  OBS02        A   2 2026-03-01 09:00:00 -72.86 46.83       urban    16
3  OBS03        A   3 2026-03-01 10:00:00 -72.79 46.88      forest    21
4  OBS04        A   4 2026-03-01 11:00:00 -72.72 46.94      forest    24
5  OBS05        B   1 2026-03-02 08:00:00 -72.88 46.70 agriculture    11
6  OBS06        B   2 2026-03-02 09:00:00 -72.83 46.76 agriculture    13
  zone_id zone_type
1      NW    forest
2      NW    forest
3      NE     urban
4      NE     urban
5      SW   wetland
6      SW   wetland

point_values <- extract(terra_zones, terra_points)
points_joined <- cbind(terra_points, point_values[, c("zone_id", "zone_type")])
  obs_id track_id seq           timestamp    category value zone_id zone_type
1  OBS01        A   1 2026-03-01 08:00:00       urban    14      NW    forest
2  OBS02        A   2 2026-03-01 09:00:00       urban    16      NW    forest
3  OBS03        A   3 2026-03-01 10:00:00      forest    21      NE     urban
4  OBS04        A   4 2026-03-01 11:00:00      forest    24      NE     urban
5  OBS05        B   1 2026-03-02 08:00:00 agriculture    11      SW   wetland
6  OBS06        B   2 2026-03-02 09:00:00 agriculture    13      SW   wetland

Intersect Geometries

track_segments <- st_intersection(tracks_sf_qc, zones_qc)
# A tibble: 6 × 3
  track_id zone_id zone_type
  <chr>    <chr>   <chr>    
1 A        NW      forest   
2 B        NW      forest   
3 A        NE      urban    
4 B        NE      urban    
5 C        NE      urban    
6 B        SW      wetland  

track_segments <- intersect(terra_tracks_qc, terra_zones_qc)
  track_id zone_id   zone_type
1        A      NE       urban
2        A      NW      forest
3        B      SW     wetland
4        B      NE       urban
5        B      NW      forest
6        C      SE agriculture

Note

Use a join when you want to attach attributes. Use an intersection when you want overlap to create new geometries.

Buffers

point_buffers <- st_buffer(points_sf_qc, dist = 10000)
st_geometry_type(point_buffers)
[1] POLYGON
18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
  obs_id track_id seq           timestamp    lon   lat    category value
1  OBS01        A   1 2026-03-01 08:00:00 -72.92 46.78       urban    14
2  OBS02        A   2 2026-03-01 09:00:00 -72.86 46.83       urban    16
3  OBS03        A   3 2026-03-01 10:00:00 -72.79 46.88      forest    21
4  OBS04        A   4 2026-03-01 11:00:00 -72.72 46.94      forest    24
5  OBS05        B   1 2026-03-02 08:00:00 -72.88 46.70 agriculture    11
6  OBS06        B   2 2026-03-02 09:00:00 -72.83 46.76 agriculture    13

point_buffers <- buffer(terra_points_qc, width = 10000)
geomtype(point_buffers)
[1] "polygons"
  obs_id track_id seq           timestamp    category value
1  OBS01        A   1 2026-03-01 08:00:00       urban    14
2  OBS02        A   2 2026-03-01 09:00:00       urban    16
3  OBS03        A   3 2026-03-01 10:00:00      forest    21
4  OBS04        A   4 2026-03-01 11:00:00      forest    24
5  OBS05        B   1 2026-03-02 08:00:00 agriculture    11
6  OBS06        B   2 2026-03-02 09:00:00 agriculture    13

Note

Buffers create new geometries at a fixed distance around features, so they are usually done in a projected CRS with interpretable distance units.

Exercise 3

Vector Operations

Subset, measure, join, intersect, buffer

Raster Data in R

stars & terra

Read Raster Data

surface <- read_stars("data/surface.tif")
surface
stars object with 2 dimensions and 1 attribute
attribute(s):
             Min. 1st Qu. Median     Mean 3rd Qu.   Max.
surface.tif   9.6 40.8325   48.5 50.00007  59.805 100.32
dimension(s):
  from to offset    delta refsys point x/y
x    1 40    -73     0.01 WGS 84 FALSE [x]
y    1 40     47 -0.01125 WGS 84 FALSE [y]

surface <- rast("data/surface.tif")
surface
class       : SpatRaster 
size        : 40, 40, 1  (nrow, ncol, nlyr)
resolution  : 0.01, 0.01125  (x, y)
extent      : -73, -72.6, 46.55, 47  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
source      : surface.tif 
name        : surface_value 
min value   :          9.60 
max value   :        100.32 

Inspect Raster Data

st_dimensions(surface)
st_bbox(surface)
st_crs(surface)$input
  from to offset    delta refsys point x/y
x    1 40    -73     0.01 WGS 84 FALSE [x]
y    1 40     47 -0.01125 WGS 84 FALSE [y]
  xmin   ymin   xmax   ymax 
-73.00  46.55 -72.60  47.00 
[1] "WGS 84"
dim(surface)
res(surface)
ext(surface)
crs(surface, proj = TRUE)
[1] 40 40  1
[1] 0.01000 0.01125
SpatExtent : -73, -72.6, 46.55, 47 (xmin, xmax, ymin, ymax)
[1] "+proj=longlat +datum=WGS84 +no_defs"

Export Raster Outputs

write_stars(surface_mask, "outputs/surface_focus.tif")
              object                file exists
1 surface_stars_mask fileea349450257.tif   TRUE
writeRaster(surface_mask, "outputs/surface_focus.tif", overwrite = TRUE)
              object                file exists
1 surface_terra_mask fileea3408c5de1.tif   TRUE

Quick Static Raster Plotting

plot(surface, col = viridis::viridis(100))

plot(surface)

Quick Interactive Raster Mapping

mapview(surface) + mapview(aoi)
mapview(surface) + mapview(aoi)

Exercise 4

Raster Foundations

Import, inspect, transform, export, visualize

Crop, Mask, Project

crop_zone <- zones |> dplyr::filter(zone_id == "NE")
buffer_one <- st_transform(point_buffers[1, ], st_crs(surface))
surface_qc <- st_warp(surface, crs = st_crs(aoi_qc))
surface_crop <- st_crop(surface, st_bbox(crop_zone))
surface_mask <- surface[buffer_one]

crop_zone <- zones[zones$zone_id == "NE"]
buffer_one <- project(point_buffers[1], crs(surface))
surface_qc <- project(surface, "EPSG:32198")
surface_crop <- crop(surface, crop_zone)
surface_mask <- mask(surface, buffer_one)

Raster manipulations

Here the three operations are shown independently: project changes CRS, crop uses one zone as the target extent, and mask keeps cells only inside a single buffer geometry.

Resampling

template <- st_as_stars(st_bbox(surface), dx = 0.05, dy = 0.05)
st_crs(template) <- st_crs(surface)
surface_near <- st_warp(surface, dest = template, method = "near", use_gdal = TRUE)
surface_bilinear <- st_warp(surface, dest = template, method = "bilinear", use_gdal = TRUE)

template <- rast(ext(surface), resolution = 0.05, crs = crs(surface))
surface_near <- resample(surface, template, method = "near")
surface_bilinear <- resample(surface, template, method = "bilinear")

Resampling changes the cell grid

Near is common for categorical rasters, while bilinear is common for continuous surfaces.

Multiple Raster Layers

surface2 <- surface * 1.1
names(surface2) <- "surface2"
surface_stack <- c(surface, surface2)
surface_mean <- st_apply(surface_stack, MARGIN = c("x", "y"), FUN = mean)
[1] "surface.tif" "surface_alt"
[1] 2

surface2 <- surface * 1.1
names(surface2) <- "surface2"
surface_stack <- c(surface, surface2)
surface_mean <- app(surface_stack, mean)
[1] "surface_value" "surface_alt"  
[1] 2

Multi-layer rasters

Can represent bands, variables, or time steps. Cell-wise functions combine information across layers.

Exercise 5

Raster Operations

Project, crop, mask, resample, layer operations

Integrated Workflows

Working with Vectors & Rasters

Extract Raster Values To Points

points_surface <- st_extract(surface, points_joined)
names(points_surface)[1] <- "surface_value"
  obs_id track_id seq           timestamp    lon   lat    category value
1  OBS01        A   1 2026-03-01 08:00:00 -72.92 46.78       urban    14
2  OBS02        A   2 2026-03-01 09:00:00 -72.86 46.83       urban    16
3  OBS03        A   3 2026-03-01 10:00:00 -72.79 46.88      forest    21
4  OBS04        A   4 2026-03-01 11:00:00 -72.72 46.94      forest    24
5  OBS05        B   1 2026-03-02 08:00:00 -72.88 46.70 agriculture    11
6  OBS06        B   2 2026-03-02 09:00:00 -72.83 46.76 agriculture    13
  zone_id zone_type surface_value
1      NW    forest         49.71
2      NW    forest         48.70
3      NE     urban         47.72
4      NE     urban         61.39
5      SW   wetland         51.67
6      SW   wetland         45.88

point_vals <- extract(surface, points_joined)
points_surface <- cbind(points_joined, point_vals[, "surface_value", drop = FALSE])
  obs_id zone_id zone_type surface_value
1  OBS01      NW    forest         48.69
2  OBS02      NW    forest         49.60
3  OBS03      NE     urban         47.72
4  OBS04      NE     urban         59.98
5  OBS05      SW   wetland         51.67
6  OBS06      SW   wetland         47.16

Extract Raster Values To Lines

tracks_surface <- st_extract(surface, tracks, FUN = function(x) mean(x, na.rm = TRUE))
# A tibble: 3 × 2
  track_id surface_value
  <chr>            <dbl>
1 A                 50.0
2 B                 45.0
3 C                 44.2

track_vals <- extract(surface, tracks, fun = mean, na.rm = TRUE)
tracks_surface <- cbind(tracks, track_vals[, "surface_value", drop = FALSE])
  track_id surface_value
1        A      49.98657
2        B      45.14939
3        C      43.87132

Extract Raster Values To Polygons

zones_surface <- st_extract(surface, zones, FUN = function(x) mean(x, na.rm = TRUE))
  zone_id   zone_type surface_value
1      NW      forest      55.93852
2      NE       urban      60.33677
3      SW     wetland      39.66305
4      SE agriculture      44.06193

zone_vals <- extract(surface, zones, fun = mean, na.rm = TRUE)
zones_surface <- cbind(zones, zone_vals[, "surface_value", drop = FALSE])
  zone_id   zone_type surface_value
1      NW      forest      55.93852
2      NE       urban      60.33677
3      SW     wetland      39.66305
4      SE agriculture      44.06193

Summarise Extracted Values

point_zone_summary <- points_surface |>
  st_drop_geometry() |>
  group_by(zone_id, zone_type) |>
  summarise(
    point_n = n(),
    mean_surface = mean(surface_value, na.rm = TRUE),
    mean_value = mean(value, na.rm = TRUE)
  )
# A tibble: 3 × 5
  zone_id zone_type point_n mean_surface mean_value
  <chr>   <chr>       <int>        <dbl>      <dbl>
1 NE      urban           5         48.4       19.8
2 NW      forest          2         49.2       15  
3 SW      wetland         5         44.7       11.6
point_zone_summary <- as.data.frame(points_surface) |>
  group_by(zone_id, zone_type) |>
  summarise(
    point_n = n(),
    mean_surface = mean(surface_value, na.rm = TRUE),
    mean_value = mean(value, na.rm = TRUE)
  )
# A tibble: 3 × 5
  zone_id zone_type point_n mean_surface mean_value
  <chr>   <chr>       <int>        <dbl>      <dbl>
1 NE      urban           5         47.6       19.8
2 NW      forest          2         49.1       15  
3 SW      wetland         5         44.5       11.6

Build Analysis Tables

analysis_table <- point_zone_summary |>
  left_join(zone_area, by = "zone_id") |>
  select(zone_id, zone_type, point_n, mean_surface, mean_value, area_km2)
# A tibble: 3 × 6
  zone_id zone_type point_n mean_surface mean_value area_km2
  <chr>   <chr>       <int>        <dbl>      <dbl>    <dbl>
1 NE      urban           5         48.4       19.8     380 
2 NW      forest          2         49.2       15       380 
3 SW      wetland         5         44.7       11.6     382.
analysis_table <- point_zone_summary |>
  left_join(zone_area, by = "zone_id") |>
  select(zone_id, zone_type, point_n, mean_surface, mean_value, area_km2)
# A tibble: 3 × 6
  zone_id zone_type point_n mean_surface mean_value area_km2
  <chr>   <chr>       <int>        <dbl>      <dbl>    <dbl>
1 NE      urban           5         47.6       19.8     381.
2 NW      forest          2         49.1       15       381.
3 SW      wetland         5         44.5       11.6     383.

Exercise 6

Integrated Workflows

Extract, summarise, build analysis tables

Spatial Analytics

Spatial Analytics

  • spatial analytics is a very broad field
  • this workshop does not aim to cover that full breadth
  • our emphasis is on getting spatial data ready for analysis
  • we use simple GLMs and KDEs as illustrative examples of what can follow

The point here is not the analyses themselves. The point is the workflow that gets you from spatial data to analysis-ready inputs.

Prepare Data For Analysis

analysis_data <- points_surface |>
  st_drop_geometry() |>
  select(obs_id, track_id, zone_id, zone_type, category, value, surface_value) |>
  filter(!is.na(zone_id), !is.na(surface_value))
  obs_id track_id zone_id zone_type    category value surface_value
1  OBS01        A      NW    forest       urban    14         49.71
2  OBS02        A      NW    forest       urban    16         48.70
3  OBS03        A      NE     urban      forest    21         47.72
4  OBS04        A      NE     urban      forest    24         61.39
5  OBS05        B      SW   wetland agriculture    11         51.67
6  OBS06        B      SW   wetland agriculture    13         45.88
analysis_data <- as.data.frame(points_surface) |>
  select(obs_id, track_id, zone_id, zone_type, category, value, surface_value) |>
  filter(!is.na(zone_id), !is.na(surface_value))
  obs_id track_id zone_id zone_type    category value surface_value
1  OBS01        A      NW    forest       urban    14         48.69
2  OBS02        A      NW    forest       urban    16         49.60
3  OBS03        A      NE     urban      forest    21         47.72
4  OBS04        A      NE     urban      forest    24         59.98
5  OBS05        B      SW   wetland agriculture    11         51.67
6  OBS06        B      SW   wetland agriculture    13         47.16

A Simple GLM

glm_fit <- glm(
  value ~ surface_value + zone_type,
  family = poisson(),
  data = analysis_data
)
summary(glm_fit)
              term Estimate Std..Error z.value Pr...z..
1      (Intercept)    2.178      0.536   4.064    0.000
2    surface_value    0.011      0.010   1.052    0.293
3   zone_typeurban    0.283      0.208   1.356    0.175
4 zone_typewetland   -0.211      0.229  -0.921    0.357
glm_fit <- glm(
  value ~ surface_value + zone_type,
  family = poisson(),
  data = analysis_data
)
summary(glm_fit)
              term Estimate Std..Error z.value Pr...z..
1      (Intercept)    2.114      0.550   3.841    0.000
2    surface_value    0.012      0.011   1.144    0.253
3   zone_typeurban    0.292      0.209   1.399    0.162
4 zone_typewetland   -0.205      0.229  -0.897    0.370

Bonus: Advanced GLM

pseudo_points <- st_sample(aoi_qc, size = nrow(points_qc), exact = TRUE) |>
  st_as_sf() |>
  st_join(zones_qc[, c("zone_id", "zone_type")])

pseudo_points$surface_value <- st_extract(surface_qc, pseudo_points)[[1]]
pseudo_points$presence <- 0
points_qc$presence <- 1

pa_data <- bind_rows(points_qc, pseudo_points) |>
  st_drop_geometry() |>
  filter(!is.na(zone_id), !is.na(surface_value))

glm_pa <- glm(presence ~ surface_value + zone_type, family = binomial(), data = pa_data)
              term Estimate Std..Error z.value Pr...z..
1      (Intercept)  -16.407   3143.229  -0.005    0.996
2    surface_value   -0.039      0.036  -1.098    0.272
3  zone_typeforest   19.471   3143.229   0.006    0.995
4   zone_typeurban   18.584   3143.229   0.006    0.995
5 zone_typewetland   18.917   3143.229   0.006    0.995

pseudo_points <- spatSample(aoi_qc, size = nrow(points_qc), method = "random", as.points = TRUE)
pseudo_zone <- extract(zones_qc, pseudo_points)
pseudo_surface <- extract(surface_qc, pseudo_points)
pseudo_points <- cbind(pseudo_points, pseudo_zone[, c("zone_id", "zone_type")], pseudo_surface[, "surface_value", drop = FALSE])
pseudo_points$presence <- 0
points_qc$presence <- 1

pa_data <- bind_rows(as.data.frame(points_qc), as.data.frame(pseudo_points)) |>
  filter(!is.na(zone_id), !is.na(surface_value))

glm_pa <- glm(presence ~ surface_value + zone_type, family = binomial(), data = pa_data)
              term Estimate Std..Error z.value Pr...z..
1      (Intercept)  -13.913   2782.290  -0.005    0.996
2    surface_value   -0.061      0.062  -0.981    0.326
3  zone_typeforest   16.642   2782.287   0.006    0.995
4   zone_typeurban   17.569   2782.287   0.006    0.995
5 zone_typewetland   17.009   2782.287   0.006    0.995

Exercise 7

Spatial Analytics

Prepare data, fit simple GLMs


Bonus: generate pseudo-absences for your GLM with points data

Kernel Density Estimation (KDE)

Use as an exploratory tools

  • useful for point patterns
  • produces a smooth intensity surface
  • helps visualize concentration or hotspots
  • often used for exploration rather than formal inference

Kernel Density Estimation (KDE)

pts_xy <- st_coordinates(points_qc)
bbox <- st_bbox(aoi_qc)
points_kde <- MASS::kde2d(
  pts_xy[, 1], pts_xy[, 2],
  n = 80,
  h = c(10000, 10000),
  lims = c(bbox["xmin"], bbox["xmax"], bbox["ymin"], bbox["ymax"])
)
$x_head
[1] -344515.3 -344091.9 -343668.5 -343245.1 -342821.7 -342398.3

$y_head
[1] 292854.5 293508.8 294163.2 294817.5 295471.8 296126.1

$z_dim
[1] 80 80
pts_xy <- crds(points_qc)
bbox <- ext(aoi_qc)
points_kde <- MASS::kde2d(
  pts_xy[, 1], pts_xy[, 2],
  n = 80,
  h = c(10000, 10000),
  lims = c(bbox$xmin, bbox$xmax, bbox$ymin, bbox$ymax)
)
$x_head
[1] -344515.3 -344091.9 -343668.5 -343245.1 -342821.7 -342398.3

$y_head
[1] 292854.5 293508.8 294163.2 294817.5 295471.8 296126.1

$z_dim
[1] 80 80

Kernel Density Estimation (KDE)

points_kde <- st_as_stars(
  list(kde = points_kde$z),
  dimensions = st_dimensions(x = points_kde$x, y = points_kde$y)
) |>
  st_set_crs(32198)

points_kde <- points_kde[aoi_qc]
points_kde_plot <- points_kde
points_kde_plot[[1]] <- points_kde_plot[[1]] / max(points_kde_plot[[1]], na.rm = TRUE)

points_kde <- rast(
  t(points_kde$z)[nrow(t(points_kde$z)):1, ],
  extent = ext(min(points_kde$x), max(points_kde$x), min(points_kde$y), max(points_kde$y)),
  crs = "EPSG:32198"
)

points_kde <- mask(points_kde, aoi_qc)
points_kde_plot <- points_kde / global(points_kde, "max", na.rm = TRUE)[1, 1]

Exercise 8

Spatial Analytics

Fit KDEs and explore surfaces

Advanced Mapping

Advanced Mapping

Quick exploration

  • plot()
  • mapview()
  • fast QA/QC and data inspection

Communication

  • ggplot2
  • tmap
  • more deliberate legends, palettes, labels, and layout

Use exploratory maps to inspect data quickly, then move to more deliberate mapping tools when the goal is communication or sharing.

Advanced Mapping with ggplot2

ggplot2 works especially well for polished static maps, particularly with sf and stars objects.

gg <- ggplot() +
  geom_stars(
    data = points_kde_sf_plot
  ) +
  scale_fill_viridis_c(name = "KDE") +
  geom_sf(
    data = zones_qc, 
    fill = NA, 
    color = "#355c67"
  ) +
  geom_sf(
    data = aoi_qc, 
    fill = NA, 
    color = "#17313b"
  ) +
  geom_sf(
    data = points_sf_qc, 
    aes(color = category), 
    size = 1.7
   ) +
  coord_sf(expand = FALSE) +
  theme_minimal()
gg

Advanced Mapping with tmap

tmap is built for thematic mapping and works well when you want a more map-focused grammar than ggplot2.

tm <- tm_shape(points_kde_sf_plot) +
  tm_raster(col.scale = tm_scale_continuous(
    values = "viridis"
  )) +
  tm_shape(zones_qc) +
  tm_borders(col = "#355c67") +
  tm_shape(points_sf_qc) +
  tm_symbols(
    fill = "category", 
    size = 0.5
  ) +
  tm_shape(aoi_qc) +
  tm_borders(col = "#17313b") +
  tm_layout(
    frame = FALSE, 
    legend.outside = TRUE
  )
tm

Advanced Mapping with tmap

One advantage of tmap is that the same map logic can often be used for static output or for an interactive shared map.

tm_html <- tmap_leaflet(tm)
tm_html

Export And Share Maps

ggsave(
  filename = "outputs/advanced_map.png",
  plot = gg,
  width = 8,
  height = 6,
  dpi = 300
)

ggsave(
  filename = "outputs/advanced_map.pdf",
  plot = gg,
  width = 8,
  height = 6
)
                    output             file                   tool
1      presentation figure advanced_map.png ggsave(..., dpi = 300)
2 publication-ready figure advanced_map.pdf            ggsave(...)
tmap_save(
  tm = tm,
  filename = "outputs/advanced_map.png",
  width = 8,
  height = 6,
  dpi = 300
)

tmap_mode("view")
tmap_save(
  tm = tm,
  filename = "outputs/advanced_map.html"
)
tmap_mode("plot")
                  output              file                               tool
1    static thematic map  advanced_map.png                     tmap_save(...)
2 interactive shared map advanced_map.html tmap_mode("view") + tmap_save(...)

Exercise 9

Advanced Mapping

Build, export, and share maps

Takeaways

  • vector and raster are different data models
  • sf, stars, and terra cover most common spatial workflows in R
  • CRS decisions affect measurements and overlays
  • joins, intersections, crops, masks, and extraction are core operations
  • spatial analysis in R is often a bridge between GIS tasks and standard statistical workflows
  • treat R as both an analysis environment and a GIS

Using recent technos for spatial data

Shiny App Assistant

  • https://gallery.shinyapps.io/assistant

  • prompts:

    • Create an app that allows users to upload GeoJSON files and visualize them using Leaflet.
    • Modify the app to compute the total area when polygons are uploaded and display it in the card header.
    • Modify the app to also compute the total length for line geometries and display it in the card header.

Closing

  • start with clear spatial objects and clear CRS
  • inspect and map early, then analyse
  • move from spatial objects to clean analysis tables
  • keep building reusable workflows as your analyses grow