library(dynamicSDM)In this tutorial, we will be extracting spatio-temporally buffered explanatory variables for each occurrence and pseudo-absence record. The dynamicSDM functions for extracting such variables require Google Earth Engine and Google Drive to be initialised. Fill in the code below with your Google account email, and run the code to check that rgee and googledrive have been correctly installed and authorised.
library(rgee)
rgee::ee_check()
library(googledrive)
googledrive::drive_user()
# Set your user email here
#user.email<-"your_google_email_here"Note: You will need internet connection for this tutorial. Variable extraction may take some time depending on your internet connection strength. If you try out these functions and are excited to move onto the next tutorial, then don’t worry - you can read the extracted data into your R environment from the dynamicSDM package.
We will be extracting data for three dynamic explanatory variables. Let’s first create new folders within the project directory to export extracted variable data to.
project_directory <- file.path(file.path(tempdir(), "dynamicSDM_vignette"))
dir.create(project_directory)
#> Warning in dir.create(project_directory):
#> 'C:\Users\eerdo\AppData\Local\Temp\RtmpCulgCq\dynamicSDM_vignette' already
#> exists
variablenames<-c("eight_sum_prec","year_sum_prec","grass_crop_percentage")
extraction_directories <- file.path(file.path(project_directory,"extraction"))
dir.create(extraction_directories)
extraction_directory_1 <- file.path(file.path(project_directory,variablenames[1]))
dir.create(extraction_directory_1)
extraction_directory_2 <- file.path(file.path(project_directory,variablenames[2]))
dir.create(extraction_directory_2)
extraction_directory_3 <- file.path(file.path(project_directory,variablenames[3]))
dir.create(extraction_directory_3)Now, the filtered occurrence and pseudo-absence record data frame generated in the first tutorial can be imported or read into your R environment from the dynamicSDM package.
# sample_filt_data<-read.csv(paste0(project_directory,"/filtered_quelea_occ.csv"))
data(sample_filt_data)extract_dynamic_coords() extracts processed remote
sensing data using the Google Earth Engine cloud servers. There are
various arguments to this function to specify the explanatory variable
including:
• datasetname: the dataset’s Google Earth Engine
catalogue name.
• bandname : the band of interest with the dataset.
• temporal.res : the temporal resolution (i.e. the
number of days to calculate the variable over).
• temporal.direction: temporal direction (days either
prior or post each record’s date).
• spatial.res.metres: spatial resolution (the resolution
in metres to extract data at).
• GEE.math.fun : the mathematical function to calculate
across the period (e.g. mean, sum or standard deviation across the given
period).
The distribution of our case study species, the red-billed quelea, is driven by precipitation levels. Run the code below to extract the sum of precipitation across the 8-week and 52-week period prior to each occurrence record from the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) dataset at GEE.
For the 8-week precipitation extraction, we will use the split method
to save extracted data. Notice how each record’s data are extracted and
exported individually. If you specify resume = T, then if
internet connection is lost, progress can be resumed.
# 8-week total precipitation
extract_dynamic_coords(occ.data=sample_filt_data,
                       datasetname = "UCSB-CHG/CHIRPS/DAILY",
                       bandname="precipitation",
                       spatial.res.metres = 5566 ,
                       GEE.math.fun = "sum",
                       temporal.direction = "prior",
                       temporal.res = 56,
                       save.method = "split",
                       varname = variablenames[1],
                       save.directory = extraction_directory_1)For the 52-week precipitation extraction, we will use the combined method to save extracted data. Here, all data are extracted and then exported as a single data frame. This approach writes fewer files but may be more vulnerable to internet connection outage because all progress will be lost and cannot be resumed.
# 52-week total precipitation
extract_dynamic_coords(occ.data=sample_filt_data,
                       datasetname = "UCSB-CHG/CHIRPS/DAILY",
                       bandname = "precipitation",
                       spatial.res.metres = 5566 ,
                       GEE.math.fun = "sum",
                       temporal.direction = "prior",
                       temporal.res = 364,
                       save.method = "combined",
                       varname = variablenames[2],
                       save.directory = extraction_directory_2)extract_buffered_coords()extracts explanatory variable
data across a spatial buffer from occurrence record co-ordinates. These
variables can be categorical or continuous, but if a temporal buffer is
also used only continuous data will work. This function utilises a
“moving window matrix” that specifies the neighbourhood of cells
(spatial buffer area) surrounding each occurrence record’s cell that
will also be included in the calculation.
get_moving_window() generates the optimal “moving window
matrix” sizes based upon a given spatial radius and resolution of
remote-sensing data.
The distribution of red-billed quelea is driven by availability of wild grass and cereal crop seed availability. The code below extracts the total number of grassland or cereal cropland cells across a spatial buffer from the MODIS Annual Land Cover Type dataset googleearthenginecatalogue.
First, however, we must generate the optimal moving window matrix for this calculated based upon the fact that quelea travel up to 10km to access resources and that the data will be at 0.05 degree resolution (500m aggregated by 12).
matrix <- get_moving_window(radial.distance = 10000,
                                        spatial.res.degrees = 0.05,
                                        spatial.ext = c(-35, -6, 10, 40))
matrix
#>      [,1] [,2] [,3]
#> [1,]    1    1    1
#> [2,]    1    1    1
#> [3,]    1    1    1# Total grassland and cereal cropland cells in surrounding area
extract_buffered_coords(occ.data=sample_filt_data,
                        datasetname = "MODIS/006/MCD12Q1",
                        bandname="LC_Type5",
                        spatial.res.metres = 500,
                        GEE.math.fun = "sum",
                        moving.window.matrix=matrix,
                        user.email= user.email,
                        save.method="split",
                        temporal.level="year",
                        categories=c(6,7),
                        agg.factor = 12,
                        varname = variablenames[3],
                        save.directory=extraction_directory_3)Data for each explanatory variable have been saved across multiple
directories and files. extract_coord_combine() combine the
extracted explanatory variable data into a single data frame.
complete.dataset <- extract_coords_combine(varnames = variablenames,
                                           local.directory = c(extraction_directory_1,
                                                               extraction_directory_2,
                                                               extraction_directory_3))At the end of this vignette, we now have a complete data frame of filtered species occurrence and pseudo-absence records with associated extracted dynamic variables. Let’s save this to our project directory for use in the next tutorial!
# Set NA values as zero 
complete.dataset[is.na(complete.dataset$grass_crop_percentage),"grass_crop_percentage"]<-0
write.csv(complete.dataset, file = paste0(project_directory, "/extracted_quelea_occ.csv"))