Basic ENMTML Run

We present here a simple model, with the simplest parameters and decisions.
The main objective of this run is to introduce the main arguments and help with some details which are required to run the function.
We will use the files available at the package example, so that everyone can easily reproduce those steps!

Creating the folder structure from the example files

ENMTML example data set consists of five bioclimantic variables for the current period and five different virtual species.

In order to simulate the files and folders needed for an standard ENMTML run we will create different folders containing the example data.

For this we will use the raster package:

if (!"raster"%in%installed.packages()){install.packages("raster")}  
require(raster)  
#> Loading required package: raster
#> Loading required package: sp
require(ENMTML)
#> Loading required package: ENMTML
#> Please note that rgdal will be retired during October 2023,
#> plan transition to sf/stars/terra functions using GDAL and PROJ
#> at your earliest convenience.
#> See https://r-spatial.org/r/2023/05/15/evolution4.html and https://github.com/r-spatial/evolution
#> rgdal: version: 1.6-7, (SVN revision 1203)
#> Geospatial Data Abstraction Library extensions to R successfully loaded
#> Loaded GDAL runtime: GDAL 3.6.2, released 2023/01/02
#> Path to GDAL shared files: C:/Users/santi/AppData/Local/R/win-library/4.3/rgdal/gdal
#>  GDAL does not use iconv for recoding strings.
#> GDAL binary built with GEOS: TRUE 
#> Loaded PROJ runtime: Rel. 9.2.0, March 1st, 2023, [PJ_VERSION: 920]
#> Path to PROJ shared files: C:/Users/santi/AppData/Local/R/win-library/4.3/rgdal/proj
#> PROJ CDN enabled: FALSE
#> Linking to sp version:2.1-1
#> To mute warnings of possible GDAL/OSR exportToProj4() degradation,
#> use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
#> Registered S3 methods overwritten by 'adehabitatMA':
#>   method                       from
#>   print.SpatialPixelsDataFrame sp  
#>   print.SpatialPixels          sp
#> rgeos version: 0.6-4, (SVN revision 699)
#>  GEOS runtime version: 3.11.2-CAPI-1.17.2 
#>  Please note that rgeos will be retired during October 2023,
#> plan transition to sf or terra functions using GEOS at your earliest convenience.
#> See https://r-spatial.org/r/2023/05/15/evolution4.html for details.
#>  GEOS using OverlayNG
#>  Linking to sp version: 2.1-1 
#>  Polygon checking: TRUE

# First we will create a folder (ENMTML_example) within the working directory
getwd() #' Working directory of R session
#> [1] "C:/Users/santi/Documents/GitHub/ENMTML_RStudio/vignettes"
d_ex <- file.path(getwd(), 'ENMTML_example')
d_ex
#> [1] "C:/Users/santi/Documents/GitHub/ENMTML_RStudio/vignettes/ENMTML_example"
dir.create(d_ex)

# We will now save ENMTML data set within the ENMTML_example folder

# Virtual species occurrences
data("occ")

# See the format of the .TXT file(tab-separated)! 
# We only need three columns:
  # one containing species name
  # one containing longitude information 
  # one containing latitude information

knitr::kable(occ[c(1:4,43:46,90:94,127:130),])
species x y
1 Sp_14 -55.37526 -31.708447
2 Sp_14 -54.20860 -25.041806
3 Sp_14 -58.29191 -29.125124
4 Sp_14 -57.29192 -27.291798
43 Sp_17 -73.12519 -15.041847
44 Sp_17 -70.87520 -15.458512
45 Sp_17 -69.12520 -17.708503
46 Sp_17 -70.45853 -15.541844
90 Sp_18 -61.37523 -26.625134
91 Sp_18 -61.29190 -26.958465
92 Sp_18 -61.37523 -23.125148
93 Sp_18 -61.04190 -24.125144
94 Sp_18 -62.29190 -26.458468
127 Sp_34 -67.29188 -5.041887
128 Sp_34 -71.20853 -7.458544
129 Sp_34 -68.20854 -7.208545
130 Sp_34 -68.95854 -4.291889
d_occ <- file.path(d_ex, 'occ.txt')
utils::write.table(occ, d_occ, sep = '\t', row.names = FALSE)

# Five bioclimatic variables for current conditions
data("env")
d_env <- file.path(d_ex, 'current_env_var')
dir.create(d_env)
raster::writeRaster(env, file.path(d_env, names(env)), bylayer=TRUE, 
                    overwrite=TRUE, format='GTiff')

# shell.exec(d_ex) # open the directory and folders created
rm(list = c('env','occ'))

# We now we have the minimum data required to create models with ENMTML package!
# A directory with environmental rasters and a .txt file with occurrence.

The following objects contains the path to the folder with predictors and the path to the occurrence file (TXT)

  • d_occ : File path to species occurrence file (.txt; tab-separated)
  • d_env : Directory path with current environmental conditions (here in .tiff)

Model design

We will now fit models for with the following specifications:

  • Five different algorithms: Bioclim, Generalized Linear Model, Random Forests, Support Vector Machine and Maxent
  • Random pseudo-absence allocation
  • No accessible area restriction
  • Random partition between training-test datasets
  • No Ensemble
  • No Bias correction
  • No Collinearity control in the predictors
  • Binary maps will be produced using the Maximum Specificity and Sensitivity Threshold
require(ENMTML)

ENMTML(
 pred_dir = d_env,
 proj_dir = NULL,
 result_dir = file.path(d_ex,"Result"),
 occ_file = d_occ,
 sp = 'species',
 x = 'x',
 y = 'y',
 min_occ = 10,
 thin_occ = NULL,
 eval_occ = NULL,
 colin_var = NULL,
 imp_var = FALSE,
 sp_accessible_area = NULL,
 pseudoabs_method = c(method = 'RND'),
 pres_abs_ratio = 1,
 part=c(method= 'KFOLD', folds='2'),
 save_part = FALSE,
 save_final = TRUE,
 algorithm = c('BIO','GLM', 'RDF', 'SVM', 'MXD'),
 thr = c(type='MAX_TSS'),
 msdm = NULL,
 ensemble = NULL,
 extrapolation = FALSE,
 cores = 1
)

Results

ENMTML function will create a folder named Result within the ENMTML_example folder containing the Algorithms sub-folder and several .txt files:

  • Evaluation_Table.txt Contains the results for model evaluation, with several metrics
  • InfoModeling.txt Information of the chosen modeling parameters
  • Number_Unique_Occurrences.txt Number of unique occurrences for each species
  • Occurrences_Cleaned.txt Dataset produced after selecting a single occurrence per grid-cell(unique occurrences)
  • Thresholds_Algorithm.txt Information about the thresholds used to create the presence-absence maps for each algorithm (Presence-absence maps are created from the Threshold of complete models)
  • Moran & Mess Contains information about autocorrelation and environmental similatiry between the datasets used to fit and evaluate the model

What’s next?

We will see next several methodological advances for the Standard model and how those are incorpoated in our modelling routine.
We will also cover projection for other periods, extent and include other algorithms and produce and ensemble model!

Hope you understood the logic behind ENMTML and were able to produce your first models!

See you next time!

Feel free to contact us by mail (you can find the paths to André and Santiago e-mails at the end of the GitHub page)