Data for: Underprediction of extirpation and colonisation following climate and land-use change using species distribution models Alistair G. Auffret, Hedvig Nenzén and Ester Polaina, 2024. Diversity and Distributions. The dataset contains files for evaluating the effects of climate and land-use change in determining the accuracy of predictions from species distribution models of 84 Swedish plant species in four provinces between 1910-1945 and 1990-2020. In this file, we briefly describe the components of the data set. ANALYSED DATA SpeciesData.csv Each row is species. Columns describe the prediction accuracy of species distribution model (SDM) in terms of the proportion of each species' observed colonisations, persistences and extirpations that were correctly predicted. It also includes each species' associations to habitat and climate. Species = Species name TSS_cross = True Skill Statistic of SDMs, based on traditional 70:30 cross-validation method TSS_ind = True Skill Statistic of SDMs, based on independent validation using modern observations AUC_ind = AUC value of SDMs, based on independent validation using modern observations pred_col = Predicted number of colonisations to a grid square pred_per = Predicted number of persistences in a grid square pred_ext = Predicted number of extirpations from a grid square TSS_ind.lwr = True Skill Statistic of SDMs, based on independent validation using modern observations - lower presence threshold pred_col.lwr = Predicted number of colonisations to a grid square - lower presence threshold pred_per.lwr = Predicted number of persistences in a grid square - lower presence threshold pred_ext.lwr = Predicted number of extirpations from a grid square - lower presence threshold TSS_ind.upr = True Skill Statistic of SDMs, based on independent validation using modern observations - upper presence threshold pred_col.upr = Predicted number of colonisations to a grid square - upper presence threshold pred_per.upr = Predicted number of persistences in a grid square - upper presence threshold pred_ext.upr = Predicted number of extirpations from a grid square - upper presence threshold n.obs.col = Observed number of colonisations to a grid square n.obs.per = Observed number of persistences in a grid square n.obs.ext = Observed number of extirpations from a grid square SCI_TempMean = Species mean temperature index, extracted from Auffret & Thomas (2019; https://doi.org/10.1111/gcb.14765) SCI_TempRange = Species temperature range index grass.spec = Level of grassland specialisation (0-10), calculated from Tyler et al. (2021; https://doi.org/10.1016/j.ecolind.2020.106923) forest.spec = Level of forest specialisation (0-10), calculated from Tyler et al. (2021) Schoeners.D = Schoener's D value of niche overlap between the historical and modern distributions, based on climate, land use and soil conditions. Schoeners.D.p = P-value of Schoener's D value, with values of <0.05 indicating significant niche overlap over time. GridSquareData.csv Each row is a 5 km grid-cell of the Swedish national grid RT90. Columns describe the prediction accuracy of species distribution models (SDM) in terms of the proportion of the total number of observed colonisations, persistences and extirpations that we correctly predicted. It also includes variables relating to the land-use change and climate change that occurred in each grid cell, as well as control variables relating to microclimatic heterogeneity, latitude and spatial autocorrelation. EK_square = The name of the 5x5km grid square, according to the Swedish RT90 grid system ext = The number of observed extirpations from that grid square frac_ext = The fraction of observed extirpations that were correctly-predicted by the SDMs frac_ext_lwr = The fraction of observed extirpations that were correctly-predicted by the SDMs - lower presence threshold frac_ext_upr = The fraction of observed extirpations that were correctly-predicted by the SDMs - upper presence threshold col = The numner of observed colonisations that grid square frac_col = The fraction of observed colonisations that were correctly-predicted by the SDMs frac_col_lwr = The fraction of observed colonisations that were correctly-predicted by the SDMs - lower presence threshold frac_col_upr = The fraction of observed colonisations that were correctly-predicted by the SDMs - upper presence threshold per = The numner of observed persistences in that grid square frac_per = The fraction of observed extirpations that were correctly-predicted by the SDMs frac_per_lwr = The fraction of observed extirpations that were correctly-predicted by the SDMs - lower presence threshold frac_per_upr = The fraction of observed extirpations that were correctly-predicted by the SDMs - upper presence threshold temp_change = The change in mean annual temperature that occurred between 1961-70 and 2001-2010 micro_sd = The standard deviation of microclimatic (50m resolution) mean annual temperature within the grid square, calculated from Meineri & Hylander (2017; https://doi.org/10.1111/ecog.02494) grass_aba = The fraction of the grid square covered by forest in 2018 that was grassland in the 1940s-1960s, calculated from Auffret et al. (2018, https://doi.org/10.1111/2041-210X.12788) and https://www.naturvardsverket.se/verktyg-och-tjanster/kartor-och-karttjanster/nationella-marktackedata/ladda-ner-nationella-marktackedata/. pcnm1 & pcnm2 = The first two eigenvectors of a principal coordinates of neighbour matrix analysis of the centroid coordinates of the grid squares, calculated using the vegan package in R. lat = latitude of the grid square centroid, according to the RT90 system MESS = MESS (multivariate similarity surface) values, indicating novel environments between historical and modern period, based on mean of maximum monthly temperatures and precipitation, and cover of arable, open and forested land (variables included in SDMs that changed over time). Negative values indicate novel environments, i.e. SDM predictions involve extrapolation. MESS.n.var = The number of variables responsible for negative MESS values. t.max.hist = The mean of monthly maximum temperatures that occurred between 1961-70 t.max.mod = The mean of monthly maximum temperatures that occurred between 2001-2010 R CODE AnalysisCode.R Contains the code used to create the paper's figures and supplementary tables from the above two data tables. SUPPLEMENTARY INFORMATION I also include Supplementary Information.pdf, which is an appendix to the published paper.