8 Functional traits and phylogenetic analyses

Besides the number and abundance of taxa, a very important aspect of ecological communities includes the ecological differences between the organisms composing them. This variation can be expressed in terms of functional traits, which are measurable features of organisms that influence how they respond to and affect the environment and how they interact with other organisms. If we have two communities with five taxa each, but the taxa present in the first one have very different traits while those present in the second are very similar, we have higher functional diversity in the first community. In addition to traits, we can use phylogenetic information to assess how much phylogenetically related the present taxa are.


load("community.rda")

traits <- read.table("vas.plant.traits.txt")
# seed weight mg, clonal spread classes, leaf size classes. NB! Many NA values!
head(traits)
##   seed.weight..g. clonal leaf
## 1            6.40      1   NA
## 2            2.36      1    5
## 3            4.00      2    4
## 4            3.16      2   NA
## 5              NA      1   NA
## 6            2.56      3    4

8.1 Weighted mean

The average trait value of a sample, weighted according to species’ abundances. It shows which trait value is the most common. If our traits have a very skewed distribution (which is generally the case with seed mass), it is recommendable to work with the logarithm, to avoid species with very high values being extremely influential (the mean is sensitive to outliers):

hist(traits$seed.weight..g.) # very skewed

hist(log(traits$seed.weight..g.)) # looks better

traits$seed.weight..g.  <- log(traits$seed.weight..g.)

mean.seed <- numeric() # empty numeric object
for (i in 1:nrow(vas.plants)) {
  mean.seed[i] <- weighted.mean(traits$seed.weight..g., 
                                w = vas.plants[i, ], 
                                na.rm = T)
}

mean.seed #In the log-scale
##  [1]  0.7587248981 -1.0523588258  0.0755782788 -0.3534966954 -0.4341183939
##  [6] -1.8067751943  1.3512422403  0.2903592473 -2.3447292993  0.2643987001
## [11]  0.4283597436 -0.0004662142  0.6720201664 -0.0751237178  0.7660636021
## [16]  0.2171485442  0.7208543721  0.3357505635 -0.1310540885  0.0600293211
## [21] -0.7430225683  0.0250576883 -0.5571641625 -0.9151129339 -0.0155489843
## [26] -0.0655949418  0.0077888053 -0.5051300122 -0.1111478318  0.6928334449
exp(mean.seed) #In the original scale
##  [1] 2.13555144 0.34911328 1.07850765 0.70222831 0.64783555 0.16418274
##  [7] 3.86222036 1.33690768 0.09587315 1.30264746 1.53473809 0.99953389
## [13] 1.95818920 0.92762872 2.15128127 1.24252866 2.05618921 1.39899002
## [19] 0.87717033 1.06186768 0.47567398 1.02537427 0.57283122 0.40047140
## [25] 0.98457128 0.93651013 1.00781922 0.60342712 0.89480646 1.99937263

calculate mean clonal spread and leaf size for sites.

Answer
mean.clonal <- numeric() # empty numeric object
for (i in 1:nrow(vas.plants)) {
  mean.clonal[i] <- weighted.mean(traits[, 2], vas.plants[i, ], na.rm = T)
}
mean.clonal
##  [1] 1.742857 2.076923 1.581395 2.163636 2.705882 2.000000 1.812500 1.136364
##  [9] 2.800000 1.679245 1.571429 1.685714 1.800000 1.761905 2.000000 1.722222
## [17] 1.685714 1.695652 2.200000 1.543860 2.764706 1.693878 2.250000 2.500000
## [25] 1.833333 1.900000 2.250000 2.473684 2.333333 1.866667

mean.leaf <- numeric()
for (i in 1:nrow(vas.plants)) {
  mean.leaf[i] <- weighted.mean(traits[, 3], vas.plants[i, ], na.rm = T)
}
mean.leaf
##  [1] 3.517241 2.923077 3.303030 3.545455 3.080000 3.523810 3.722222 3.705882
##  [9] 3.000000 3.333333 3.250000 3.333333 3.300000 3.108108 3.368421 3.562500
## [17] 3.470588 3.375000 3.000000 3.794118 3.000000 3.577778 3.000000 3.000000
## [25] 3.382353 3.222222 3.250000 3.166667 3.125000 3.400000

8.2 mean trait values for dark diversity

We can test the hypothesis that traits are linked to processes which limit species presences in communities. We can do this by comparing the mean trait values for observed and dark diversity using a paired t.test:

## Using script for dark diversity from Topic 4 
library(DarkDiv)
dark <- DarkDiv(vas.plants)$Dark > 0.7 # Defining dark
dark[is.na(dark)] <- 0 # Replacing NA with zeros


mean.seed.dark <- numeric()
for (i in 1:nrow(dark)) {
  mean.seed.dark[i] <-  weighted.mean(traits$seed.weight..g., dark[i, ], na.rm = T)
}

plot(mean.seed, mean.seed.dark, asp = T)
abline(0, 1)



# Hypothesis: large seeds establish more easily in community (more resources), 
# so that we expect to find in average larger seed in observed community compared 
# to dark diversity

t.test(mean.seed - mean.seed.dark, alternative = "greater") # one sided question!
## 
##  One Sample t-test
## 
## data:  mean.seed - mean.seed.dark
## t = 0.24958, df = 28, p-value = 0.4024
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
##  -0.3919486        Inf
## sample estimates:
##  mean of x 
## 0.06739051

# t-test assumption: normal distribution:
shapiro.test(mean.seed - mean.seed.dark) #OK
## 
##  Shapiro-Wilk normality test
## 
## data:  mean.seed - mean.seed.dark
## W = 0.94414, p-value = 0.1287

## Non-parametric alternative is wilcox.test()

8.3 Functional space of taxa

Besides mean values, we can also look at how large is the functional space covered by a taxa within a site. Size of the functional space describes functional diversity. NB! Related to species richness!


traits.scale <- scale(traits)  # scaling traits 

row.names(traits.scale) <- names(vas.plants) # putting names of taxa

traits.scale <- traits.scale[complete.cases(traits), ] # omitting taxa with NA
vas.plants.traits <- vas.plants[, complete.cases(traits)] # same with community data 


plot(traits.scale[, 1], traits.scale[, 3], xlab = "Seed size", 
     ylab = "Leaf size")
i <- 1 # id for sample
site1 <- data.frame(x = traits.scale[vas.plants.traits[i, ] > 0, 1], 
                    y = traits.scale[vas.plants.traits[i, ] > 0, 3])
rect(min(site1$x),
     min(site1$y),
     max(site1$x),
     max(site1$y),
     col = rgb(0.5, 0, 0, 0.5))

Show the range rectangle for sites 2 and 3. Which one has the largest functional diversity?

Answer

i=2 # id for sample
site1=data.frame(x=traits.scale[vas.plants.traits[i,]>0,1],
                 y=traits.scale[vas.plants.traits[i,]>0,3])
rect(min(site1$x),min(site1$y),max(site1$x),max(site1$y),col=rgb(0,0.5,0,0.5)) # changed rgb parameters (red, green, blue, transparency)
## Error in rect(min(site1$x), min(site1$y), max(site1$x), max(site1$y), : plot.new has not been called yet

# smaller

i=3 # id for sample
site1=data.frame(x=traits.scale[vas.plants.traits[i,]>0,1],
                 y=traits.scale[vas.plants.traits[i,]>0,3])
rect(min(site1$x),min(site1$y),max(site1$x),max(site1$y),col=rgb(0,0,0.5,0.5))
## Error in rect(min(site1$x), min(site1$y), max(site1$x), max(site1$y), : plot.new has not been called yet

# larger

We can find seed size range for all sites using the apply function.


seed.range <- apply(vas.plants.traits, 1, 
                    function(x){max(traits.scale[x > 0, 1]) - 
                      min(traits.scale[x > 0, 1])})
seed.range
##   X001Vapramae       X002Illi   X003Vitipalu    X004Konguta    X005Vehendi 
##      2.2006137      3.2824281      3.2916288      2.1730898      2.7548619 
##    X006.Erumae   X007Porgumae     X008Aardla   X009Kurepalu  X010Sarakuste 
##      2.7181131      1.6690650      1.5723804      2.7025766      2.0717218 
##      X011Kannu      X012Vonnu     X013Rookse    X014Kammeri     X015Kambja 
##      1.7524058      2.0457201      1.7172918      4.0587498      1.6690650 
##      X016Reola     X017Ignase    X018Unikula X019Haavametsa X020Kruusahaua 
##      2.0457201      1.6846015      1.6846015      0.5798515      1.5666966 
##      X021Aravu  X026Pedajamae       X027Miti    X028Puhaste  X031Pimmelaan 
##      0.3284283      4.9031903      0.5798515      4.0587498      4.3871781 
##     X032Logina  X033Voorepalu X036Taevaskoja    X044Savimae X073Pausakunnu 
##      1.6846015      1.6846015      1.2355453      1.6846015      1.7172918

Calculate leaf size range for all sites. Find if these two ranges (seed weight and leaf size) are related to species richness.

Answer

leaf.range=apply(vas.plants.traits,1,function(x) max(traits.scale[x>0,2])-min(traits.scale[x>0,2]))

shapiro.test(rowSums(vas.plants>0)) # Ok
## 
##  Shapiro-Wilk normality test
## 
## data:  rowSums(vas.plants > 0)
## W = 0.97798, p-value = 0.7696
shapiro.test(seed.range) # Not OK
## 
##  Shapiro-Wilk normality test
## 
## data:  seed.range
## W = 0.90508, p-value = 0.0112

cor.test(rowSums(vas.plants>0),seed.range, method="spearman")
## Warning in cor.test.default(rowSums(vas.plants > 0), seed.range, method =
## "spearman"): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  rowSums(vas.plants > 0) and seed.range
## S = 3051.5, p-value = 0.08356
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##      rho 
## 0.321145

shapiro.test(leaf.range) # not ok
## 
##  Shapiro-Wilk normality test
## 
## data:  leaf.range
## W = 0.27344, p-value = 4.243e-11
cor.test(rowSums(vas.plants>0),leaf.range,method="spearman")
## Warning in cor.test.default(rowSums(vas.plants > 0), leaf.range, method =
## "spearman"): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  rowSums(vas.plants > 0) and leaf.range
## S = 2766.1, p-value = 0.03584
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.3846284

8.4 Functional diversity based on mean pairwise distance

Now we will calculate the average functional distance between all taxa within a sample. This is a functional diversity measure which is NOT necessarily related to species richness (which is a good thing in general, because if functional diversity is trivially related to species richness, it does not convey much new information). We use the function melodic from this paper:

de Bello, F., Carmona, C.P., Lepš, J., Szava-Kovats, R. & Pärtel, M. 2016. Functional diversity through the mean trait dissimilarity: resolving shortcomings with existing paradigms and algorithms. Oecologia 180: 933-940. https://link.springer.com/article/10.1007%2Fs00442-016-3546-0

Supp. material 3 includes the function melodic: https://static-content.springer.com/esm/art%3A10.1007%2Fs00442-016-3546-0/MediaObjects/442_2016_3546_MOESM3_ESM.r

Just read the function:

##MEan DIssimilarity Components##
#samp:  community matrix; sites in lines, species in columns
#dis:   dissimilarity matrix
#type:  "both" for results with abundances weighted and non-weighted
#       "abundance" for results with abundances weighted
#       "presence" for results with abundances non-weighted
  
melodic <- function(samp,dis,type="both"){
    samp <- as.matrix(samp)
  dis <- as.matrix(dis)
  if(is.null(colnames(samp)) | is.null(colnames(dis)) ){
    stop("Both samp and dis must have colnames.\n")
  }
  N<-dim(samp)[1]
    melodic<-list()
    if (type=="both"){
        melodic$abundance<-list()
        melodic$abundance$mpd<-melodic$abundance$rao<-melodic$abundance$simpson<-numeric(N)
        melodic$presence<-list()
        melodic$presence$mpd<-melodic$presence$rao<-melodic$presence$simpson<-numeric(N)
    }
    if (type=="abundance"){ 
        melodic$abundance<-list()
        melodic$abundance$mpd<-melodic$abundance$rao<-melodic$abundance$simpson<-numeric(N)
    }
    if (type=="presence"){ 
        melodic$presence<-list()
        melodic$presence$mpd<-melodic$presence$rao<-melodic$presence$simpson<-numeric(N)
    }
    for (i in 1:N){
        sppInSample<-names(samp[i,samp[i, ]>0])
        melodic$richness[i]<-rowSums(samp>0)[i]
        if (length(sppInSample)>1){
            sample.dis<-dis[sppInSample,sppInSample]
            abund.w<-numeric(length(sppInSample))
            if (type=="both" | type=="abundance"){
                abund.w <- samp[i , sppInSample] / sum(samp[i , sppInSample])
                sample.weights <- outer(abund.w , abund.w)
                melodic$abundance$mpd[i] <- weighted.mean(sample.dis[lower.tri(sample.dis)] , sample.weights[lower.tri(sample.weights)])
                melodic$abundance$rao[i] <- sum(sample.weights * sample.dis)
                melodic$abundance$simpson[i] <- sum(2*sample.weights[lower.tri(sample.weights)])
            }   
            if (type=="both" | type=="presence"){
                abund.nw <- rep(1 , length(sppInSample)) / length(sppInSample)
                sample.weights <- outer(abund.nw , abund.nw)
                melodic$presence$mpd[i] <- weighted.mean(sample.dis[lower.tri(sample.dis)] , sample.weights[lower.tri(sample.weights)])
                melodic$presence$rao[i] <- sum(sample.weights * sample.dis)
                melodic$presence$simpson[i] <- sum(2*sample.weights[lower.tri(sample.weights)])
            }   
        }   else {
            if (type=="both" | type=="abundance"){
                melodic$abundance$mpd[i] <- NA
                melodic$abundance$rao[i] <- melodic$abundance$simpson[i] <-0
            }
            if (type=="both" | type=="presence"){
                melodic$presence$mpd[i] <- NA
                melodic$presence$rao[i] <- melodic$presence$simpson[i] <-0
            }
        }
    }   
    out<-melodic
    return(out) 
}

############

8.5 Using the melodic function


# We need community data and trait dissimilarity between taxa

trait.dist <- dist(traits.scale) # distance matrix

## Let's visualise as a dendrogram

plot(as.dendrogram(hclust(trait.dist, "ward.D2")))


## Now the function using species abundances and Mean Pairwise Distance

mpd <- melodic(vas.plants.traits,trait.dist)$presence$mpd
mpd
##  [1] 2.2047313 2.3314668 2.5644992 1.9284481 2.3290041 2.7209281 1.9670262
##  [8] 2.5384878 1.9596808 2.0268937 1.7803221 2.0382592 1.8762759 2.0167115
## [15] 1.8866939 1.9862419 1.9897153 1.9635251 2.3840487 2.1270476 0.2189522
## [22] 2.3627634 1.5927772 2.3374001 2.0137481 1.8481704 1.8662330 1.7984857
## [29] 1.5039213 2.0659289

Test whether mpd correlates with species richness. Make a graph.

Answer
shapiro.test(rowSums(vas.plants>0)) #OK
## 
##  Shapiro-Wilk normality test
## 
## data:  rowSums(vas.plants > 0)
## W = 0.97798, p-value = 0.7696
shapiro.test(mpd) # Not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  mpd
## W = 0.79616, p-value = 5.585e-05
cor.test(rowSums(vas.plants>0),mpd,method="spearman") # 
## Warning in cor.test.default(rowSums(vas.plants > 0), mpd, method = "spearman"):
## Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  rowSums(vas.plants > 0) and mpd
## S = 4278.5, p-value = 0.8004
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## 0.04817142
plot(rowSums(vas.plants>0),mpd)

8.6 Phylogenetic tree

Similarly to functional distances, we can also use phylogenetic distances between taxa. These distances basically represent the latest time when two taxa had a common ancestor (i.e. when they diverged). We will use the packages ape and phytools.


library(ape) 

phy <- read.tree("phy.tree.newick.txt")

plot(phy, cex = 0.5)  # not easy to read


# A circular plot

library(phytools)
plotTree(phy, type = "fan", part = 0.95, fsize = 0.5)
h <- round(max(nodeHeights(phy)) / 10) * 10
axis(1, pos = -0.05 * h, at = seq(0, h, by = h / 2), lwd = 2, cex.axis = 0.7)


# NB! Axis with millions of years (up to 400 millions)

8.7 Phylogenetic diversity

We can estimate the phylogenetic distance between all pairs of species and use again the melodic function to calculate phylogenetic diversity.

# The first thing to check is whether the names of species in the tree and the abundance matrix match:
phy$tip.label
##  [1] "Lycopodium_annotinum"      "Equisetum_sylvaticum"     
##  [3] "Equisetum_pratense"        "Pteridium_aquilinum"      
##  [5] "Gymnocarpium_dryopteris"   "Phegopteris_connectilis"  
##  [7] "Athyrium_filix-femina"     "Dryopteris_filix-mas"     
##  [9] "Dryopteris_carthusiana"    "Dryopteris_expansa"       
## [11] "Paris_quadrifolia"         "Maianthemum_bifolium"     
## [13] "Polygonatum_odoratum"      "Convallaria_majalis"      
## [15] "Goodyera_repens"           "Luzula_pilosa"            
## [17] "Carex_canescens"           "Carex_globularis"         
## [19] "Carex_digitata"            "Melica_nutans"            
## [21] "Brachypodium_pinnatum"     "Festuca_ovina"            
## [23] "Deschampsia_flexuosa"      "Deschampsia_cespitosa"    
## [25] "Calamagrostis_arundinacea" "Milium_effusum"           
## [27] "Poa_nemoralis"             "Molinia_caerulea"         
## [29] "Asarum_europaeum"          "Anemone_nemorosa"         
## [31] "Hepatica_nobilis"          "Ranunculus_cassubicus"    
## [33] "Ranunculus_auricomus"      "Geranium_sylvaticum"      
## [35] "Circaea_alpina"            "Epilobium_angustifolium"  
## [37] "Oxalis_acetosella"         "Viola_riviniana"          
## [39] "Viola_canina"              "Viola_mirabilis"          
## [41] "Viola_epipsila"            "Urtica_dioica"            
## [43] "Rubus_saxatilis"           "Fragaria_vesca"           
## [45] "Potentilla_erecta"         "Lathyrus_vernus"          
## [47] "Moehringia_trinervia"      "Stellaria_nemorum"        
## [49] "Stellaria_holostea"        "Impatiens_parviflora"     
## [51] "Impatiens_noli-tangere"    "Lysimachia_europaea"      
## [53] "Lysimachia_vulgaris"       "Chimaphila_umbellata"     
## [55] "Orthilia_secunda"          "Pyrola_minor"             
## [57] "Calluna_vulgaris"          "Vaccinium_vitis-idaea"    
## [59] "Vaccinium_myrtillus"       "Anthriscus_sylvestris"    
## [61] "Aegopodium_podagraria"     "Angelica_sylvestris"      
## [63] "Campanula_persicifolia"    "Scorzonera_humilis"       
## [65] "Hieracium_lachenalii"      "Lactuca_muralis"          
## [67] "Crepis_paludosa"           "Solidago_virgaurea"       
## [69] "Pulmonaria_obscura"        "Galium_album"             
## [71] "Galium_palustre"           "Veronica_officinalis"     
## [73] "Veronica_chamaedrys"       "Melampyrum_pratense"      
## [75] "Melampyrum_sylvaticum"     "Melampyrum_nemorosum"     
## [77] "Galeopsis_speciosa"        "Lamium_galeobdolon"
colnames(vas.plants)
##  [1] "ACTAspic"   "AEGOpoda"   "ANEMnemo"   "ASAReuro"   "ATHYfil.fe"
##  [6] "BRACpinn"   "CALAarun"   "CALLvulg"   "CAREdigi"   "CIRCalpi"  
## [11] "CONVmaja"   "DESCflex"   "DRYOcart"   "DRYOexpa"   "EPILangu"  
## [16] "FESTovin"   "FRAGvesc"   "GALElute"   "GERAsylv"   "GYMNdryo"  
## [21] "HEPAnobi"   "IMPAnoli"   "LATHvern"   "LUZUpilo"   "LYCOanno"  
## [26] "MAIAbifo"   "MELAprat"   "MELInuta"   "MILIeffu"   "MOLIcaer"  
## [31] "MYCEmura"   "ORTHsecu"   "OXALacet"   "PARIquad"   "PTERaqui"  
## [36] "RANUcass"   "RUBUsaxa"   "STELholo"   "STELnemo"   "TRIEeuro"  
## [41] "VACCmyrt"   "VACCvit.id" "VEROcham"   "VIOLmira"   "ANGEsylv"  
## [46] "ANTHsylv"   "CAMPpers"   "CAREcane"   "CAREglob"   "CHIMumbe"  
## [51] "CREPpalu"   "DESCcaes"   "DRYOfili"   "EQUIprat"   "EQUIsylv"  
## [56] "GALEOPsp"   "GALIalbu."  "GALIpalu"   "GOODrepe"   "HIERvulg"  
## [61] "IMPAparv"   "LYSIvulg"   "MELAnemo"   "MELAsylv"   "MOERtrin"  
## [66] "PHEGconn"   "POAnemo"    "POLYodor"   "POTEerec"   "PULMobsc"  
## [71] "PYROmino"   "RANUauri."  "SCORhumi"   "SOLIvirg"   "URTIdioi"  
## [76] "VEROoffi"   "VIOLcani"   "VIOLepip"   "VIOLrivi."
# They dont. The first problem is that the names in the abundance matrix are
# shortened, showing the first four characters of the genus in CAPITAL letters and
# the first four characters of the epiteton in small letters. 
# Lets transform the names from the phylogeny:
new.names <- character()
for(i in 1:length(phy$tip.label)){
  separateNames <- strsplit(phy$tip.label[i], split  = "_")[[1]]
  separateNames[1] <- toupper(separateNames[1])
  new.names[i] <- paste0(substr(separateNames[1], 1, 4), 
                         substr(separateNames[2], 1, 4))
}
# We can now replace names in the phylogeny:
phy$tip.label <- new.names

# The next issue is with the column names. Some species have a dot (".") which we
# dont want to keep:
colnames2 <- gsub(pattern = "\\.", replacement = "", x = colnames(vas.plants))
#There is a species with an extra character (9 instead of 8).  Lets trim them all:
colnames2 <- substr(colnames2, 1, 8)
# Now lets substitute colnames:
colnames(vas.plants) <- colnames2

#Not all species in the phylogeny are in the community matrix, and vice versa.
# Lets keep only common species:
vas.plants2 <- vas.plants[, which(colnames2 %in% phy$tip.label)]
phy2 <- drop.tip(phy, tip = setdiff(phy$tip.label, colnames2))

#Finally, lets reorder columns so that the species match the names order in the phylogeny:
vas.plants2 <- vas.plants2[, phy2$tip.label]

# Distance within tree
phy.dis <- cophenetic(phy2) # distance between tree tips!

phy.mpd <- melodic(vas.plants2, phy.dis)$presence$mpd

phy.mpd
##  [1] 387.20367 271.45177 294.67478 461.78523 296.81682 557.92382 365.86018
##  [8] 516.32730  70.20136 352.97152 462.00656 392.54300 444.89736 376.17377
## [15] 408.73964 434.16348 419.69530 422.80277 314.98768 420.57579 580.14899
## [22] 345.41456 278.04621 420.47415 404.45586 306.59898 295.91345 436.52068
## [29] 502.32523 505.32190

NB! You can use cophenetic distance for any (dis)similarity tree (community composition, trait similarity of species, etc.).

Test whether phylogenetic mpd is related to species richness. How is it related to functional mpd?

Answer
shapiro.test(phy.mpd) # OK
## 
##  Shapiro-Wilk normality test
## 
## data:  phy.mpd
## W = 0.94619, p-value = 0.1336
cor.test(rowSums(vas.plants2>0),phy.mpd)
## 
##  Pearson's product-moment correlation
## 
## data:  rowSums(vas.plants2 > 0) and phy.mpd
## t = 1.1955, df = 28, p-value = 0.2419
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1519618  0.5379331
## sample estimates:
##       cor 
## 0.2203722
cor.test(phy.mpd,mpd,method="spearman") # mpd was not normally distributed
## 
##  Spearman's rank correlation rho
## 
## data:  phy.mpd and mpd
## S = 5190, p-value = 0.413
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.1546162

8.8 Community assembly tests with traits

With traits and phylogeny we can make community assembly tests. These tests check if functional trait values or phylogenetic history are randomly assembled in a community, or if taxa are more similar (or dissimilar) than expected by chance. If there are more similar species, then often it has been explained by habitat filtering (similar taxa fits to similar habitats). On the other hand, when co-existing species are more dissimilar than expected, it has been explained by competition – similar species compete most strongly and might exclude each other.

mpd <- melodic(vas.plants.traits > 0, trait.dist, "abundance")$abundance$mpd

mpd.rand <- replicate(100, 
                      melodic(apply(vas.plants.traits > 0, 2, sample), 
                              trait.dist, "abundance")$abundance$mpd)

## Function replicate makes the same thing n times (here 100)
## Functions apply and sample randomize all taxa across all samples

mean.rand <- apply(mpd.rand, 1, mean, na.rm = T) # mean random functional diversity

sd.rand <- apply(mpd.rand, 1, sd, na.rm = T) # standard deviation

ses <- (mpd - mean.rand) / sd.rand # SES = standardized effect size, calculated from 
# observed and randomized values, is usually normally distributed, values >1.96 and < -1.96 are significant.

abs(ses) > 1.96
##  [1] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE  TRUE FALSE

# We can visualize this by making histograms of random functional diversities and showing the observed value as a line.

i = 1 #sample ID
hist(mpd.rand[i, ], xlim = range(mpd.rand[i, ], mpd[i]), main = "",
  xlab = "Random functional diversity")
abline(v = mpd[i], col = "red", lwd = 3)
mtext("Obs.", 3, at = mpd[i], col = "red") # mtext writes text outside the figure box

# P-value can be calculated as proportion of random observations more extreme than the empirical value

p <- sum(mpd[i] < mpd.rand[i,]) / length(mpd.rand[i,])
p # one side
## [1] 0.38
1 - p # another side
## [1] 0.62
title(main = paste("P =", round(min(p, 1 - p), 3)))

Make the histogram for site no 3.

Answer
i=3 #sample ID
hist(mpd.rand[i,],xlim=range(mpd.rand[i,],mpd[i]),main="",xlab="Random functional diversity")
abline(v=mpd[i],col="red",lwd=3)
mtext("Obs.",3,at=mpd[i],col="red") # mtext writes text outside the figure box
p=sum(mpd[i]<mpd.rand[i,])/length(mpd.rand[i,])
p # one side
## [1] 0.08
1-p # another side
## [1] 0.92
title(main=paste("P =",round(min(p,1-p),3)))

8.9 Richness variance over all samples

Another community assembly test does not require traits or phylogeny. It just compares the variance in richness values with a null model where species are distributed randomly across samples (by keeping empirical species frequencies).

If observed variation is smaller than random expectation, then competition might limit the number of taxa. If the observed variation is larger, we likely have different species pools for different sites, some are larger than others.


var.obs <- var(rowSums(vas.plants > 0))
var.obs
## [1] 33.62759


random.table <- apply(vas.plants > 0, 2, sample)

var(rowSums(random.table))
## [1] 7.351724

var.rand <- replicate(100, var(rowSums(apply(vas.plants > 0, 2, sample))))

hist(var.rand, xlim = range(var.rand, var.obs), main = "", col = "blue")

abline(v = var.obs, col = "red", lwd = 3)
mtext("Obs.", 3, at = var.obs, col = "red")

p <-  sum(var.obs < var.rand) / length(var.rand)
p # one side
## [1] 0
1 - p # another side
## [1] 1
title(main = paste("P =", round(min(p, 1 - p), 3)))