This vignette explains how to extract, plot, and statistically test for differences among BD (Birth-Death) or FBD (fossilized birth-death) parameters (e.g., net diversification, relative extinction (turnover), and relative fossilization) across the tree when using the Skyline BD or Skyline FBD tree models produced by the program Mr. Bayes or BEAST2 (SA or BDSKY packages) since EvoPhylo v. 0.2.0.
FBD Parameters Statistics and Plots
Below we demonstrate how to extract evolutionary rate summary statistics from each node from a Bayesian clock (time-calibrate) summary tree produced by Mr. Bayes, store them in a data frame, produce summary tables, and plots.
Load the EvoPhylo package
1. Import combined log file from all runs.
This is produced by using combine_log(). Alternatively, users can also use LogCombiner from the BEAST2 software package. The first argument passed to combine_log() should be a path to the folder containing the log files to be imported and combined.
## Import all log (.p) files from all runs and combine them, with burn-in = 25% 
## and downsampling to 2.5k trees in each log file
posterior3p <- combine_log("LogFiles3p", burnin = 0.25, downsample = 1000)
Below, we use the posterior dataset posterior3p that accompanies EvoPhylo.
The posterior data must first be transformed from wide to long to be used with the functions described below; FBD_reshape() accomplishes this.
## Reshape imported combined log file from wide to long with FBD_reshape
posterior3p_long <- FBD_reshape(posterior3p, variables = NULL, log.type = "MrBayes")
## Show first 5 lines of combined log file
head(posterior3p_long, 5)
##       Gen       LnL      LnPr  TH.all.   TL.all. prop_ancfossil.all.   sigma.1.
## 1 8750000 -1449.425 -143.1907 5.271118 11.969460                   0 0.07660715
## 2 8761000 -1458.367 -174.9627 4.775064 11.070210                   0 0.05850396
## 3 8771000 -1449.445 -163.9216 5.927716 12.628460                   0 0.05182430
## 4 8782000 -1453.218 -153.2030 4.451376  9.931809                   0 0.14644520
## 5 8792000 -1461.906 -132.1172 5.095504 11.083810                   0 0.14143120
##     sigma.2.  sigma.3.      m.1.     m.2.     m.3. tk02var.1. tk02var.2.
## 1 1.33351500 0.8523453 0.3695799 1.544579 1.362332  0.3197728  0.3848931
## 2 0.06463618 0.1380557 0.5083868 1.495777 1.108471  0.2710006  0.3609312
## 3 0.67980130 0.7776142 0.4275609 1.569911 1.144364  0.2853423  0.1831945
## 4 0.65005980 0.2999867 0.6445027 1.329942 1.148377  0.4670378  0.3483061
## 5 0.52745340 1.3928490 0.4993570 1.368074 1.445410  0.2115789  0.2723863
##   tk02var.3. clockrate.all. Time_bin net_speciation relative_extinction
## 1  0.2075079     0.01192715        1     0.04987983           0.6785586
## 2  0.3622265     0.01086355        1     0.04675159           0.9174022
## 3  0.6146289     0.01349259        1     0.01064803           0.9677827
## 4  0.4949015     0.01016002        1     0.07373453           0.8976315
## 5  0.3463915     0.01160514        1     0.04990040           0.7887825
##   relative_fossilization
## 1            0.055629950
## 2            0.006523517
## 3            0.010535440
## 4            0.001264865
## 5            0.036796000
 
2. Summarize FBD parameters by time bin
Summary statistics for each FBD parameter by time bin can be quickly summarized using FBD_summary():
## Summarize parameters by time bin and analysis
t3.1 <- FBD_summary(posterior3p_long)
t3.1
FBD parameters by time bin
| parameter | Time_bin | n | mean | sd | min | Q1 | median | Q3 | max | 
| net_speciation | 1 | 10000 | 0.04 | 0.02 | 0.00 | 0.03 | 0.04 | 0.06 | 0.17 | 
| net_speciation | 2 | 10000 | 0.03 | 0.02 | 0.00 | 0.02 | 0.03 | 0.04 | 0.12 | 
| net_speciation | 3 | 10000 | 0.02 | 0.01 | 0.00 | 0.01 | 0.02 | 0.03 | 0.12 | 
| net_speciation | 4 | 10000 | 0.05 | 0.02 | 0.00 | 0.03 | 0.05 | 0.06 | 0.12 | 
| relative_extinction | 1 | 10000 | 0.79 | 0.15 | 0.08 | 0.71 | 0.82 | 0.90 | 1.00 | 
| relative_extinction | 2 | 10000 | 0.93 | 0.05 | 0.55 | 0.90 | 0.93 | 0.96 | 1.00 | 
| relative_extinction | 3 | 10000 | 0.95 | 0.05 | 0.18 | 0.93 | 0.96 | 0.98 | 1.00 | 
| relative_extinction | 4 | 10000 | 0.03 | 0.10 | 0.00 | 0.00 | 0.00 | 0.01 | 0.97 | 
| relative_fossilization | 1 | 10000 | 0.04 | 0.05 | 0.00 | 0.01 | 0.02 | 0.05 | 0.72 | 
| relative_fossilization | 2 | 10000 | 0.07 | 0.04 | 0.00 | 0.04 | 0.06 | 0.09 | 0.36 | 
| relative_fossilization | 3 | 10000 | 0.01 | 0.02 | 0.00 | 0.00 | 0.01 | 0.02 | 0.54 | 
| relative_fossilization | 4 | 10000 | 0.04 | 0.11 | 0.00 | 0.00 | 0.00 | 0.02 | 0.99 | 
## Export the table
write.csv(t3.1, file = "FBD_summary.csv")
 
3. Plot the distribution of each FBD parameter
Each of (or all) the FBD parameter distributions can be plotted by time bin using various plotting alternatives with FBD_dens_plot():
## Plot distribution of the desired FBD parameter by time bin with 
## kernel density plot
FBD_dens_plot(posterior3p_long, parameter = "net_speciation",
              type = "density", stack = FALSE)

## Plot distribution of the desired FBD parameter by time bin with 
## stacked kernel density plot
FBD_dens_plot(posterior3p_long, parameter = "net_speciation",
              type = "density", stack = TRUE)

## Plot distribution of the desired FBD parameter by time bin with 
## a violin plot
FBD_dens_plot(posterior3p_long, parameter = "net_speciation",
              type = "violin", stack = FALSE, color = "red")

## Plot distribution of all FBD parameter by time bin with a violin plot
p1 <- FBD_dens_plot(posterior3p_long, parameter = "net_speciation",
                    type = "violin", stack = FALSE, color = "red")
p2 <- FBD_dens_plot(posterior3p_long, parameter = "relative_extinction",
                    type = "violin", stack = FALSE, color = "cyan3")
p3 <- FBD_dens_plot(posterior3p_long, parameter = "relative_fossilization",
                    type = "violin", stack = FALSE, color = "green3")
library(patchwork)
p1 + p2 + p3 + plot_layout(nrow = 1)

## Save your plot to your working directory as a PDF
ggplot2::ggsave("Plot_regs.pdf", width = 12, height = 4)
 
4. Test for assumptions
In this step, users can perform tests for normality and homoscedasticity in data distribution for each of the FBD parameters under consideration. The output will determine whether parametric or nonparametric tests will be performed subsequently.
##### Tests for normality and homoscedasticity for each FBD parameter for all time bins
t3.2 <- FBD_tests1(posterior3p_long)
### Export the output table for all tests
write.csv(t3.2, file = "FBD_Tests1_Assum.csv")
The results of the Shapiro-Wilk normality test for each parameter can be output as seperate tables or as a single combined table.
# Output as separate tables 
t3.2$shapiro
Shapiro-Wilk normality test
| 
|  | parameter | statistic | p-value |  
| Time bin 1 | net_speciation | 0.9917 | 0 |  
| Time bin 2 | net_speciation | 0.9385 | 0 |  
| Time bin 3 | net_speciation | 0.9227 | 0 |  
| Time bin 4 | net_speciation | 0.9898 | 0 |  
| Overall | net_speciation | 0.9568 | 0 |  
| Residuals | net_speciation | 0.9874 | 0 |  | 
|  | parameter | statistic | p-value |  
| Time bin 1 | relative_extinction | 0.8927 | 0 |  
| Time bin 2 | relative_extinction | 0.9247 | 0 |  
| Time bin 3 | relative_extinction | 0.8044 | 0 |  
| Time bin 4 | relative_extinction | 0.3775 | 0 |  
| Overall | relative_extinction | 0.7036 | 0 |  
| Residuals | relative_extinction | 0.8238 | 0 |  | 
|  | parameter | statistic | p-value |  
| Time bin 1 | relative_fossilization | 0.5764 | 0 |  
| Time bin 2 | relative_fossilization | 0.8853 | 0 |  
| Time bin 3 | relative_fossilization | 0.6210 | 0 |  
| Time bin 4 | relative_fossilization | 0.4637 | 0 |  
| Overall | relative_fossilization | 0.5473 | 0 |  
| Residuals | relative_fossilization | 0.5531 | 0 |  | 
# OR as single merged table
t3.2$shapiro$net_speciation$bin <- row.names(t3.2$shapiro$net_speciation)  
t3.2$shapiro$relative_extinction$bin <- row.names(t3.2$shapiro$relative_extinction)  
t3.2$shapiro$relative_fossilization$bin <- row.names(t3.2$shapiro$relative_fossilization)  
k1all <- rbind(t3.2$shapiro$net_speciation,
               t3.2$shapiro$relative_extinction,
               t3.2$shapiro$relative_fossilization,
               make.row.names = FALSE)
Shapiro-Wilk normality test
| parameter | statistic | p-value | bin | 
| net_speciation | 0.9917 | 0 | Time bin 1 | 
| net_speciation | 0.9385 | 0 | Time bin 2 | 
| net_speciation | 0.9227 | 0 | Time bin 3 | 
| net_speciation | 0.9898 | 0 | Time bin 4 | 
| net_speciation | 0.9568 | 0 | Overall | 
| net_speciation | 0.9874 | 0 | Residuals | 
| relative_extinction | 0.8927 | 0 | Time bin 1 | 
| relative_extinction | 0.9247 | 0 | Time bin 2 | 
| relative_extinction | 0.8044 | 0 | Time bin 3 | 
| relative_extinction | 0.3775 | 0 | Time bin 4 | 
| relative_extinction | 0.7036 | 0 | Overall | 
| relative_extinction | 0.8238 | 0 | Residuals | 
| relative_fossilization | 0.5764 | 0 | Time bin 1 | 
| relative_fossilization | 0.8853 | 0 | Time bin 2 | 
| relative_fossilization | 0.6210 | 0 | Time bin 3 | 
| relative_fossilization | 0.4637 | 0 | Time bin 4 | 
| relative_fossilization | 0.5473 | 0 | Overall | 
| relative_fossilization | 0.5531 | 0 | Residuals | 
## Bartlett's test for homogeneity of variance 
t3.2$bartlett
Bartlett’s test
| parameter | statistic | p-value | 
| net_speciation | 3815.464 | 0 | 
| relative_extinction | 18159.213 | 0 | 
| relative_fossilization | 25654.975 | 0 | 
## Fligner-Killeen test for homogeneity of variance 
t3.2$fligner
Fligner-Killeen test
| parameter | statistic | p-value | 
| net_speciation | 3748.140 | 0 | 
| relative_extinction | 12599.843 | 0 | 
| relative_fossilization | 4808.545 | 0 | 
Deviations from normality can be displayed graphically using FBD_normality_plot():
## Visualize deviations from normality and similarity of variances
FBD_normality_plot(posterior3p_long)

## Save your plot to your working directory as a PDF
ggplot2::ggsave("Plot_normTests.pdf", width = 8, height = 6)
 
5. Test for significant FBD shifts between time bins
Significant shifts in FBD parameters across time bins can be easily tested using parametric (Student’s t-test) and nonparametric (Mann-Whitney test) pairwise comparisons with FBD_tests2(). Both are automatically calculated and the preferred pairwise comparison will be chosen by the user depending on the results of the assumption tests step #4 (above).
##### Test for significant differences between each time bin for each FBD parameter
t3.3 <- FBD_tests2(posterior3p_long)
### Export the output table for all tests
write.csv(t3.3, file = "FBD_Tests2_Sign.csv")
## Pairwise t-tests
# Output as separate tables 
t3.3$t_tests
Significant tests
| 
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj |  
| net_speciation | 1 | 2 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 1 | 3 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 1 | 4 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 2 | 3 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 2 | 4 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 3 | 4 | 10000 | 10000 | 0 | 0 |  | 
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj |  
| relative_extinction | 1 | 2 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 1 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 1 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 2 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 2 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 3 | 4 | 10000 | 10000 | 0 | 0 |  | 
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj |  
| relative_fossilization | 1 | 2 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 1 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 1 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 2 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 2 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 3 | 4 | 10000 | 10000 | 0 | 0 |  | 
# OR as single merged table
k3.3a <- rbind(t3.3$t_tests$net_speciation,
               t3.3$t_tests$relative_extinction,
               t3.3$t_tests$relative_fossilization,
               make.row.names = FALSE)
Pairwise t-tests
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj | 
| net_speciation | 1 | 2 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 1 | 3 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 1 | 4 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 2 | 3 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 2 | 4 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 3 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 1 | 2 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 1 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 1 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 2 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 2 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 3 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 1 | 2 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 1 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 1 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 2 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 2 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 3 | 4 | 10000 | 10000 | 0 | 0 | 
## Mann-Whitney tests (use if Tests in step #4 fail assumptions)
# Output as separate tables 
t3.3$mwu_tests
Mann-Whitney tests
| 
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj |  
| net_speciation | 1 | 2 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 1 | 3 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 1 | 4 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 2 | 3 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 2 | 4 | 10000 | 10000 | 0 | 0 |  
| net_speciation | 3 | 4 | 10000 | 10000 | 0 | 0 |  | 
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj |  
| relative_extinction | 1 | 2 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 1 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 1 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 2 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 2 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_extinction | 3 | 4 | 10000 | 10000 | 0 | 0 |  | 
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj |  
| relative_fossilization | 1 | 2 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 1 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 1 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 2 | 3 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 2 | 4 | 10000 | 10000 | 0 | 0 |  
| relative_fossilization | 3 | 4 | 10000 | 10000 | 0 | 0 |  | 
# OR as single merged table
k3.3b <- rbind(t3.3$mwu_tests$net_speciation,
               t3.3$mwu_tests$relative_extinction,
               t3.3$mwu_tests$relative_fossilization,
               make.row.names = FALSE)
Mann-Whitney tests
| parameter | Time_bin1 | Time_bin2 | n1 | n2 | p-value | p-value adj | 
| net_speciation | 1 | 2 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 1 | 3 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 1 | 4 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 2 | 3 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 2 | 4 | 10000 | 10000 | 0 | 0 | 
| net_speciation | 3 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 1 | 2 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 1 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 1 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 2 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 2 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_extinction | 3 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 1 | 2 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 1 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 1 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 2 | 3 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 2 | 4 | 10000 | 10000 | 0 | 0 | 
| relative_fossilization | 3 | 4 | 10000 | 10000 | 0 | 0 |