bd.sim
now allows for simulation up to a number of
species as well as the original implementation simulating up to a
certain time. To choose which realization of a certain number of species
we choose, we use the method proposed by Stadler 2011 (see
bd.sim
documentation for full reference), whereby we
simulate to a much larger number of species, then go back to check which
periods had the desired number, and choose one through weighted sampling
using the amount of time spent in each as the weights. Finally, we
sample a uniform distribution to decide the length of the final period
between when the species number was achieve and the end of the
simulation.Added functions to simulate trait evolution and trait-dependent birth death models, in particular State Speciation and Extinction (SSE) models BiSSE, MuSSE, and QuaSSE. See documentation for full reference.
bd.sim.traits
simulates species diversification
following a MuSSE model, where traits evolve from an Mk model and change
speciation and/or extinction in a discrete fashion. Allows for the
simulation of multiple traits, though currently the rates can only
depend on one of them (each rate can depend on a different trait,
though). Can simulate HiSSE as well, by setting the nHidden
parameter, representing the number of hidden states, to something higher
than 1. Can set separate number of states (observed or hidden), initial
trait values, and transition matrices for each trait. See
?bd.sim.traits
for details. In the future, other
trait-dependent diversification models will be implemented, including
relaxing the assumption of discrete traits.sample.clade.traits
simulates state-dependent fossil
sampling, following a similar algorithm as bd.sim.traits
.
It allows for a hidden state model as well, but note that trait
information (usually coming from bd.sim.traits
) needs to
include all states as observed for that (see examples in
?sample.clade.traits
and vignettes for details).draw.sim
can now color longevity segments and fossil
occurrences based on the trait values of a given trait for each species
through time. Customization options include which colors to use for each
state, whether to plot fossils as true occurrence times or ranges
(already present in 1.0, but now with a dedicated argument for such,
fossilsFormat
), and where to place trait value legend. See
?draw.sim
, ?bd.sim.traits
,
?sample.clade.traits
, and the overview
vignette for examples.make.phylo
now allows for an optional
fossils
input representing a fossil record, and will then
add these fossils to the tree as sampled ancestors. These SAs are added
as 0-length branches if the saFormat
input is set to
"branch"
, or as degree-2 nodes if it is set to
"node"
. The returnTrueExt
input optionally
drops the tip representing the true extinction time of a species, and
make the last sampled fossil of that species the fossil tip, if set to
FALSE
. Note that this last functionality required a limited
version of the drop.tip
function of the APE
package to be copied. See ?make.phylo
for reference and
credits.overview
bin.occurrences
allows for post-hoc binning of fossil
occurrences, so one can take a fossil record including only true times,
i.e. an output of sample.clade(..., returnTrue = TRUE)
, and
bin it to produce the uncertainty in fossil ages ubiquitous in the true
fossil record.sample.clade.R
: added a small bit on help page to
explain the complication of using adFun
with extant
species.make.phylo.R
: corrected bug in node labels.This is the first release of paleobuddy
, an R package
dedicated to flexible simulations of diversification, fossil records,
and phylogenetic trees. Below we list current features, and above
sections will be filled as new features and fixes are implemented.
rexp.var
generalizes rexp
to take any
function of time as a rate. Also allows for a shape
parameter, in which case it similarly generalizes
rweibull
.bd.sim
simulates species diversification with high
flexibility in allowed speciation and extinction rate scenarios.
Produces a sim
object.sample.clade
simulates fossil sampling. Similar
flexibility to bd.sim
, though even more so in the case of
age-dependent rates.make.phylo
creates a bifurcating phylogenetic tree as
aphylo
object (see APE) from a
sim
object. Can take fossils to be added as length
0
branches.draw.sim
plots a sim
by drawing species
durations and relationships, and optionally adding fossils as time
points or ranges.find.lineages
creates subsets of a sim
defined as the clades descended from one or more species present in the
simulation. Needed to e.g. generate phylogenetic trees from simulations
with more than one starting species.make.rate
creates a purely time-dependent (or constant)
rate based on optional inputs. Used internally to allow for users to
define rate scenarios easily in bd.sim
.phylo.to.sim
creates a sim
object from a
phylo
object, provided the user makes choices to solve
ambiguities on bifurcating phylogenetic trees.var.rate.div
calculates expected diversity for a given
diversification rate and set of times. Useful for testing
bd.sim
and planning rate scenarios.sim
a class returned by bd.sim
and used as
an input for many functions in the package. Formally, it is a named list
of vectors recording speciation time, extinction time, status (extant or
extinct), and parent information for each lineage in the simulation. It
contains the following methods: ** print
gives some quick
details about number of extant and total species, and the first few
members of each vector. ** head
and tail
return the sim
object containing only a given number of
species from the beginning and end of its vectors, respectively. **
summary
gives quantitative details, e.g. quantiles of
durations and speciation waiting times. ** plot
plots
lineages through time (LTT) plots for births, deaths, and diversity. **
sim.counts
counts numbers of births, deaths, and diversity
for some given time t
. ** is.sim
checks the
object is a valid sim
object. Used internally for error
checking.temp
temperature data during the Cenozoic. Modified
from RPANDA.co2
CO2 data during the Jurassic. Modified from RPANDA.overview
gives a reasonably in depth look at the main
features of the package, including examples of workflows using most
available rate scenarios, and examples of applications.The question of how to structure time came up a lot during
development of the package. Most of the literature in macroevolution and
paleontology considers absolute geological time, i.e. t = 0
at the present and t = 5
five million years ago. It becomes
challenging, however, to visualize and program complex rates going
backwards. As such, the code is structured such that all functions are
considered to go from 0
to the maximum simulation time
tMax
—i.e. the inverse of absolute geological time. There is
only one exception to this rule, the adFun
parameter
describing age-dependent distribution of fossil occurrences in
sample.clade
. In any case, all returned objects in the
package are set to follow absolute geological time, so as to conform to
the literature.
The integrate
function occasionally fails when given
functions that vary suddenly and rashly, usually happening in the case
of environmentally-dependent rates. I have tracked this error down to
numerical problems in integrate
, and testing seems to
indicate the error does not prevent integrate
from finding
the correct result. As such this is not currently something I intend to
fix, though if issues are found that indicate this could be a
paleobuddy
problem, not an integrate
problem,
that could change.
paleobuddy
is the first package to implement
time-dependent parameters for Weibull-distributed waiting times. Since
the authors currently are not aware of an analytical solution to
important quantities in the BD process in this case, it is challenging
to test exactly. Simulation tests indicate pretty strongly that the
algorithm works, however, with one exception—in cases where shape is
time-dependent and varies dramatically, especially when close to
0
, rexp.var
seems to have a hard time finding
the correct waiting time distribution. When maintained within levels
generally accepted as sensible throughout the literature—around 0.8 to
3, say—, and even a reasonable amount outside of that range, tests
indicate the algorithm functions as it should. A testthat
routine will be implemented in the future to formalize these claims, and
this issue is one I plan to work on soon, especially if users report it
as more prevalent than I thought.