The simulateWithPriors function pulls parameters from prior distributions and conducts a single simulation of continuous trait evolution (using the doSimulation function), returning useful summary statistics for ABC. parallelSimulateWithPriors is a wrapper function for simulateWithPriors that allows for multithreading and checkpointing. This family of functions is mostly used as internal components, generating simulations within ABC analyses using the doRun functions. See Note below.

simulateWithPriors(
  phy = NULL,
  intrinsicFn,
  extrinsicFn,
  startingPriorsFns,
  startingPriorsValues,
  intrinsicPriorsFns,
  intrinsicPriorsValues,
  extrinsicPriorsFns,
  extrinsicPriorsValues,
  generation.time = 1000,
  TreeYears = max(branching.times(phy)) * 1e+06,
  timeStep = NULL,
  giveUpAttempts = 10,
  verbose = FALSE,
  checks = TRUE,
  taxonDF = NULL,
  freevector = NULL
)

parallelSimulateWithPriors(
  nrepSim,
  multicore,
  coreLimit,
  phy,
  intrinsicFn,
  extrinsicFn,
  startingPriorsFns,
  startingPriorsValues,
  intrinsicPriorsFns,
  intrinsicPriorsValues,
  extrinsicPriorsFns,
  extrinsicPriorsValues,
  generation.time = 1000,
  TreeYears = max(branching.times(phy)) * 1e+06,
  timeStep = NULL,
  checkpointFile = NULL,
  checkpointFreq = 24,
  verbose = TRUE,
  checkTimeStep = TRUE,
  verboseNested = FALSE,
  freevector = NULL,
  taxonDF = NULL,
  giveUpAttempts = 10
)

Arguments

phy

A phylogenetic tree, in package ape's phylo format.

intrinsicFn

Name of (previously-defined) function that governs how traits evolve within a lineage, regardless of the trait values of other taxa.

extrinsicFn

Name of (previously-defined) function that governs how traits evolve within a lineage, based on their own ('internal') trait vlaue and the trait values of other taxa.

startingPriorsFns

Vector containing names of prior distributions to use for root states: can be one of "fixed", "uniform", "normal", "lognormal", "gamma", "exponential".

startingPriorsValues

A list of the same length as the number of prior distributions specified in startingPriorsFns (for starting values, this should be one prior function specified for each trait - thus one for most univariate trait analyses), with each element of the list a vector the same length as the appropriate number of parameters for that prior distribution (1 for "fixed", 2 for "uniform", 2 for "normal", 2 for "lognormal", 2 for "gamma", 1 for "exponential").

intrinsicPriorsFns

Vector containing names of prior distributions to use for intrinsic function parameters: can be one of "fixed", "uniform", "normal", "lognormal", "gamma", "exponential".

intrinsicPriorsValues

A list of the same length as the number of prior distributions specified in intrinsicPriorsFns (one prior function specified for each parameter in the intrinsic model), with each element of the list a vector the same length# as the appropriate number of parameters for that prior distribution (1 for "fixed", 2 for "uniform", 2 for "normal", 2 for "lognormal", 2 for "gamma", 1 for "exponential").

extrinsicPriorsFns

Vector containing names of prior distributions to use for extrinsic function parameters: can be one of "fixed", "uniform", "normal", "lognormal", "gamma", "exponential".

extrinsicPriorsValues

A list of the same length as the number of prior distributions specified in extrinsicPriorsFns (one prior function specified for each parameter in the extrinsic model), with each element of the list a vector the same length as the appropriate number of parameters for that prior distribution (1 for "fixed", 2 for "uniform", 2 for "normal", 2 for "lognormal", 2 for "gamma", 1 for "exponential").

generation.time

The number of years per generation. This sets the coarseness of the simulation; if it's set to 1000, for example, the population's trait values change every 1000 calendar years. Note that this is in calendar years (see description for argument TreeYears), and not in millions of years (as is typical for dated trees in macroevolutionary studies). Thus, if a branch is 1 million-year time-unit long, and a user applies the default generation.time = 1000, then 1000 evolutionary changes will be simulated along that branch. See documentation for doSimulation for further details.

TreeYears

The amount of calendar time from the root to the furthest tip. Most trees in macroevolutionary studies are dated with branch lengths in units of millions of years, and thus the default for this argument is max(branching.times(phy)) * 1e6. If your tree has the most recent tip at time zero (i.e., the modern day), this would be the same as the root age of the tree. If your branch lengths are not in millions of years, you should alter this argument. Otherwise, leave this argument alone. See documentation for doSimulation for further details.

timeStep

This value corresponds to the length of intervals between discrete evolutionary events ('generations') simulated along branches, relative to a rescaled tree where the root to furthest tip distance is 1. For example, timeStep = 0.01 of would mean 100 (i.e., 1 / 0.01) evolutionary changes would be expected to occur from the root to the furthest tip. (Note that the real number simulated will be much less, because simulations start over at each branching node.) Ideally, timeStep (or its effective value, via other arguments) should be as short as is computationally possible. Typically NULL by default and determined internally as follows: timeStep = generation.time / TreeYears. Can be provided a value as an alternative to using arguments generation.time and TreeYears, which would then be overridden. See documentation for doSimulation for further details.

giveUpAttempts

Value for when to stop the analysis if NA values are present.

verbose

If TRUE, gives messages about how the simulation is progessing via message.

checks

If TRUE, checks inputs for consistency. This activity is skipped (checks = FALSE) when run in parallel by parallelSimulateWithPriors, and instead is only checked once. This argument also controls whether simulateWithPriors assigns freevector as an attribute to the output produced.

taxonDF

A data.frame containing data on nodes (both tips and internal nodes) output by various internal functions. Can be supplied as input to spead up repeated calculations, but by default is NULL, which instead forces a calculation from input phy.

freevector

A logical vector (with length equal to the number of parameters), indicating free (TRUE) and fixed (FALSE) parameters.

nrepSim

Number of replicated simulations to run.

multicore

Whether to use multicore, default is FALSE. If TRUE, one of two suggested packages must be installed, either doMC (for UNIX systems) or doParallel (for Windows), which are used to activate multithreading. If neither package is installed, this function will fail if multicore = TRUE.

coreLimit

Maximum number of cores to be used.

checkpointFile

Optional file name for checkpointing simulations

checkpointFreq

Saving frequency for checkpointing

checkTimeStep

If TRUE, warnings will be issued if TimeStep is too short.

verboseNested

Should looped runs of simulateWithPriors be verbose?

Value

Function simulateWithPriors returns a vector of trueFreeValues, the true generating parameters used in the simulation (a set of values as long as the number of freely varying parameters), concatenated with a set of summary statistics for the simulation.

Function parallelSimulateWithPriors returns a matrix of such vectors bound together, with each row representing a different simulation.

By default, both functions also assign a logical vector named freevector, indicating the total number of parameters and which parameters are freely-varying (have TRUE values), as an attribute of the output.

Note

The simulateWithPriors functions are effectively the engine that powers the doRun functions, while the doSimulation function is the pistons within the simulateWithPriors engine. In general, most users will just drive the car - they will just use doRun, but some users may want to use simulateWithPriors or doSimulation to do various simulations.

Author

Brian O'Meara and Barb Banbury

Examples

# \donttest{ set.seed(1) tree <- rcoal(20) # get realistic edge lengths tree$edge.length <- tree$edge.length*20 # example simulation # NOTE: the example analyses involve too few simulations, # as well as overly coarse time-units... # ...all for the sake of examples that reasonably test the functions simData <- simulateWithPriors( phy = tree, intrinsicFn = brownianIntrinsic, extrinsicFn = nullExtrinsic, startingPriorsFns = "normal", startingPriorsValues = list( c(mean(simCharExample[, 1]), sd(simCharExample[, 1]))), intrinsicPriorsFns = c("exponential"), intrinsicPriorsValues = list(10), extrinsicPriorsFns = c("fixed"), extrinsicPriorsValues = list(0), generation.time = 100000, freevector = NULL, giveUpAttempts = 10, verbose = TRUE)
#> Warning: You have only 22 timeSteps on the shortest branch in this dataset #> but should probably have a lot more if you expect change on this branch. #> Please consider decreasing timeStep to no more than 0.0016
simData
#> #> 9.547836e+00 4.150295e-02 -3.026773e+01 3.418391e-01 6.453546e+01 #> #> -3.007117e+01 2.896568e-01 9.956452e-01 6.614234e+01 -3.025080e+01 #> #> 3.796814e-01 8.707084e-01 6.650160e+01 -3.026773e+01 3.418391e-01 #> #> 3.495804e-09 6.653546e+01 -5.479906e+01 1.135981e+02 7.823403e+00 #> #> 1.195810e+01 2.669213e+00 1.478069e+01 9.932491e+00 1.044875e+01 #> #> 6.941505e+00 9.646293e+00 -1.111716e+00 1.072039e+00 -1.189547e+00 #> #> 2.626172e+00 -1.905434e-01 3.010480e-01 5.862058e-01 -9.090808e-01 #> #> -2.857882e-01 -5.936686e-01 6.647034e-01 -3.803557e-01 7.586470e-01 #> 21 #> 2.214307e-02 1.729511e-01 3.274339e-01 -1.839619e-01 7.169150e+00 #> 22 23 24 25 26 #> 7.921066e+00 9.175249e+00 1.008808e+01 1.014071e+01 4.844487e+00 #> 27 28 29 30 31 #> 1.101510e+01 1.110375e+01 1.133103e+01 1.105411e+01 3.257514e+00 #> 32 33 34 35 36 #> 4.304719e+00 1.121909e+01 1.134805e+01 3.211705e+00 2.967029e+00 #> 37 38 39 #> 4.518816e+00 1.167323e+01 2.956022e+00 3.676780e+00 5.100799e+00 #> #> 6.668273e+00 8.045717e+00 8.152504e+00 2.706270e+00 9.786392e+00 #> #> 9.971090e+00 1.020349e+01 1.016135e+01 2.291315e+00 3.393225e+00 #> #> 1.039898e+01 1.066429e+01 2.748921e+00 2.572227e+00 4.124383e+00 #> #> 1.130102e+01 2.732176e+00 1.066152e+01 1.074133e+01 1.168223e+01 #> #> 1.213044e+01 1.212892e+01 6.982704e+00 1.224380e+01 1.223641e+01 #> #> 1.245856e+01 1.194688e+01 4.223714e+00 5.216214e+00 1.203919e+01 #> #> 1.203180e+01 3.674488e+00 3.361830e+00 4.913249e+00 1.204545e+01 #> #> 3.179869e+00 #> attr(,"freevector") #> starting_1 intrinsic_1 extrinsic_1 #> TRUE TRUE FALSE
simDataParallel <- parallelSimulateWithPriors( nrepSim = 2, multicore = FALSE, coreLimit = 1, phy = tree, intrinsicFn = brownianIntrinsic, extrinsicFn = nullExtrinsic, startingPriorsFns = "normal", startingPriorsValues = list( c(mean(simCharExample[, 1]), sd(simCharExample[, 1]))), intrinsicPriorsFns = c("exponential"), intrinsicPriorsValues = list(10), extrinsicPriorsFns = c("fixed"), extrinsicPriorsValues = list(0), generation.time = 100000, checkpointFile = NULL, checkpointFreq = 24, verbose = TRUE, freevector = NULL, taxonDF = NULL)
#> You have only 22 timeSteps on the shortest branch in this dataset #> but should probably have a lot more if you expect change on this branch. #> Please consider decreasing timeStep to no more than 0.0016
#> Using 1 core(s) for simulations #>
#> Doing simulations:
simDataParallel
#> #> result.1 9.760018 0.1797100 -62.91207 8.944325 129.8241 -62.72453 8.140874 #> result.2 9.776337 0.1051511 -50.13737 2.493156 104.2747 -49.92761 1.981982 #> #> result.1 0.9980298 131.4491 -62.42822 15.478625 0.4733353 130.8564 -62.91208 #> result.2 0.9930656 105.8552 -49.06543 1.041436 3.0000000 104.1309 -49.08950 #> #> result.1 8.944334 8.679890e-08 131.8242 -90.37916 184.7583 13.153407 41.13777 #> result.2 3.266063 5.293501e-02 104.1790 -65.31075 134.6215 5.806912 15.95728 #> #> result.1 -22.723057 518.70321 21.948674 29.757689 66.42861 17.294459 #> result.2 -5.891123 42.28749 6.666774 -3.573864 -12.43237 -0.939118 #> #> result.1 -0.5215153 -6.132746 13.6332359 11.380489 9.553875 0.5278427 #> result.2 -6.6938329 -11.011112 0.1808191 -2.571592 -0.376079 4.8719921 #> #> result.1 -7.462962 -1.730839 -0.5490294 -0.8407188 -2.528203 -2.985570 #> result.2 2.220494 -1.163257 -3.7126726 -2.4657865 -2.244137 -0.398084 #> 21 #> result.1 0.3445505 1.6355256 1.247292 0.5431617 -1.5219928 10.671592 #> result.2 0.6449047 0.6889875 -1.152041 -1.3327654 -0.2805405 7.570582 #> 22 23 24 25 26 27 28 #> result.1 17.867251 22.642802 26.772601 28.24163 -11.5749 26.394983 24.810413 #> result.2 6.223887 5.551458 3.777487 4.67578 11.7341 -1.740221 -2.021125 #> 29 30 31 32 33 34 35 #> result.1 29.256352 29.751801 -20.36775 17.327205 32.089605 31.060038 -21.04068 #> result.2 7.528365 -2.509962 13.58509 5.281563 6.848467 -2.320778 15.27782 #> 36 37 38 39 #> result.1 -21.53375 17.753283 33.142705 -20.86863 -7.192592 3.441013 9.819107 #> result.2 12.18929 6.548439 -2.036146 12.40358 -1.860991 -1.392588 -1.218939 #> #> result.1 16.325504 18.0715372 -22.51232 20.109913 19.016627 23.488786 25.185115 #> result.2 -1.738161 -0.6936222 5.95958 -5.058486 -5.080012 4.483322 -4.920989 #> #> result.1 -25.31006 12.66472 27.894605 27.562491 -23.40791 -23.55324 15.735680 #> result.2 10.97575 2.81996 4.633675 -4.167342 14.02801 11.12307 5.483225 #> #> result.1 31.238741 -22.01365 28.53578 32.29349 35.46650 37.219697 38.41173 #> result.2 -3.041363 11.79905 17.00215 13.84036 12.32186 9.293135 10.04518 #> #> result.1 -0.6374868 32.680053 30.604200 35.02392 34.3184871 -15.42544 21.989686 #> result.2 17.5086139 1.578044 1.037761 10.57341 -0.0989346 16.19443 7.743166 #> #> result.1 36.284605 34.5575849 -18.67344 -19.51426 19.770886 35.04667 -19.72361 #> result.2 9.063259 -0.4742136 16.52762 13.25550 7.613652 -1.03093 13.00810 #> attr(,"freevector") #> starting_1 intrinsic_1 extrinsic_1 #> TRUE TRUE FALSE
# }