Function returnPLSModel fits a PLS regression (using plsr) individually to each freely varying parameter of a model, unlike a true multivariate PLS regression. A secondary step than limits the number of components to those that explain some minimum cumulative percentage of variance (see argument variance.cutoff). For ABC, this seems to result in much better results, without one parameter dominating the combined variance.

returnPLSModel(
  trueFreeValuesMatrix,
  summaryValuesMatrix,
  validation = "CV",
  scale = TRUE,
  variance.cutoff = 95,
  verbose = TRUE,
  segments = min(10, nrow(summaryValuesMatrix) - 1),
  ...
)

PLSTransform(summaryValuesMatrix, pls.model)

Arguments

trueFreeValuesMatrix

Matrix of true free values from simulations.

summaryValuesMatrix

Matrix of summary statistics from simulations.

validation

Character argument controlling what validation procedure is used by plsr. Default is "CV" for cross-validation.

scale

This argument is passed to plsr. It may be a numeric vector, or logical. If numeric vector, the input is scaled by dividing each variable with the corresponding element of scale. If scale = TRUE, the inpus is scaled by dividing each variable by its sample standard deviation. If cross-validation is selected (the default for returnPLSModel), scaling by the standard deviation is done for every segment.

variance.cutoff

Minimum threshold percentage of variance explained for the number of components included in the final PLS model fit. This value is a percentage and must be between 0 and 100. Default is 95 percent.

verbose

If TRUE, helpful warning messages will be made when you make questionable decisions.

segments

Number of segments of data used for crossvalidaton by pls::cvsegments. The number of segments cannot exceed the number of simulations. The default number of segments as set for normal use of plsr is 10, which leads to issues when a trial analysis uses fewer than 10 simulations. Instead, we will pass an alternative value for the number of segments - either 10, or the number of rows in summaryValuesMatrix. Thus, this is a default value of min(10, nrow(summaryValuesMatrix)), or can be changed by the user. A hard minimum of 3 is required.

...

Additional arguments, passed to plsr.

pls.model

Output from returnPLSModel.

Value

Function returnPLSModel returns a PLS model, and function PLSTransform returns transformed summary statistics.

Details

Function PLSTransform uses results from a Partial Least Squares (PLS) model fit with returnPLSModel to transform summary values.

See also

Function returnPLSModel effectively wraps function plsr from package pls (see documentation at mvr).

Author

Brian O'Meara and Barb Banbury

Examples

# \donttest{ set.seed(1) simPhyExample <- rcoal(20) simPhyExample$edge.length <- simPhyExample$edge.length * 20 # example simulation nSimulations <- 6 simDataParallel <- parallelSimulateWithPriors( nrepSim = nSimulations, multicore = FALSE, coreLimit = 1, phy = simPhyExample, intrinsicFn = brownianIntrinsic, extrinsicFn = nullExtrinsic, startingPriorsFns = "normal", startingPriorsValues = list( c(mean(simCharExample[, 1]), sd(simCharExample[, 1]))), intrinsicPriorsFns = c("exponential"), intrinsicPriorsValues = list(10), extrinsicPriorsFns = c("fixed"), extrinsicPriorsValues = list(0), generation.time = 10000, checkpointFile = NULL, checkpointFreq = 24, verbose = FALSE, freevector = NULL, taxonDF = NULL ) nParFree <- sum(attr(simDataParallel, "freevector")) # separate the simulation results: # 'true' generating parameter values from the summary values trueFreeValuesMat <- simDataParallel[, 1:nParFree] summaryValuesMat <- simDataParallel[, -1:-nParFree] PLSmodel <- returnPLSModel( trueFreeValuesMatrix = trueFreeValuesMat, summaryValuesMatrix = summaryValuesMat, validation = "CV", scale = TRUE, variance.cutoff = 95 , segments = nSimulations )
#> Warning: in practice, doing PLS works best if #> you do each free parameter separately, so one parameter does not dominate
PLSmodel
#> Partial least squares regression , fitted with the kernel algorithm. #> Call: #> plsr(formula = trueFreeValuesMatrix ~ summaryValuesMatrix, ncomp = ncomp.final, scale = scale, validation = "none", segments = segments)
PLSTransform( summaryValuesMatrix = summaryValuesMat, pls.model = PLSmodel )
#> Comp 1 Comp 2 Comp 3 Comp 4 #> result.1 -1.865662 1.26007718 3.46401033 3.7571723 #> result.2 6.853179 -4.71992485 3.47048417 -0.2673290 #> result.3 2.124331 -0.04484062 -9.14074659 1.0786583 #> result.4 2.059156 6.38273297 1.93767032 -1.0738489 #> result.5 -1.856793 0.10975128 0.01306283 -3.0782262 #> result.6 -7.314211 -2.98779596 0.25551894 -0.4164266
# }