R/methodsPLS.R
methodsPLS.Rd
Function returnPLSModel
fits a PLS regression (using
plsr
) individually to each freely varying parameter of a model, unlike
a true multivariate PLS regression. A secondary step than limits the number
of components to those that explain some minimum cumulative percentage
of variance (see argument variance.cutoff
). For ABC, this seems to result in much
better results, without one parameter dominating the combined variance.
returnPLSModel( trueFreeValuesMatrix, summaryValuesMatrix, validation = "CV", scale = TRUE, variance.cutoff = 95, verbose = TRUE, segments = min(10, nrow(summaryValuesMatrix) - 1), ... ) PLSTransform(summaryValuesMatrix, pls.model)
trueFreeValuesMatrix | Matrix of true free values from simulations. |
---|---|
summaryValuesMatrix | Matrix of summary statistics from simulations. |
validation | Character argument controlling what
validation procedure is used by |
scale | This argument is passed to |
variance.cutoff | Minimum threshold percentage of variance explained for the number of components included in the final PLS model fit. This value is a percentage and must be between 0 and 100. Default is 95 percent. |
verbose | If |
segments | Number of segments of data used for crossvalidaton
by |
... | Additional arguments, passed to |
pls.model | Output from |
Function returnPLSModel
returns a PLS model, and
function PLSTransform
returns transformed summary statistics.
Function PLSTransform
uses results from a Partial Least Squares (PLS)
model fit with returnPLSModel
to transform summary values.
Function returnPLSModel
effectively wraps function plsr
from package pls
(see documentation at mvr
).
Brian O'Meara and Barb Banbury
# \donttest{ set.seed(1) simPhyExample <- rcoal(20) simPhyExample$edge.length <- simPhyExample$edge.length * 20 # example simulation nSimulations <- 6 simDataParallel <- parallelSimulateWithPriors( nrepSim = nSimulations, multicore = FALSE, coreLimit = 1, phy = simPhyExample, intrinsicFn = brownianIntrinsic, extrinsicFn = nullExtrinsic, startingPriorsFns = "normal", startingPriorsValues = list( c(mean(simCharExample[, 1]), sd(simCharExample[, 1]))), intrinsicPriorsFns = c("exponential"), intrinsicPriorsValues = list(10), extrinsicPriorsFns = c("fixed"), extrinsicPriorsValues = list(0), generation.time = 10000, checkpointFile = NULL, checkpointFreq = 24, verbose = FALSE, freevector = NULL, taxonDF = NULL ) nParFree <- sum(attr(simDataParallel, "freevector")) # separate the simulation results: # 'true' generating parameter values from the summary values trueFreeValuesMat <- simDataParallel[, 1:nParFree] summaryValuesMat <- simDataParallel[, -1:-nParFree] PLSmodel <- returnPLSModel( trueFreeValuesMatrix = trueFreeValuesMat, summaryValuesMatrix = summaryValuesMat, validation = "CV", scale = TRUE, variance.cutoff = 95 , segments = nSimulations )#> Warning: in practice, doing PLS works best if #> you do each free parameter separately, so one parameter does not dominatePLSmodel#> Partial least squares regression , fitted with the kernel algorithm. #> Call: #> plsr(formula = trueFreeValuesMatrix ~ summaryValuesMatrix, ncomp = ncomp.final, scale = scale, validation = "none", segments = segments)PLSTransform( summaryValuesMatrix = summaryValuesMat, pls.model = PLSmodel )#> Comp 1 Comp 2 Comp 3 Comp 4 #> result.1 -1.865662 1.26007718 3.46401033 3.7571723 #> result.2 6.853179 -4.71992485 3.47048417 -0.2673290 #> result.3 2.124331 -0.04484062 -9.14074659 1.0786583 #> result.4 2.059156 6.38273297 1.93767032 -1.0738489 #> result.5 -1.856793 0.10975128 0.01306283 -3.0782262 #> result.6 -7.314211 -2.98779596 0.25551894 -0.4164266# }