% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/manova.update.r
\name{manova.update}
\alias{manova.update}
\title{MANOVA update for lm.rrpp model fits}
\usage{
manova.update(
  fit,
  error = NULL,
  tol = 1e-07,
  PC.no = NULL,
  print.progress = TRUE,
  verbose = NULL
)
}
\arguments{
\item{fit}{Linear model fit from \code{\link{lm.rrpp}}}

\item{error}{An optional character string to define R matrices 
used to calculate invR.H.
(Currently only Residuals can be used and this argument defaults 
to NULL.  Future versions
will update this argument.)}

\item{tol}{A tolerance value for culling data dimensions to 
prevent spurious results.  The distribution
of eigenvalues for the data will be examined and if the decay 
becomes less than the tolerance,
the data will be truncated to principal components ahead of this 
point.  This will possibly prevent spurious results
calculated from eigenvalues near 0.  If tol = 0, all possible 
PC axes are used, which is likely
not a problem if observations outnumber variables.  If tol = 0 
and the number of variables exceeds the number of observations,
the value of tol will be made slightly positive to prevent problems 
with eigen-analysis.}

\item{PC.no}{A value that, if not NULL, can override the tolerance 
argument, and forces a desired number of 
data PCs to use for analysis.  If a value larger than the possible 
number of PCs is chosen, the full compliment of PCs
(the full data space) will be used.  If a number larger than tol 
would permit is chosen, the minimum number of PCs between the tol
argument and PC.no argument is returned.}

\item{print.progress}{A logical value to indicate whether a 
progress bar should be printed to the screen.
This is helpful for long-running analyses.}

\item{verbose}{Either a NULL or logical value for whether to retain
all MANOVA result (if TRUE).  If NULL, the verbose argument used for
the \code{\link{lm.rrpp}} is retained for MANOVA update.  Essentially,
verbose indicates whether to retain all SSCP matrices and all invR.H matrices,
for every model effect, in every RRPP iteration.}
}
\value{
An object of class \code{lm.rrpp} is updated to include 
class \code{manova.lm.rrpp}, and the object,
$MANOVA, which includes
\item{SSCP}{Terms and Model SSCP matrices.}
\item{invR.H}{The inverse of the residuals SSCP times the H SSCP.}
\item{eigs}{The eigenvalues of invR.H.}
\item{e.rank}{Rank of the error (residuals) covariance matrix.  
Currently NULL only.}
\item{PCA}{Principal component analysis of data, using either tol 
or PC.no.}
\item{manova.pc.dims}{Resulting number of PC vectors in the 
analysis.}
\item{e.rank}{Rank of the residual (error) covariance matrix, 
irrespective of the number of dimensions 
 used for analysis.}
}
\description{
Function updates a lm.rrpp fit to add $MANOVA, which like 
$ANOVA, provides statistics
or matrices typically associated with multivariate analysis 
of variance (MANOVA).

MANOVA statistics or sums of squares and cross-products (SSCP) matrices
are calculated over the random permutations of a \code{\link{lm.rrpp}} 
fit.  SSCP matrices are 
computed, as are the inverse of R times H (invR.H), where R is a SSCP 
for the residuals or random effects and H is
the difference between SSCP matrices of full and reduced models 
(see below).   From invR.H,
MANOVA statistics are calculated, including Roy's maximum root 
(eigenvalue), Pillai trace, Hotelling-Lawley trace,
and Wilks lambda (via \code{\link{summary.manova.lm.rrpp}}).

The manova.update to add $MANOVA to \code{\link{lm.rrpp}} fits 
requires more computation time than the $ANOVA
statistics that are computed automatically in \code{\link{lm.rrpp}}.  
Generally, the same inferential conclusions will
be found with either approach, when observations outnumber response 
variables.  For high-dimensional data (more variables
than observations) data are projected into a Euclidean space of 
appropriate dimensions (rank of residual covariance matrix).  
One can vary the tolerance for eigenvalue decay or specify the number 
of PCs, if a smaller set of PCs than the maximum is desired.  
This is advised if there is strong correlation among variables 
(the data space could be simplified to fewer dimensions), as spurious
results are possible.  Because distributions of MANOVA stats 
can be generated from the random permutations,
there is no need to approximate F-values, like with parametric MANOVA. 
By restricting analysis to the real, positive eigenvalues calculated,
all statistics can be calculated (but Wilks lambda, as a product but 
not a trace, might be less reliable as variable number approaches
the number of observations).

 \subsection{ANOVA vs. MANOVA}{ 
 Two SSCP matrices are calculated for each linear model effect, for 
 every random permutation: R (Residuals or Random effects) and
 H, the difference between SSCPs for "full" and "reduced" models. 
 (Full models contain and reduced models lack
 the effect tested; SSCPs are hypothesized to be the same under 
 a null hypothesis, if there is no effect.  The 
 difference, H, would have a trace of 0 if the null hypothesis 
 were true.)  In RRPP, ANOVA and MANOVA correspond to
 two different ways to calculate statistics from R and H matrices.
 
 ANOVA statistics are those that find the trace of R and H SSCP 
 matrices before calculating subsequent statistics,
 including sums of squares (SS), mean squares (MS), and F-values.  
 These statistics can be calculated with univariate data
 and provide univariate-like statistics for multivariate data.  
 These statistics are dispersion measures only (covariances
 among variables do not contribute) and are the same as "distance-based" 
 stats proposed by Goodall (1991) and Anderson (2001).
 MANOVA stats require multivariate data and are implicitly 
 affected by variable covariances.  For MANOVA, the inverse of R times
 H (invR.H) is first calculated for each effect, then eigen-analysis 
 is performed on these matrix products.  Multivariate
 statistics are calculated from the positive, real eigenvalues.  
 In general, inferential
 conclusions will be similar with either approach, but effect 
 sizes might differ.
 
 Two important differences between manova.update and 
 \code{\link{summary.manova}} (for \code{\link{lm}} objects) 
 are that manova.update
 does not attempt to normalize residual SSCP matrices (unneeded 
 for non-parametric statistical solutions) and (2) uses a generalized
 inverse of the residual SSCP, if needed, when the number of 
 variables could render eigen-analysis problematic.  This approach 
 is consistent
 with covariance regularization methods that attempt to make covariance 
 matrices positive-definite for calculating model likelihoods or
 multivariate statistics.  If the number of observations far exceeds 
 the number of response variables, observed statistics from 
 manova.update and 
 \code{\link{summary.manova}} will be quite similar.  If the 
 number of response variables approaches or exceeds the number 
 of observations, manova.update
 statistics will be much more reliable.
 
 ANOVA tables are generated by \code{\link{anova.lm.rrpp}} on 
 lm.rrpp fits and MANOVA tables are generated
 by \code{\link{summary.manova.lm.rrpp}}, after running 
 manova.update on lm.rrpp fits.
 
 Currently, mixed model effects are only possible with 
 $ANOVA statistics, not $MANOVA.
 
 More detail is found in the vignette, ANOVA versus MANOVA.  
}
}
\examples{
   
# Body Shape Analysis (Multivariate) ----------------

data(Pupfish)

# Although not recommended as a practice, this example will use only
# three principal components of body shape for demonstration.  
# A larger number of random permutations should also be used.

Pupfish$shape <- prcomp(Pupfish$coords)$x[, 1:3]

Pupfish$logSize <- log(Pupfish$CS) 

fit <- lm.rrpp(shape ~ logSize + Sex, SS.type = "I", 
data = Pupfish, print.progress = FALSE, iter = 499) 
summary(fit, formula = FALSE)
anova(fit) # ANOVA table

# MANOVA

fit.m <- manova.update(fit, print.progress = FALSE, tol = 0.001)
summary(fit.m, test = "Roy")
summary(fit.m, test = "Pillai")

fit.m$MANOVA$eigs$obs # observed eigenvalues
fit.m$MANOVA$SSCP$obs # observed SSCP
fit.m$MANOVA$invR.H$obs # observed invR.H 

# Distributions of test statistics

summ.roy <- summary(fit.m, test = "Roy")
dens <- apply(summ.roy$rand.stats, 1, density)
par(mfcol = c(1, length(dens)))
for(i in 1:length(dens)) {
     plot(dens[[i]], xlab = "Roy max root", ylab = "Density",
     type = "l", main = names(dens)[[i]])
     abline(v = summ.roy$rand.stats[1, i], col = "red")
}
par(mfcol = c(1,1))

}
\references{
Goodall, C.R. 1991. Procrustes methods in the 
statistical analysis of shape. Journal of the Royal Statistical 
Society B 53:285-339.

Anderson MJ. 2001. A new method for non-parametric 
multivariate analysis of variance.
   Austral Ecology 26: 32-46.
}
\author{
Michael Collyer
}
\keyword{analysis}
