% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/nauf-pmmeans.R
\name{nauf-pmmeans}
\alias{nauf-pmmeans}
\alias{nauf_pmmeans}
\alias{nauf_ref.grid}
\title{Predicted marginal means for \code{nauf} models.}
\usage{
nauf_ref.grid(mod, KR = FALSE, ...)

nauf_pmmeans(object, specs, pairwise = FALSE, subset = NULL,
  na_as_level = NULL, ...)
}
\arguments{
\item{mod}{A regression model fit with \code{nauf} contrasts.}

\item{KR}{Only applies when \code{mod} is a \code{\linkS4class{nauf.lmerMod}}
fit with \code{REML = TRUE}. If \code{KR = TRUE}, then the Kenward-Roger
approximation is used to calculate degrees of freedom.  If
\code{KR = FALSE} (the default), then the Satterthwaite approximation is
used.  When \code{mod} is a
\code{\linkS4class{nauf.lmerMod}} fit with \code{REML = FALSE}, then the
Satterthwaite approximation is always used.  The Kenward-Roger method
is implemented with \code{\link[pbkrtest]{Lb_ddf}} and the Satterthwaite
method is implemented with \code{\link[lmerTest]{calcSatterth}}.}

\item{...}{Additional arguments are ignored with a warning.}

\item{object}{A \code{nauf.ref.grid} object created with
\code{nauf_ref.grid}.}

\item{specs}{The fixed effects for which the full interaction term should be
considered in the calculation of predicted marginal means. The preferred
method is to specify the variables as a character vector.  However, they
can also be specified on the right hand side of a formula, optionally with
the keyword \code{pairwise} on the left hand side to indicate that pairwise
comparions should be performed.}

\item{pairwise}{A logical (default \code{FALSE}) indicating whether pairwise
comparisons of the predicted marginal means should be performed.  If
\code{specs} is a formula, then the \code{pairwise} argument is ignored and
the left hand side of the formula is used to determined whether pairwise
comparsions should be made.}

\item{subset}{A list indicating which subsets of the reference grid should be
considered in the calculation of the predicted marginal means. See
'Details'.}

\item{na_as_level}{A character vector of unordered factors in \code{specs}
that have \code{NA} values that should be considered as levels. The default
\code{NULL} indicates that \code{NA} should not be considered as a level
for any unordered factors in \code{specs}. See 'Details'.}
}
\value{
\code{nauf_ref.grid} returns a \code{nauf.ref.grid} object, which is just a
list with one element \code{ref.grid} of class
\code{\link[lsmeans]{ref.grid-class}}.  This reference grid should not be
used directly with \code{\link[lsmeans]{lsmeans}}, but rather only with
\code{nauf_pmmeans}.

\code{nauf_pmmeans} returns a \code{nauf.pmm} object, which is a list
inheriting from \code{lsm.list} with an additional attribute
\code{nauf.specs} containing information about the variables and subsets from
the call to the function.  The \code{nauf.pmm} list contains an element
\code{pmmeans}, and, if pairwise comparisons were made, a second element
\code{contrasts}, both of which are \code{\link[lsmeans]{lsmobj-class}}
objects.  The \code{nauf.pmm} object has \code{summary} and \code{print}
methods which print information from the \code{nauf.specs} attribute, and
then call the \code{\link[lsmeans]{summary.ref.grid}} methods (to which
arguments \code{infer}, \code{type}, \code{adjust}, etc. can be passed).
}
\description{
Create a reference grid for a \code{nauf} model with \code{nauf_ref.grid},
and use the resultnig \code{nauf.ref.grid} as the \code{object} argument to
\code{nauf_pmmeans} to obtain predicted marginal means and pairwise
comparisons, optionally conditioning these predictions on certain subsets of
the data via the \code{subset} argument.
}
\details{
A reference grid creates a data frame which contains all possible
combinations of the factors in a regression model, holding all covariates
at their mean values.  There are many options for
\code{\link[lsmeans]{ref.grid}} which are not currently supported for
\code{nauf} models.  The main functionality which is not currently supported
is that the reference grid cannot be created specifying certain levels for
variables (i.e. the \code{at} argument; this is handled through the
\code{subset} argument to \code{nauf_pmmeans}).  A direct call to
\code{\link[lsmeans]{ref.grid}} will result in warnings (or possibly errors),
and inference made with the resulting object will be misleading and/or
incorrect.  Only \code{nauf_ref.grid} should be used. The \code{nauf.ref.grid}
returned by \code{nauf_ref.grid} can then be used as the \code{object}
argument to \code{nauf_pmmeans} to obtain predicted marginal means and
pairwise comparisons with p-values that adjust for familywise error rate.

The \code{specs} and \code{pairwise} arguments to \code{nauf_pmmeans}
indicate what variables marginal means should be calculated for and wheter
pairwise comparisons of these means should be made.  If \code{specs} is a
character vector, then \code{pairwise} is used; if \code{specs} is a formula,
then the full iteraction of the terms on the right hand side of the formula
is considered, and the left hand side is used to indicate pairwise
comparisons.  For example (where \code{rg} is a \code{nauf.ref.grid}):

\preformatted{
# all of these calculate pmm's for each combination of the factors f1 and f2
# but not pairwise comparisons
nauf_pmmeans(rg, c("f1", "f2"))
nauf_pmmeans(rg, ~ f1 + f2)
nauf_pmmeans(rg, ~ f1 * f2)
nauf_pmmeans(rg, ~ f1:f2)

# all of these calculate the same pmm's, and additionally pairwise comparions
nauf_pmmeans(rg, c("f1", "f2"), pairwise = TRUE)
nauf_pmmeans(rg, pairwise ~ f1 + f2)
nauf_pmmeans(rg, pairwise ~ f1 * f2)
nauf_pmmeans(rg, pairwise ~ f1:f2)
}

If \code{specs} indicates a single covariate, the effect of an increase of
\code{1} in the covariate is computed.  If \code{specs} indicates multiple
covariates, the effect of a simultaneous increase of \code{1} in all of the
covariates is computed.  If \code{specs} indicates a combination of factors
and covariate(s), the the effect of an increase of \code{1} for the
covariates is calcualted for each level of the full interaction of the
factors.

The reference grid returned by \code{nauf_ref.grid} contains combinations of
factors which are not actually possible in the data set.  For example,
if factor \code{f1} has levels \code{A} and \code{B}, and factor \code{f2}
is \code{NA} when \code{f1 = A}, and takes values \code{C} and \code{D}
when \code{f1 = B}, the reference grid will still contain the combinations
\code{f1 = A, f2 = C}; \code{f1 = A, f2 = D}; and \code{f1 = B, f2 = NA},
even though these combinations are not possible.  This is because it is
impossible to know without the user's knowledge which combinations
make sense.  In many cases, this is inconsequential for the computation
of predicted marginal means, since the coding of unordered factors in
\code{nauf} regressions will average over the effects.  In cases where
these rows in the reference grid will cause invalid estimates and pairwise
comparisons, the \code{subset} argument can be used in the call to
\code{nauf_pmmeans} to ensure only the correct subsets are considered.
The default for the \code{subset} argument is \code{NULL}, indicating that
the the entire reference grid should be considered.  If not \code{NULL}, then
\code{subset} must be a list which defines the valid subsets as lists of
named character vectors, where the name of the character vector is an
unordered factor in the model, and the vector itself contains the levels
which define the subset (including \code{NA} in the case of factors which
have \code{NA} values; when \code{NA} is specified as a level, there should
be no quotes around it).  Any row in the reference grid which matches the
definition of at least one of the groups defined in \code{subset} is kept,
and all others are dropped.  So, continuing with the \code{f1} and \code{f2}
example, if \code{f2 = NA} corresponds to \code{f2 = D} in meaning, and is
coded as \code{NA} because all \code{f1 = A} observations are by necessity
\code{f2 = D}, then to analyze the effect of \code{f1}, we want to compare
the groups \code{f1 = A, f2 = NA} and \code{f1 = B, f2 = D}, which we could
do with the following call:

\preformatted{
nauf_pmmeans(rg, "f1", subset = list(
  list(f1 = "A", f2 = NA), list(f1 = "B", f2 = "D")))
}

This would produce an estimate for \code{f1 = A} and \code{f1 = B}, but
conditioning on the subset where \code{f1} is truly contrastive based on
\code{f2}.  If, on the other hand, \code{f2 = NA} does not correspond in
interpretation to either \code{f2 = C} or \code{f2 = D}, but rather indicates
that \code{f2} is simply not meaningful when \code{f1 = A}, we would want to
average over the effect of \code{f2} within \code{f1 = B}, and compare this
result to \code{f1 = A, f2 = NA}, which we could do with the following call:

\preformatted{
nauf_pmmeans(rg, "f1", subset = list(
  list(f1 = "A", f2 = NA), list(f1 = "B", f2 = c("C", "D"))))
}

In this case, the second sub-list in the \code{subset} list indicates that if
\code{f1 = B} and either \code{f2 = C} or \code{f2 = D}, then it belongs to
the second subset.  In this case, the \code{subset} argument is actually not
necessary, since for \code{f1 = A}, we want to \emph{not consider} the effect
\code{f2}, and for \code{f2 = B}, we want to \emph{average over all possible
levels} of \code{f2}, and these are actually the same thing computationally
for unordered factors in \code{nauf} models.  That is, we would get the same
result with:

\preformatted{
nauf_pmmeans(rg, "f1")
}

Generally speaking, if all of the factors in \code{specs} do \emph{not}
contain \code{NA} values, then the \code{subset} argument is unnecessary.
If any of the factors in \code{specs} \emph{do} contain \code{NA} values,
then you will almost always want to use the \code{subset} argument.  Now
consider that we are interested now in \code{f2}.  Because \code{f2} is only
contrastive when \code{f1 = B}, we probably want to call:

\preformatted{
# note that because there are not multiple subsets being specified, you
# don't have to specify subset = list(list(f1 = "B")); nauf_pmmeans will
# assume list(f1 = "B") means list(list(f1 = "B"))
nauf_pmmeans(rg, "f2", subset = list(f1 = "B"))
}

This call will produce two estimates, one for \code{f2 = C} and one for
\code{f2 = D}, conditioning on \code{f1 = B}.  There will be no estimate for
\code{f2 = NA} because, by default, no estiamtes are produced for
combinations of factors where one factor is \code{NA}.  If we wanted to
compare the three possible groups (i.e. \code{f1 = A, f2 = NA};
\code{f1 = B, f2 = C}; and \code{f1 = B, f2 = D}), then we could additionally
use the \code{na_as_level} argument and change our \code{subset}:

\preformatted{
nauf_pmmeans(rg, "f2", subset = list(
  list(f1 = "A", f2 = NA), list(f1 = "B", f2 = c("C", "D"))),
  na_as_level = "f2")

# this gives the same estimates, but the output will also show the
# corresponding level of f1, which is more transparent
nauf_pmmeans(rg, c("f1", "f2"), subset = list(
  list(f1 = "A", f2 = NA), list(f1 = "B", f2 = c("C", "D"))),
  na_as_level = "f2")
}

The easiest way to use the \code{subset} argument is to create a list that
defines valid subsets for different regression terms of interest outside of
\code{nauf_pmmeans}, and then using the relevant element of the list in the
\code{nauf_pmmeans} call.  For example:

\preformatted{
pmmsubs <- list()
pmmsubs$f1 <- list(list(f1 = "A", f2 = NA), list(f1 = "B", f2 = c("C", "D")))
pmmsubs$f2 <- list(f1 = "B")

nauf_pmmeans(rg, "f1", subset = pmmsubs$f1)

nauf_pmmeans(rg, "f2", subset = pmmsubs$f2)

nauf_pmmeans(rg, c("f1", "f2"), subset = pmmsubs$f1, na_as_level = "f2")
}

This way you can just define the different subsets of the data once and not
have to think about it at every \code{nauf_pmmeans} call.
}
\seealso{
\code{\link{nauf_contrasts}}, \code{\link{nauf_glm}}, and
  \code{\link{nauf_glmer}}.
}

