% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fullmatch.R
\name{fullmatch}
\alias{fullmatch}
\alias{full}
\title{Optimal full matching}
\usage{
fullmatch(
  x,
  min.controls = 0,
  max.controls = Inf,
  omit.fraction = NULL,
  mean.controls = NULL,
  tol = 0.001,
  data = NULL,
  ...
)

full(
  x,
  min.controls = 0,
  max.controls = Inf,
  omit.fraction = NULL,
  mean.controls = NULL,
  tol = 0.001,
  data = NULL,
  ...
)
}
\arguments{
\item{x}{Any valid input to \code{match_on}. \code{fullmatch} will use
\code{x} and any optional arguments to generate a distance before performing
the matching.

If \code{x} is a numeric vector, there must also be passed a vector \code{z}
indicating grouping. Both vectors must be named.

Alternatively, a precomputed distance may be entered. A matrix of
non-negative discrepancies, each indicating the permissibility and
desirability of matching the unit corresponding to its row (a 'treatment') to
the unit corresponding to its column (a 'control'); or, better, a distance
specification as produced by \code{\link{match_on}}.}

\item{min.controls}{The minimum ratio of controls to treatments that is to
be permitted within a matched set: should be non-negative and finite.  If
\code{min.controls} is not a whole number, the reciprocal of a whole number,
or zero, then it is rounded \emph{down} to the nearest whole number or
reciprocal of a whole number.

When matching within subclasses (such as those created by
\code{\link{exactMatch}}), \code{min.controls} may be a named numeric vector
separately specifying the minimum permissible ratio of controls to treatments
for each subclass.  The names of this vector should include names of all
subproblems \code{distance}.}

\item{max.controls}{The maximum ratio of controls to treatments that is
to be permitted within a matched set: should be positive and numeric.
If \code{max.controls} is not a whole number, the reciprocal of a
whole number, or \code{Inf}, then it is rounded \emph{up} to the
nearest whole number or reciprocal of a whole number.

When matching within subclasses (such as those created by
\code{\link{exactMatch}}), \code{max.controls} may be a named numeric vector
separately specifying the maximum permissible ratio of controls to treatments
in each subclass.}

\item{omit.fraction}{Optionally, specify what fraction of controls or treated
subjects are to be rejected.  If \code{omit.fraction} is a positive fraction
less than one, then \code{fullmatch} leaves up to that fraction of the control
reservoir unmatched.  If \code{omit.fraction} is a negative number greater
than -1, then \code{fullmatch} leaves up to |\code{omit.fraction}| of the
treated group unmatched.  Positive values are only accepted if
\code{max.controls} >= 1; negative values, only if \code{min.controls} <= 1.
If neither \code{omit.fraction} or \code{mean.controls} are specified, then
only those treated and control subjects without permissible matches among the
control and treated subjects, respectively, are omitted.

When matching within subclasses (such as those created by
\code{\link{exactMatch}}), \code{omit.fraction} specifies the fraction of
controls to be rejected in each subproblem, a parameter that can be made to
differ by subclass by setting \code{omit.fraction} equal to a named numeric
vector of fractions.

At most one of \code{mean.controls} and \code{omit.fraction} can be non-\code{NULL}.}

\item{mean.controls}{Optionally, specify the average number of controls per
treatment to be matched. Must be no less than than \code{min.controls} and no
greater than the either \code{max.controls} or the ratio of total number of
controls versus total number of treated. Some controls will likely not be
matched to ensure meeting this value. If neither \code{omit.fraction} or
\code{mean.controls} are specified, then only those treated and control
subjects without permissible matches among the control and treated subjects,
respectively, are omitted.

When matching within subclasses (such as those created by
\code{\link{exactMatch}}), \code{mean.controls} specifies the average number of
controls per treatment per subproblem, a parameter that can be made to
differ by subclass by setting \code{mean.controls} equal to a named numeric
vector.

At most one of \code{mean.controls} and \code{omit.fraction} can be non-\code{NULL}.}

\item{tol}{Because of internal rounding, \code{fullmatch} may
solve a slightly different matching problem than the one
specified, in which the match generated by
\code{fullmatch} may not coincide with an optimal solution of
the specified problem.  \code{tol} times the number of subjects
to be matched specifies the extent to
which \code{fullmatch}'s output is permitted to differ from an
optimal solution to the original problem, as measured by the
sum of discrepancies for all treatments and controls placed
into the same matched sets.}

\item{data}{Optional \code{data.frame} or \code{vector} to use to get order
of the final matching factor. If a \code{data.frame}, the \code{rownames}
are used. If a vector, the \code{names} are first tried, otherwise the contents
is considered to be a character vector of names. Useful to pass if you want to
combine a match (using, e.g., \code{cbind}) with the data that were used to
generate it (for example, in a propensity score matching).}

\item{...}{Additional arguments, passed to \code{match_on} (e.g. \code{within}) 
or to specific methods.}
}
\value{
A \code{\link{optmatch}} object (\code{factor}) indicating matched groups.
}
\description{
Given two groups, such as a treatment and a control group, and a method of
creating a treatment-by-control discrepancy matrix indicating desirability and
permissibility of potential matches (or optionally an already created such
discrepancy matrix), create optimal full matches of members of the groups.
Optionally, incorporate restrictions on matched sets' ratios of treatment to
control units.
}
\details{
If passing an already created discrepancy matrix, finite entries indicate
permissible matches, with smaller discrepancies indicating more desirable
matches.  The matrix must have row and column names.

If it is desirable to create the discrepancies matrix beforehand (for example,
if planning on running several different matching schemes), consider using
\code{\link{match_on}} to generate the distances. This generic function has
several useful methods for handling propensity score models, computing
Mahalanobis distances (and other arbitrary distances), and using user supplied
functions. These distances can also be combined with those generated by
\code{\link{exactMatch}} and \code{\link{caliper}} to create very nuanced
matching specifications.

The value of \code{tol} can have a substantial effect on computation time;
with smaller values, computation takes longer.  Not every tolerance can be
met, and how small a tolerance is too small varies with the machine and with
the details of the problem.  If \code{fullmatch} can't guarantee that the
tolerance is as small as the given value of argument \code{tol}, then
matching proceeds but a warning is issued.

By default, \code{fullmatch} will attempt, if the given constraints are
infeasible, to find a feasible problem using the same constraints.  This
will almost surely involve using a more restrictive \code{omit.fraction} or
\code{mean.controls}. (This will never automatically omit treatment units.)
Note that this does not guarantee that the returned match has the least
possible number of omitted subjects, it only gives a match that is feasible
within the given constraints. It may often be possible to loosen the
\code{omit.fraction} or \code{mean.controls} constraint and still find a
feasible match. The auto recovery is controlled by
\code{options("fullmatch_try_recovery")}.

In full matching problems permitting many-one matches (\code{min.controls}
less than 1), the number of controls contributing to matches can exceed
what was requested by setting a value of \code{mean.controls} or
\code{omit.fraction}.  I.e., in this setting \code{mean.controls} sets
the minimum ratio of number of controls to number of treatments placed
into matched sets.

If the program detects that (what it thinks is) a large problem,
a warning is issued. Unless you have an older computer, there's a good
chance that you can handle larger problems (at the cost of increased
computation time). To check the large problem threshold, use
\code{\link{getMaxProblemSize}}; to re-set it, use
\code{\link{setMaxProblemSize}}.
}
\examples{
data(nuclearplants)
### Full matching on a Mahalanobis distance.
( fm1 <- fullmatch(pr ~ t1 + t2, data = nuclearplants) )
summary(fm1)

### Full matching with restrictions.
( fm2 <- fullmatch(pr ~ t1 + t2, min.controls = .5, max.controls = 4, data = nuclearplants) )
summary(fm2)

### Full matching to half of available controls.
( fm3 <- fullmatch(pr ~ t1 + t2, omit.fraction = .5, data = nuclearplants) )
summary(fm3)

### Full matching attempts recovery when the initial restrictions are infeasible.
### Limiting max.controls = 1 allows use of only 10 of 22 controls.
( fm4 <- fullmatch(pr ~ t1 + t2, max.controls = 1, data=nuclearplants) )
summary(fm4)
### To recover restrictions
optmatch_restrictions(fm4)

### Full matching within a propensity score caliper.
ppty <- glm(pr ~ . - (pr + cost), family = binomial(), data = nuclearplants)
### Note that units without counterparts within the caliper are automatically dropped.
### For more complicated models, create a distance matrix and pass it to fullmatch.
mhd <- match_on(pr ~ t1 + t2, data = nuclearplants) + caliper(match_on(ppty), width = 1)
( fm5 <- fullmatch(mhd, data = nuclearplants) )
summary(fm5)

### Propensity balance assessment. Requires RItools package.
if (require(RItools)) summary(fm5,ppty)

### The order of the names in the match factor is the same
### as the nuclearplants data.frame since we used the data argument
### when calling fullmatch. The order would be unspecified otherwise.
cbind(nuclearplants, matches = fm5)

### Match in subgroups only. There are a few ways to specify this.
m1 <- fullmatch(pr ~ t1 + t2, data=nuclearplants,
                within=exactMatch(pr ~ pt, data=nuclearplants))
m2 <- fullmatch(pr ~ t1 + t2 + strata(pt), data=nuclearplants)
### Matching on propensity scores within matching in subgroups only:
m3 <- fullmatch(glm(pr ~ t1 + t2, data=nuclearplants, family=binomial),
                data=nuclearplants,
                within=exactMatch(pr ~ pt, data=nuclearplants))
m4 <- fullmatch(glm(pr ~ t1 + t2 + pt, data=nuclearplants,
                    family=binomial),
                data=nuclearplants,
                within=exactMatch(pr ~ pt, data=nuclearplants))
m5 <- fullmatch(glm(pr ~ t1 + t2 + strata(pt), data=nuclearplants,
                    family=binomial), data=nuclearplants)
# Including `strata(foo)` inside a glm uses `foo` in the model as
# well, so here m4 and m5 are equivalent. m3 differs in that it does
# not include `pt` in the glm.
}
\references{
Hansen, B.B. and Klopfer, S.O. (2006), \sQuote{ Optimal full matching and related designs via network flows},
 \emph{Journal of Computational and Graphical Statistics}, \bold{15}, 609--627.

 Hansen, B.B. (2004), \sQuote{Full Matching in an Observational Study
 of Coaching for the SAT}, \emph{Journal of the American
 Statistical Association}, \bold{99}, 609--618.

 Rosenbaum, P. (1991), \sQuote{A Characterization of Optimal Designs for Observational
 Studies}, \emph{Journal of the Royal Statistical Society, Series B},
 \bold{53}, 597--610.
}
\keyword{nonparametric}
\keyword{optimize}
