% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/SumStat.R
\name{SumStat}
\alias{SumStat}
\title{Calculate summary statistics for propensity score weighting}
\usage{
SumStat(
  ps.formula,
  ps.estimate = NULL,
  trtgrp = NULL,
  Z = NULL,
  covM = NULL,
  zname = NULL,
  xname = NULL,
  data = NULL,
  weight = "ATO",
  delta = 0
)
}
\arguments{
\item{ps.formula}{an object of class \code{\link{formula}} (or one that can be coerced to that class): a symbolic description of the propensity score model to be fitted. Additional details of model specification are given under ‘Details’. This argument is optional if \code{ps.estimate} is not \code{NULL}.}

\item{ps.estimate}{an optional matrix or data frame containing estimated (generalized) propensity scores for each observation. Typically, this is an N by J matrix, where N is the number of observations and J is the total number of treatment levels. Preferably, the column names of this matrix should match the names of treatment level, if column names are missing or there is a mismatch, the column names would be assigned according to the alphabatic order of treatment levels. A vector of propensity score estimates is also allowed in \code{ps.estimate}, in which case a binary treatment is implied and the input is regarded as the propensity to receive the last category of treatment by alphabatic order, unless otherwise stated by \code{trtgrp}.}

\item{trtgrp}{an optional character defining the "treated" population for estimating the average treatment effect among the treated (ATT). Only necessary if \code{weight = "ATT"}. This option can also be used to specify the treatment (in a two-treatment setting) when a vector argument is supplied for \code{ps.estimate}. Default value is the last group in the alphebatic order.}

\item{Z}{an optional vector specifying the values of treatment, only necessary when the covariate matrix \code{covM} is provided instead of \code{data}.}

\item{covM}{an optional covariate matrix or data frame including covariates, their interactions and higher-order terms. When the covariate matrix \code{covM} is provided, the balance statistics are generated according to each column of this matrix.}

\item{zname}{an optional character specifying the name of the treatment variable in \code{data}.}

\item{xname}{an optional vector of characters including the names of covariates in \code{data}.}

\item{data}{an optional data frame containing the variables in the propensity score model. If not found in data, the variables are taken from \code{environment(formula)}.}

\item{weight}{a character or vector of characters including the types of weights to be used. \code{"ATE"} specifies the inverse probability weights for estimating the average treatment effect among the combined population. \code{"ATT"} specifies the weights for estimating the average treatment effect among the treated. \code{"ATO"} specifies the (generalized) overlap weights for estimating the average treatment effect among the overlap population, or population at clinical equipoise. Default is \code{"ATO"}.}

\item{delta}{trimming threshold for estimated (generalized) propensity scores. Should be no larger than 1 / number of treatment groups. Default is 0, corresponding to no trimming.}
}
\value{
SumStat returns a \code{SumStat} object including a list of the following value:
treatment group, propensity scores, propensity score weights, effective sample sizes,
and balance statistics. A summary of \code{SumStat} can be obtained with \code{\link{summary.SumStat}}.

\describe{
\item{\code{ trtgrp}}{a character indicating the treatment group.}

\item{\code{ propensity}}{a data frame of estimated propensity scores.}

\item{\code{ ps.weights}}{a data frame of propensity score weights.}

\item{\code{ ess}}{a table of effective sample sizes. This serves as a conservative measure to
characterize the variance inflation or precision loss due to weighting, see Li and Li (2019).}

\item{\code{ unweighted.sumstat}}{A list of tables including covariate means and variances
by treatment group and standardized mean differences.}

\item{\code{ ATE.sumstat}}{If \code{"ATE"} is included in \code{weight}, this is a list of summary statistics using inverse probability weighting.}

\item{\code{ ATT.sumstat}}{If \code{"ATT"} is included in \code{weight}, this is a list of summary statistics using the ATT weights.}

\item{\code{ ATO.sumstat}}{If \code{"ATO"} is included in \code{weight}, this is a list of summary statistics using the overlap weights.}

\item{\code{ trim}}{If \code{delta > 0}, this is a table summarizing the number of observations before and after trimming.}

}
}
\description{
\code{SumStat} is used to generate distributional plots of the estimated propensity scores and balance
diagnostics after propensity score weighting.
}
\details{
A typical form for \code{ps.formula} is \code{treatment ~ terms} where \code{treatment} is the treatment
variable (identical to the variable name used to specify \code{zname}) and \code{terms} is a series of terms
which specifies a linear predictor for \code{treatment}. \code{ps.formula} specifies logistic or multinomial logistic
models for estimating the propensity scores, when \code{ps.estimate} is \code{NULL}.

When comparing two treatments, \code{ps.estimate} can either be a vector or a two-column matrix of estimated
propensity scores. If a vector is supplied, it is assumed to be the propensity scores to receive the treatment, and
the treatment group corresponds to the last group in the alphebatic order, unless otherwise specified by \code{trtgrp}.
When comparing multiple (J>=3) treatments, \code{ps.estimate} needs to be specified as an N by J matrix,
where N indicates the number of observations, and J indicates the total number of treatments.
This matrix specifies the estimated generalized propensity scores to receive each of the J treatments.
In general, \code{ps.estimate} should have column names that indicate the level of the treatment variable,
which should match the levels given in \code{Z}.
If column names are empty or there is a mismatch, the column names will be created following
the alphebatic order of treatmentlevels. The rightmost coulmn of \code{ps.estimate} is then assumed
to be the treatment group when estimating ATT. \code{trtgrp} can also be used to specify the treatment
group for estimating ATT.

The argument \code{zname} is required when \code{ps.estimate} is not \code{NULL}.

To generate balance statistics, one can directly specify \code{Z} and \code{covM} to indicate the treatment levels and
covariate matrix. Alternatively, one can supply \code{data}, \code{zname}, and \code{xname} to indicate the
same information. When both are specified, the function will prioritize inputs from \code{Z} and \code{covM}.

Current version of \code{PSweight} allows for three types of propensity score weights used to estimate ATE, ATT and
ATO. These weights are members of a larger class of balancing weights defined in Li, Morgan, and Zaslavsky (2018).
When there is a practical violation of the positivity assumption, \code{delta} defines the symmetric
propensity score trimming rule following Crump et al. (2009). With multiple treatments, \code{delta} defines the
multinomial trimming rule introduced in Yoshida et al. (2019). The overlap weights can also be considered as
a data-driven continuous trimming strategy without specifying trimming rules, see Li, Thomas and Li (2019).
Additional details on balancing weights and generalized overlap weights for multiple treatment groups are provided in
Li and Li (2019).
}
\examples{

data("psdata")
# the propensity model
ps.formula<-trt~cov1+cov2+cov3+cov4+cov5+cov6

# using SumStat to estimate propensity scores
msstat <- SumStat(ps.formula, trtgrp="2", data=psdata,
   weight=c("ATE","ATO","ATT"))
summary(msstat)

# importing user-supplied propensity scores "e.h"
fit <- nnet::multinom(formula=ps.formula, data=psdata, maxit=500, trace=FALSE)
e.h <- fit$fitted.values
varname <- c("cov1","cov2","cov3","cov4","cov5","cov6")
msstat0 <- SumStat(zname="trt", xname=varname, data=psdata, ps.estimate=e.h,
   trtgrp="2", weight=c("ATE",'ATT',"ATO"))
summary(msstat0)

}
\references{
Crump, R. K., Hotz, V. J., Imbens, G. W., Mitnik, O. A. (2009).
Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187-199.

Li, F., Morgan, K. L., Zaslavsky, A. M. (2018).
Balancing covariates via propensity score weighting.
Journal of the American Statistical Association, 113(521), 390-400.

Li, F., Thomas, L. E., Li, F. (2019).
Addressing extreme propensity scores via the overlap weights. American Journal of Epidemiology, 188(1), 250-257.

Yoshida, K., Solomon, D.H., Haneuse, S., Kim, S.C., Patorno, E., Tedeschi, S.K., Lyu, H.,
Franklin, J.M., Stürmer, T., Hernández-Díaz, S. and Glynn, R.J. (2019).
Multinomial extension of propensity score trimming methods: A simulation study.
American Journal of Epidemiology, 188(3), 609-616.

Li, F., Li, F. (2019). Propensity score weighting for causal inference with multiple treatments.
The Annals of Applied Statistics, 13(4), 2389-2415.
}
