% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/HierPoolPrev.R
\name{HierPoolPrev}
\alias{HierPoolPrev}
\title{Estimation of prevalence based on presence/absence tests on pooled samples in
a hierarchical sampling frame}
\usage{
HierPoolPrev(
  data,
  result,
  poolSize,
  hierarchy,
  ...,
  prior.alpha = 0.5,
  prior.beta = 0.5,
  prior.absent = 0,
  hyper.prior.sd = 2,
  level = 0.95,
  verbose = FALSE,
  cores = NULL,
  iter = 2000,
  warmup = iter/2,
  chains = 4,
  control = list(adapt_delta = 0.9)
)
}
\arguments{
\item{data}{A \code{data.frame} with one row for each pooled sampled and
columns for the size of the pool (i.e. the number of specimens / isolates /
insects pooled to make that particular pool), the result of the test of the
pool. It may also contain additional columns with additional information
(e.g. location where pool was taken) which can optionally be used for
splitting the data into smaller groups and calculating prevalence by group
(e.g. calculating prevalence for each location)}

\item{result}{The name of column with the result of each test on each pooled
sample. The result must be stored with 1 indicating a positive test result
and 0 indicating a negative test result.}

\item{poolSize}{The name of the column with number of
specimens/isolates/insects in each pool}

\item{hierarchy}{The name of column(s) indicating the group membership. In a
nested sampling design with multiple levels of grouping the lower-level
groups must have names/numbers that differentiate them from all other
groups at the same level. E.g. If sampling was performed at 200 sites
across 10 villages (20 site per village), then there should be 200 unique
names for the sites. If, for instance, the sites are instead numbered 1 to
20 within each village, the village identifier (e.g. A, B, C...) should be
combined with the site number to create unique identifiers for each site
(e.g. A-1, A-2... for sites in village A and B-1, B-2... for the sites in
village B etc.)}

\item{...}{Optional name(s) of columns with variables to stratify the data by.
If omitted the complete dataset is used to estimate a single prevalence.
If included prevalence is estimated separately for each group defined by
these columns}

\item{prior.alpha, prior.beta, prior.absent}{The prior on the prevalence in
each group takes the form of beta distribution (with parameters alpha and
beta). The default is \code{prior.alpha = prior.beta = 1/2}. Another popular
uninformative choice is \code{prior.alpha = prior.beta = 1}, i.e. a uniform
prior. \code{prior.absent} is included for consistency with \code{PoolPrev},
but is currently ignored}

\item{hyper.prior.sd}{Scale for the half-Cauchy hyper-prior for standard deviations
of random/group effect terms. Defaults to 2, which is weakly informative since
it implies that 50\% of random/group effects terms will be within a order of
magnitude of each other, and 90\% of random/group effects will be within four
orders of magnitude of each other. Decrease if you think group differences are
are smaller than this, and increase if you think group differences may often
reasonably be larger than this}

\item{level}{The confidence level to be used for the confidence and credible
intervals. Defaults to 0.95 (i.e. 95\% intervals)}

\item{verbose}{Logical indicating whether to print progress to screen.
Defaults to false (no printing to screen)}

\item{cores}{The number of CPU cores to be used. By default one core is used}

\item{iter, warmup, chains}{MCMC options for passing onto the sampling
routine. See \link[rstan]{stan} for details.}

\item{control}{A named list of parameters to control the sampler's behaviour.
Defaults to default values as defined in \link[rstan]{stan}, except for
\code{adapt_delta} which is set to the more conservative value of 0.9. See
\link[rstan]{stan} for details.}
}
\value{
A \code{data.frame} with columns:
  \itemize{\item{\code{PrevBayes} the (Bayesian) posterior expectation}
           \item{\code{CrILow} and \code{CrIHigh} -- lower and upper bounds
                 for credible intervals}
           \item{\code{NumberOfPools} -- number of pools}
           \item{\code{NumberPositive} -- the number of positive pools} }
  If grouping variables are provided in \code{...} there will be an additional
  column for each grouping variable. When there are no grouping variables
  (supplied in \code{...}) then the output has only one row with the
  prevalence estimates for the whole dataset. When grouping variables are
  supplied, then there is a separate row for each group.
}
\description{
Estimation of prevalence based on presence/absence tests on pooled samples in
a hierarchical sampling frame
}
\examples{
# Calculate prevalence for a synthetic dataset consisting of pools (sizes 1, 5,
# or 10) taken from 4 different regions and 3 different years. Within each
# region specimens are collected at 4 different villages, and within each
# village specimens are collected at 8 different sites.

\donttest{
  #Prevalence for each combination of region and year:
  #ignoring hierarchical sampling frame within each region
  PoolPrev(SimpleExampleData, Result, NumInPool, Region, Year)
  #accounting hierarchical sampling frame within each region
  HierPoolPrev(SimpleExampleData, Result, NumInPool, c("Village","Site"), Region, Year)
}


}
\seealso{
\code{\link{PoolPrev}},
     \code{\link{getPrevalence}}
}
