% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/bruvo.r
\name{bruvo.dist}
\alias{bruvo.dist}
\title{Bruvo's distance for microsatellites}
\usage{
bruvo.dist(pop, replen = 1, add = TRUE, loss = TRUE)
}
\arguments{
\item{pop}{a \code{\link{genind}} object}

\item{replen}{a \code{vector} of \code{integers} indicating the length of the
  nucleotide repeats for each microsatellite locus. E.g. a locus with a (CAT)
  repeat would have a repelen value of 3.}

\item{add}{if \code{TRUE}, genotypes with zero values will be treated under
  the genome addition model presented in Bruvo et al. 2004. See the
  \strong{Note} section for options.}

\item{loss}{if \code{TRUE}, genotypes with zero values will be treated under
  the genome loss model presented in Bruvo et al. 2004. See the
  \strong{Note} section for options.}
}
\value{
an object of class \code{\link{dist}}
}
\description{
Calculate the average Bruvo's distance over all loci in a population.
}
\details{
Ploidy is irrelevant with respect to calculation of Bruvo's
  distance. However, since it makes a comparison between all alleles at a
  locus, it only makes sense that the two loci need to have the same ploidy
  level. Unfortunately for polyploids, it's often difficult to fully separate
  distinct alleles at each locus, so you end up with genotypes that appear to
  have a lower ploidy level than the organism.

  To help deal with these situations, Bruvo has suggested three methods for
  dealing with these differences in ploidy levels: \itemize{ \item
  \strong{Infinite Model} - The simplest way to deal with it is to count all
  missing alleles as infinitely large so that the distance between it and
  anything else is 1. Aside from this being computationally simple, it will
  tend to \strong{inflate distances between individuals}. \item
  \strong{Genome Addition Model} - If it is suspected that the organism has
  gone through a recent genome expansion, \strong{the missing alleles will be
  replace with all possible combinations of the observed alleles in the
  shorter genotype}. For example, if there is a genotype of [69, 70, 0, 0]
  where 0 is a missing allele, the possible combinations are: [69, 70, 69,
  69], [69, 70, 69, 70], and [69, 70, 70, 70]. The resulting distances are
  then averaged over the number of comparisons. \item \strong{Genome Loss
  Model} - This is similar to the genome addition model, except that it
  assumes that there was a recent genome reduction event and uses \strong{the
  observed values in the full genotype to fill the missing values in the
  short genotype}. As with the Genome Addition Model, the resulting distances
  are averaged over the number of comparisons. \item \strong{Combination
  Model} - Combine and average the genome addition and loss models. } As
  mentioned above, the infinite model is biased, but it is not nearly as
  computationally intensive as either of the other models. The reason for
  this is that both of the addition and loss models requires replacement of
  alleles and recalculation of Bruvo's distance. The number of replacements
  required is equal to the multiset coefficient: \eqn{\left({n \choose
  k}\right) == {(n+k-1) \choose k}}{choose(n+k-1, k)} where \emph{n} is the
  number of potential replacements and \emph{k} is the number of alleles to
  be replaced. So, for the example given above, The genome addition model
  would require \eqn{\left({2 \choose 2}\right) = 3}{choose(2+2-1, 2) == 3}
  calculations of Bruvo's distance, whereas the genome loss model would
  require \eqn{\left({4 \choose 2}\right) = 10}{choose(4+2-1, 2) == 10}
  calculations.

  To reduce the number of calculations and assumptions otherwise, Bruvo's
  distance will be calculated using the largest observed ploidy in pairwise
  comparisons. This means that when comparing [69,70,71,0] and [59,60,0,0],
  they will be treated as triploids.
}
\note{
\subsection{Model Choice}{ The \code{add} and \code{loss} arguments
  modify the model choice accordingly: \itemize{ \item \strong{Infitine
  Model:}  \code{add = FALSE, loss = FALSE} \item \strong{Genome Addition
  Model:}  \code{add = TRUE, loss = FALSE} \item \strong{Genome Loss Model:}
  \code{add = FALSE, loss = TRUE} \item \strong{Combination Model}
  \emph{(DEFAULT):}  \code{add = TRUE, loss = TRUE} } Details of each model
  choice are described in the \strong{Details} section, above. Additionally,
  genotypes containing all missing values at a locus will return a value of
  \code{NA} and not contribute to the average across loci. }
  \subsection{Repeat Lengths}{ If the user does not provide a vector of
  appropriate length for \code{replen} , it will be estimated by taking the
  minimum difference among represented alleles at each locus. IT IS NOT
  RECOMMENDED TO RELY ON THIS ESTIMATION. }
}
\examples{
# Please note that the data presented is assuming that the nancycat dataset
# contains all dinucleotide repeats, it most likely is not an accurate
# representation of the data.

# Load the nancycats dataset and construct the repeat vector.
data(nancycats)
ssr <- rep(2, 9)

# Analyze the 1st population in nancycats

bruvo.dist(popsub(nancycats, 1), replen = ssr)

# View each population as a heatmap.
\dontrun{
sapply(nancycats$pop.names, function(x)
heatmap(as.matrix(bruvo.dist(popsub(nancycats, x), replen = ssr)), symm=TRUE))
}
}
\author{
Zhian N. Kamvar
}
\references{
Ruzica Bruvo, Nicolaas K. Michiels, Thomas G. D'Souza, and
  Hinrich Schulenburg. A simple method for the calculation of microsatellite
  genotype distances irrespective of ploidy level. Molecular Ecology,
  13(7):2101-2106, 2004.
}
\seealso{
\code{\link{bruvo.boot}}, \code{\link{bruvo.msn}}
}

