% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dist_categorical.R
\name{dist_categorical}
\alias{dist_categorical}
\title{Compute pairwise distances for categorical data}
\usage{
dist_categorical(x, method = "matching_coefficient")
}
\arguments{
\item{x}{A data frame or matrix containing only categorical variables (factor or character)}

\item{method}{Currently only \code{"matching_coefficient"} is supported.}
}
\value{
A symmetric numeric matrix of pairwise distances. Distance is in the
  range [0, 1], where 0 indicates complete agreement and 1 indicates
  complete disagreement. NA is returned for pairs with no valid comparisons
  (all NA entries).
}
\description{
Internal helper function to compute distances between observations based on
the matching coefficient, which measures the proportion of matching attributes
between two categorical vectors. This approach is particularly useful for
multiclass categorical variables.
}
\details{
The distance between two observations \eqn{i} and \eqn{j} is defined as:
\deqn{d(i, j) = 1 - \frac{\alpha}{p^\prime}}
where \eqn{\alpha} is the number of matching attributes (agreements) and \eqn{p'}
is the number of non-missing comparisons between the two observations.


\itemize{
 \item Only categorical columns (factor or character) are supported; numeric columns
  must be converted prior to using this function.
 \item Missing values (NA) are ignored pairwise. If all attributes are missing
  for a given pair, the distance is returned as NA.
 \item This distance is equivalent to the normalized Hamming distance when
  applied to binary variables.
 \item The matching coefficient satisfies metric properties and can be used
  as a building block for mixed-type distances (e.g., combined with
  quantitative distances via Gower's similarity).
}
}
\examples{
# Small categorical dataset
df <- data.frame(
  A = factor(c("red", "blue", "red")),
  B = factor(c("circle", "circle", "square"))
)
# Compute matching coefficient
dbrobust::dist_categorical(df)

}
\keyword{internal}
