% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/hybridRepairFilter.R
\name{hybridRepairFilter}
\alias{hybridRepairFilter}
\alias{hybridRepairFilter.default}
\alias{hybridRepairFilter.formula}
\title{Hybrid Repair-Remove Filter}
\usage{
\method{hybridRepairFilter}{formula}(formula, data, ...)

\method{hybridRepairFilter}{default}(x, consensus = FALSE,
  noiseAction = "remove", classColumn = ncol(x), ...)
}
\arguments{
\item{formula}{A formula describing the classification variable and the attributes to be used.}

\item{data, x}{Data frame containing the tranining dataset to be processed.}

\item{...}{Optional parameters to be passed to other methods.}

\item{consensus}{If set to \code{TRUE}, consensus voting scheme is applied to identify noisy instances. Otherwise (default),
majority approach is used.}

\item{noiseAction}{Character which can be set to "remove", "repair" or "hybrid". The filter accordingly decides
what to do with the identified noise (see Details).}

\item{classColumn}{Positive integer indicating the column which contains the (factor of) classes.
By default, the last column is considered.}
}
\value{
An object of class \code{filter}, which is a list with seven components:
\itemize{
   \item \code{cleanData} is a data frame containing the filtered dataset.
   \item \code{remIdx} is a vector of integers indicating the indexes for
   removed instances (i.e. their row number with respect to the original data frame).
   \item \code{repIdx} is a vector of integers indicating the indexes for
   repaired/relabelled instances (i.e. their row number with respect to the original data frame).
   \item \code{repLab} is a factor containing the new labels for repaired instances.
   \item \code{parameters} is a list containing the argument values.
   \item \code{call} contains the original call to the filter.
   \item \code{extraInf} is a character that includes additional interesting
   information not covered by previous items.
}
}
\description{
Ensemble-based filter for removing or repairing label noise from a dataset as a
preprocessing step of classification. For more information, see 'Details' and
'References' sections.
}
\details{
As presented in (Miranda et al., 2009), \code{hybridRepairFilter} builds on the dataset an ensemble of four
classifiers: SVM, Neural Network, CART, KNN (combining k=1,3,5). According to their predictions and
majority or consensus voting schemes, a
subset of instances are labeled as noise. These are removed if \code{noiseAction} equals "remove", their class
is changed into the most voted among the ensemble if \code{noiseAction} equals "repair", and when the latter
is set to "hybrid", the vote of KNN decides whether remove or repair.

All this procedure is repeated while the accuracy (over the original dataset) of the ensemble
trained with the processed dataset increases.
}
\examples{
# Next example is not run in order to save time
\dontrun{
data(iris)
out <- hybridRepairFilter(iris, noiseAction = "hybrid")
summary(out, explicit = TRUE)
}
}
\references{
Miranda A. L., Garcia L. P. F., Carvalho A. C., Lorena A. C. (2009): Use of
classification algorithms in noise detection and elimination. In \emph{Hybrid Artificial
Intelligence Systems} (pp. 417-424). Springer Berlin Heidelberg.
}

