% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/DiffGenes.R
\name{DiffGenes}
\alias{DiffGenes}
\title{Differential Gene Expression Analysis}
\usage{
DiffGenes(ExpressionSet, nrep, method = "foldchange", lib.size = NULL,
  p.adjust.method = NULL, comparison = NULL, alpha = NULL,
  filter.method = NULL, n = NULL, stage.names = NULL)
}
\arguments{
\item{ExpressionSet}{a standard PhyloExpressionSet or DivergenceExpressionSet object.}

\item{nrep}{either a numeric value specifying the constant number of replicates per stage or a numeric vector specifying the variable number of replicates for each stage position.}

\item{method}{method to detect differentially expressed genes.}

\item{lib.size}{the library sizes to equalize library sizes by quantile-to-quantile normalization (see \code{\link[edgeR]{equalizeLibSizes}}).}

\item{p.adjust.method}{p value correction method.}

\item{comparison}{a character string specifying whether genes having fold-change or p-values
 below, above, or below AND above (both) the \code{alpha} value should be excluded from the dataset.
 In case \code{comparison = "both"} is chosen, the \code{cut.off} argument must be a two dimensional vector defining the lower \code{alpha} value at the first position and the upper \code{alpha} value
at the second position.}

\item{alpha}{a numeric value specifying the cut-off value above which Genes fulfilling the corresponding fold-change, log-fold-change, or p-value should be retained and returned by \code{DiffGenes}.}

\item{filter.method}{a method how to \code{alpha} values in multiple stages. Options are \code{"const"}, \code{"min-set"}, and \code{"n-set"}.}

\item{n}{a numeric value for \code{method = "n-set"}.}

\item{stage.names}{a character vector specifying the new names of collapsed stages.}
}
\description{
Detect differentially expressed genes (DEGs) in a standard \code{ExpressionSet} object.
}
\details{
All methods to perform dection of differentially expressed genes assume that your input
dataset has been normalized before passing it to \emph{DiffGenes}. For RNA-Seq data
\emph{DiffGenes} assumes that the libraries have been normalized to have the same size, i.e.,
to have the same expected column sum under the null hypothesis. If this isn't the case
please run \code{\link[edgeR]{equalizeLibSizes}} before calling \emph{DiffGenes}. 

Available methods for the detection of differentially expressed genes:

\itemize{
\item \code{method = "foldchange"}: ratio of replicate geometric means between developmental stages.
 Here, the \emph{DiffGenes} functions assumes that absolute expression levels are stored in your input \code{ExpresisonSet}.
\item \code{method = "log-foldchange"}: difference of replicate arithmetic means between developmental stages. Here, the \emph{DiffGenes} functions assumes that \emph{log2a}
transformed expression levels are stored in your input \code{ExpresisonSet}.
\item \code{method = "t.test"}: Welch t.test between replicate expression levels of two samples.
\item \code{method = "wilcox.test"}: Wilcoxon Rank Sum Test between replicate expression levels of two samples.
\item \code{method = "doubletail"}: Computes two-sided p-values by doubling the smaller tail probability (see \code{\link[edgeR]{exactTestDoubleTail}} for details).
\item \code{method = "smallp"}: Performs the method of small probabilities as proposed by Robinson and Smyth (2008) (see \code{\link[edgeR]{exactTestBySmallP}} for details).
\item \code{method = "deviance"}: Uses the deviance goodness of fit statistics to define the rejection region, and is therefore equivalent to a conditional likelihood ratio test (see \code{\link[edgeR]{exactTestByDeviance}} for details).
}


Exclude non differentially expressed genes from the result dataset:

When specifying the \code{alpha} argument you furthermore, need to specify the \code{filter.method} to decide how non differentially expressed genes should be classified in multiple sample comparisons and which genes should be retained in the final dataset returned by \code{DiffGenes}. In other words, all genes < \code{alpha} based on the following \code{filter.method} are removed from the result dataset.

Following extraction criteria are implemented in this function: 

\itemize{
\item \code{const}: all genes that have at least one sample comparison that undercuts or exceeds the \code{alpha} value \code{cut.off} will be excluded from the \code{ExpressionSet}. Hence, for a 7 stage \code{ExpressionSet} genes passing the \code{alpha} threshold in 6 stages will be retained in the \code{ExpressionSet}.
\item \code{min-set}: genes passing the \code{alpha} value in \code{ceiling(n/2)} stages will be retained in the \code{ExpressionSet}, where \emph{n} is the number of stages in the \code{ExpressionSet}.
\item \code{n-set}: genes passing the \code{alpha} value in \code{n} stages will be retained in the \code{ExpressionSet}. Here, the argument \code{n} needs to be specified.
}
}
\note{
In case input \code{ExpressionSet} objects store 0 values, internally all expression levels are 
shifted by \code{+1} to allow sufficient fold-change and p-value computations. Additionally, a warning
is printed to the console in case expression levels have been automatically shifted.
}
\examples{

data(PhyloExpressionSetExample)

# Detection of DEGs using the fold-change measure
DEGs <- DiffGenes(ExpressionSet = PhyloExpressionSetExample[ ,1:8],
                  nrep          = 2,
                  comparison    = "below",
                  method        = "foldchange",
                  stage.names   = c("S1","S2","S3"))


head(DEGs)


# Detection of DEGs using the log-fold-change measure
# when choosing method = "log-foldchange" it is assumed that
# your input expression matrix stores log2 expression levels 
log.DEGs <- DiffGenes(ExpressionSet = tf(PhyloExpressionSetExample[1:5,1:8],log2),
                      nrep          = 2,
                      comparison    = "below",
                      method        = "log-foldchange",
                      stage.names   = c("S1","S2","S3"))


head(log.DEGs)


# Remove fold-change values < 2 from the dataset:

## first have a look at the range of fold-change values of all genes 
apply(DEGs[ , 3:8],2,range)

# now remove genes undercutting the alpha = 2 threshold
# hence, remove genes having p-values <= 0.05 in at
# least one sample comparison
DEGs.alpha <- DiffGenes(ExpressionSet = PhyloExpressionSetExample[1:250 ,1:8],
                        nrep          = 2,
                        method        = "t.test",
                        alpha         = 0.05,
                        comparison    = "above",
                        filter.method = "n-set",
                        n             = 1,
                        stage.names   = c("S1","S2","S3"))

# now again have a look at the range and find
# that fold-change values of 2 are the min value
apply(DEGs.alpha[ , 3:5],2,range)

# now check whether each example has at least one stage with a p-value <= 0.05
head(DEGs.alpha)

}
\seealso{
\code{\link{Expressed}}
}
\author{
Hajk-Georg Drost
}
