Differential expression
Usage
diffExp(object, ...)
# S4 method for SingleCellExperiment
diffExp(
object,
numerator,
denominator,
caller = c("edgeR", "DESeq2"),
minCells = 2L,
minCellsPerGene = 1L,
minCountsPerCell = 1L
)
Arguments
- object
Object.
- numerator
character
. Cells to use in the numerator of the contrast (e.g. treatment).- denominator
character
. Cells to use in the denominator of the contrast (e.g. control).- caller
character(1)
. Package to use for differential expression calling. Defaults to"edgeR"
(faster for large datasets) but"DESeq2"
is also supported.- minCells
integer(1)
. Minimum number of cells required to perform the differential expression analysis.- minCellsPerGene
integer(1)
. The minimum number of cells where a gene is expressed, to pass low expression filtering.- minCountsPerCell
integer(1)
. Minimum number of counts per cell for a gene to pass low expression filtering. The number of cells is defined byminCellsPerGene
.- ...
Additional arguments.
Value
Varies depending on the caller
argument:
caller = "edgeR"
:DEGLRT
.caller = "DESeq2"
: UnshrunkenDESeqResults
.
Apply DESeq2::lfcShrink()
if shrunken results are desired.
Details
Perform pairwise differential expression across groups of cells. Currently supports edgeR and DESeq2 as DE callers.
DESeq2
We're providing preliminary support for DESeq2 as the differential expression caller. It is currently considerably slower for large datasets than edgeR.
We're trying to follow the conventions used in DESeq2 for contrasts, defining
the name of the factor in the design formula, numerator, and denominator
level for the fold change calculations. See DESeq2::results()
for details.
Van de Berge and Perraudeau and others have shown the LRT may perform better
for null hypothesis testing, so we use the LRT. In order to use the Wald
test, it is recommended to set useT = TRUE
(not currently in use).
For UMI data, for which the expected counts may be very low, the likelihood
ratio test implemented in nbinomLRT()
should be used.
Note that DESeq2 supports weights()
values automatically, if slotted using
zinbwave (which is no longer recommended for droplet scRNA-seq).
edgeR
The LRT has been shown to perform better for null hypothesis testing with
droplet scRNA-seq data. Here we are using edgeR::glmLRT()
internally.
edgeR is currently significantly faster than DESeq2 for large datasets.
Seurat conventions
Note that Seurat currently uses the convention cells.1
for the numerator
and cells.2
for the denominator. See Seurat::FindMarkers()
for details.
Zero count inflation
We are no longer recommending the use of software that attempts to mitigate zero count inflation (e.g. zinbwave, zingeR) for UMI droplet-based single cell RNA-seq data. Simply model the counts directly.
Examples
data(SingleCellExperiment_Seurat, package = "AcidTest")
object <- SingleCellExperiment_Seurat
## Compare expression in cluster 2 relative to 1.
clusters <- clusters(object)
numerator <- names(clusters)[clusters == "2"]
summary(numerator)
#> Length Class Mode
#> 19 character character
denominator <- names(clusters)[clusters == "1"]
summary(denominator)
#> Length Class Mode
#> 25 character character
## edgeR ====
## > x <- diffExp(
## > object = object,
## > numerator = numerator,
## > denominator = denominator,
## > caller = "edgeR"
## > )
## > class(x)
## > summary(x)
## DESeq2 ====
## This will warn about weights with the minimal example.
## > x <- diffExp(
## > object = object,
## > numerator = numerator,
## > denominator = denominator,
## > caller = "DESeq2"
## > )
## > class(x)
## > summary(x)