Differential expression

Usage

diffExp(object, ...)

# S4 method for SingleCellExperiment
diffExp(
  object,
  numerator,
  denominator,
  caller = c("edgeR", "DESeq2"),
  minCells = 2L,
  minCellsPerGene = 1L,
  minCountsPerCell = 1L
)

Arguments

object: Object.
numerator: character. Cells to use in the numerator of the contrast (e.g. treatment).
denominator: character. Cells to use in the denominator of the contrast (e.g. control).
caller: character(1). Package to use for differential expression calling. Defaults to "edgeR" (faster for large datasets) but "DESeq2" is also supported.
minCells: integer(1). Minimum number of cells required to perform the differential expression analysis.
minCellsPerGene: integer(1). The minimum number of cells where a gene is expressed, to pass low expression filtering.
minCountsPerCell: integer(1). Minimum number of counts per cell for a gene to pass low expression filtering. The number of cells is defined by minCellsPerGene.
...: Additional arguments.

Value

Varies depending on the caller argument:

caller = "edgeR": DEGLRT.
caller = "DESeq2": Unshrunken DESeqResults.

Apply DESeq2::lfcShrink() if shrunken results are desired.

Details

Perform pairwise differential expression across groups of cells. Currently supports edgeR and DESeq2 as DE callers.

Note

Updated 2023-08-17.

DESeq2

We're providing preliminary support for DESeq2 as the differential expression caller. It is currently considerably slower for large datasets than edgeR.

We're trying to follow the conventions used in DESeq2 for contrasts, defining the name of the factor in the design formula, numerator, and denominator level for the fold change calculations. See DESeq2::results() for details.

Van de Berge and Perraudeau and others have shown the LRT may perform better for null hypothesis testing, so we use the LRT. In order to use the Wald test, it is recommended to set useT = TRUE (not currently in use).

For UMI data, for which the expected counts may be very low, the likelihood ratio test implemented in nbinomLRT() should be used.

Note that DESeq2 supports weights() values automatically, if slotted using zinbwave (which is no longer recommended for droplet scRNA-seq).

edgeR

The LRT has been shown to perform better for null hypothesis testing with droplet scRNA-seq data. Here we are using edgeR::glmLRT() internally.

edgeR is currently significantly faster than DESeq2 for large datasets.

Seurat conventions

Note that Seurat currently uses the convention cells.1 for the numerator and cells.2 for the denominator. See Seurat::FindMarkers() for details.

Zero count inflation

We are no longer recommending the use of software that attempts to mitigate zero count inflation (e.g. zinbwave, zingeR) for UMI droplet-based single cell RNA-seq data. Simply model the counts directly.

Examples

data(SingleCellExperiment_Seurat, package = "AcidTest")
object <- SingleCellExperiment_Seurat

## Compare expression in cluster 2 relative to 1.
clusters <- clusters(object)
numerator <- names(clusters)[clusters == "2"]
summary(numerator)
#>    Length     Class      Mode 
#>        19 character character 
denominator <- names(clusters)[clusters == "1"]
summary(denominator)
#>    Length     Class      Mode 
#>        25 character character 

## edgeR ====
## > x <- diffExp(
## >     object = object,
## >     numerator = numerator,
## >     denominator = denominator,
## >     caller = "edgeR"
## > )
## > class(x)
## > summary(x)

## DESeq2 ====
## This will warn about weights with the minimal example.
## > x <- diffExp(
## >     object = object,
## >     numerator = numerator,
## >     denominator = denominator,
## >     caller = "DESeq2"
## > )
## > class(x)
## > summary(x)