bcbioSingleCell
is an S4 class that extends SingleCellExperiment
, and is
designed to store a bcbio single-cell RNA-seq analysis. This class contains
read counts saved as a sparse matrix (sparseMatrix
), sample metadata, and
cell quality control metrics.
Usage
bcbioSingleCell(
uploadDir,
sampleMetadataFile = NULL,
organism = NULL,
ensemblRelease = NULL,
genomeBuild = NULL,
gffFile = NULL,
transgeneNames = NULL,
interestingGroups = "sampleName"
)
Arguments
- uploadDir
character(1)
. Final upload directory path.- sampleMetadataFile
character(1)
. Sample metadata file path. CSV or TSV is preferred, but Excel worksheets are also supported. Check the documentation for conventions and required columns.- organism
character(1)
. Full Latin organism name (e.g."Homo sapiens"
).- ensemblRelease
integer(1)
. Ensembl release version (e.g.100
). We recommend setting this value if possible, for improved reproducibility. When left unset, the latest release available via AnnotationHub/ensembldb is used. Note that the latest version available can vary, depending on the versions of AnnotationHub and ensembldb in use.- genomeBuild
character(1)
. Ensembl genome build assembly name (e.g."GRCh38"
). If setNULL
, defaults to the most recent build available. Note: don't pass in UCSC build IDs (e.g."hg38"
).- gffFile
character(1)
. GFF/GTF (General Feature Format) file. Generally, we recommend using GTF (GFFv2) instead of GFFv3.- transgeneNames
character
. Vector indicating which assay rows denote transgenes (e.g. EGFP, TDTOMATO).- interestingGroups
character
. Groups of interest to use for visualization. Corresponds to factors describing the columns of the object.
Remote data
When working in RStudio, we recommend connecting to the bcbio-nextgen run directory as a remote connection over sshfs.
Examples
uploadDir <- system.file("extdata/indrops", package = "bcbioSingleCell")
x <- bcbioSingleCell(uploadDir)
#>
#> ────────────────────────────────────────────────────────────────────────────────
#> bcbioSingleCell
#> ────────────────────────────────────────────────────────────────────────────────
#> → Importing bcbio-nextgen single-cell RNA-seq run
#> projectDir: 2018-01-01_bcbio
#> ℹ 1 sample detected:
#> • multiplexed_AAAAAAAA
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/project-summary.yaml using yaml::`read_yaml()`.
#> ! Data versions are missing.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/programs.txt using base::`read.table()`.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/bcbio-nextgen.log using base::`readLines()`.
#> ! bcbio-nextgen.log file is empty.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/bcbio-nextgen-commands.log using base::`readLines()`.
#> ℹ 1000 reads per cellular barcode cutoff detected.
#> Counts will imported as genes.
#> UMI type: harvard-indrop-v3
#> ── Sample metadata
#> ── Counts
#> → Importing counts.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.rownames using base::`readLines()`.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.colnames using base::`readLines()`.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx using Matrix::`readMM()`.
#> → Importing sidecar /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.rownames.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.rownames using base::`readLines()`.
#> → Importing sidecar /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.colnames.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.colnames using base::`readLines()`.
#> → Imported multiplexed-AAAAAAAA.
#> ── Feature metadata
#> bcbio GTF file:
#> /n/app/bcbio/dev/genomes/Hsapiens/hg38/rnaseq/ref-transcripts.gtf
#> ! bcbio GTF file is not accessible.
#> ! Slotting empty ranges into `rowRanges()`.
#> ── Column data
#> ! `sampleMetadataFile` is recommended for multiplexed samples (e.g. "harvard-indrop-v3").
#> ── Metadata
#> → Importing unfiltered cellular barcode distributions.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA-barcodes.tsv using base::`read.table()`.
#> → Calculating 100 sample metrics.
#> ! Calculating metrics without biotype information.
#> `rowData()` required to calculate: "nCoding", "nMito", "mitoRatio".
#> ✔ bcbio single-cell RNA-seq run imported successfully.
print(x)
#> bcbioSingleCell 0.7.1
#> uploadDir: /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops
#> dates(2): 2018-01-01 2023-12-04
#> level: genes
#> interestingGroups: sampleName
#> filtered: FALSE
#> class: SingleCellExperiment
#> dim: 50 100
#> metadata(27): allSamples bcbioCommandsLog ... wd yaml
#> assays(1): counts
#> rownames(50): ENSG00000071082 ENSG00000100316 ... ENSG00000269028
#> ENSG00000282105
#> rowData names(0):
#> colnames(100): AAACACTA_CTTCGATT AAACTACA_CCACATTA ...
#> TGGGAATT_ATATAGGA TGTTATCA_ACGCAGAG
#> colData names(9): log10FeaturesPerCount mitoRatio ... sampleId
#> sampleName
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
x <- bcbioSingleCell(
uploadDir = uploadDir,
sampleMetadataFile = file.path(uploadDir, "metadata.csv")
)
#>
#> ────────────────────────────────────────────────────────────────────────────────
#> bcbioSingleCell
#> ────────────────────────────────────────────────────────────────────────────────
#> → Importing bcbio-nextgen single-cell RNA-seq run
#> projectDir: 2018-01-01_bcbio
#> ℹ 1 sample detected:
#> • multiplexed_AAAAAAAA
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/project-summary.yaml using yaml::`read_yaml()`.
#> ! Data versions are missing.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/programs.txt using base::`read.table()`.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/bcbio-nextgen.log using base::`readLines()`.
#> ! bcbio-nextgen.log file is empty.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/2018-01-01_bcbio/bcbio-nextgen-commands.log using base::`readLines()`.
#> ℹ 1000 reads per cellular barcode cutoff detected.
#> Counts will imported as genes.
#> UMI type: harvard-indrop-v3
#> ── Sample metadata
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/metadata.csv using base::`read.table()`.
#> ℹ Multiplexed samples detected.
#> ── Counts
#> → Importing counts.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.rownames using base::`readLines()`.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.colnames using base::`readLines()`.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx using Matrix::`readMM()`.
#> → Importing sidecar /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.rownames.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.rownames using base::`readLines()`.
#> → Importing sidecar /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.colnames.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA.mtx.colnames using base::`readLines()`.
#> → Imported multiplexed-AAAAAAAA.
#> ── Feature metadata
#> bcbio GTF file:
#> /n/app/bcbio/dev/genomes/Hsapiens/hg38/rnaseq/ref-transcripts.gtf
#> ! bcbio GTF file is not accessible.
#> ! Slotting empty ranges into `rowRanges()`.
#> ── Column data
#> ── Metadata
#> → Importing unfiltered cellular barcode distributions.
#> → Importing /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/multiplexed-AAAAAAAA/multiplexed-AAAAAAAA-barcodes.tsv using base::`read.table()`.
#> → Calculating 100 sample metrics.
#> ! Calculating metrics without biotype information.
#> `rowData()` required to calculate: "nCoding", "nMito", "mitoRatio".
#> ✔ bcbio single-cell RNA-seq run imported successfully.
print(x)
#> bcbioSingleCell 0.7.1
#> uploadDir: /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops
#> dates(2): 2018-01-01 2023-12-04
#> level: genes
#> sampleMetadataFile: /private/var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T/RtmpVNjWW5/temp_libpathd51756da0957/bcbioSingleCell/extdata/indrops/metadata.csv
#> interestingGroups: sampleName
#> filtered: FALSE
#> class: SingleCellExperiment
#> dim: 50 100
#> metadata(27): allSamples bcbioCommandsLog ... wd yaml
#> assays(1): counts
#> rownames(50): ENSG00000071082 ENSG00000100316 ... ENSG00000269028
#> ENSG00000282105
#> rowData names(0):
#> colnames(100): AAACACTA_CTTCGATT AAACTACA_CCACATTA ...
#> TGGGAATT_ATATAGGA TGTTATCA_ACGCAGAG
#> colData names(15): aggregate description ... sampleName sequence
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):