detectOrganismand added support for chicken genome.
prepareSummarizedExperimentto make sample loading with
loadSingleCellin the bcbio packages less confusing.
*GTFalias functions to simply wrap the
*GFFfunctions with S4 methods support.
camelsyntax for both lax and strict modes. Added
gsubin internal functions.
saveData. Now will skip on existing files when
overwrite = FALSE.
readDataVersions, which shouldn’t have the column types defined, using
col_types = "ccT".
loadDataAsName. Now rather than using a named character vector for the
mappingsargument, the user can simply pass the key value pairs in as dots. For example,
newName1 = "oldName1", newName2 = "oldName2". The legacy
mappingsmethod will still work, as long as the dots argument is a length of 1.
rowDatato be left unset in
prepareSummarizedExperiment. This is useful for setting up objects that don’t contain gene annotations.
readSampleMetadata. This feature wasn’t fully baked and doesn’t offer enough functionality to the user.
toStringUniquecode, which is still in use in the wormbase package.
detectOrganism. Now allowing
NULLreturn for unsupported organism, with a warning.
loadDataAsNamenow default to
replace = TRUE. If an object with the same name exists in the destination environment, then a warning is generated.
collapseToStringonly attempts to dynamically return the original object class on objects that aren’t class
data.frame. I updated this code to behave more nicely with grouped tibbles (
grouped_df), which are a virtual class of
data.frameand therefore can’t be coerced using
NULLfor integers and numerics.
prepareSummarizedExperiment, added support for dropping
NULLobjects in assays list. This is useful for handling output from bcbioRNASeq when
transformLimitis reached. In this case, the
vstmatrices aren’t generated and set
NULLin the assays list. Using
Filter(Negate(is.null), assays)we can drop these
NULLobjects and prevent a downstream dimension mismatch in the
readSampleMetadataFile. This now checks for a sequence column containing ACGT nucleotides. When those are detected, the
revcompcolumn is generated. Otherwise this step is skipped. This is useful for handling multiplexed sample metadata from 10X Genomics Cell Ranger single-cell RNA-seq samples.
annotablefunction to include nested Entrez identifiers in the
entrezcolumn. This is useful for downstream functional analysis.
midnightThemeggplot2 theme. Originally this was defined as
darkThemein the bcbioSingleCell package, but can be useful for other plots and has been moved here for general bioinformatics usage. The theme now uses
ggplot2::theme_minimalas the base, with some color tweaks, namely dark gray axes without white axis lines.
sanitizeAnnotableutility functions that will be used in the bcbio R packages.
microplatecode from the wormbase package here, since it’s of general interest.
dgCMatrixmethod support in
aggregateFeaturesfunctions. Both of these functions now use a consistent
groupingsparameter, which uses a named factor to define the mappings of either samples (columns) for
aggregateReplicatesor genes/transcripts (rows) for
makeNamessanitization functions. Now they will work on
names(x)for vectors by default.
detectOrganismto match against “H. sapiens”, etc.
NAvalues from LibreOffice and Microsoft Excel output in
readFileByExtension. This function now sets
readSampleMetadataFile. We were detecting the presence of
indexcolumn but should instead check against