Supports organism detection from Ensembl identifier or genome build.
Details
Only the first match is returned. We're using a while loop approach here so we can skip transgenes or spike-ins. The function fails after a maximum of 50 unknowns, for speed.
Note
BiocGenerics::organism()
character
method conflicts with annotate
package, which gets loaded into the namespace when DESeq2 is attached.
Instead, we're exporting the character method here as a separate function
named detectOrganism
.
Updated 2023-09-14.
Supported organisms
Caenorhabditis elegans (roundworm)
Danio rerio (zebrafish)
Drosophila melanogaster (fruitfly)
Gallus gallus (chicken)
Homo sapiens (human)
Mus musculus (mouse)
Ovis aries (sheep)
Rattus norvegicus (rat)
Saccharomyces cerevisiae (yeast)
Examples
## Match by gene identifier.
detectOrganism("ENSG00000000003")
#> [1] "Homo sapiens"
## Match by genome build.
detectOrganism("GRCh38") # Ensembl
#> [1] "Homo sapiens"
detectOrganism("hg38") # UCSC
#> [1] "Homo sapiens"
## Match by alternate organism name.
detectOrganism("H. sapiens")
#> [1] "Homo sapiens"
detectOrganism("hsapiens")
#> [1] "Homo sapiens"
## The function will skip transgenes/spike-ins until we find a match.
detectOrganism(c("EGFP", "TDTOMATO", "ENSG00000000003"))
#> [1] "Homo sapiens"
## But it only returns the first match, if there are multiple genomes.
detectOrganism(c("ENSG00000000003", "ENSMUSG00000000001"))
#> [1] "Homo sapiens"