Skip to contents

Map gene names to Ensembl

Usage

mapGeneNamesToEnsembl(genes, organism, genomeBuild = NULL, release = NULL)

Arguments

genes

Gene names (e.g. "TUT4").

organism

character(1). Full Latin organism name (e.g. "Homo sapiens").

genomeBuild

character(1). Ensembl genome build assembly name (e.g. "GRCh38"). If set NULL, defaults to the most recent build available. Note: don't pass in UCSC build IDs (e.g. "hg38").

release

integer(1). Ensembl release version (e.g. 100). We recommend setting this value if possible, for improved reproducibility. When left unset, the latest release available via AnnotationHub/ensembldb is used. Note that the latest version available can vary, depending on the versions of AnnotationHub and ensembldb in use.

Details

Internally matches using mapGenesToNCBI first, so we can support gene synonym matching.

Note

Updated 2023-04-14.

Examples

## Homo sapiens.
x <- mapGeneNamesToEnsembl(
    genes = c("TUT4", "ZCCHC11", "TENT3A"),
    organism = "Homo sapiens",
    genomeBuild = "GRCh38",
    release = 109L
)
#> → Importing HGNC complete set.
#> → Importing /Users/mike/.cache/R/AcidGenomes/BiocFileCache/886a796ea9b4_hgnc_complete_set.txt using base::`readLines()`.
#> → Importing text connection with base::`read.table()`.
print(x)
#> [1] "ENSG00000134744" "ENSG00000134744" "ENSG00000134744"

## Mus musculus
x <- mapGeneNamesToEnsembl(
    genes = c("Nfe2l2", "Nrf2"),
    organism = "Mus musculus",
    genomeBuild = "GRCm39",
    release = 109L
)
#> → Importing /Users/mike/.cache/R/AcidGenomes/BiocFileCache/8e9949e5d369_Mus_musculus.GRCm39.109.entrez.tsv.gz using base::`read.table()`.
#> → Downloading Mus musculus gene info from NCBI at <https://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus.gene_info.gz>.
#> → Importing /Users/mike/.cache/R/AcidGenomes/BiocFileCache/8e9976c0388a_Mus_musculus.gene_info.gz using base::`read.table()`.
print(x)
#> [1] "ENSMUSG00000015839" "ENSMUSG00000015839"