Make a GeneToSymbol object
Usage
makeGeneToSymbolFromEnsembl(
organism,
genomeBuild = NULL,
release = NULL,
ignoreVersion = FALSE,
format = c("makeUnique", "1:1", "unmodified")
)
makeGeneToSymbolFromEnsDb(
object,
ignoreVersion = FALSE,
format = c("makeUnique", "1:1", "unmodified")
)
makeGeneToSymbolFromGff(
file,
ignoreVersion = FALSE,
format = c("makeUnique", "1:1", "unmodified")
)Arguments
- organism
character(1). Full Latin organism name (e.g."Homo sapiens").- genomeBuild
character(1). Ensembl genome build assembly name (e.g."GRCh38"). If setNULL, defaults to the most recent build available. Note: don't pass in UCSC build IDs (e.g."hg38").- release
integer(1). Ensembl release version (e.g.100). We recommend setting this value if possible, for improved reproducibility. When left unset, the latest release available via AnnotationHub/ensembldb is used. Note that the latest version available can vary, depending on the versions of AnnotationHub and ensembldb in use.- ignoreVersion
logical(1). Ignore identifier (e.g. transcript, gene) versions. When applicable, the identifier containing version numbers will be stored intxIdVersionandgeneIdVersion, and the variants without versions will be stored intxId,txIdNoVersion,geneId, andgeneIdNoVersion.- format
character(1). Formatting method to apply:"makeUnique": Recommended. Applymake.uniqueto thegeneNamecolumn. Gene names are made unique, while the identifiers remain unmodified.NAgene names will be renamed to"unannotated"."1:1": For gene names that map to multiple gene identifiers, select only the first annotated gene identifier. Incomplete elements withNAgene name will be removed will be removed with an internalcomplete.casescall."unmodified": ReturngeneIdandgeneNamecolumns unmodified, in long format. Incomplete elements withNAgene name will be removed with an internalcomplete.casescall.
- object
Object.
- file
character(1). File path.
Functions
makeGeneToSymbolFromEnsembl(): Make aGeneToSymbolobject from Ensembl using an AnnotationHub lookup.makeGeneToSymbolFromEnsDb(): Make aGeneToSymbolobject from anEnsDbobject or annotation package.makeGeneToSymbolFromGff(): Make aGeneToSymbolobject from a GFF file.
Examples
## makeGeneToSymbolFromEnsembl ====
x <- makeGeneToSymbolFromEnsembl(
organism = "Homo sapiens",
ignoreVersion = FALSE
)
#> → Making <GRanges> from Ensembl.
#> → Getting <EnsDb> from AnnotationHub 3.14.0 (2024-10-28).
#> ℹ "AH119325": Ensembl 113 EnsDb for Homo sapiens.
#> → Making <GRanges> from <EnsDb>.
#> Organism: Homo sapiens
#> Genome build: GRCh38
#> Release: 113
#> Level: genes
#> → Defining names by `geneId` column in `mcols()`.
#> ℹ 4590 non-unique gene symbols detected: "5S_rRNA", "5_8S_rRNA", "7SK", "A2M", "A2MP1", "A4GALT", "AAAS", "AACSP1", "AADACL2", "AADACL2-AS1"....
print(x)
#> GeneToSymbol with 87726 rows and 2 columns
#> geneId geneName
#> <character> <character>
#> ENSG00000000003.16 ENSG00000000003.16 TSPAN6
#> ENSG00000000005.6 ENSG00000000005.6 TNMD
#> ENSG00000000419.14 ENSG00000000419.14 DPM1
#> ENSG00000000457.14 ENSG00000000457.14 SCYL3
#> ENSG00000000460.17 ENSG00000000460.17 FIRRM
#> ... ... ...
#> LRG_995.1 LRG_995.1 FUBP1.1
#> LRG_996.1 LRG_996.1 ERBB3.1
#> LRG_997.1 LRG_997.1 ROS1.1
#> LRG_998.1 LRG_998.1 CCND3.1
#> LRG_999.1 LRG_999.1 CIC.1
## makeGeneToSymbolFromEnsDb ====
## > if (goalie::isInstalled("EnsDb.Hsapiens.v75")) {
## > x <- makeGeneToSymbolFromEnsDb("EnsDb.Hsapiens.v75")
## > print(x)
## > }
## makeGeneToSymbolFromGff ====
## > file <- AcidBase::pasteUrl(
## > "ftp.ensembl.org",
## > "pub",
## > "release-102",
## > "gtf",
## > "homo_sapiens",
## > "Homo_sapiens.GRCh38.102.gtf.gz",
## > protocol = "ftp"
## > )
## > x <- makeGeneToSymbolFromGff(
## > file = file,
## > ignoreVersion = FALSE
## > )
## > print(x)