Skip to contents

Download UCSC reference genome

Usage

downloadUcscGenome(
  organism,
  genomeBuild = NULL,
  outputDir = getwd(),
  cache = FALSE
)

Arguments

organism

character(1). Full Latin organism name (e.g. "Homo sapiens").

genomeBuild

character(1). UCSC genome build assembly name (e.g. "hg38"). If set NULL, defauls to the most recent build available.

outputDir

character(1). Output directory path.

cache

logical(1). Cache URLs locally, using BiocFileCache internally.

Value

Invisible list.

Note

Updated 2023-11-22.

Genome

  • <GENOME_BUILD>.chrom.sizes: Two-column tab-separated text file containing assembly sequence names and sizes.

  • <GENOME_BUILD>.chromAlias.txt: Sequence name alias file, one line for each sequence name. First column is sequence name followed by tab separated alias names.

Transcriptome

  • mrna.fa.gz: Human mRNA from GenBank. This sequence data is updated regularly via automatic GenBank updates.

  • refMrna.fa.gz: RefSeq mRNA from the same species as the genome. This sequence data is updated regularly via automatic GenBank updates.

Gene annotations

This directory contains GTF files for the main gene transcript sets where available. They are sourced from the following gene model tables: knownGene (GENCODE) and ncbiRefSeq (NCBI RefSeq).

Examples

## This example is bandwidth intensive.
## > downloadUcscGenome(organism = "Homo sapiens")