Coercion methods

as.DataFrame(x, ...)

as.data.frame(x, row.names = NULL, optional = FALSE, ...)

as.data.table(x, keep.rownames = FALSE, ...)

# S3 method for DataFrame
as_tibble(
  x,
  ...,
  rownames = pkgconfig::get_config("tibble::rownames", "rowname")
)

# S3 method for IRanges
as_tibble(
  x,
  ...,
  rownames = pkgconfig::get_config("tibble::rownames", "rowname")
)

# S3 method for GenomicRanges
as_tibble(
  x,
  ...,
  rownames = pkgconfig::get_config("tibble::rownames", "rowname")
)

# S3 method for DataFrame
as.data.table(x, keep.rownames = TRUE, ...)

# S3 method for IRanges
as.data.table(x, keep.rownames = TRUE, ...)

# S3 method for GenomicRanges
as.data.table(x, keep.rownames = TRUE, ...)

# S4 method for SimpleList
as.DataFrame(x, row.names = NULL)

# S4 method for list
as.DataFrame(x, row.names = NULL)

# S4 method for IRanges
as.data.frame(x, row.names = NULL, optional = FALSE, ...)

# S4 method for Matrix
as.data.frame(x, row.names = NULL, optional = FALSE, ...)

Arguments

x

any R object.

...

Additional arguments.

rownames

logical(1). Apply to row names.

keep.rownames

Default is FALSE. If TRUE, adds the input object's names as a separate column named "rn". keep.rownames = "id" names the column "id" instead.

row.names

NULL or character.

optional

logical. If TRUE, setting row names and converting column names (to syntactic names: see make.names) is optional. Note that all of R's base package as.data.frame() methods use optional only for column names treatment, basically with the meaning of data.frame(*, check.names = !optional). See also the make.names argument of the matrix method.

Value

Modified object, of desired conversion class.

Details

These conversion methods are primarily intended to interconvert between popular tabular formats in R, including data.frame, data.table, tbl_df, and the Bioconductor DataFrame classes.

Note

Updated 2020-02-05.

DataFrame (Bioconductor) coercion

Don't define as() coercion method for list here. It will create issues with data.frame coercion. Use as.DataFrame() instead to coerce a list to DataFrame.

Wrapping the columns in an I() works when passing to DataFrame(). See also as_tibble() for easy list to data frame coercion.

as() method definition causes issues with data.frame to DataFrame coercion when defined, because data.frame inherits from list.

data.frame coercion

To IRanges

Default coercion of IPosRanges (i.e. IRanges) to data.frame currently strips metadata in mcols(). However, GenomicRanges preserves this information, so we're adding a tweaked coercion method here to improve consistency.

Relevant methods:

getMethod(
    f = "as.data.frame",
    signature = "GenomicRanges",
    where = asNamespace("GenomicRanges")
)
## IRanges inherits from `IPosRanges`.
getMethod(
    f = "as.data.frame",
    signature = "IPosRanges",
    where = asNamespace("IRanges")
)

See also:

  • https://github.com/Bioconductor/IRanges/issues/8

data.table coercion

Our defined methods attempt to improve on the defaults in the data.table package to ensure that row names are not dropped by default, which is a poor default for bioinformatics. This is accomplished by setting keep.rownames = "rowname" by default instead of keep.rownames = NULL. Note that we're manually defining the "rowname" column instead of using TRUE, to match the conventions used in our as_tibble() methods.

S3 methods: as.data.table()

The package extends as.data.table() method support for these S4 classes:

  • DataFrame.

  • GenomicRanges.

S4 methods: as()

Since data.table is a class that extends data.frame, we need to define an S4 coercion method that allows us to use as() to coerce an object to a data.table.

tibble (tbl_df) coercion

Our defined methods attempt to improve on the defaults in the tibble package to ensure that row names are not dropped by default, which is a poor default for bioinformatics. This is accomplished by setting rownames = "rowname" by default instead of rownames = NULL.

Note that we're matching as_tibble() convention here, using rowname as column for row names assignment. We also using similar internal assert checks here, allowing atomic and/or list columns only.

S3 methods: as_tibble()

The package extends as_tibble() method support for these S4 classes:

  • DataFrame.

  • GenomicRanges.

S4 methods: as()

Since tbl_df is a virtual class that extends tbl and data.frame, we need to define an S4 coercion method that allows us to use as() to coerce an object to a tibble.

See also

Examples

data( DFrame, GRanges, IRanges, data.table, sparseMatrix, tbl_df, package = "AcidTest" ) ## `DataFrame` to `data.table` ==== x <- as(DFrame, "data.table") x <- as.data.table(DFrame) print(x)
#> rn sample01 sample02 sample03 sample04 #> 1: gene01 1 2 3 4 #> 2: gene02 5 6 7 8 #> 3: gene03 9 10 11 12 #> 4: gene04 13 14 15 16
## `DataFrame` to `tbl_df` ==== x <- as(DFrame, "tbl_df") x <- as_tibble(DFrame) print(x)
#> # A tibble: 4 x 5 #> rowname sample01 sample02 sample03 sample04 #> <chr> <int> <int> <int> <int> #> 1 gene01 1 2 3 4 #> 2 gene02 5 6 7 8 #> 3 gene03 9 10 11 12 #> 4 gene04 13 14 15 16
## `GenomicRanges` to `data.table` ==== x <- as(GRanges, "data.table") x <- as.data.table(GRanges) print(x)
#> rn seqnames start end width strand geneId #> 1: ENSG00000000003 X 100627109 100639991 12883 - ENSG00000000003 #> 2: ENSG00000000005 X 100584802 100599885 15084 + ENSG00000000005 #> 3: ENSG00000000419 20 50934867 50958555 23689 - ENSG00000000419 #> 4: ENSG00000000457 1 169849631 169894267 44637 - ENSG00000000457 #> 5: ENSG00000000460 1 169662007 169854080 192074 + ENSG00000000460 #> geneName #> 1: TSPAN6 #> 2: TNMD #> 3: DPM1 #> 4: SCYL3 #> 5: C1orf112
## `GenomicRanges` to `tbl_df` ==== x <- as(GRanges, "tbl_df") x <- as_tibble(GRanges) print(x)
#> # A tibble: 5 x 8 #> rowname seqnames start end width strand geneId geneName #> <chr> <fct> <int> <int> <int> <fct> <chr> <chr> #> 1 ENSG000000000… X 1.01e8 1.01e8 12883 - ENSG00000000… TSPAN6 #> 2 ENSG000000000… X 1.01e8 1.01e8 15084 + ENSG00000000… TNMD #> 3 ENSG000000004… 20 5.09e7 5.10e7 23689 - ENSG00000000… DPM1 #> 4 ENSG000000004… 1 1.70e8 1.70e8 44637 - ENSG00000000… SCYL3 #> 5 ENSG000000004… 1 1.70e8 1.70e8 192074 + ENSG00000000… C1orf112
## `IRanges` to `data.table` ==== x <- as(IRanges, "data.table") x <- as.data.table(IRanges) print(x)
#> rn start end width score #> 1: a 1 5 5 1 #> 2: b 10 14 5 2 #> 3: c 20 24 5 3
## `IRanges` to `tbl_df` ==== x <- as(IRanges, "tbl_df") x <- as_tibble(IRanges) print(x)
#> # A tibble: 3 x 5 #> rowname start end width score #> <chr> <int> <int> <int> <int> #> 1 a 1 5 5 1 #> 2 b 10 14 5 2 #> 3 c 20 24 5 3
## `Matrix` to `DataFrame` ==== from <- sparseMatrix to <- as(from, "DataFrame") to
#> DataFrame with 8 rows and 10 columns #> sample01 sample02 sample03 sample04 sample05 sample06 sample07 #> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> #> gene01 0 7 0 0 0 0 0 #> gene02 0 0 0 0 0 0 0 #> gene03 0 0 0 0 0 0 0 #> gene04 0 0 0 0 0 21 0 #> gene05 0 0 0 0 0 0 28 #> gene06 0 0 0 0 0 0 0 #> gene07 0 0 0 0 0 0 0 #> gene08 0 0 0 0 0 0 0 #> sample08 sample09 sample10 #> <numeric> <numeric> <numeric> #> gene01 0 0 0 #> gene02 0 0 0 #> gene03 0 14 0 #> gene04 0 0 0 #> gene05 0 0 0 #> gene06 35 0 0 #> gene07 0 42 0 #> gene08 0 0 49
## `Matrix` to `data.frame` ==== x <- as(sparseMatrix, "data.frame") head(x)
#> sample01 sample02 sample03 sample04 sample05 sample06 sample07 sample08 #> gene01 0 7 0 0 0 0 0 0 #> gene02 0 0 0 0 0 0 0 0 #> gene03 0 0 0 0 0 0 0 0 #> gene04 0 0 0 0 0 21 0 0 #> gene05 0 0 0 0 0 0 28 0 #> gene06 0 0 0 0 0 0 0 35 #> sample09 sample10 #> gene01 0 0 #> gene02 0 0 #> gene03 14 0 #> gene04 0 0 #> gene05 0 0 #> gene06 0 0
## `data.table` to `DataFrame` ==== from <- data.table to <- as(from, "DataFrame") head(to)
#> DataFrame with 4 rows and 4 columns #> sample01 sample02 sample03 sample04 #> <integer> <integer> <integer> <integer> #> gene01 1 2 3 4 #> gene02 5 6 7 8 #> gene03 9 10 11 12 #> gene04 13 14 15 16
## `list` to `DataFrame` ==== ## Use `as.DataFrame()` instead of `as()` for `list` class. from <- list( a = list(c(1, 2), c(3, 4)), b = list(NULL, NULL) ) to <- as.DataFrame(from) to
#> DataFrame with 2 rows and 2 columns #> a b #> <list> <list> #> 1 1,2 #> 2 3,4
## `tbl_df` to `DataFrame` ==== from <- tbl_df to <- as(from, "DataFrame") head(to)
#> DataFrame with 4 rows and 4 columns #> sample01 sample02 sample03 sample04 #> <integer> <integer> <integer> <integer> #> gene01 1 2 3 4 #> gene02 5 6 7 8 #> gene03 9 10 11 12 #> gene04 13 14 15 16