Drop any columns defined in denylist, and ensure character
columns
containing any duplicate values are automatically coerced to factor
.
Arguments
- object
DFrame
(recommended) ordata.frame
(legacy). Note that legacydata.frame
support will be removed in a future update.
Examples
data(RangedSummarizedExperiment, package = "AcidTest")
rse <- RangedSummarizedExperiment
## SummarizedExperiment ====
from <- sampleData(rse)
print(from)
#> DataFrame with 12 rows and 3 columns
#> condition sampleName interestingGroups
#> <factor> <factor> <factor>
#> sample01 A sample01 A
#> sample02 A sample02 A
#> sample03 A sample03 A
#> sample04 A sample04 A
#> sample05 A sample05 A
#> ... ... ... ...
#> sample08 B sample08 B
#> sample09 B sample09 B
#> sample10 B sample10 B
#> sample11 B sample11 B
#> sample12 B sample12 B
to <- sanitizeSampleData(from)
all(vapply(to, is.factor, logical(1L)))
#> [1] TRUE
print(to)
#> DataFrame with 12 rows and 2 columns
#> condition sampleName
#> <factor> <factor>
#> sample01 A sample01
#> sample02 A sample02
#> sample03 A sample03
#> sample04 A sample04
#> sample05 A sample05
#> ... ... ...
#> sample08 B sample08
#> sample09 B sample09
#> sample10 B sample10
#> sample11 B sample11
#> sample12 B sample12