DsATAC.bam.Rd
Create a DsATAC dataset from multiple input bam files
DsATAC.bam(
sampleAnnot,
bamFiles,
genome,
regionSets = NULL,
sampleIdCol = NULL,
diskDump = FALSE,
keepInsertionInfo = TRUE,
pairedEnd = TRUE
)
data.frame specifying the sample annotation table
either a character vector of the same length as sampleAnnot has rows, specifying the file paths of the bam files for each
sample or a single character string specifying the column name in sampleAnnot
where the file paths can be found
genome assembly
a list of GRanges objects which contain region sets over which count data will be aggregated
column name in the sample annotation table containing unique sample identifiers. If NULL
(default), the function will look for a column that contains the word "sample"
should large data objects (count matrices, fragment data, ...) be disk-backed to save main memory
flag indicating whether to maintain the insertion information in the resulting object. Only relevant when type=="insBam"
.
is the input data paired-end? Only relevant when type=="insBam"
.
DsATAC
object
if (FALSE) {
# download and unzip the dataset
datasetUrl <- "https://s3.amazonaws.com/muellerf/data/ChrAccR/data/tutorial/tcells.zip"
downFn <- "tcells.zip"
download.file(datasetUrl, downFn)
unzip(downFn, exdir=".")
# prepare the sample annotation table
sampleAnnotFn <- file.path("tcells", "samples.tsv")
bamDir <- file.path("tcells", "bam")
sampleAnnot <- read.table(sampleAnnotFn, sep="\t", header=TRUE, stringsAsFactors=FALSE)
# add a column that ChrAccR can use to find the correct bam file for each sample
sampleAnnot[,"bamFilenameFull"] <- file.path(bamDir, sampleAnnot[,"bamFilename"])
# prepare the dataset
dsa_fromBam <- DsATAC.bam(sampleAnnot, "bamFilenameFull", "hg38", regionSets=NULL, sampleIdCol="sampleId")
}