Create a DsATAC dataset from multiple input bam files

DsATAC.bam(
  sampleAnnot,
  bamFiles,
  genome,
  regionSets = NULL,
  sampleIdCol = NULL,
  diskDump = FALSE,
  keepInsertionInfo = TRUE,
  pairedEnd = TRUE
)

Arguments

sampleAnnot

data.frame specifying the sample annotation table

bamFiles

either a character vector of the same length as sampleAnnot has rows, specifying the file paths of the bam files for each sample or a single character string specifying the column name in sampleAnnot where the file paths can be found

genome

genome assembly

regionSets

a list of GRanges objects which contain region sets over which count data will be aggregated

sampleIdCol

column name in the sample annotation table containing unique sample identifiers. If NULL (default), the function will look for a column that contains the word "sample"

diskDump

should large data objects (count matrices, fragment data, ...) be disk-backed to save main memory

keepInsertionInfo

flag indicating whether to maintain the insertion information in the resulting object. Only relevant when type=="insBam".

pairedEnd

is the input data paired-end? Only relevant when type=="insBam".

Value

DsATAC object

Author

Fabian Mueller

Examples

if (FALSE) {
# download and unzip the dataset
datasetUrl <- "https://s3.amazonaws.com/muellerf/data/ChrAccR/data/tutorial/tcells.zip"
downFn <- "tcells.zip"
download.file(datasetUrl, downFn)
unzip(downFn, exdir=".")
# prepare the sample annotation table
sampleAnnotFn <- file.path("tcells", "samples.tsv")
bamDir <- file.path("tcells", "bam")
sampleAnnot <- read.table(sampleAnnotFn, sep="\t", header=TRUE, stringsAsFactors=FALSE)
# add a column that ChrAccR can use to find the correct bam file for each sample
sampleAnnot[,"bamFilenameFull"] <- file.path(bamDir, sampleAnnot[,"bamFilename"])
# prepare the dataset
dsa_fromBam <- DsATAC.bam(sampleAnnot, "bamFilenameFull", "hg38", regionSets=NULL, sampleIdCol="sampleId")
}