DsATAC.bam — DsATAC.bam • ChrAccR

Create a DsATAC dataset from multiple input bam files

DsATAC.bam(
  sampleAnnot,
  bamFiles,
  genome,
  regionSets = NULL,
  sampleIdCol = NULL,
  diskDump = FALSE,
  keepInsertionInfo = TRUE,
  pairedEnd = TRUE
)

Arguments

sampleAnnot: data.frame specifying the sample annotation table
bamFiles: either a character vector of the same length as sampleAnnot has rows, specifying the file paths of the bam files for each sample or a single character string specifying the column name in sampleAnnot where the file paths can be found
genome: genome assembly
regionSets: a list of GRanges objects which contain region sets over which count data will be aggregated
sampleIdCol: column name in the sample annotation table containing unique sample identifiers. If NULL (default), the function will look for a column that contains the word "sample"
diskDump: should large data objects (count matrices, fragment data, ...) be disk-backed to save main memory
keepInsertionInfo: flag indicating whether to maintain the insertion information in the resulting object. Only relevant when type=="insBam".
pairedEnd: is the input data paired-end? Only relevant when type=="insBam".

Value

DsATAC object

Author

Fabian Mueller

Examples

if (FALSE) {
# download and unzip the dataset
datasetUrl <- "https://s3.amazonaws.com/muellerf/data/ChrAccR/data/tutorial/tcells.zip"
downFn <- "tcells.zip"
download.file(datasetUrl, downFn)
unzip(downFn, exdir=".")
# prepare the sample annotation table
sampleAnnotFn <- file.path("tcells", "samples.tsv")
bamDir <- file.path("tcells", "bam")
sampleAnnot <- read.table(sampleAnnotFn, sep="\t", header=TRUE, stringsAsFactors=FALSE)
# add a column that ChrAccR can use to find the correct bam file for each sample
sampleAnnot[,"bamFilenameFull"] <- file.path(bamDir, sampleAnnot[,"bamFilename"])
# prepare the dataset
dsa_fromBam <- DsATAC.bam(sampleAnnot, "bamFilenameFull", "hg38", regionSets=NULL, sampleIdCol="sampleId")
}