This function will merge cells within each designated cell group for the generation of pseudo-bulk replicates and then merge these replicates into a single insertion coverage file.

addGroupCoverages(
  ArchRProj = NULL,
  groupBy = "Clusters",
  useLabels = TRUE,
  minCells = 40,
  maxCells = 500,
  maxFragments = 25 * 10^6,
  minReplicates = 2,
  maxReplicates = 5,
  sampleRatio = 0.8,
  kmerLength = 6,
  threads = getArchRThreads(),
  returnGroups = FALSE,
  parallelParam = NULL,
  force = FALSE,
  verbose = TRUE,
  logFile = createLogFile("addGroupCoverages")
)

Arguments

ArchRProj

An ArchRProject object.

groupBy

The name of the column in cellColData to use for grouping multiple cells together prior to generation of the insertion coverage file.

useLabels

A boolean value indicating whether to use sample labels to create sample-aware subgroupings during as pseudo-bulk replicate generation.

minCells

The minimum number of cells required in a given cell group to permit insertion coverage file generation.

maxCells

The maximum number of cells to use during insertion coverage file generation.

maxFragments

The maximum number of fragments per cell group to use in insertion coverage file generation. This prevents the generation of excessively large files which would negatively impact memory requirements.

minReplicates

The minimum number of pseudo-bulk replicates to be generated.

maxReplicates

The maximum number of pseudo-bulk replicates to be generated.

sampleRatio

The fraction of the total cells that can be sampled to generate any given pseudo-bulk replicate.

kmerLength

The length of the k-mer used for estimating Tn5 bias.

threads

The number of threads to be used for parallel computing.

returnGroups

A boolean value that indicates whether to return sample-guided cell-groupings without creating coverages. This is used mainly in addReproduciblePeakSet() when MACS2 is not being used to call peaks but rather peaks are called from a TileMatrix (peakMethod = "Tiles").

parallelParam

A list of parameters to be passed for biocparallel/batchtools parallel computing.

force

A boolean value that indicates whether or not to overwrite the relevant data in the ArchRProject object if insertion coverage / pseudo-bulk replicate information already exists.

verbose

A boolean value that determines whether standard output includes verbose sections.

logFile

The path to a file to be used for logging ArchR output.