This function will identify features that are definitional of each provided cell grouping where possible

getMarkerFeatures(
  ArchRProj = NULL,
  groupBy = "Clusters",
  useGroups = NULL,
  bgdGroups = NULL,
  useMatrix = "GeneScoreMatrix",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  normBy = NULL,
  testMethod = "wilcoxon",
  maxCells = 500,
  scaleTo = 10^4,
  threads = getArchRThreads(),
  k = 100,
  bufferRatio = 0.8,
  binarize = FALSE,
  useSeqnames = NULL,
  verbose = TRUE,
  logFile = createLogFile("getMarkerFeatures")
)

Arguments

ArchRProj

An ArchRProject object.

groupBy

The name of the column in cellColData to use for grouping cells together for marker feature identification.

useGroups

A character vector that is used to select a subset of groups by name from the designated groupBy column in cellColData. This limits the groups used to perform marker feature identification.

bgdGroups

A character vector that is used to select a subset of groups by name from the designated groupBy column in cellColData to be used for background calculations in marker feature identification.

useMatrix

The name of the matrix to be used for performing differential analyses. Options include "GeneScoreMatrix", "PeakMatrix", etc.

bias

A character vector indicating the potential bias variables (i.e. c("TSSEnrichment", "log10(nFrags)")) to account for in selecting a matched null group for marker feature identification. These should be column names from cellColData.

normBy

The name of a numeric column in cellColData that should be normalized across cells (i.e. "ReadsInTSS") prior to performing marker feature identification.

testMethod

The name of the pairwise test method to use in comparing cell groupings to the null cell grouping during marker feature identification. Valid options include "wilcoxon", "ttest", and "binomial".

maxCells

The maximum number of cells to consider from a single-cell group when performing marker feature identification.

scaleTo

Each column in the matrix designated by useMatrix will be normalized to a column sum designated by scaleTo.

threads

The number of threads to be used for parallel computing.

k

The number of nearby cells to use for selecting a biased-matched background while accounting for bgdGroups proportions.

bufferRatio

When generating optimal biased-matched background groups of cells to determine significance, it can be difficult to find sufficient numbers of well-matched cells to create a background group made up of an equal number of cells. The bufferRatio indicates the fraction of the total cells that must be obtained when creating the biased-matched group. For example to create a biased-matched background for a group of 100 cells, when bufferRatio is set to 0.8 the biased-matched background group will be composed of the 80 best-matched cells. This option provides flexibility in the generation of biased-matched background groups given the stringency of also maintaining the group proportions from bgdGroups.

binarize

A boolean value indicating whether to binarize the matrix prior to differential testing. This is useful when useMatrix is an insertion counts-based matrix.

useSeqnames

A character vector that indicates which seqnames should be plotted in the heatmap. Features from seqnames that are not listed will be ignored. In the context of a Sparse.Assays.Matrix, such as a matrix containing chromVAR deviations, the seqnames do not correspond to chromosomes, rather they correspond to the sub-portions of the matrix, for example raw deviations ("deviations") or deviation z-scores ("z") for a chromVAR deviations matrix.

verbose

A boolean value that determines whether standard output is printed.

logFile

The path to a file to be used for logging ArchR output.