HDMA

Data download and descriptions

This file provides more detailed description of the data and analysis products deposited on Zenodo at https://zenodo.org/communities/hdma, and demonstrates how to programmatically search for and download specific files.

Contents:

Downloading data from Zenodo

Any records on Zenodo can be downloaded by navigating to the record URL provided in Table S14 and downloading from the webs.

Programmatic download

For programmatic access, we provide the following helper script for getting links to the latest version of all files in a set of Zenodo records, from the latest version of each record. This is useful since many of our data types are split across several depositions, and one may want to download several files or types of files across several depositions. We also provide some examples of how to download specific files across many depositions.

(Another option is the zenodo_get python package, which allows for downloading all data in a record, or fetching direct links to all files within a record, from the command line. Important note: for zenodo_get, the input must be one of the versioned DOIs for the record, not the record that resolves to the latest version. The latter is what is provided in Table S14.)

Helper script

We will use the following helper script, get_urls.py for getting links to all files in a set of Zenodo records, from the latest version of each record.

# Input: space-separated Zenodo record IDs, from STDIN or passed as an argument
# Output: urls_record.txt with one line per record, corresponding to the URLs for the latest version
#         urls_file.txt with one line per file, corresponding to the direct download URL, for all files in all the input records
# Usage: $ python get_latest.py 15048277

import requests
import sys

def get_latest_zenodo_info(record_id):
	"""
	Fetches the latest DOI and HTML URL for a Zenodo record ID.

	Args:
		record_id (str): The Zenodo record ID.

	Returns:
		tuple: A tuple containing the latest DOI and HTML URL, or (None, None) on error.
	"""
	api_url = f"https://zenodo.org/api/records/{record_id}"

	try:
		response = requests.get(api_url)
		response.raise_for_status()  # Raise HTTPError for bad responses (4xx or 5xx)
		data = response.json()

		# Get the API URL of the latest version (which should be the same record)
		latest_html_url = data["links"]["self"]  # directly get the current record

		# Extract the DOI
		latest_doi = data["doi"]

		return latest_doi, latest_html_url

	except requests.exceptions.RequestException as e:
		return None, None
	except (KeyError, ValueError) as e:
		return None, None

def get_file_urls(record_id):
	"""
	Fetches file URLs for a given Zenodo record ID.

	Args:
		record_id (str): The Zenodo record ID.

	Returns:
		list: A list of file URLs, or an empty list on error.
	"""
	try:
		api_url = f"https://zenodo.org/api/records/{record_id}"
		response = requests.get(api_url)
		response.raise_for_status()
		data = response.json()
		file_urls = []
		for file_data in data['files']:
			fname = file_data['key']
			link = f"https://zenodo.org/record/{record_id}/files/{fname}"
			file_urls.append(link)
		return file_urls

	except requests.exceptions.RequestException as e:
		return []
	except (KeyError, ValueError) as e:
		return []

record_urls = []
file_urls_all = []
errors = []

# Determine source of record IDs (stdin or command line arguments)
record_ids = []
if not sys.stdin.isatty():
	record_ids = sys.stdin.read().strip().split()
else:
	record_ids = sys.argv[1:]

for record_id in record_ids:
	if not record_id:
		continue

  print(record_id)
	latest_doi, latest_html_url = get_latest_zenodo_info(record_id)
	if latest_doi and latest_html_url:
		record_urls.append(latest_html_url)

		latest_record_id = latest_html_url.split("/")[-1]  # Extract record ID from URL
		file_urls = get_file_urls(latest_record_id)
		file_urls_all.extend(file_urls)
	else:
		errors.append(f"Error processing record ID {record_id}")

if errors:
	for error in errors:
		print(error)
else:
	with open("urls_record.txt", "w") as record_file:
		for url in record_urls:
			record_file.write(url + "\n")

	with open("urls_file.txt", "w") as file_file:
		for url in file_urls_all:
			file_file.write(url + "\n")

	print("Record URLs written to urls_record.txt")
	print("File URLs written to urls_file.txt")


Example: get trained ChromBPNet models for Brain cell types

# list the data types on Zenodo from Table S14
cut -f 1 table_s14.tsv | sort | uniq
# ArchR projects
# Bigwigs
# BPCells objects
# Cell metadata, caCREs, ChromBPNet training regions, instances, motif lexicon
# ChromBPNet counts mean contribution scores (h5)
# ChromBPNet models
# ChromBPNet models, and shared bias model
# data_type
# Fragments + count matrices
# Per-cluster TF-MoDISco motifs (h5)
# Seurat objects

# record IDs are in column 4 of Table S14, so we pipe those
# into our helper to get the URLs for the files from records containing ChromBPNet models
grep models table_s14.tsv | cut -f 4 | tr '\n' ' ' | python get_urls.py
# Record URLs written to urls_record.txt
# File URLs written to urls_file.txt

# get the URLs for all Brain cell types
grep Brain urls_file.txt > urls_brain.txt

# download the Brain models
wget -i urls_brain.txt
for i in *.gz; do tar -xvf $i; done

Example: get the Seurat object for brain tissue

grep Seurat table_s14.tsv | cut -f 4 | tr '\n' ' ' | python get_urls.py
# Record URLs written to urls_record.txt
# File URLs written to urls_file.txt

# get the URL for the Brain object
grep Brain urls_file.txt > urls_brain.txt

# download the Brain models
wget -i urls/brain_urls.txt

Example: get all fragment files

# get the records containing Fragments + Count Matrices
grep Fragments table_s14.tsv | cut -f 4 | tr '\n' ' ' | python get_urls.py
# Record URLs written to urls_record.txt
# File URLs written to urls_file.txt

# get the URLs for all fragments and fragment index files
grep fragments urls_file.txt > urls_fragment.txt

# download the fragments files & index files
wget -i urls_fragment.txt

Example: get bigwigs for all immune cell types

# get the records containing Bigwigs
grep Bigwig table_s14.tsv | cut -f 4 | tr '\n' ' ' | python get_urls.py

# preview the columns in Table S2
head -n2 table_s2.tsv
# Cluster	organ	organ_code	cluster_id	compartment	L1_annot	L2_annot	L3_annot	dend_order	ncell	median_numi	median_ngene	median_nfrags	median_tss	median_frip	note	organ_color	compartment_color	Cluster_ChromBPNet
# LI_13	Liver	LI	13	epi	LI_13_cholangiocyte	Liver cholangiocyte	cholangiocyte	1	1004	1986	1462.5	7509.5	11.826	0.516103027	PKHD1 ANXA4 cholangiocyte (secretes bile)	#3b46a3	#11A579	Liver_c13

# get all cell types where compartment == "imm", extract the Cluster_ChromBPNet column value,
# and get the URLs matching those IDs
awk -F'\t' '$5 == "imm" { print $19 }' table_s2.tsv | grep -f - urls_file.txt > urls_imm.txt

# download the bigwigs for immune cell types
wget -i urls_imm.txt

Fragment files and count matrices

One ATAC fragment file and tabix index per sample, named <sample>.fragments.tsv.gz and <sample>.fragments.tsv.gz.tbi. Genomic coordinates in fragment files are 0-based.

Three RNA count matrices files per sample in Matrix Market Exchange format:

Seurat objects

One Seurat V4 object as a serialized RDS object per organ, named <organ>_RNA_obj_clustered_final.rds. Objects contain the RNA data for all cells passing QC; ATAC data is not present in these objects.

The cell metadata (<object>@meta.data) contains sample information and cell annotations:

Each object contains 3 assays:

e.g. in R:

# load object for Adrenal
adrenal <- readRDS("Adrenal_RNA_obj_clustered_final.rds")
adrenal@meta.data %>% tibble::glimpse()
# Rows: 2,883
# Columns: 25
# $ orig.ident          <chr> "CL131", "CL131", "CL131", "CL131", "CL131", "CL131", "CL131", "CL131"…
# $ nCount_RNA          <dbl> 10283, 22085, 17195, 12484, 10271, 10032, 15655, 11016, 18970, 4805, 4…
# $ nFeature_RNA        <int> 3453, 5526, 4864, 4014, 3876, 3645, 5096, 4483, 5039, 2340, 2365, 3250…
# $ percent.mt          <dbl> 0.009724788, 0.009055920, 0.000000000, 0.008010253, 0.009736150, 0.000…
# $ passfilter          <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
# $ Sample              <chr> "T273_b16_Adr_PCW18", "T273_b16_Adr_PCW18", "T273_b16_Adr_PCW18", "T27…
# $ sex                 <chr> "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", …
# $ PCW                 <chr> "PCW18", "PCW18", "PCW18", "PCW18", "PCW18", "PCW18", "PCW18", "PCW18"…
# $ PCD                 <chr> "PCD126", "PCD126", "PCD126", "PCD126", "PCD126", "PCD126", "PCD126", …
# $ age                 <chr> "PCW18", "PCW18", "PCW18", "PCW18", "PCW18", "PCW18", "PCW18", "PCW18"…
# $ S.Score             <dbl> -0.042831648, -0.091017252, -0.075550268, -0.143961927, -0.008328376, …
# $ G2M.Score           <dbl> -0.053519769, -0.408871745, -0.230472517, -0.005143041, -0.092735455, …
# $ Phase               <chr> "G1", "G1", "G1", "G1", "G1", "S", "G1", "S", "G1", "S", "G1", "G1", "…
# $ nCount_SCT          <dbl> 8086, 7344, 7914, 8205, 8123, 8090, 8251, 8203, 7772, 6688, 6548, 8001…
# $ nFeature_SCT        <int> 3422, 3027, 3561, 3788, 3833, 3614, 4197, 4409, 3366, 2327, 2365, 3228…
# $ SCT_snn_res.0.3     <fct> 2, 2, 2, 2, 2, 2, 4, 5, 2, 2, 5, 2, 2, 5, 2, 6, 2, 2, 2, 2, 2, 6, 2, 2…
# $ seurat_clusters     <fct> 2, 2, 2, 2, 2, 2, 4, 5, 2, 2, 5, 2, 2, 5, 2, 6, 2, 2, 2, 2, 2, 6, 2, 2…
# $ Clusters            <fct> 2, 2, 2, 2, 2, 2, 4, 5, 2, 2, 5, 2, 2, 5, 2, 6, 2, 2, 2, 2, 2, 6, 2, 2…
# $ estConp             <dbl> 0.30891153, 0.13286947, 0.17151322, 0.24562908, 0.51484749, 0.33882811…
# $ nCount_decontX      <dbl> 6026, 6775, 7050, 6733, 3929, 5714, 2715, 3481, 7371, 4119, 5477, 6580…
# $ nFeature_decontX    <int> 3159, 2977, 3473, 3554, 2418, 3257, 1841, 2427, 3329, 2003, 2299, 3093…
# $ decontX_snn_res.0.3 <fct> 2, 2, 2, 2, 2, 2, 4, 5, 2, 2, 5, 2, 2, 5, 2, 6, 2, 2, 2, 2, 2, 6, 2, 2…
# $ annotv1             <chr> "AG_epi", "AG_epi", "AG_epi", "AG_epi", "AG_epi", "AG_epi", "AG_epi", …
# $ keepcell            <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
# $ annotv2             <chr> "adrenal cortex", "adrenal cortex", "adrenal cortex", "adrenal cortex"…

ArchR projects

One zipped folder per organ containing the ArchR project (Save-ArchR-Project.rds) and all associated files, named <organ>_ATAC_obj_clustered_peaks_final_decontx.zip. See the ArchR documentation on loading projects here. ArchR projects contain the ATAC data for all cells passing QC; RNA data is also appended to these projects.

The cell metadata (getCellColData(<project>)) contains sample information and cell annotations:

Each project contains 5 matrices:

e.g. in R:

# load project for Adrenal
adrenal <- loadArchRProject("ATAC_obj_clustered_peaks_final_decontx/")
adrenal
# 
#            ___      .______        ______  __    __  .______      
#           /   \     |   _  \      /      ||  |  |  | |   _  \     
#          /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
#         /  /_\  \   |      /     |  |     |   __   | |      /     
#        /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
#       /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|
#     
# class: ArchRProject 
# outputDirectory: /path/to/Adrenal/atac_preprocess_output/ATAC_obj_clustered_peaks_final_decontx 
# samples(4): T318_b16_Adr_PCW21 T40_b16_Adr_PCW17 T273_b16_Adr_PCW18 T390_b16_Adr_PCW13
# sampleColData names(1): ArrowFiles
# cellColData names(26): Sample TSSEnrichment ... FRIP ATAC_Clusters_Peak
# numberOfCells(1): 2883
# medianTSS(1): 9.872
# medianFrags(1): 5437

getAvailableMatrices(adrenal)
# [1] "GeneExpressionMatrix" "GeneScoreMatrix"      "MotifMatrix"          "PeakMatrix"           "TileMatrix"

getCellColData(adrenal) %>% tibble::glimpse()
# Formal class 'DFrame' [package "S4Vectors"] with 6 slots
#   ..@ rownames       : chr [1:2883] "T318_b16_Adr_PCW21#CL131_E01+H09+A09" "T318_b16_Adr_PCW21#CL131_E03+G11+A10" "T318_b16_Adr_PCW21#CL131_E05+C09+A09" "T318_b16_Adr_PCW21#CL131_E07+B08+A08" ...
#   ..@ nrows          : int 2883
#   ..@ listData       :List of 26
#   .. ..$ Sample              : chr [1:2883] "T318_b16_Adr_PCW21" "T318_b16_Adr_PCW21" "T318_b16_Adr_PCW21" "T318_b16_Adr_PCW21" ...
#   .. ..$ TSSEnrichment       : num [1:2883] 10.75 10.35 9.33 9.3 9.38 ...
#   .. ..$ ReadsInTSS          : num [1:2883] 31418 27337 20101 19281 20050 ...
#   .. ..$ ReadsInPromoter     : num [1:2883] 32813 29854 22357 21415 21334 ...
#   .. ..$ ReadsInBlacklist    : num [1:2883] 1012 871 764 539 537 ...
#   .. ..$ PromoterRatio       : num [1:2883] 0.168 0.191 0.161 0.176 0.178 ...
#   .. ..$ PassQC              : num [1:2883] 1 1 1 1 1 1 1 1 1 1 ...
#   .. ..$ NucleosomeRatio     : num [1:2883] 2.15 1.71 1.82 1.82 1.81 ...
#   .. ..$ nMultiFrags         : num [1:2883] 43118 30620 27556 24036 23236 ...
#   .. ..$ nMonoFrags          : num [1:2883] 31014 28823 24574 21584 21330 ...
#   .. ..$ nFrags              : num [1:2883] 97567 78222 69279 60787 59842 ...
#   .. ..$ nDiFrags            : num [1:2883] 23435 18779 17149 15167 15276 ...
#   .. ..$ BlacklistRatio      : num [1:2883] 0.00519 0.00557 0.00551 0.00443 0.00449 ...
#   .. ..$ age                 : chr [1:2883] "PCW21" "PCW21" "PCW21" "PCW21" ...
#   .. ..$ RNA_Clusters        : chr [1:2883] "c0" "c5" "c0" "c3" ...
#   .. ..$ Gex_nUMI            : num [1:2883] 6234 2195 6290 6186 6023 ...
#   .. ..$ Gex_nGenes          : num [1:2883] 2707 1533 2861 2640 2806 ...
#   .. ..$ Gex_MitoRatio       : num [1:2883] 0.00353 0.00273 0.00366 0.00388 0.00349 ...
#   .. ..$ Gex_RiboRatio       : num [1:2883] 0.00417 0.00273 0.00302 0.00453 0.00432 ...
#   .. ..$ ATAC_Clusters       : chr [1:2883] "C4" "C5" "C4" "C4" ...
#   .. ..$ RNA_NamedCluster_L1 : chr [1:2883] "AG_epi" "AG_end" "AG_epi" "AG_epi" ...
#   .. ..$ RNA_NamedCluster_L1b: chr [1:2883] "AG_epi1" "AG_end" "AG_epi1" "AG_epi4" ...
#   .. ..$ RNA_NamedCluster_L2 : chr [1:2883] "adrenal cortex" "endothelial" "adrenal cortex" "adrenal cortex" ...
#   .. ..$ ReadsInPeaks        : num [1:2883] 54467 44226 37898 35782 36616 ...
#   .. ..$ FRIP                : num [1:2883] 0.28 0.283 0.274 0.295 0.306 ...
#   .. ..$ ATAC_Clusters_Peak  : chr [1:2883] "C2" "C3" "C2" "C2" ...
#   ..@ elementType    : chr "ANY"
#   ..@ elementMetadata: NULL
#   ..@ metadata       : list()

BPCells object

A BPCells object containing fragment files, count matrices, cell metadata, ABC loops per cluster, and motif instances per cluster is provided at DOI:

The BPCells object was produced by the script code/05-misc/01-bp_cells_create_obj.R.

The BPCells object was primarily used for efficient plotting and visualization of genomic loci. See our notebook with examples for plotting using the BPCells object at code/05-misc/02-bp_cells_plotting_examples.Rmd (html).

NOTE: the BPCells package uses on-disk storage, so the fragment files and count matrices in the RDS object only point to the path where the files are stored, and it’s important to modify those paths when the folders are moved or downloading this object locally. See the BPCells paper (Parks & Greenleaf, biorxiv, 2025) for more details on the compression and storage in BPCells objects.

Trained ChromBPNet models

One tar archive per cluster, using the Cluster_ChromBPNet cluster ID from Table S2, named <cluster>.gz. Only models which passed QC and which were used for downstream analysis are provided, thus there are models for 189 cell types.

The bias model was trained on Heart_c0 fold_0, and the bias model was used for training all ChromBPNet models.

The tar archive for each cluster can be extracted with tar -xvf and contains 15 models as weights saved as H5 files named <cluster>__<fold>__<model>.h5:

From the model schematic below from the ChromBPNet paper (Pampari, Scherbina et al, biorxiv, 2024), bias_model_scaled is the “Frozen bias model”, chrombpnet_nobias is the “TF model”, and chrombpnet is the total model with both submodels together.

We show how to load and use trained ChromBPNet models in the tutorial at code/05-misc/04-ChromBPNet_use_cases.ipnyb (html).

ChromBPNet mean contribution scores

One H5 file per cluster, using the Cluster_ChromBPNet cluster ID from Table S2, named <cluster>__average_shaps.counts.h5.

Bigwig tracks for observed and predicted accessibility and contrib. scores

One zipped directory per cluster, using the Cluster_ChromBPNet cluster ID from Table S2, named <cluster>__bigwigs.gz. Each directory contains the following genomic tracks:

NOTE: for viewing contribution scores in a genome browser, we recommend loading these bigwgigs as dynseq tracks in the WashU or UCSC genome browsers, which render the nucleotide character (A/C/G/T) at each position, where character height is proportional to the contribution score at that position. When zoomed out, the tracks appear as regular bigwigs. See the dynseq paper (Nair et al, Nature Genetics, 2022) for more details and documentation.

Motif lexicon and motifs per cell type

Motifs per cell type

We also provide the 6,362 motifs learned with TF-MoDISco in each cell type, in the Zenodo depo DOI:

Following the documentation from tf-modiscolite, the h5 files have this format:

pos_patterns/
    pattern_0/
        sequence: [...]
        contrib_scores: [...]
        hypothetical_contribs: [...]
        seqlets/
            n_seqlets: [...]
            start: [...]
            end: [...]
            example_idx: [...]
            is_revcomp: [...]
            sequence: [...]
            contrib_scores: [...]
            hypothetical_contribs: [...]
        subpattern_0/
            ...
    pattern_1/
        ...
    ...
neg_patterns/
    pattern_0/
        ...
    pattern_1/
        ...
    ...

In the TF-MoDISco terminology, “seqlets” are short genomic loci which are contiguous regions of high-contribution nucleotides, and they are clustered into “patterns”, or motifs.

The h5 file can be loaded e.g. in python:

import h5py
modisco_obj = h5py.File("Adrenal_c0__counts_modisco_output.h5")
modisco_obj
# <HDF5 file "Adrenal_c0__counts_modisco_output.h5" (mode r)>
modisco_obj['pos_patterns']
# <HDF5 group "/pos_patterns" (39 members)>
modisco_obj['neg_patterns']
# <HDF5 group "/neg_patterns" (3 members)>

The merged_modisco_patterns_map.tsv can be used to match per-cell type motifs to their motif cluster in the lexicon:

head merged_modisco_patterns_map.tsv
merged_pattern	component_celltype	pattern_class	pattern	n_seqlets	component_pattern
neg.Average_12__merged_pattern_0	Spleen_c5	neg_patterns	pattern_0	88	Spleen_c5__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Spleen_c0	neg_patterns	pattern_0	461	Spleen_c0__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Skin_c5	neg_patterns	pattern_0	302	Skin_c5__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Heart_c13	neg_patterns	pattern_0	128	Heart_c13__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Skin_c7	neg_patterns	pattern_0	708	Skin_c7__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Heart_c5	neg_patterns	pattern_0	47	Heart_c5__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Heart_c4	neg_patterns	pattern_0	41	Heart_c4__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Skin_c9	neg_patterns	pattern_0	335	Skin_c9__neg_patterns.pattern_0
neg.Average_12__merged_pattern_0	Heart_c2	neg_patterns	pattern_0	93	Heart_c2__neg_patterns.pattern_0

Motif lexicon

To generate the motif lexicon (also referred to as motif compendium in the code base), the 6,362 motifs discovered in each of the 189 cell types (using TF-MoDISco) were aggregated together and subjected to QC. This resulted in a total of 508 motifs.

Table S6 contains a summary table of the motif lexicon, one row per motif, along with its granular and broad annotations. The motifs can be explored interactively along with summary stats across tisuses and best matches to know motifs here: https://greenleaflab.github.io/HDMA/MOTIFS.html

We provide the following resources for the motif lexicon in the Zenodo depo DOI:

To match the motifs in the lexicon h5 object to their names, use column R, “merged_pattern”, in Table S6, which corresponds to the keys in the object.

Motif instances

The genomic tracks of annotated predictive motif instances in peaks are provided for each cluster as a zipped file motif_instances.gz in the Zenodo depo DOI

For every cluster, there are two files.

A BED file: <cluster>__instances.bed.gz>, with the columns chr, start, end, motif_name, hit_score, strand, pattern_class:

$ zcat Adrenal_c0__instances.bed.gz | head
chr1	10175	10180	456|ZEB/SNAI	888.5867	-	neg_patterns
chr1	10566	10575	436|SP/KLF	931.89967	+	pos_patterns
chr1	181096	181105	436|SP/KLF	933.10714	+	pos_patterns
chr1	181151	181161	400|NRF1	966.9898999999999	-	pos_patterns
chr1	181180	181190	400|NRF1	962.5238	-	pos_patterns
chr1	181209	181219	400|NRF1	951.84535	-	pos_patterns
chr1	181238	181248	400|NRF1	949.1391	-	pos_patterns
chr1	181267	181277	400|NRF1	944.7280999999999	-	pos_patterns
chr1	181296	181306	400|NRF1	934.34006	-	pos_patterns
chr1	181325	181335	400|NRF1	930.62365	-	pos_patterns

A TSV file: <cluster>__instances.annotated.tsv.gz, with a more extended set of information (including hit scores and genomic annotaitons) per motif instance:

$ zcat Adrenal_c0__instances.annotated.tsv.gz | head
seqnames	start	end	width	strand	start_untrimmed	end_untrimmed	motif_name	source	hit_coefficient	hit_correlation	hit_importance	peak_name	peak_id	motif_name_unlabeled	pattern_class	distToGeneStart	nearestGene	peakType	distToTSS	nearestTSS	GC	N	distToPeakSummit
chr1	10175	10180	5	-	10158	10188	456|ZEB/SNAI	2	1.1400998	0.8885867	0.009292185	NA	0	neg_patterns.neg.Average_12__merged_pattern_0	neg_patterns	1690	DDX11L1	Promoter	1690	ENST00000456328.2	0.4	0	71
chr1	10566	10575	9	+	10560	10590	436|SP/KLF	2	9.525877	0.93189967	0.02702999	NA	0	pos_patterns.pos.Average_212__merged_pattern_0	pos_patterns	1297	DDX11L1	Promoter	1297	ENST00000456328.2	0.8889	0	464
chr1	181096	181105	9	+	181090	181120	436|SP/KLF	1	1.8184662	0.93310714	0.0265913	NA	1	pos_patterns.pos.Average_212__merged_pattern_0	pos_patterns	1594	DDX11L17	Promoter	1594	ENST00000624431.2	0.8889	0	393
chr1	181151	181161	10	-	181138	181168	400|NRF1	2	8.63048	0.9669899	0.05363989	NA	1	pos_patterns.pos.Average_159__merged_pattern_0	pos_patterns	1539	DDX11L17	Promoter	1539	ENST00000624431.2	0.9	0	338
chr1	181180	181190	10	-	181167	181197	400|NRF1	2	4.2169023	0.9625238	0.038315773	NA	1	pos_patterns.pos.Average_159__merged_pattern_0	pos_patterns	1510	DDX11L17	Promoter	1510	ENST00000624431.2	0.9	0	309
chr1	181209	181219	10	-	181196	181226	400|NRF1	2	2.505722	0.95184535	0.03131914	NA	1	pos_patterns.pos.Average_159__merged_pattern_0	pos_patterns	1481	DDX11L17	Promoter	1481	ENST00000624431.2	0.9	0	280
chr1	181238	181248	10	-	181225	181255	400|NRF1	2	2.895874	0.9491391	0.036172867	NA	1	pos_patterns.pos.Average_159__merged_pattern_0	pos_patterns	1452	DDX11L17	Promoter	1452	ENST00000624431.2	0.9	0	251
chr1	181267	181277	10	-	181254	181284	400|NRF1	2	1.5476482	0.9447281	0.026664732	NA	1	pos_patterns.pos.Average_159__merged_pattern_0	pos_patterns	1423	DDX11L17	Promoter	1423	ENST00000624431.2	0.9	0	222
chr1	181296	181306	10	-	181283	181313	400|NRF1	2	1.0276077	0.93434006	0.023554802	NA	1	pos_patterns.pos.Average_159__merged_pattern_0	pos_patterns	1394	DDX11L17	Promoter	1394	ENST00000624431.2	0.9	0	193

Genomic tracks on the WashU Genome Browser

For easy interactive visualization, all genomic tracks can be loaded on the WashU Genome Browser using this link: https://epigenomegateway.wustl.edu/browser2022/?genome=hg38&hub=https://human-dev-multiome-atlas.s3.amazonaws.com/tracks/HDMA_trackhub.json.

For instructions on loading tracks, see below:

  1. Go to Tracks > Remote tracks:

  1. Click to expand the content:

  1. Use the track table to load cell types or tracks of interest:

For each cluster, the following tracks are available: