A human cell atlas of fetal chromatin accessibility
Silvia Domcke, Andrew J. Hill, Riza M. Daza, Junyue Cao, Diana R. O’Day, Hannah
A. Pliner, Kimberly A. Aldinger, Dmitry Pokholok, Fan Zhang, Jennifer H. Milbank,
Michael A. Zager, Ian A. Glass, Frank J. Steemers, Dan Doherty, Cole Trapnell, Darren
A. Cusanovich, Jay Shendure
Materials/Methods, Supplementary Text, Tables, Figures, and/or References
- Download Supplement
-
- Materials and Methods
- Figs. S1 to S19
- References
- Table S1
- Master list of sites. Chromosomal location, width, read number in the object evenly subsampled across cell
types, and indication of whether this site was z-score filtered for downstream analyses,
for each site within a merged set of 1.05 M peaks of accessibility. - Table S2
- Metadata of high-quality cells. Includes sample metadata and various per-cell QC stats, Louvain cluster id and cell
type annotation, for each of the 790,957 high-quality cells used in the downstream
analyses. - Table S3
- Motif enrichments across cell types. For each of the 579 motifs from the JASPAR vertebrate database, the enrichment in
accessible sites in each of the main 54 cell types was determined using a linear regression
model. Fold-change of the mean motif occurrence in sites of a given cell type relative
to the rest of the dataset and matching Benjamini Hochberg-adjusted p-values are reported
for each motif-cell type pair. - Table S4
- Specificity scores for top 10,000 sites per cell type. The top 10,000 specific sites for each of the 54 cell types and their specificity
scores are reported. For downstream analyses, sites beyond two standard deviations
from the mean of the log-scaled counts per site distribution (as indicated in File
S1) were removed, yet they are included here with their scores for convenience. For
de novo motif discovery, the top 2,000 specific sites per cell type were used, and
for heritability enrichment calculations, the top 10,000 specific sites were used,
post z-score filtering. - Table S5
- Cicero co-accessibility scores by cell type. Comma-separated table of Cicero coaccessibility scores greater than 0.1, generated
for each of 1010 cell type/tissue pairs. The first two columns are the coordinates
in hg19 of the two tested sites. Each of the remaining columns represents the co-accessibility
scores for each of the cell types. NA values indicate that the pair of sites was not
tested because of insufficient depth or that the co-accessibility value was less than
0.1. - Table S6
- Cicero gene activity scores by cell type. Comma-separated combined triplet sparse matrix table of Cicero gene activity scores
for each of 1010 cell type/tissue pairs. The first column is gene symbols, the second
column is cell IDs with cell type and tissue, and the third column is the associated
gene activity score. NA values indicate that there was insufficient accessibility
to calculate a gene activity score. - Table S7
- Motif enrichment across clusters in UMAP of sites. For each of the 579 motifs from the JASPAR vertebrate database, the enrichment in
each of the 15 Louvain clusters in the UMAP of sites was determined using a linear
regression model. Fold-change of the mean motif occurrence in a given cluster relative
to the rest of the dataset and matching Benjamini Hochberg-adjusted p-values are reported
for each motif-cluster pair.

