Most perturbations a eukaryotic cell experiences occur at nonreplicative time scales. These perturbations are remarkably varied, range in intensity, and can be completely distinct from previously encountered stimuli. Examples exist throughout the human body, including within the skin, the alimentary tract, the immune system, the respiratory tract, and the reproductive system and in malignancy. Consider the epithelial lining of the digestive and respiratory systems. While both systems are constantly renewing their lining, most functional cells within these tissues persist for days to weeks after replication. During their life span, these cells are exposed to a wide range of nutrients and toxicants that necessitate the modification of gene expression to carry on basic cellular functions across these variable conditions (e.g., appropriately absorb nutrients, regulate ionic homeostasis, maintain a sufficient mucosal barrier, excrete waste products, and secrete immunoglobulins). No better example may exist than malignant tumor cells, which are remarkably adept at acclimating to a broad spectrum of cytotoxic chemotherapies and radiation exposure while evading detection from the myriad tools present within the immune system. These capabilities evoke a critical question: How do individual cells acclimate to fluctuating or completely novel conditions? Likewise, how do collections of cells, such as an organ or a tumor mass, acclimate in aggregate to a heterogeneous, rapidly evolving environment?
One widely explored mechanism to respond to these varied conditions is to have a level of predetermined functionalization: intermixing specialized cells within an organ to carry out specific roles. Beyond establishing precoordinated responses, an intriguing possibility is for cells and cell populations to have an encoded level of phenotypic plasticity to acclimate to changing environmental conditions in real time (1, 2). In the context of multicellular systems, the level of phenotypic plasticity encoded is a product of cellular malleability—the functional responsiveness of cells toward stable end states upon external stimulation—and the level of intercellular heterogeneity—the diversity of stable states that are observed within the same population at a given time. Recent advances in single-cell technologies have highlighted the remarkable levels of diversity within multicellular systems for otherwise seemingly identical cells (3–5). An extraordinary level of diversity has been demonstrated in the lungs (3), breast tissue (4), the gastrointestinal tract (5), and in malignancy (6–9). Furthermore, the cancer state is associated with considerable structural (10, 11), transcriptional (8, 12), epigenetic (9, 13), and mutational heterogeneity (6, 9)—all of which have been demonstrated to be independently linked to chemotherapeutic resistance, metastasis, survival, and resilience in multiple cancer models. Likewise, transcriptional responsiveness is concomitant with cancer cell survival in response to chemotherapy and the functional responsiveness of immune cells to microbes (14). To date, these two facets of phenotypic plasticity have largely been viewed as distinct entities, as no mechanisms have been identified that simultaneously link both diversity and responsiveness. However, at the level of gene transcription, both the malleability and intercellular heterogeneity of gene expression within cell populations could result from the physical organization of chromatin (14, 15).
To test the hypothesis that the physical organization of the genome is a regulator of both transcriptional malleability and intercellular heterogeneity, we used multiscale modeling to describe transcription as a series of chemical reactions occurring in a heterogeneous, crowded environment—a disordered chromatin-packing macromolecular crowding (CPMC) model. Pairing the CPMC model with single-cell RNA sequencing (scRNA-seq), chromatin electron microscopy tomography (ChromEM)—a DNA-specific staining technique for electron microscopy—and live-cell partial wave spectroscopic (PWS) microscopy, we demonstrate that the physical structure of chromatin packing determines both the level of transcriptional malleability and heterogeneity. In particular, the CPMC model predicts that at the supranucleosomal scale [from ~kbp (kilo–base pairs) to several Mbp (million base pairs)], the scaling behavior of chromatin packing, which is the relationship between the genomic length of a chromatin chain and its packing size in three-dimensional space, determines the level of intercellular transcriptional heterogeneity by regulating local variations in chromatin density (14, 16). Furthermore, the scaling of chromatin packing directly influences the level of transcriptional malleability by regulating both gene accessibility and the free energy of transcription reactions (17–19). Last, applying the CPMC model to interrogate the phenotypic plasticity of cancer cells, we show that increased transcriptional malleability has an impact on cancer patient mortality. Analyzing gene expression data from The Cancer Genome Atlas (TCGA) (20), we demonstrate that transcriptional divergence—a direct measure of transcriptional malleability, which is connected with chromatin packing scaling—is inversely related to patient survival in advanced (stage 3 and stage 4) colorectal, breast, and lung cancers. In summary, this work mechanistically links two distinct aspects of phenotypic plasticity, transcriptional malleability and intercellular heterogeneity, with the physical properties representing the structure of disordered chromatin packing. Using the CPMC model, we quantitatively describe the role that physical forces play on gene expression in vitro and describe a potential mechanistic relationship between structural alterations of chromatin observed in cancer and prognosis.
The CPMC model considers transcription in dilute, ex vivo conditions as a series of diffusion-limited chemical reactions that use DNA, transcription factors (TFs), and RNA polymerase II (Pol-II) to produce mRNA (Fig. 1A). The total production of mRNA in these conditions will depend on the concentration of reactants ([C]tot; Fig. 1B), the rate of polymerase elongation (km; Fig. 1C), and the dissociation rates of TFs and polymerase from DNA (KD: Fig. 1D). These molecular factors are well-studied regulators of gene expression in vitro. For example, at the scale of nuclear compartments, the formation and dissipation of topologically associating domains (TADs) can alter local TF concentrations (21). In addition, posttranslational histone modifications alter nucleosomal stability, thereby influencing the rate of polymerase elongation (22). Other posttranslational modifications of RNA polymerase itself independently control polymerase activity (23). Furthermore, gene motifs determine binding affinities of polymerase and TFs, resulting in varied dissociation constants of these molecules from their respective target genes (24).
Compared to ex vivo conditions, the eukaryotic nucleus is a highly crowded, heterogeneous environment (Fig. 1E). To model transcription reactions within such an environment requires consideration of the length scales involved. At the smallest scale (within ~20 nm of a gene, i.e., an “interaction volume”), macromolecular crowding (ϕin) influences transcription by affecting the mobility of transcriptional reactants and the dissociation rate of these molecules from DNA (19, 25, 26). In addition, the accessible surface area of chromatin determines the number of DNA binding sites available to transcriptional reactants. The probability of a gene promoter to be available for transcription depends on its local accessible surface area. At these small length scales, transcription can be modeled as a network of chemical reactions involving TFs, Pol-II, and DNA. TFs bind to their respective DNA binding sites and recruit polymerases to gene promoters, which, in turn, bind DNA. These series of reactions result in intermediary transcription complexes that stochastically transcribe genes into mRNA. Each reaction coefficient depends on local crowding effects, which can be calculated using Brownian dynamics (BD) and Monte Carlo (MC) simulations. Gene expression for particular crowding conditions is calculated by solving the steady-state network of equations that models these transcription reactions (19, 26). This modeling approach predicts a nonmonotonic dependence of transcription on crowding. This nonmonotonic behavior is influenced by the molecular factors previously discussed and is due to the opposing effects of macromolecular crowders on chemical reactions. Initially, transcription rates increase with crowding due to an enhanced binding stability of TFs and Pol-II arising from attractive depletion interactions. At higher crowding conditions, however, the crowding-induced reduction of molecular mobility dominates, lowering transcription rates. Notably, the most prevalent macromolecular crowder in the nucleus is chromatin. Thus, local chromatin density within the interaction volume of a gene should have a profound effect on transcription processes. Recent electron microscopy studies have shown that chromatin packing density is highly heterogeneous across the genome. Some genes have interaction volumes with exceedingly high densities [chromatin volume concentration (CVC) up to >60%], while others may be positioned in regions of the nucleus with CVC as low as ~10 to 20% (27). One approach to study the effect of local crowding on transcription in cells would be to experimentally measure the local density of chromatin near every gene using electron microscopy and pair these measurements with in situ mRNA levels. This, however, is beyond existing technical capabilities, and an alternate approach is needed.
Instead of experimentally mapping gene expression to locus-specific crowding conditions, the CPMC model probabilistically samples the polymeric properties of chromatin to approximate transcriptional output of an ensemble of genes under similar molecular and varying physical conditions (14, 28, 29). A combination of molecular factors influences the relative initial expression levels of these genes (19). In this work, we focus on how physical regulators further modulate transcription reactions to produce a final observed transcription rate. The model considers chromatin to be a disordered heteropolymer that is heterogeneously packed in three-dimensional (3D) space. The 3D packing of the chromatin polymer determines the volume fraction occupied by chromatin, the number of nucleotides acting together as a grouped polymeric entity (Nd), and the space-filling geometry or the scaling behavior of these polymeric entities. Nd can be considered as the number of nucleotides that are contained within a subset of the chromatin polymer that has self-similar, power law scaling properties. The power law scaling behavior exhibited by the chromatin polymer within a self-similar domain of size Nd describes the relationship between the length of a given segment (e.g., the number of nucleotides, N) and the size (R) of the physical space occupied by that segment, N ∝ RD for N ≤ Nd. The scaling factor D is frequently referred to as the fractal dimension of the polymer and is determined by the balance of the free energy of polymer-polymer and polymer-solvent interactions. D of an unconstrained free polymer may range from D = 5/3 for an excluded volume polymer to D = 2 for an ideal chain polymer in theta solvent and to D = 3 for a completely space-filling polymer. A polymer with a uniform chain structure throughout would form a single fractal domain with D determined by the properties of the chain and the solvent. Chromatin, on the other hand, is a heterogeneous polymer with variable post-translational histone modifications and DNA methylation patterns. This leads to differential interactions between the heterogeneous chromatin subunits and results in chromatin compartmentalization, potentially as a result of liquid-liquid phase separation (30). Additional topological constraints induced by chromatin-binding proteins, such as those responsible for the formation of chromatin loops or nuclear lamins, might further influence D within a given chromatin domain or compartment. Electron microscopy and super-resolution imaging studies have demonstrated the existence of spatially segregated supranucleosomal nanoscale packing domains with a variable size distribution in 3D space (27, 31). We have been able to visualize the existence of these packing domains using ChromEM (Fig. 1E) and PWS (Fig. 1F) as small (100 to 200 nm in diameter; genomic size, between 100 and 400 kb), globular regions that exhibit power law scaling behavior and tend to have higher chromatin density and D. The CPMC model considers a gene’s interaction volume to be located within these packing domains. Accordingly, the local environment of a gene’s interaction volume is determined by the encompassing packing domain, each of which may have its own average nuclear crowding density (ϕin,0) (Fig. 1E), chromatin packing scaling D (Fig. 1, F and G), and genomic size (Nd) (Fig. 1H). These local physical conditions are important determinants of gene expression. In addition, gene length (L) partially influences the size of the interaction volume of a given gene, affecting the range of crowding conditions the gene is probabilistically exposed to. The CPMC model uses these measurable physical regulators of chromatin to approximate distributions of chromatin density and accessibility to determine transcription for each gene throughout the entire nucleus, a feat that is currently experimentally infeasible (17).
The expected expression rate of a gene in vitro is the product of the steady-state mRNA expression rate of that gene (ϵ) and the probability of the gene to be on the accessible surface of the chromatin polymer (pg). Steady-state expression rate is a function of molecular features surrounding the gene of interest (
; TF concentration, histone state, enhancer-promoter interactions, etc.) (Fig. 1, B to D) in the context of local physical conditions (Fig. 1, E to H) (14, 18, 19, 25). The probability of gene accessibility contributes to the likelihood of a gene to interact with transcriptional components (TFs and Pol-II) in vitro (32). It is beyond technical capabilities to measure all molecular and physical parameters of the model for specific genes at the single-cell level. Thus, we explore how a given ensemble of genes with similar molecular features
(e.g., grouped by their initial expression or associated gene ontologies) would respond to changes in average measurable physical conditions. Specifically, we study how average nuclear crowding density (ϕin,0), average chromatin packing scaling (D), and genomic size of a packing domain (Nd) change the behavior of global transcription processes. It is critical to stress that the CPMC model does not assume that the chromatin polymer has the same power law scaling behavior or constant density throughout the entire nucleus, but that this is instead an approximation due to existing experimental limitations. The model can further be extended to consider that each packing domain has its own chromatin packing scaling D, as technological capabilities to co-register chromatin packing, molecular, and genomic properties advance. In addition, this model considers nuclear crowding density within each interaction volume, ϕin, is assumed to be constant relative to the time scale of transcription (in minutes), in line with recent imaging studies of chromatin mobility (33).
Given these considerations, in a population of cells, each gene will be exposed to different crowding densities ϕin. Each ϕin will be sampled from the probability distribution function f(ϕin), which is assumed to follow a normal distribution with mean ϕin,0 and variance
, where rmin is the radius of the elementary unit of chromatin (e.g., a base pair) and rin is the radius of the interaction volume (Supplementary Text) (14). Because of the mass fractal nature of chromatin,
for a gene of length L, where
is the radius of the interaction volume for a single base pair and is approximated from previous MC simulations of crowding effects (14, 19). Thus, the expected range of crowding densities each gene is exposed to is dependent on the statistical properties of the packing domain of size Nd where the gene is located, including D and ϕin,0, and is further influenced by length L of the gene. The transcription rate ϵ itself is assumed to depend on molecular features
and on local crowding density ϕin. We calculate all expression rates under the assumption that molecular features
remain constant throughout the population, with physiologically relevant values used in previous MC and BD crowding simulations (table S1) (19). This gives rise to the form of
, the average expression rate for an ensemble of genes that share a given
Likewise, a power law model of chromatin packing scaling allows the CPMC model to calculate the probability of a unit of DNA (e.g., a gene promoter) to be on the accessible surface of chromatin, pg (28, 29)
Last, merging accessibility with steady-state expression rate for a group of genes, the ensemble expression rate is
To quantitatively analyze the effect of D on gene expression, we calculated the sensitivity of gene expression as a function of D as predicted by the CPMC model. Sensitivity (Se) is the measurement of how a dependent variable (i.e., gene expression) will change as a function of a perturbation to an independent variable (i.e., D). Se of expression rate for any group of genes to changes in chromatin packing is defined as
(4)where Ei is the initial average expression rate of the group of genes sharing similar molecular features
and gene length L, and Di is the initial average chromatin packing scaling within a cell population. A positive Se for a given group of genes indicates that an increase in the scaling of chromatin packing (D↑), on average, enhances their collective expression rate. The CPMC model predicts the output of transcription reactions that occur within the nucleus. Assuming that the half-life of mRNA transcripts is dictated by cytoplasmic conditions, structural changes in chromatin that alter chromatin packing scaling D are not considered to alter the degradation rate of mRNA. Thus, sensitivity should be directly related to the number of transcripts produced for any group of genes in the nucleus.
To solve Eq. 4, we used a Taylor series approximation of
is a nonmonotonic function of ϕin due to the competing effects of crowding on depletion interactions and molecular diffusion, and
quantifies gene expression as a function of crowding within a transcriptional interaction volume. Expression rate κ = 22.6 nM/s is derived from a steady-state solution of rate equations that model transcription and whose crowding-dependent rates were determined from BD and MC simulations as described previously (14, 19). Notably, the function
can be simulated by varying any or several of the components of
. Although, in principle, the exact form of
as a function of
may depend on which component of
is being varied, i.e.,
, in practice, κ is only weakly dependent on
. In other words,
, with the average expression rate as the “common dominator” of multiple molecular factors. Thus, predictions of the CPMC model regarding the effects of physical regulators on ensemble gene expression should be robust to changes in molecular factors. Combining Eqs. 1 to 5, Se of expression rate becomes
Physical factors of chromatin structure regulate sensitivity of gene expression to changes in chromatin packing scaling
To first test the CPMC model predictions in vitro, we used live-cell PWS microscopy to measure D (Fig. 2, A and B) (33, 34) and ChromEM to measure ϕin,0 (Fig. 2, C and D) (27) paired with mRNA microarrays, RNA-seq, and scRNA-seq to measure gene expression of cell populations under different conditions. Specifically, average D was calculated by first averaging D values from PWS measurements within each cell nucleus and then averaging these measurements over the entire cell population for each treatment condition. Using ChromEM, average chromatin density was measured within each nucleus with ~3-nm resolution. As ϕin,0 represents the crowding contributions from both chromatin and mobile crowders within the nucleus, we added to our ChromEM measurements an additional 5% contribution from unbound macromolecules (as described in Materials and Methods). In addition, we used publicly available DNA sequencing information to obtain gene length and high-throughput chromatin conformation (Hi-C) capture data to approximate Nd from the size of TADs (35). In relation to previous work on higher-order chromatin organization, Nd could extend from tens of thousands to millions of base pairs. While Nd might not necessarily represent the organization observed in TADs, TAD size was used as an approximate measure of Nd as these domains have been shown to obey self-similar organization (36), as evidenced by power law scaling properties of contact probability within TADs (37). Combining these methods, we then tested the CPMC model’s predictions of Se against in vitro measurements for each identified physical regulator of gene expression.
To test the role of initial Di, we performed an RNA interference knockdown of the chromatin remodeling enzyme Arid1a (A-Kd) in human colon carcinoma HT-29 cells, which resulted in a lower Di compared to wild-type (WT) cells (17). Next, we used PWS microscopy to measure changes in chromatin packing scaling D in serum-starved WT and A-Kd HT-29 cells before and 30 min after stimulation with 10% fetal bovine serum (FBS), 100 nM epidermal growth factor (EGF), and 100 nM phorbol 12-myristate 13-acetate (PMA) (14). In parallel, we measured gene expression for these conditions at 5 hours using mRNA microarrays. Genes were grouped for WT and A-Kd cells separately based on their relative initial expression during serum starvation, and the experimentally measured sensitivity Δ ln E/Δ ln D was calculated for each group of genes. As predicted by the CPMC model, experimental measurements of Se of gene expression show a bidirectional, monotonic responsiveness to D as a function of initial expression in HT-29 cells (ϕin,0 ~ 39%, approximated by dividing chromosome copy number by nuclear volume). In addition, we found that Di predominantly changes the responsiveness of initially underexpressed genes (Fig. 2, E and F). These results indicate that populations of cells with a higher D would have a higher level of transcriptional divergence (the difference between highly and low expressed genes) than lower D cells. Cancer cells across most malignancies, stem cells, and, especially, cancer stem cells are all examples of cell populations that have elevated chromatin packing scaling (2, 38). Functionally, this suggests that D can act as a means to optimize transcriptional responses as is explored in subsequent sections.
Next, we tested the effect of average nuclear crowding density, ϕin,0, on gene expression sensitivity to changes in chromatin packing scaling D. ChromEM was used to measure average chromatin density for both human lung adenocarcinoma A549 cells and differentiated BJ fibroblast cells, which had a mean CVC of 0.35 and 0.30, respectively (Fig. 2, C and D; distribution of CVC values shown in fig. S3). Approximating for an additional contribution to excluded volume interactions from mobile crowders, estimates of ϕin,0 were 40% in A549 and 35% in BJ cells. Each cell line was treated with 100 nM dexamethasone (DXM) to modulate D, which was measured by PWS. Gene expression of both cell lines with and without dexamethasone treatment was measured by RNA-seq. Sensitivity of gene expression was measured as described above for each cell line. The CPMC model predicts that cells with a lower ϕin,0 would have an attenuated bidirectional Se, an effect confirmed experimentally in the lower chromatin density BJ cells (Fig. 2G). In contrast, the higher chromatin density A549 cells (Fig. 2H) retain an asymmetric response to altered chromatin packing scaling. This suggests that cells with smaller nuclear volume, such as immune cells, or cells with increased chromosome copy number, such as malignant cancer cells, would be predisposed to produce a more pronounced bidirectional response in gene expression to stimuli that alter whole nuclear chromatin packing structure compared to cells with lower chromatin density. These results demonstrate that the net effect of increasing D and ϕin,0 is an increased transcriptional divergence between initially over- and underexpressed genes.
In addition, we tested the roles of Nd on Se. From our model, Nd determines the probability of genes being located on an exposed surface (thus allowing transcription reactions to occur), a relationship that depends nonlinearly on D (Eq. 2). Consequently, the CPMC model predicts that (i) genes in larger packing domains (e.g., Nd > 2 Mbp) would be relatively underexpressed in comparison to those within smaller Nd domains (<50 kbp) and (ii) genes within large Nd domains would be more likely to become enhanced as a function of increasing D (+Se). To test these predictions experimentally, we used the arrowhead function in Juicer tools to measure TAD sizes from Hi-C data of untreated and dexamethasone-treated A549 cells (35). As the dissociation and formation of TADs have been previously shown to alter gene expression, for our analysis, we only selected TADs that were unaltered with dexamethasone treatment. The top 20% largest (~2 Mbp) and bottom 30% smallest (~50 kbp) of these conserved TADs were chosen to produce roughly equal-sized groups of genes (~130 genes per group). Using RNA-seq to measure gene expression and PWS microscopy to measure changes in D before and after dexamethasone treatment, we analyzed the differences in sensitivity of expression between genes localized to smaller 50-kbp TADs compared to larger 2-Mbp TADs (Fig. 2I). As predicted from the CPMC model, in vitro results demonstrate that genes within larger 2-Mbp TADs have an overall higher sensitivity to changes in D (Fig. 2I) while simultaneously having lower initial expression compared to those within smaller 50-kbp TADs. Consequently, these findings suggest a regulatory role of spatially confining genes into self-similar structures, such as those found in TADs, in determining the probability of a gene to be exposed to transcriptional reactants. Given recent work indicating substantial variability in TADs from cell to cell, this would suggest yet another mechanism that cells can use to regulate their functional diversity within a population.
Last, we tested the role of gene length on sensitivity of twofold underexpressed (low) and twofold overexpressed (high) genes in the serum-starved WT HT-29 cells described above. Using the built-in Mathematica function, GenomeData, to obtain sequence length of genes, the sensitivity of gene expression to D was then calculated as a function of gene length. The model predicts that shorter genes have a smaller interaction volume, increasing the variance of crowding conditions these genes are exposed to. Consequently, an increase in D should further increase fluctuations in crowding concentrations surrounding these shorter genes, causing initially underexpressed genes to further reduce their expression in proportion to decreasing gene length L. However, genes with an initially higher expression level will be relatively unaffected by changes in gene length due to more optimal molecular characteristics (e.g., high TF and Pol-II concentrations) and initial crowding conditions these genes are exposed to. In line with the CPMC model, our experimental microarray data demonstrate that shorter, initially underexpressed genes become disproportionately underexpressed as a function of increasing D, whereas length minimally influences initially overexpressed genes (Fig. 2J).
The scaling behavior of chromatin packing regulates phenotypic plasticity through transcriptional divergence and malleability
A major implication of the CPMC model is the role physical chromatin structure plays in shaping gene expression. Thus, the model could provide a mechanistic link between two aspects of phenotypic plasticity of a population of cells: transcriptional malleability and intercellular transcriptional heterogeneity. In this case, we can consider transcriptional malleability to be the average change in expression of a gene in response to an external stimulus, while transcriptional heterogeneity can be thought of as the range in expression levels of each gene across a cell population. While there is likely to be increased complexity that results from cell-to-cell variations in average density and D, we here test how heterogeneity and malleability are influenced by the measurable features of disordered chromatin packing within a cell population. An ideal test bed for this mechanistic integration is cancer. Multiple lines of evidence have shown that chromatin structure is nearly universally transformed in malignancy (39–42). Microscale structural alterations in chromatin are currently the gold standard for histopathological diagnosis of dysplasia and malignancy (39). At the nanoscale, an increase in D has been previously reported to occur at predysplastic stages of lung, colon, esophageal, ovarian, liver, prostate, and pancreatic cancers, while the severity of the chromatin transformation has been shown to be an accurate indicator of the tumor aggressiveness (40, 42). Because (i) elevated D is a hallmark of malignancy, (ii) there is an emergent role of intercellular heterogeneity in determining chemotherapeutic responsiveness, and (iii) cancer cells rapidly alter their gene expression to overcome cytotoxic stressors (14, 16, 43), we hypothesized that cancer cells could leverage physical transformation within the nucleus to gain survival advantages. Therefore, we wanted to test whether cells could use the scaling of chromatin packing as a regulator of both transcriptional malleability and heterogeneity to achieve a rapid response to external stressors.
According to the CPMC model, the dependence of transcriptional malleability on chromatin packing scaling results from the observed asymmetric response of up-regulated and down-regulated genes to changes in D (Fig. 2), which we denote as transcriptional divergence. Here, we focus on changes in gene expression due to an external stimulus. A transcriptional response of a cell to a chemotherapeutic stress provides a case in point. Chemotherapeutic induction of apoptosis has been shown to depend on the rate of change in expression of critical genes (e.g., p53) and not their steady-state levels alone (44). Accordingly, mechanisms that increase the rate of up-regulation of these critical genes would facilitate the development of cellular resilience to stressors. Consider two populations of cells that have a baseline difference in their initial D. These two populations are then exposed to the same exogenous stressor, and a series of stress signaling pathways are activated in an attempt to overcome the perturbation. The cells’ survival now depends, in part, on the increased expression of these genes within a critical time frame. The CPMC model predicts that the population of cells with initially higher D will be more likely to up-regulate these critical genes and remain viable (Fig. 3A).
To quantify the effect of initial D on transcriptional responsiveness, let mRNA1,a be the initial expression (the number of mRNA transcripts) for a given gene in cell a with chromatin packing state Da. At time point t = 0, a stimulus produces an increase in the gene’s rate of expression from E1,a to E2,a. Without loss of generality, we first assume that both expression rate E2,a remains stable and that the rate of mRNA degradation, v, remains constant after stimulation. The relative change in expression at time t is (mRNAa(t) − mRNA1,a)/mRNA1,a = (E2,a/E1,a − 1)(1 − e−νt), where mRNA1,a = E1,a/ν is the prestimulation steady-state expression. This relative change in expression increases with the ratio E2,a/E1,a, which is itself a function of both molecular features and the chromatin packing state surrounding the gene. This can be illustrated by comparing the response of an individual gene to an exogenous stressor in two cells a and b. Let the same gene in both cells be associated with similar molecular features
but different chromatin packing states Da and Db, with Db > Da. From Eq. 4,
, it follows that
(7)where Sei(D) is the sensitivity of expression state Ei,a. In this situation, the effect of D on relative changes in transcription in cell b compared to cell a would be defined as
Within the physiological range of transcription, Se is an increasing function of E (Fig. 2), and as E2 > E1 for both cells, δ > 1. Consequently, the same stimulus will result in enhanced up-regulation of the same gene in cell b compared to cell a, driven by the differences in chromatin packing scaling between the two cells. This effect is expected to be particularly pronounced for initially underexpressed genes with Se1 < 0 that undergo a significant amplification (Se2 > 0) upon stimulation. We see that δ is directly related to transcriptional divergence and the shape of the function Se(E) (Fig. 2). A faster rise of Se as a function of E results in a higher δ. For cells a and b with similar D, δ ≈ 1 + (Se2 − Se1)(Db − Da)/Da. This implies that factors that tend to increase transcriptional divergence (e.g., high D, high crowding density, and small Nd) would be expected to result in a higher transcriptional malleability.
The functional significance of the relative transcriptional malleability coefficient δ is twofold. First, for highly amplified genes (E2/E1 ≫ 1), the relative increase in transcription at any given time after the stimulation is proportional to δ
Second, the time τ required to reach a given level of expression is dependent on chromatin packing scaling and inversely proportional to δ, i.e., τb/τa ≈ δ−1. This conclusion is applicable to genes that are both up-regulated and those that are down-regulated in response to a stimulus, an effect that might be especially consequential if decisions regarding cell fates must be made within a limited time period after the introduction of the stressor (44).
To experimentally explore the relationship between D and phenotypic plasticity, we performed concurrent scRNA-seq and live-cell PWS microscopy experiments on A2780 ovarian adenocarcinoma cells in response to treatment conditions that modulate chromatin packing scaling. We first tested whether chemotherapy treatment of cancer cells resulted in a preselection of high D cells. We measured changes in D using live-cell PWS in A2780 ovarian adenocarcinoma cells before and after treatment with a chemotherapeutic agent, 5 nM paclitaxel, for 48 hours. We also monitored cell coverage, which represents the survival of a cell population. Defining high D cells as those that fall within the top 25th percentile of D in the cell population before paclitaxel treatment (D = 2.47), we then measured the percentage of cells with high D at 48 hours after paclitaxel treatment. We observed that the percentage of high D cells increased in paclitaxel-treated cells compared to the control population (Fig. 3B). In combination with coverage measurements, which demonstrated significant cell death after 48 hours of paclitaxel treatment, our results indicate that high D cells have an increased survival rate when exposed to paclitaxel treatment (Fig. 3, B and C).
We then compared the transcriptional malleability of populations of cells with different initial D. As a model system, we relied on chemically induced modulation in D. To reduce D, we treated A2780 cells with 75 μM celecoxib, a nonsteroidal anti-inflammatory agent for 16 hours. Previously, we have found that celecoxib reduces D within 30 min of treatment in A2780 ovarian carcinoma cells by at least 8% compared to untreated cells (14). As a model of high-D cells, we used untreated A2780 cells. Both celecoxib-treated cells (low D) and untreated cells (high D) were then exposed to a chemotherapeutic agent, 5 nM paclitaxel, for 16 or 48 hours. scRNA-seq was conducted using Illumina NextSeq500. Raw reads were aligned, mapped, and used to calculate transcripts per million (TPM) for each condition using bowtie2 (45) and RSEM (46). Thus, as a model system, we measured transcriptional perturbation induced by a cytotoxic chemotherapy stressor in lower D (celecoxib-treated) versus higher D (untreated) A2780 cell populations.
Inputting the experimentally observed difference in D into the CPMC model, we estimated δ > 4 for initially underexpressed genes that become activated (Fig. 3D, blue manifold) and a smaller increase in δ for initially overexpressed genes that are up-regulated in response to stimulation (Fig. 3D, red manifold). We then tested whether these predicted trends are observed experimentally using scRNA-seq. The crucial window for response to chemotherapy is frequently thought to occur within 24 hours (44). Thus, we compared changes in gene expression in A2780 cells with initially higher D to initially lower D after stimulation by paclitaxel treatment for 16 hours. In agreement with the CPMC model, the stimulation of initially underexpressed genes by chemotherapy treatment in initially higher D cells (up-regulation of expression rate from control rate E1,b to 16-hour paclitaxel–treated rate E2,b) was much higher than that in lower D cells (from celecoxib-treated rate E1,a to 16-hour combo rate E2,a), resulting in δ~4 (Fig. 3E). Likewise, a similar but mitigated effect was observed in initially overexpressed genes (Fig. 3E), in strong agreement with the model predictions. Next, we tested whether these trends were independent of cell line and compound. We performed parallel experiments using propranolol as a D-lowering agent in A2780 cells and celecoxib and propranolol to decrease D in more malignant TP53 mutant A2780 (M248) cells. These additional conditions demonstrated a similar effect of D on transcriptional malleability in response to paclitaxel stimulation of high D compared to low D cells (fig. S4). Last, we tested whether the observed effect of chromatin packing scaling influences genes specifically involved in functionally relevant stress response pathways. We first identified differentially expressed genes that, on average, increased their expression at least twofold in A2780 cells treated with paclitaxel for 48 hours compared to control cells. Gene Ontology (GO) analysis of these up-regulated genes showed the activation of multiple stress response pathways after stimulation by paclitaxel treatment, including DNA repair, autophagy, cell cycle arrest, and apoptosis (P < 0.05; Fig. 3F and fig. S5). The effect of D on the activation of these established stress response genes was consistent with that observed in all up-regulated genes, with δ as high as ~4 (Fig. 3G).
The scaling behavior of chromatin packing regulates phenotypic plasticity through intercellular transcriptional heterogeneity
Another key aspect of phenotypic plasticity that can be modulated by the disordered packing of chromatin is transcriptional heterogeneity or the range of expression levels across genes exposed to similar molecular conditions. The CPMC model predicts that transcriptional heterogeneity increases as a function of D due to increased variations in both packing density (
) and gene accessibility (pg). To quantify this effect from the CPMC model, the variance in ϵ across any given cell population, Varϵ, is (14)
Consequently, intercellular transcriptional heterogeneity, i.e., the SD of steady-state expression rate E in Eq. 3 becomes
(11)and the coefficient of variation (the ratio of the SD to the mean expression) is
. Both H and COV increase with D, and COV also decreases as a function of expression.
To investigate the association between D and intercellular transcriptional heterogeneity, we analyzed our scRNA-seq data to quantify the spread in transcriptional states across each treatment condition. Focusing on overall transcriptional differences between cells within the same condition provides better validation to the model than analyzing the spread of all observed genes. Thus, we first used t-distributed stochastic neighbor embedding (t-SNE) combined with principal components analysis (PCA) to reduce the dimensionality of the system on all cells simultaneously (47). The dimensionality reduction mapped each cell onto a 3D projection. Distances between cells in 3D space represented overall differences in transcriptional states, as has been described by van der Maaten and Hinton (47). Intercellular transcriptional heterogeneity for each cell population was quantified by the average radius of the cluster of cells,
, where ri is the position of each cell in the reduced spaced, N is the total number of cells in each treatment group, and
. Intuitively, Rc can be thought of as the radius of relative genomic space. Consistent with predictions of the CPMC model, we found that transcriptional heterogeneity, as measured by the radius of genomic space, increases with D in response to paclitaxel treatment, which preselects for high D cells, as shown above. Notably, after 48 hours of paclitaxel treatment, the population of surviving cells had both higher D and increased transcriptional heterogeneity compared to control cells (Fig. 4, A to C and F). In contrast, celecoxib treatment reduces average D of a cancer cell population. Accordingly, cells treated with celecoxib for 16 hours had a lower transcriptional heterogeneity compared to control cells. In addition, when these celecoxib-primed cells with initially lower D were treated with paclitaxel for 16 hours, they had a decreased transcriptional heterogeneity compared to paclitaxel-treated control cells (Fig. 4, D to F). Although the resulting projection from t-SNE is nonunique, the trends in the radius of genomic space across conditions are robust to randomly selected choice of seed (fig. S6). Additional analyses quantifying the Euclidean distance between expression of DNA repair genes up-regulated in 48-hour paclitaxel treatment and the coefficient of variation of expression between cells in the same treatment condition demonstrate the same effect of chromatin packing scaling on transcriptional heterogeneity as the t-SNE results (fig. S7).
Next, we sought to investigate the effect of chromatin packing scaling on changes in transcriptional heterogeneity in response to stimulation. For higher D compared to lower D populations, the CPMC model predicts an increase in transcriptional malleability concomitant with an increase in gene expression variability in response to stimulation. As a case in point, consider the up-regulation of stress response genes due to a stressor such as chemotherapy. Both transcriptional malleability and heterogeneity may facilitate a response to the stress. An increase in the average expression (malleability) and in the SD of expression levels (heterogeneity) for these genes upon the stimulation would increase the percentage of cells that express these genes above any given level that may facilitate cell survival, regardless of the exact value of this critical level. We used scRNA-seq data on A2780 cells to analyze the distributions of transcriptional responses to paclitaxel treatment, as an example of an exogenous stressor, in cell populations with different initial D. We assessed the ratio of the up-regulated expression rate due to the stressor versus the initial expression rate (relative up-regulation, E2/E1). Focusing on transcriptional responsiveness of genes associated with DNA repair pathways that had been up-regulated in response to 48-hour paclitaxel treatment, we found that the higher D population had both an increase in the average (malleability) and the variance (heterogeneity) of relative up-regulation compared to the lower D population (Fig. 4G). Next, we examined relative expression levels of genes suppressed in the control condition, specifically those that occupied the bottom 10th percentile of gene expression, and observed a similar behavior (Fig. 4H). This D-dependent increase in intercellular transcriptional heterogeneity was itself a function of initial expression level. Specifically, genes that were underexpressed in cells before paclitaxel treatment had a more significant difference in the heterogeneity of their relative up-regulation in the high D compared to the low D populations than those that were already highly expressed before the stimulation (Fig. 4I), also in agreement with the CPMC predictions.
Transcriptional divergence is inversely associated with patient survival
As described above, D determines a cell’s responsiveness to stressors such as chemotherapeutic agents through the effect of chromatin packing scaling on phenotypic plasticity. A logical next step was to establish whether these physical regulators play a role in tumor aggressiveness in vivo. The effects of chromatin packing scaling on phenotypic plasticity may foster the ability of cancer cells to develop resilience and/or resistance to chemotherapy in vivo and may also be involved in other processes fostering increased tumor fitness and aggressiveness. Throughout carcinogenesis, tumors are frequently exposed to a wide range of stressors including attack by a host’s immune system, inadequate oxygen supply from nearby blood vessels, or an acidotic microenvironment. To test whether such a relationship between phenotypic plasticity and tumor fitness exists, we analyzed publicly available RNA-seq data collected by the TCGA Research Network (20) for lung, colorectal, and breast cancers, the three most prevalent malignancies in the United States. As the model predicts cellular responsiveness to external stressors, of which chemotherapy is an example, we focus on patients presenting with stage III and IV tumors at the time of diagnosis, as systemic therapy is the standard of care for these patients. Using the R package TCGAbiolinks (48), we quantified gene expression in units of fragments per kilobase million (FPKM) for each patient. As these data lack initial control measurements of cancer cells before initiation of systemic therapy, transcriptional malleability cannot be measured directly for each patient. In addition, we do not have information related to chromatin packing scaling and other physical regulators of transcription for these patients. However, the essence of the effect of δ is that elevated D amplifies a gene’s transcriptional response to stimuli: Overexpressed genes are enhanced, whereas underexpressed genes are suppressed (Fig. 2). Consequently, as the bidirectional behavior of Se(E) curves indicates, an elevated D widens the distribution of gene expression, resulting in increased transcriptional divergence, which, in turn, is a key determinant of transcriptional malleability (Fig. 5A). Thus, quantifying transcriptional divergence within these patient cohorts should, by proxy, measure transcriptional responsiveness, which we have shown above is linked to D. Borrowing a method from macroeconomics, transcriptional divergence can be quantified by the ratio of expression of the top 50% of genes and the bottom 50% of genes (P50/P50), for ranked expression, Ek, and total number of detected genes, N
As age is also a major predictor of cancer mortality, we restricted our analysis to patients under 75 years at the time of diagnosis. As the CPMC model predicts that higher transcriptional divergence produces more adaptable tumor cells, we would expect that patients with a shorter survival time would have tumor cells with an elevated P50/P50 ratio at the time of diagnosis. To test this hypothesis, we compared the P50/P50 ratio calculated at the time of diagnosis of patients surviving above or below the median survival time for each cancer type (Fig. 5B). We found a statistically significant inverse relationship between P50/P50 ratio and relative patient survival time for lung (Fig. 5B; P < 0.021), breast (Fig. 5B; P < 0.0001), and colon (Fig. 5B; P = 0.018) cancers.
Next, we analyzed the relative contribution of transcriptional divergence to patient survival time compared to effects of other prognostic factors (e.g., demographic factors, tumor molecular subtype, and stage) by performing a multivariate regression on each prognostic factor. We then calculated the relative survival time (RST) for each patient as the observed survival time relative to the expected survival time based on these additional prognostic factors. RST < 1 indicates that a patient’s survival is shorter than expected (e.g., RST = 0.5 indicates that their survival duration is 50% shorter than expected), whereas RST > 1 indicates a longer than expected survival time. Patients were then grouped into a high and a low P50/P50 cohort based on whether they were in the top or bottom half of P50/P50 values, respectively. Notably, high P50/P50 patients had an RST below 0.8 for all malignancies, whereas a low P50/P50 translated into a significantly higher RST > 1 (P < 0.05). Next, we analyzed the relationship between patient survival and P50/P50 directly for all malignancies. As survival depends on a multitude of factors, some of which were not available within the TCGA dataset for all patients (e.g., comorbidities), a fixed moving window average (MWA) was applied to the data (see Materials and Methods for details). We found a continuous inverse trend between P50/P50 and patient survival for all three malignancies (Fig. 5D and fig. S8). Last, Kaplan-Meier survival curves show that patients with high P50/P50 ratios have a median survival of 8 months compared to 28 months for those with a low P50/P50 (Fig. 5E; P = 0.01). In summary, these results support a strong correlation between transcriptional divergence, a facet of phenotypic plasticity that is directly affected by chromatin packing scaling, and patient survival (Fig. 5).
In this work, we combined multiscale modeling with Hi-C, scRNA-seq, chromatin electron tomography, and live-cell PWS microscopy to demonstrate the role of the disordered chromatin polymer on regulating both intercellular transcriptional heterogeneity and transcriptional malleability. On the basis of predictions from the CPMC model, which were verified experimentally, the spatial arrangement of chromatin packing affects gene expression through a number of physical regulators, including ϕin,0, Nd, and D (Figs. 1 and 2). We demonstrate, both computationally and experimentally, that a crucial role of chromatin packing is to determine the level of phenotypic plasticity within a cell population. In particular, the scaling of chromatin packing, D, modulates both the transcriptional malleability through a chromatin-mediated enhancement δ, a “tailwind effect” (Fig. 3), and the level of intercellular transcriptional heterogeneity (Fig. 4). This effect is further regulated by other physical properties of chromatin. A higher average crowding density within the nucleus suppresses the expression of initially underexpressed genes as D increases (Fig. 2, G and H). The modulatory effects of Nd are twofold. Genes localized to domains with a large Nd (Mbp range) are more suppressed than those localized to domains with small Nd (kbp range) owing to a reduced accessibility to TFs and polymerases. However, as D increases, expression of genes associated with large Nd is disproportionately enhanced (Fig. 2I). Overall, higher D, higher crowding density, and lower Nd increase both transcriptional malleability and heterogeneity, with D having a much larger effect compared to the other two chromatin packing properties.
The fact that eukaryotic cells have encoded information into the scaling behavior of chromatin packing may have important medical implications. Elevated D is a hallmark of cancer cells and could represent a mechanism by which malignancy gains nonmutational advantages over neighboring healthy cells. As observed in vitro, treating cells with a chemotherapeutic agent such as paclitaxel selects for cells with higher D (Fig. 3B), which, as demonstrated within this work, is, in part, due to increased phenotypic plasticity compared to cells with lower D (Figs. 3 and 4). This selects for tumor cell populations with a higher transcriptional adaptive potential, which, in turn, may facilitate their survival despite future exposure to new stressors. In support of this potential mechanism, our data show that transcriptional divergence, the cross-sectional measurement of transcriptional malleability, in advanced colorectal, lung, and breast cancers is associated with worse prognosis independent of demographic factors (e.g., age and gender), tumor stage, and molecular transformations (Fig. 5).
At present, experimental validation of the CPMC model relies on the measurement of average chromatin packing scaling D and crowding within the entire nucleus. While currently beyond existing experimental capabilities, subsequent studies directly comparing how local (e.g., intra-packing domain) chromatin structure affects transcriptional processes and output would be of considerable importance. Pairing gene-tracking techniques such as CRISPRainbow with imaging modalities that measure chromatin structure, such as live-cell PWS microscopy and ChromEM, and super-resolution imaging of molecular factors would help elucidate how intranuclear variations in molecular and physical regulators of transcription contribute to transcriptional heterogeneity and malleability (12, 27, 49).
Although not explored in this work, there are several implications of these results on the understanding of multicellular fitness in the context of cell biology. For example, the localization of genes into domains has been demonstrated to be a conserved, albeit heterogeneous, process (50). From the predictions of the model, cells would benefit from localizing genes into large domains that are intended to be suppressed at baseline but require rapid amplification to adapt to changing environmental conditions. Likewise, crowding density could be adjusted by cells either as a preprogrammed response by changing nuclear volume or incidentally from the retention of an extra chromosome during replication. Consequently, as has been hypothesized, this could be a mechanism linking nuclear size and density (e.g., hyperchromasia) with differential gene expression. Nuclear size, hyperchromasia, and abnormal nuclear texture are some of the most ubiquitous histological markers of neoplasia, although their etiology and functional consequences have been poorly understood (51).
In light of the CPMC model conclusions, it should be clear that disordered chromatin packing does not mean that the configurations are random or that observed patterns in gene transcription are the result of configurational noise. While it is beyond the scope of this work, the conformation of a chromatin polymer depends on the balance between chromatin-chromatin and chromatin-nucleoplasm interactions and is further shaped by active chromatin loop formation processes and other constraints such as nuclear lamins (52). The shape of the disordered chromatin polymer will ultimately depend on molecular factors such as histone modifications, transcriptional and replication-induced supercoiling, and DNA motif stiffness, as well as nucleoplasmic factors such as nuclear pH, ionic concentrations, and crowding, which collectively alter chromatin-chromatin and chromatin-nucleoplasm interactions (26, 53–55). Therefore, individual cells could use a combination of chromatin-chromatin and chromatin-nucleoplasm interactions to appropriately organize the genome while also encoding a predetermined level of phenotypic plasticity.
In addition, this work may have implications on the open question in chromatin biology regarding the importance of noncoding DNA. Some roles have since been illuminated, including the production of noncoding RNA and the distribution of transcriptional regulatory motifs such as enhancers and insulators (21, 56). In light of this work, and in relation to previously suggested hypotheses of the role of macromolecular crowding on gene expression, one of the evolutionary functions of noncoding DNA could be derived from its space-filling role. Consequently, noncoding DNA might be a critical component within the genome to determine phenotypic plasticity as it possesses the ability to modulate transcription reactions by influencing the free energy of transcription reactions and the diffusion of reactants.
Last, one could consider how D plays a role in the adaptability of cancer cells throughout carcinogenesis. Carcinogenesis depends on cells overcoming aberrations in metabolism, derangements of the microenvironment, inadequate vascular supply, immune surveillance, and acclimation to distal tissue environments during metastasis. As it could take multiple replicative generations to develop a new useful mutation within a population for each of these processes, cancer cells could additionally leverage the physical properties of chromatin packing to increase their transcriptional plasticity to acclimate to these conditions over a faster time scale. Thus, it may be worth investigating, for example, whether cancer cells with elevated D are better able to survive an immune response and acclimate to distant tissue sites during metastasis. From the therapeutic standpoint, while mutations are difficult to remove from a cell population, this work suggests that limiting cancer cell evolution might be possible pharmacologically by lowering the scaling of disordered chromatin packing.