This manuscript (permalink) was automatically generated from meyer-lab/mechanismEncoder@f851f7f on April 6, 2021.
Fabian Fröhlich
ORCID
0000-0002-5360-4292
· Github
FFroehlich
· twitter
fabfrohlich
Department of Systems Biology, Harvard Medical School
Sara JC Gosline
ORCID
0000-0002-6534-4774
· Github
sgosline
· twitter
sargoshoe
Pacific Northwest National Laboratories
Jackson L. Chin
· Github
JacksonLChin
Department of Bioengineering, University of California, Los Angeles
Emek Demir
ORCID
0000-0002-3663-7113
· Github
emekdemir
Department of Molecular and Medical Genetics, Oregon Health & Sciences Univerity
Aaron S. Meyer
ORCID
0000-0003-4513-1840
· Github
aarmey
· twitter
aarmey
Department of Bioengineering, University of California, Los Angeles; Department of Bioinformatics, University of California, Los Angeles; Jonsson Comprehensive Cancer Center, University of California, Los Angeles; Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles
Proteomic data provides measurements that are uniquely close to the mechanism of action for many cancer therapies. As such, it can provide an unmatched perspective into the mechanism of drug action and resistance. At the same time, extracting the source of patient-to-patient differences in proteomic measurements and understanding its relevance for drug sensitivity is extremely challenging. Correlative analyses are most common but are difficult to mechanistically interpret.
Proteomic data provides measurements that are uniquely close to the mechanism of action for many cancer therapies. As such, it can provide an unmatched perspective into the mechanism of drug action and resistance.1,2 At the same time, extracting the source of patient-to-patient differences in proteomic measurements and understanding its relevance for drug sensitivity is extremely challenging. Correlative analyses are most common but are difficult to mechanistically interpret.
Mechanistic models are uniquely powerful for identifying the drivers of differences within measurements, integrating our prior knowledge, and interpreting data. However, a key question that limits their use for patient data is how to handle patient-to-patient differences. Constructing multiple patient-specific models is infeasible due to the limited data for each patient. Alternatively, universal models that use patient invariant and patient-specific parameters to integrate data across multiple individuals have been proposed.3 However, how to estimate these patient-specific parameters is challenging as genetic and microenvironmental context influences signaling pathways in complex, non-linear, and often poorly understood ways.
At its core, the challenge of integrating mechanistic models with patient-derived measurements is an issue of how to account for patient-to-patient variation. Mechanistic dynamical models have been widely applied to data of all types but are used where the sources of variation among measurements can be explicitly identified and modeled. By contrast, variation among individuals can arise through both factors that can easily be identified, like changes in the abundance of the species being modeling, and endless other molecular and physiological factors that cannot be usefully enumerated in a mechanistic approach. Still, the structure of mechanistic models provides important constraints on the behavior of molecular pathways and interpretability that is missing from purely data-driven statistical methods.
To address this issue, we propose a model structure that is based on a variational autoencoder. Autoencoders are neural networks that embed data into low dimensional latent feature space by feeding the data through encoding and decoding layers.4 The extracted latent features then provide a reduced representation of patient-patient similarity. We integrate mechanistic information by partly replacing the decoder layers in the network with a coarse-grained mechanistic model, where the encoded, latent representation of the data defines the patient-specific parameters of the universal ordinary differential equation (ODE) model. We apply this to AML patient samples, where proteomic and phosphoproteomic measurements with high tumor purity can be collected. This model structure enables mechanistic interpretation of these data; more robust latent space representations of patient relationships; and integration of prior knowledge, other data sources such as in vitro experiments or other data types, and clinical measurements. Mechanistic autoencoders, therefore, offer a general solution to building mechanistic models in the presence of unexplained sample variation, such as from clinical samples.
Initial plot of proteomic data (clustergram?) - see #23 Data-driven selection of network nodes from OHSU
Training against actual data Description of fit model
Cell line perturbation Description of that data
Model/validation comparison
FROM PROPOSAL
This project has the potential to enable routine use of mechanistic models to analyze clinical proteomics measurements. As such, one can easily envision applying a similar technique across many different cancer types as well as other diseases. It is hard to overstate the potential impact, as this can convert these measurements to (1) exacting predictions of which components to target in individual patients and (2) provide a mechanism-grounded view of patient-to-patient variation.
Sara should likely fill this in.
This would be Jackson’s work.
Fabian knows this.
This work was supported by an administrative supplement to NIH U01-CA215709 to A.S.M. The authors declare no competing financial interests.
1. McDermott, J. E. et al. Proteogenomic Characterization of Ovarian HGSC Implicates Mitotic Kinases, Replication Stress in Observed Chromosomal Instability. Cell Reports Medicine 1, 100004 (2020).
2. Clark, D. J. et al. Integrated Proteogenomic Characterization of Clear Cell Renal Cell Carcinoma. Cell 179, 964–983.e31 (2019).
3. Fröhlich, F. et al. Efficient Parameter Estimation Enables the Prediction of Drug Response Using a Mechanistic Pan-Cancer Pathway Model. Cell Systems 7, 567–579.e6 (2018).
4. Hinton, G. E. Reducing the Dimensionality of Data with Neural Networks. Science 313, 504–507 (2006).