Developing insight into the pathogenesis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is of critical importance to overcome the global pandemic caused by coronavirus disease 2019 (covid-19). In this study, we have applied Mendelian randomization (MR) to systematically evaluate the effect of 10 cardiometabolic risk factors and genetic liability to lifetime smoking on 97 circulating host proteins postulated to either interact or contribute to the maladaptive host response of SARS-CoV-2.
We applied the inverse variance weighted (IVW) approach and several robust MR methods in a two-sample setting to systemically estimate the genetically predicted effect of each risk factor in turn on levels of each circulating protein. Multivariable MR was conducted to simultaneously evaluate the effects of multiple risk factors on the same protein. We also applied MR using cis-regulatory variants at the genomic location responsible for encoding these proteins to estimate whether their circulating levels may influence severe SARS-CoV-2.
In total, we identified evidence supporting 105 effects between risk factors and circulating proteins which were robust to multiple testing corrections and sensitivity analyzes. For example, body mass index provided evidence of an effect on 23 circulating proteins with a variety of functions, such as inflammatory markers c-reactive protein (IVW Beta=0.34 per standard deviation change, 95% CI=0.26 to 0.41, P = 2.19 × 10−16) and interleukin-1 receptor antagonist (IVW Beta=0.23, 95% CI=0.17 to 0.30, P = 9.04 × 10−12). Further analyzes using multivariable MR provided evidence that the effect of BMI on lowering immunoglobulin G, an antibody class involved in protection from infection, is substantially mediated by raised triglycerides levels (IVW Beta=-0.18, 95% CI=-0.25 to -0.12, P = 2.32 × 10−08, proportion mediated=44.1%). The strongest evidence that any of the circulating proteins highlighted by our initial analysis influence severe SARS-CoV-2 was identified for soluble glycoprotein 130 (odds ratio=1.81, 95% CI=1.25 to 2.62, P = 0.002), a signal transductor for interleukin-6 type cytokines which are involved in inflammatory response. However, based on current case samples for severe SARS-CoV-2 we were unable to replicate findings in independent samples.
Our findings highlight several key proteins which are influenced by established exposures for disease. Future research to determine whether these circulating proteins mediate environmental effects onto risk of SARS-CoV-2 infection or covid-19 progression are warranted to help elucidate therapeutic strategies for severe covid-19 disease.
The Medical Research Council, the Wellcome Trust, the British Heart Foundation and UK Research and Innovation.
Research in context
Evidence before this study
It remains unclear why certain individuals develop more severe symptoms of coronavirus disease 19 (covid-19) compared to others. However, increasingly findings from the literature suggest that established cardiometabolic risk factors, such as body mass index and smoking, influence the risk of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) which is caused by covid-19.
Added value of this study
Our study provides a systematic evaluation of the genetically predicted effects of 10 cardiometabolic risk factors on each of the 97 unique proteins. Altogether, we found 105 effects which were robust to multiple testing corrections which may be valuable for future covid-19 research. We also evaluated whether any of these proteins influence risk of SARS-CoV-2, with soluble glycoprotein 130 providing the strongest evidence of a genetically predicted effect using MR. This protein is involved on the interleukin 6 receptor pathway which plays an important role in the body’s immune response. However, further data is required to robustly support this gene’s putative role in risk of severe SARS-CoV-2.
Implications of all the available evidence
Our findings are important in terms of developing insight into the molecular pathways by which modifiable lifestyle factors influence disease risk. Specifically with respect to severe covid-19, we note that the GWAS datasets of SAR-CoV-2 used in this work will capture genetic effects on increased susceptibility to infection as well as progression to severe symptoms. This is particularly important when considering the implications of therapeutically targeting any of the proteins highlighted by our study.
]. Furthermore, despite widespread ongoing biomedical research it remains unclear why some individuals develop severe symptoms of SARS-CoV-2 once contracting covid-19, whereas an estimated 80% of individuals display either asymptomatic or mild infections [
]. It is becoming increasingly evident however based on findings from the literature that established cardiometabolic disease risk factors play a role in the severity of symptoms for SARS-CoV-2 [
]. These include inflammatory cytokines and antibodies (such as immunoglobulin G) which are involved in immune response to infection, proteins involved in fibrinolysis and blood coagulation and gene products which have been reported to interact with SARS-CoV-2 proteins in human cells [
]. A complete list of these proteins can be found in Supplementary Table 1.
]. As such genetic variants can be leveraged as instrumental variables to investigate causal relationships between conventional exposures (such as cardiometabolic risk factors) and outcomes (such as circulating proteins) (Fig. 1A). As these inherited genetic variants are fixed at conception, MR is typically robust to confounding factors and reverse causation which can bias analyzes in an observational setting which do not make use of human genetics data.
]. This was followed by a series of sensitivity analyzes as well as applying multivariable MR to evaluate whether exposures independently influence the same circulating protein or act along overlapping causal pathways. We also sought to investigate the potential effects of proteins highlighted by this analysis on risk of severe covid-19 using data from recently conducted genome-wide association studies (GWAS).
We have undertaken a comprehensive Mendelian randomization study to systematically evaluate the effect of 11 established risk factors for disease on circulating levels of proteins related to SARS-CoV-2. Our main findings are that among the modifiable risk factors assessed, BMI and triglycerides showed the widest repertoire of causal effects on these circulating proteins (providing evidence of causation for 23 and 27 effects, respectively). Furthermore, of the circulating proteins investigated by our study, the strongest evidence of an effect on developing severe covid-19 was identified for glycoprotein 130, which is involved in the transmission of molecular signals for inflammatory interleukin cytokines.
]. It is likely that genetic effects will differ between infection and progression, indeed they could even be in different directions. MR estimates derived using these datasets should therefore take this into account when interpreting the potential implications of therapeutically targeting any proteins highlighted by this (and similar) studies (Fig. 1B).
] and acute inflammatory markers such as fibrinogen [
- Kaptoge S.
- White I.R.
- et al.
Associations of plasma fibrinogen levels with established cardiovascular disease risk factors, inflammatory markers, and other characteristics: individual participant meta-analysis of 154,211 adults in 31 prospective studies: the fibrinogen studies collaboration.
]. Other findings fit with the known biology of cardiometabolic risk factors and proteins identified by our analysis, such as the effect of HDL cholesterol on serum amyloid A-1 and A-2 proteins, which have previously been proposed as clinically applicable surrogates of HDL vascular functionality [
]. Whilst our results are therefore of immediate importance for SARS-CoV-2 research, they may also be valuable for future endeavors interested in the therapeutic potential of these proteins with respect to a wide range of disease outcomes.
]. Our results indicate that having a high BMI may reduce levels of circulating IgG, suggesting that people with obesity have less of this class of antibody to help protect from infection. That being said, an important consideration when interpreting this finding is that IgG levels were measured in individual’s in a healthy state and can therefore only act a proxy for IgG response to infection. Additionally, generic IgG levels were measured rather than the specific adaptive immune response to SARS-CoV-2.
]. Along with evaluating the effect of modifiable risk factors on antibody mediated immunity to covid-19, it will be critical to develop insight into how these factors influence cell mediated immunity given the emerging importance of the adaptive immune response to SARS-CoV-2 [
]. It’s activation is dependent upon the binding of cytokines with their receptors, such as interleukin-6 (IL6) with interleukin-6 receptor (IL6R) [
]. This is noteworthy due to the extensive interest in repurposing IL6R blockers as a potential therapeutic strategy for SARS-CoV-2 [
]. As lowering the levels of circulating IL6R will lead to lower activation of glycoprotein 130, estimates in this study suggest that this might result in reduced risk of severe SARS-CoV-2 symptoms. These findings therefore corroborate results from a recent MR study which used human genetic data to support the efficacy of IL6R inhibition as a potential treatment option for severe SARS-CoV-2 symptoms [
]. However the MR studies to date have not been able to reliably separate influences on risk of becoming infected from risk of progressing to severe disease following infection (Fig. 1B). Thus they do not provide robust evidence as to whether IL6R inhibition would be expected to favourably influence outcome in severe covid-19. Adequately powered randomized controlled trial data are essential for evaluating the clinical value of therapeutic intervention targeting IL6R [
], so larger sample sizes of severe SARS-CoV-2 GWAS in the future should improve the statistical power of our approach. Furthermore, although data from plasma is of unprecedented sample size compared to previous large-scale pQTL analyzes (n = 10,708), it remains comparably modest to the sample sizes of GWAS used to derive instrument for the cardiometabolic exposures in this work. This is exaggerated by the fact that protein MRs are typically conducted using instruments relating to a single gene and therefore these genetic variants will explain a lower proportion of variance in the exposure than if instruments are taken from across the genome [
]. Therefore, although we have undertaken thorough evaluations to interrogate bi-directional relationships between the exposures and proteins in this study, the discrepancies between the samples sizes makes the direction of effect challenging to orientate (the majority of exposure instruments were derived using sample sizes of n=~440,000). Finally, although data from plasma pQTL studies provide an exceptional opportunity to leverage instruments for MR studies, it should be noted that serum plasma may not capture signatures confined to disease or cell-type relevant tissues. This is particularly important for a disease with a large autoimmune component such as covid-19 and further emphasis should therefore be noted when interpreting the results of our study on proteins such as IgG. Finally, studies need to be conducted on data that allow MR to separately investigate modifiable influences on acquiring SARS-CoV-2 and on progressing to severe covid-19 or death (Fig. 1B).
In conclusion, our MR study identified many effects between conventional risk factors and circulating proteins which provides a platform for prospective endeavors to dissect related disease pathways. Future research into the pathogenesis of the proteins highlighted by this study are warranted to discern whether they may hold therapeutic potential for severe covid-19.