Appraising the Causal Role of Risk Factors in Coronary Artery Disease and Stroke: A Systematic Review of Mendelian Randomization Studies

BACKGROUND Mendelian randomization (MR) offers a powerful approach to study potential causal associations between exposures and health outcomes by using genetic variants associated with an exposure as instrumental variables. In this systematic review, we aimed to summarize previous MR studies and to evaluate the evidence for causality for a broad range of exposures in relation to coronary artery disease and stroke. METHODS AND RESULTS MR studies investigating the association of any genetically predicted exposure with coronary artery disease or stroke were identified. Studies were classified into 4 categories built on the significance of the main MR analysis results and its concordance with sensitivity analyses, namely, robust, probable, suggestive, and insufficient. Studies reporting associations that did not perform any sensitivity analysis were classified as nonevaluable. We identified 2725 associations eligible for evaluation, examining 535 distinct exposures. Of them, 141 were classified as robust, 353 as probable, 110 as suggestive, and 926 had insufficient evidence. The most robust associations were observed for anthropometric traits, lipids, and lipoproteins and type 2 diabetes with coronary artery; disease and clinical measurements with coronary artery disease and stroke; and thrombotic factors with stroke. CONCLUSIONS Despite the large number of studies that have been conducted, only a limited number of associations were supported by robust evidence. Approximately half of the studies reporting associations presented an MR sensitivity analysis along with the main analysis that further supported the causality of associations. Future research should focus on more thorough assessments of sensitivity MR analyses and further assessments of mediation effects or nonlinearity of associations.

However, beyond these conventional risk factors, an ever-expanding list of exposures and their associations with cardiovascular manifestations is being explored in the medical literature.
Despite the volume of research, the causality of associations between risk factors and cardiovascular outcomes remains inconclusive for the majority of exposures, as observational associations are hindered by confounding and reverse causation and evidence from randomized controlled trials (RCTs) is relatively scarce. 3The Mendelian randomization (MR) approach can potentially overcome some biases of traditional epidemiological research by using genetic variants robustly associated with the risk factor of interest and assessing whether these variants are associated with the outcome of interest.The MR method can address bias attributed to confounding because genetic variants are randomly allocated when alleles are passed from parents to offspring during meiosis.MR studies therefore can be thought as randomly assigning participants based on the presence of alleles, which influence the risk factors of interest, and subsequently investigate whether carriers of genetic variants associated with the risk factor have different disease risks compared with noncarriers.In addition, as genetic variants are acquired at birth and cannot be modified by the presence of disease, MR associations are not influenced by reverse causality.Because of these appealing properties and as genome-wide association studies (GWAS) provide associations between numerous traits and risk factors, MR is increasingly becoming a popular method to study the potential causal associations between different exposures and cardiovascular outcomes.
In this study, we present the first effort to systematically collect and appraise MR studies investigating any risk factor in relation to CAD and stroke.Our aim was to present the breadth and depth of exposures studied, identify areas of research focus and highlight gaps, and appraise the current evidence supporting their causal role in developing CAD and stroke.

METHODS
The data and materials that support the findings of this study are available in Data S1.Institutional review committee approval and consent by participants were not required because the current study is based exclusively on summary-level data from previously published studies.

Search Strategy
A systematic literature search was conducted independently by 2 researchers (A.N.G. and N.K.) on Medline (via PubMed) from inception to May 2022 for the identification of studies using the MR approach investigating causal risk factors for CAD or stroke.The following algorithm was used: "(Mendelian Randomization OR Mendelian Randomisation or genetic instrument) AND (Cardiovascular OR Stroke OR Coronary Heart OR Coronary Artery OR Myocardial Infarction)."We also screened the references of relevant reviews and the references of the included studies.The screening process in shown in Figure 1.The inclusion and exclusion criteria are described in detail in Data S1.

CLINICAL PERSPECTIVE
What Is New?

Data Synthesis and Evaluation of Robustness
Based on the extracted information, we presented the basic characteristics of the identified MR analyses.Main findings were categorized by risk factor and risk factor categories.The robustness of the evidence (robust, probable, suggestive, and insufficient evidence) was assessed through a priori defined criteria (Figure S1, Data S3) 6 based on previous recommendations. 7We grouped MR studies into polygenic (trans) MR studies (studies that use variants from multiple regions of the genome associated with the risk factor of interest) and monogenic (cis) MR studies (studies using biological knowledge and variants from a single-gene region associated with the risk factor of interest).For example, an MR analysis for CRP (C-reactive protein) can be monogenic, and therefore conducted using variants in the CRP gene only, or polygenic, and therefore conducted using all independent genome-wide significant variants associated with CRP. 7 For polygenic MR studies, we based the evaluation on the results of the main MR analysis and the sensitivity analyses (eg, MR-Egger, weighted median, MR-PRESSO).The sensitivity analyses are used to check potential violations of the assumptions of the MR methodology.Evidence for causality was therefore  considered stronger when a sensitivity analysis was reported and was supportive of the main analysis findings as MR investigations that do not perform ≥1 sensitivity method may be viewed as having incomplete evidence.Specifically, the associations were considered as having robust evidence for causality when all methods had concordant direction of effect estimates and both the main analysis and at least 1 sensitivity analysis achieved statistical significance (P<0.05).When studies adjusted their results for multiple testing, we used the P value threshold after the adjustment to define statistical significance, otherwise we used a nominal significance level (P<0.05).When a P value was not reported for the main MR estimate, we calculated it using the effect size and standard error.Also, when studies also reported analyses excluding genetic variants with evidence of pleiotropy, we considered those as the main analysis as they account better for the MR assumptions.The term robust refers to evidence of causality for the studied associations, not the quality of the analysis.An association was supported by probable evidence for causality when at least 1 method (main or sensitivity analysis) achieved statistical significance and the direction of the effect estimate was concordant in all methods.Suggestive evidence for causality was achieved when at least 1 method had a statistically significant P value but the direction of the effect estimates differed between methods.Associations that presented nonsignificant P values for both the main analysis and sensitivity analyses were classified as insufficient evidence for causality.Polygenic MR studies that did not report any sensitivity analyses were nonevaluable based on the aforementioned grading scheme that focuses on evaluating the robustness for causality of the studied associations.
Monogenic MR studies included MR analyses examining only one single-nucleotide polymorphism (SNP) or single-gene regions to define the risk factor (instrumental variable of interest).Most of these studies could not perform sensitivity analyses as the number of genetic variants was small.We assessed the robustness of these results based on whether the authors also performed colocalization analysis. 8Colocalization assesses whether the same genetic variant (or variants) influences 2 traits and is useful when MR is based on a single-gene region. 7inally, we further assessed the reporting of all MR studies using the Strengthening the Reporting of Mendelian Randomization Studies guidelines. 4,5ll statistical analyses were done with R 4.1.0.

Eligible Studies
The literature search yielded 3980 articles of which 586 were evaluated in full text, and of them, 391 publications were deemed eligible (see full list in Tables S1 and S2, respectively).The majority of studies were published from 2018 onward (Figure S2).

Description of Study Characteristics
Of 391 MR publications, 317 studied CAD as the outcome of interest, 175 stroke, and 102 both outcomes.Overall, the 391 publications included 2725 different MR analyses examining 535 unique exposures, 482 in relation to CAD and 268 in relation to stroke, covering a broad range of biomarkers, physical measurements, traits, and diseases (Figure 2, Table S3).Many risk factors have been examined in multiple MR publications (Table S4), and the most commonly studied risk factors were low-density lipoprotein cholesterol for CAD (26 articles) and body mass index for stroke (10 articles).There were 2122 polygenic MR analyses and 596 monogenic MR analyses.The median number of SNPs used to genetically predict the risk factor of interest in polygenic MR studies was 13, ranging from 2 to 3188 SNPs (Table S5).The median sample size for the exposure genetic analysis was 81 807 (with the smallest being 272 for phospholipase A2 and the largest 1 887 658 for COVID-19 severity).GWAS summary statistics for the exposures were derived from European (87.5%) and multiethnic (9.9%) ancestry populations.For the outcomes, 65.9% of the associations were derived from European and 32.5% from multiethnic populations.The vast majority of MR analyses used or included CARDIoGRAMplusC4D (Coronary Artery Disease Genomewide Replication and Meta-Analysis Plus the Coronary Artery Disease Genetics consortium) when CAD was the outcome of interest (822 of 1478 MR analyses) and MEGASTROKE consortium when stroke was the outcome of interest (914 of 1214 MR analyses).Only 65 MR analyses were based on 1-sample MR designs.

Evaluation of the Robustness of Causal Associations
Table S6 lists the main characteristics of each eligible MR analysis and its subsequent grading category based on the robustness of evidence, whereas Table S7 summarizes the grading categories.Of the 2122 polygenic MR associations, 20 analyses were based on 2 genetic variants (in different gene regions) and sensitivity analysis could not be performed.From the remaining 2102 MR analyses, 1479 (70%) presented results on both the main and at least 1 sensitivity analysis and were eligible for evaluation.IVW was the main analysis in the majority of associations examined (N=1931, 92%).Of the 1530 associations reporting main and sensitivity analyses, we found 141 robust associations (median N SNPs =110), 353 probable associations (median N SNPs =71), 110 suggestive associations (median N SNPs =73), and 926 associations with insufficient information (median N SNPs =21).
Overall, 276 MR analyses reported multivariable MR, which examines multiple risk factors (exposures) simultaneously and estimates the independent causal effect of each of the risk factors.Genetically predicted body mass index, smoking, and lipid levels were common risk factors adjusted in multivariable MR.
There were also 596 monogenic MR associations examining genetic variants within a single-gene region as instrumental variables for the risk factor of interest.Of them, 447 were based on a single genetic variant analysis only (Figure S3).Among the 596 associations, 219 reported statistically significant results in the main analysis.Of them, 24 analyses performed colocalization analyses with the outcome of interest, of which 2 found evidence for colocalization between the risk factor and the outcome (ICA1L [islet cell autoantigen 1-like protein] for stroke and NK3R [neurokinin 3 receptor] for CHD).
A graphical overview of the robustness of the evidence per exposure category and CVD group is presented in Figure 3.The exposure category with the most robust associations was anthropometry (N=28), followed by lipids and lipoproteins (N=24).There were 141 robust polygenic MR associations pertaining to 53 different risk factors as illustrated in detail in Table S8 and Figure 4 (35 risk factors for CAD and 22 risk factors for stroke).  Almt all studies that showed robust evidence of association between an exposure and CAD or stroke had a good reporting score for Strengthening the Reporting of Mendelian Randomization Studies reporting, with the exception of 3 studies (Table S9). 60,71,73part from conventional cardiovascular risk factors such as blood pressure traits, cholesterol levels, type 2 diabetes, obesity, and smoking, robust positive associations were observed between genetically predicted calcium (OR IVW , 1.66 [95% CI, 1.12-1.81]),lymphocyte count (OR IVW , 1.Of the robust associations, 24 reported multivariable MR analyses adjusting for potential mediators.Of these, 2 associations were attenuated to the null: linoleic acid and stroke after adjusting for low-density lipoprotein cholesterol and forced vital capacity, and CHD after adjusting for height.The remaining multivariable analyses attenuated the estimates, suggesting different levels of mediation; however, statistical significance was retained (Table S10).

DISCUSSION
In this systematic review, we summarized the evidence for associations between genetic predisposition to 535 risk factors and CAD or stroke examined in 391 publications covering 2725 MR associations.Using a set of predefined criteria, we found robust evidence for  This large body of published MR analyses highlighted several reporting limitations also observed in a previous systematic review of MR studies on cancer outcomes. 6Approximately half of the associations included sensitivity analyses, which are important to assess the assumptions of the method and therefore the robustness of the results.The lack of sensitivity analyses was often because studies were published early before the availability of MR sensitivity methods or because they were monogenic (single-gene) or single variant MR studies where sensitivity analyses were not feasible because of the small number of instrumental variables.In the latter case of MR studies, colocalization can be used to investigate whether the exposure and the outcome share a causal variant in the genetic region, but it was rarely performed in the examined MR studies.Again, this may be attributed to the fact that colocalization was only suggested recently as an additional method to support monogenic MR investigations and a large proportion of these studies were published earlier.Genetic variants typically explain only a small proportion of the variation in the relevant exposure of interest, and as a result low statistical power is common in MR studies.The MR studies examined here rarely reported power estimates or the variance explained by instrumental variables, and therefore it was difficult to conclude whether nonsignificant associations were true null findings. 77The recent publication of the MR reporting guidelines (Strengthening the Reporting of Mendelian Randomization Studies) statement should improve the reporting standards of MR studies and further enhance the robustness and interpretability of MR findings. 5Finally, MR investigations are dependent on the primary GWAS sources, the quality of which was not assessed in this work.However, underlying GWAS quality is unlikely to lead to false positive results.A considerable proportion of the studies provided supporting evidence for causal associations between the so-called conventional CVD risk factors and CVD events.We identified studies with robust evidence for causal associations between genetically predicted low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, triglycerides, apolipoprotein B, blood pressure, type 2 diabetes, and glycemic traits such as hemoglobin A1c and insulin resistance with CAD and stroke.This supports the extensive evidence from traditional epidemiological studies, experimental studies, and RCTs examining these risk factors.However, MR provided additional valuable information such as examination of comparative effects between correlated risk factors, estimation of nonlinear effects, and interactions with other factors.For example, multivariable MR analyses on several lipids and lipoproteins highlighted the central role of apolipoprotein B compared with other lipids in ischemic stroke. 53Similarly, the MR paradigm generated evidence supporting an effect of midlife blood pressure on later life CAD risk independent of later life blood pressure. 41easures of anthropometry have also been extensively studied in the MR context in relation to CAD and stroke.Beyond body mass index, which showed robust causal associations with CAD and stroke, higher height was also highlighted as a potentially causal risk factor for CAD.Genetically predicted height also mediated at least partly the association between lung function measured by forced vital capacity and CHD and stroke.Although several observational studies have reported a protective role of short height for CVD, the magnitude of this association has been controversial, 78 and the mechanisms underlying this inverse association are not well understood. 79One proposed explanation is that shorter individuals have on average smaller vessel diameters, which can lead to increased arterial occlusive events. 80,81There was also both robust and probable evidence for a protective association between higher birth weight and CAD and stroke, respectively, supporting the fetal developmental origins of CVD. 26,82Interestingly, further investigation into the fetal or maternal components of instrumental effects on birth weight showed robust evidence between lower birth weight, by maternal rather than fetal genome, and stroke and its subtypes in later life. 51ifestyle is an important area of MR research in CVD as RCTs are often inappropriate or unfeasible, and evidence stemming from MR is vital to support causality.Smoking behavior showed robust evidence for a causal association with CVD in agreement with overwhelming evidence from observational epidemiology. 83][86][87][88][89] Observational epidemiology has often suggested a possible protective effect of moderate alcohol consumption on CVD.Dose-response MR analyses did not support this conclusion, but they found evidence of a dose-response relationship between alcohol and risk of stroke. 90Educational attainment was reported to have a protective role for both CAD and stroke, exhibiting robust evidence of association in MR studies. 15,28,33Traditional observational studies and MR mediation analyses have shown that body mass index, systolic blood pressure, and smoking behavior mediate a substantial proportion of the protective effect of education on the risk of CVD outcomes. 91,92espite the research interest on diet and CVD, there were few robust or probable associations between nutrients or dietary traits and CVD outcomes.This is partly expected because of the low power of genetically predicted nutrients and other dietary variables (few SNPs available to instrument dietary traits) and the small heritable components of many dietary traits both leading to underpowered MR studies.
Many MR analyses concentrate on the causal association between biomarker levels and CVD to identify novel treatment targets for the disease.Of them, genetically predicted plasma sIL6R (soluble IL6R), an IL6 (interleukin 6) signaling biomarker, showed robust evidence for an inverse association with CAD 36 and stroke, 60 supporting a key role of inflammation in CVD, which also has supportive RCT evidence. 93hrombotic factors are implicated in the coagulation cascade and along with inflammatory factors are contributing to the suppression of a pathogen entering in the host, a mechanism termed as immunothrombosis.The aberrant activation of immunothrombosis has been associated with increased risk for myocardial infarction, stroke and venous thromboembolism. 94his association is supported by MR evidence.A robust positive association was observed between vitamin K and large vessel stroke as well as between two enzymes of the coagulation cascade (ie, factor XI and factor VII) and ischemic stroke. 19,22,308][99] Circulating calcium levels are thought to increase CAD risk through vascular calcification 100,101 or via the upregulation of the coagulation pathway which in turn is associated with CVD risk. 102Finally, iron, ferritin, and transferrin saturation, biomarkers of iron metabolism and intake, 103 showed robust positive causal associations with risk of stroke in MR analyses and this effect was suggested to be driven by an increased risk of cardioembolic stroke. 21The latter, along with the absence of an association with CAD, may indicate that the effect of iron on stroke is through thrombus formation rather than atherosclerosis. 104

STRENGTHS AND LIMITATIONS
In the current systematic review, we summarized all previously published MR studies for all genetically determined exposures and their association with CVD risk.A clear categorization scheme and evaluation criteria were applied, to further examine the robustness credibility of the resulting associations.Other efforts to summarize the evidence of MR analyses on CVD risk have been performed in the past.However, they were either limited to specific exposures, 105 or used a more narrative approach of presenting and assessing the MR results, 106 while none performed a formal evaluation of the evidence.However, some limitations exist, which need to be acknowledged.Some relevant MR studies may have been missed through our search strategy, especially if the MR analysis was not the primary focus but only a supplementary analysis, which seems to be increasingly common in recent GWAS.In the absence of comprehensive MR guidelines, we based our evaluation of the evidence of causality adapting a set of previously proposed criteria.This approach did not allow us to investigate MR studies presenting only main analysis without sensitivity analyses.Sensitivity analyses increase the credibility of the findings as they test various MR assumptions.However, many studies did not present those as they were published earlier before such analyses were introduced in the literature or were based on monogenic associations with a small number of SNPs which did not allow sensitivity analyses.For the latter associations we based our evaluation on availability of colocalization analysis which was again introduced only recently.Therefore, the evaluation criteria for this systematic review were designed mainly for the assessment of the evidence that resulted from the MR analyses and not for the assessment of the quality of the studies.Although many studies included instrumental variables from the largest available GWAS for the exposure traits, the SNPs explained a small percentage of the variance and therefore some studies were underpowered.Finally, information on statistical power of the instrument was often not reported, and therefore the grading scheme used could not distinguish between MR analyses with robust evidence of lack of association or MR analyses which did not present an association due to lack of power.

CONCLUSIONS
MR studies have contributed a large body of evidence supporting the causal association between risk factors and CVD.Although many studies concentrated on CVD risk factors known to be causally associated with CVD through RCTs, MR provided further confirmation of previous associations and supported evidence for potentially novel causal risk factors.Despite the plethora of MR investigations in CVD, the highlighted associations with robust evidence for causality were modest.Those risk factors concentrated around conventional risk factors for CVD, inflammation and thrombotic factors, and indices of anthropometry and showed a large overlap between risk factors for CAD and stroke, as well as highlighted the different risk factor profiles between stroke subtypes.As GWAS investigations of exposures become larger, novel exposures are measured in epidemiological settings, and novel MR methodologies are published, the contribution of MR in establishing causal associations and prioritizing RCT is expected to grow further.

Figure 1 .
Figure 1.Flowchart of systematic literature search.

Figure 2 .
Figure 2. Number of Mendelian randomization associations extracted from eligible publications according to different exposure categories for coronary artery disease (CAD) and stroke.

Figure 3 .
Figure 3. Evidence map for eligible Mendelian randomization studies per exposure category for CAD and stroke.DNA methylation was not included in the diagram because of the limited number of analyses for the specific exposure category.Nonevaluable evidence level includes associations for which a sensitivity analysis was not feasible (eg, single genetic variant analyses).CAD indicates coronary artery disease; IVW, inverse variance weighted; OR, odds ratio; Ref, reference number; SNP, single-nucleotide polymorphism; and WM, weighted median.

Figure 4 .
Figure 4. Forest plot showing the identified robust associations between exposures and CAD (A) and stroke (B).When >1 study exhibited a robust association with CAD or stroke, the most recent study with the largest sample size (exposure) was selected.The number in square brackets corresponds to the reference that examined the relevant exposure.ApoE indicates apolipoprotein E; CAD, coronary artery disease; CIS, cardioembolic ischemic stroke; FVC, forced vital capacity; IS, ischemic stroke; IVW, inverse variance weighted; LVIS, large vessel ischemic stroke; MI, myocardial infarction; OR, odds ratio; Ref, reference number; RM, repetition maximum; SNP, single-nucleotide polymorphism; and WM, weighted median.(Continued)