Metabolomic investigation of major depressive disorder identifies a potentially causal association with polyunsaturated fatty acids

BACKGROUND: Metabolic differences have been reported between individuals with and without major depressive disorder (MDD), but their consistency and causal relevance have been unclear. METHODS: We conducted a metabolome-wide association study of MDD with 249 metabolomic measures available in the UK Biobank ( n = 29,757). We then applied two-sample bidirectional Mendelian randomization and colocalization analysis to identify potentially causal relationships between each metabolite and MDD. RESULTS: A total of 191 metabolites tested were signi ﬁ cantly associated with MDD (false discovery rate – corrected p , .05), which decreased to 129 after adjustment for likely confounders. Lower abundance of omega-3 fatty acid measures and a higher omega-6 to omega-3 ratio showed potentially causal effects on liability to MDD. There was no evidence of a causal effect of MDD on metabolite levels. Furthermore, genetic signals associated with docosahexaenoic acid colocalized with loci associated with MDD within the fatty acid desaturase gene cluster. Post hoc Mendelian randomization of gene-transcript abundance within the fatty acid desaturase cluster demonstrated a potentially causal association with MDD. In contrast, colocalization analysis did not suggest a single causal variant for both transcript abundance and MDD liability, but rather the likely existence of two variants in linkage disequilibrium with one another. CONCLUSIONS: Our ﬁ ndings suggest that decreased docosahexaenoic acid and increased omega-6 to omega-3 fatty acids ratio may be causally related to MDD. These ﬁ ndings provide further support for the causal involvement of fatty acids in MDD.

https://doi.org/10.1016/j.biopsych.2023.01.027 Major depressive disorder (MDD) is a debilitating condition estimated to affect 322 million people, leading to a global total of 50 million years lived with disability (1). This is due to both the high prevalence of MDD and its frequent comorbidity with other conditions (2)(3)(4), particularly cardiovascular disease (5). Most antidepressants target monoamine-related pathways (6) but are ineffective in 40% of cases (7). Better understanding of the molecular mechanisms of MDD may aid in the development of more effective treatments.
Twin studies have estimated the heritability of MDD as w37% (8), and investigating this genetic component of MDD may help to identify its pathophysiological basis. Recently, genome-wide association studies (GWASs) have been increasingly successful in identifying genetic variants associated with MDD (9)(10)(11). Despite this success, the high polygenicity of MDD presents significant challenges to researchers who wish to identify actionable mechanisms lying between genetic variants and clinical phenotype (12). One means of translating genetic findings into biomarkers and biological mechanisms is through the incorporation of molecular phenotype data, such as metabolomics. These data may help to clarify the downstream functional impact of identified risk single nucleotide variants (SNVs) on molecular traits. This information can then be utilized to stratify risk, provide therapeutic targets, or enable lifestyle interventions (13).
There have been several recent studies investigating the links between genetic risk for MDD and molecular phenotypes, including DNA methylation (14,15), messenger RNA (mRNA) levels (16,17), and protein markers (18,19). Metabolomics considers the role of circulating metabolites (20), including lipids, amino acids, and other small molecules that have been implicated in a range of disorders and some hypothesis-driven studies of MDD (21,22). The metabolome has not been characterized to the same extent as other molecular phenotypes (23), but the recent availability of reproducible high-throughput metabolomics makes studies feasible for many thousands of individuals (24). Studying the association between metabolite levels and MDD may offer potential biomarkers and interventional targets for MDD and provides a means to investigate MDD's comorbidity with cardiometabolic disorders (2,3). Most previous metabolic studies of MDD have concentrated on small numbers of metabolites (25), often used sample sizes of ,100 (25)(26)(27)(28)(29), and implemented a variety of methodological approaches all leading to somewhat heterogeneous findings (30). However, larger and more standardized studies are beginning to emerge; a recent meta-analysis of 230 metabolite levels in 15,428 individuals (5283 cases and 10,145 controls) found evidence of a distinct metabolomic profile associated with MDD (31).
The UK Biobank (UKB) recently released a large metabolomic dataset (32), in which we conducted a comprehensive and agnostic examination of metabolite associations with MDD. Association analysis between metabolites and MDD is limited because of the susceptibility of these analyses to confounding and reverse causation. We therefore also assessed potentially causal associations between MDD and metabolite levels, using two-sample Mendelian randomization (MR). Finally, we sought to identify potentially causal gene targets using MR and colocalization analysis of expression quantitative trait loci (eQTL) ( Figure S1 in Supplement 1).

Study Population
The UKB is a large prospective study (N = 502,492) of participants recruited from around the United Kingdom between the ages of 40 and 69 (33). Baseline data collection, including genotyping and extensive phenotyping for a range of health outcomes, occurred between 2006 and 2010. UKB data acquisition was approved by the North West National Health Service Research Ethics Committee (11/NW/0382). The analysis and data acquisition for the present study were conducted under application #4844. Written informed consent was obtained from all participants.

Metabolic Biomarkers
A total of 249 metabolite biomarkers were quantified in nonfasting baseline EDTA plasma samples (n = 118,021) (33), collected at baseline assessment between April 2007 and December 2010 (34), and measured between June 2019 and April 2020 (31). Measurement was performed using the highthroughput nuclear magnetic resonance (NMR) spectroscopy profiling platform, developed by Nightingale Health Ltd. Detailed methodology of the Nightingale NMR pipeline has been described elsewhere (35)(36)(37). These biomarkers include detailed cholesterol measures, fatty acid compositions, and low-weight metabolites such as amino acids and ketones (Table S1 in Supplement 2). Most biomarkers were measured in absolute concentration (mmol/L), excluding apolipoproteins A and B (measured in g/L), diameter measures (diameter of very-low-density lipoprotein/low-density lipoprotein/highdensity lipoprotein particles in nm), and the degree of unsaturation measure (the total number of pi bonds and rings within a molecule). Prior to analysis, the levels of each metabolite biomarker were transformed to a standard normal distribution using rank-based inverse normalization. Therefore, all reported beta coefficients from regression analyses represent the standardized change in ranked metabolite levels between MDD cases and controls.

Assessment of MDD Status
The MDD phenotype used in this study was derived from the online Thoughts and Feelings Questionnaire, fully completed by 126,077 participants in July 2017 (38). The questionnaire included the Composite International Diagnostic Interview Short Form, from which MDD lifetime diagnoses were derived (38) (Supplemental Methods and Materials in Supplement 1). This sample (referred to as the Mental Health Questionnaire sample) comprised 37,430 cases and 88,647 controls of whom 29,757 (8840 cases and 20,917 controls) also had metabolite levels derived from their blood samples (Table 1).

Metabolome-wide Association Study
Linear regression models were used to test the association of each metabolite level with the MDD phenotype (n = 29,757), referred to as a metabolome-wide association study (Met-WAS). In the basic MetWAS, only age, sex, assessment center, and NMR spectrometer were included as covariates. MetWAS analysis was also conducted with adjustment for socioeconomic status, smoking status, ethnicity, educational attainment, and body mass index (Supplemental Methods and Materials in Supplement 1) due to the strong and potentially confounding association between these factors with both MDD and metabolite levels (39)(40)(41)(42). Significance was defined by false discovery rate (FDR)-corrected p (p FDR ) values , .05.

Metabolite GWASs
For each metabolite, a GWAS was performed on unrelated individuals of European ancestry (n = 88,329) using PLINK 2.0 (43). Unrelated individuals were identified as previously reported (44). In brief, related individuals were initially excluded based on a shared relatedness of up to the third degree using kinship coefficients (.0.044) calculated using the KING toolset (45). One member of each group of related individuals was then subsequently added back in, by creating a genomic relationship matrix and selecting individuals with a genetic relatedness of ,0.025 with any other participant. Association testing was performed on the imputed genotypes using an additive model adjusted for age, sex, genotyping array, assessment center, spectrometer, and 10 genetic principal components. Variants with minor allele frequency (MAF) , 0.001, minor allele count , 20, and an INFO score , 0.1 were removed. Insertiondeletions were included in the analysis. To account for multiple comparisons, significance was defined as the Metabolomic Investigation of Major Depressive Disorder genome-wide threshold divided by the number of principal components, which account for 99% of the variance within the metabolomic data (5 3 10 28 /42 = 1.19 3 10 29 ), as described previously (46,47). An F statistic for the strength of the association between the sentinel SNVs and metabolite levels was calculated using the method: , in which N = sample size, k = number of SNVs, and r 2 = variance explained in metabolite levels by the genetic instruments. The r 2 statistic was calculated as follows: r 2 = 2 3 MAF 3 (1 2 MAF) 3 beta 2 , in which beta = effect size of the SNV and MAF = the SNV effect allele frequency. SNV heritability for each metabolite and the genetic correlation of the metabolite with MDD was performed using linkage disequilibrium (LD) score regression as implemented in the LDSC package (https://github.com/bulik/ldsc) (48,49). Precomputed LD scores based on the 1000 Genomes Project data for European ancestry (50) were used as the reference panel for regression weights in the analysis (https://data.broadinstitute. org/alkesgroup/LDSCORE).

Metabolite Bidirectional MR Analysis
Bidirectional MR was performed using the TwoSampleMR R package, version 0.5.6 (51,52) to assess causality between the metabolite (exposure) and MDD (outcome) for all metabolites, and vice versa. The MR analysis used SNV effect sizes from the UKB metabolite GWASs conducted in this study and the MDD meta-analysis GWAS, performed by the Psychiatric Genomics Consortium (9), with UKB participants removed using 3 MR methods (inverse variance weighted, MR Egger, and weighted median). Genetic instruments were identified by clumping genome-wide significant SNVs (p , 1.19 3 10 29 and MAF . 0.005) using the clump_data function in the R package TwoSampleMR (r 2 = 0.001, within a 10-Mb window). Metabolites with ,15 genetic instruments after harmonization with MDD summary statistics were omitted from the MR analysis.

Colocalization
Colocalization analysis was performed between MDD and metabolite genetic signals that showed potentially causal associations using MR, with a window of 1 Mb. Lead SNVs were identified using the default clumping method (r 2 = 0.1) on the SNP2GENE tool within FUMA software (53). To avoid running multiple colocalization tests of overlapping regions, lead SNVs were ordered by significance and iteratively filtered so that each lead SNV was .1 Mb away from any other lead SNV. For each of the final lead SNVs, all SNVs within a 1-Mb window from the metabolite GWAS and the MDD GWAS were extracted from the respective summary statistics. Colocalization analysis was then performed using the R coloc package (version 5.1.1) (54). The prior probabilities were set to default values: P1 (SNV is associated with the metabolite) and P2 (SNV is associated with MDD) were both set at 1 3 10 24 , and P12 (SNV is associated with both MDD and the metabolite) was set at 1 3 10 25 . An arbitrary posterior probability .

MR and Colocalization With eQTL Data
To probe possible mechanistic pathways between exposure (metabolite) and outcome (MDD), colocalized MDD and metabolite genetic signals were investigated to identify whether they localized to any causal genes within the colocalized region. For genes lying within the same LD window as colocalized MDD-metabolite genetic signals, we first identified any cis-eQTLs from whole blood gene expression data (n = 948) from the Genotype-Tissue Expression (GTEx) Project (48). Second, MR and complementary colocalization analyses were performed to assess the causality of gene expression levels (mRNA) with MDD. The MR analysis was performed using a Wald ratio test (55), using a single instrumental SNV defined as the most significant eQTL present in the dataset. For this post hoc analysis, significance was defined as p , .05. The instrumental SNVs were also used as lead SNVs in the colocalization analysis. For each lead SNV, MDD GWAS (9) and GTEx (56) summary statistics for variants within 61 Mb were extracted as inputs for the colocalization analysis.

MetWAS MDD Association Analysis
The demographics of the Mental Health Questionnaire sample (metabolomic and MDD data available; n = 29,757) are presented in Table 1. The basic MetWAS, without adjustment for common confounders, found that 191 (w76%) metabolites were significantly associated with MDD (p FDR , .05) ( Figure S2 in Supplement 1; Table S2 in Supplement 2). The most significant positive metabolite-MDD association was for monosaturated fatty acids to total fatty acids (%), (b = 0.177, p FDR = 1.52 3 10 243 ), and the most significant negative association was for polyunsaturated fatty acids to monounsaturated fatty acids ratio (b = 20.176, p FDR = 1.85 3 10 243 ). After adjustment for possible confounders, 133 (w53%) metabolites were significantly associated with MDD, 129 of which (w52%) were shared in the unadjusted model ( Figure 1; Table S3 in Supplement 2). The most significant positive metabolite-MDD association remained monosaturated fatty acids to total fatty acids (b = 0.069, p FDR = 2.

Metabolite GWASs
Manhattan plots for all 249 GWASs can be found in Supplement 3. All metabolites had multiple independent (r 2 = 0.001, within a 10-Mb window) genome-wide significant SNV associations (p , 1.19 3 10 29 ), all with F statistics . 10 (Table S4 in Supplement 2). The number of significant independent SNVs ranged from 4 (phenylalanine and acetoacetate) to 60 (cholesteryl esters in large high-density lipoprotein) (Table S5 in (Table S6 in Supplement 2), with absolute correlations ranging from (2) 0.073 (omega-3 fatty acids to total fatty acids [%]) to (1) 0.180 (phospholipids to total lipids in large high-density lipoprotein [%]). Of the metabolites that showed a significant genetic correlation with MDD, 119 (w83%) also had significant associations with the MDD phenotype in both the unadjusted and adjusted MetWAS analyses.

Mendelian Randomization
In the metabolite (exposure) to MDD (outcome) MR analysis, 21 metabolites were excluded for having too few (,15) genetic instruments ( Figure S4 in Supplement 1). MR analysis on the remaining 228 metabolites (Figure 2; Figure S5 in Supplement 1) found 5 with significant evidence for a causal relationship with MDD ( Table 2; Table S7 in Supplement 2), all relating to longchain polyunsaturated fatty acids (LC-PUFAs). Specifically, omega-3 fatty acid levels (omega-3 fatty acids, omega-3 to total fatty acids [%], and docosahexaenoic acid [DHA]) and the degree of unsaturation had a negative causal effect on MDD, while the omega-6 to omega-3 ratio had a positive causal effect on lifetime MDD ( Figure S6 in Supplement 1). These metabolites also had strong correlations with the other metabolites that significantly associated with MDD in the MetWAS phenotypic analysis, ranging from omega-3 fatty acids (significantly correlated with 79% of associated metabolites) to the degree of unsaturation (significantly correlated with 100% of the other MDD-associated metabolites) ( Figure S7 in Supplement 1). Of the 5 putatively causal metabolites, only the degree of unsaturation showed consistency in previous observational analyses, maintaining a significant association and genetic correlation with MDD ( Figure S8 in Supplement 1). In the MR analysis in which one SNV was left out, 3 of the significant metabolites to MDD MR results-degree of unsaturation, omega-3 fatty acids, and DHA-were dependent on rs174528, a single SNV in the FADS gene cluster on chromosome 11 ( Figure S9 in Supplement 1; Table S8 in Supplement 2).
In the MDD (exposure) to metabolite (outcome) MR analysis, all metabolite measures were included. In this analysis, there was no significant evidence that MDD was significantly causal to a change in any of the metabolite levels tested (b range: 20.28 to 0.26, p FDR range: .91 to .99) (Figure S10 in Supplement 1; Table S9 in Supplement 2).

Colocalization
The number of lead SNVs identified across the genome for the 5 metabolites with MR p FDR , .05 ranged from 20 (omega-3 fatty acids to total fatty acids [%]) to 30 (DHA) (Table S10 in Supplement 2). DHA had strong evidence for colocalization (PP.H4 = 0.89) with MDD at SNV rs2727271 (Table 2), which is an intronic variant for the fatty acid desaturase 2 (FADS2) gene ( Figure 3). The rest of the metabolites showed weak evidence of colocalization with MDD (PP.H4 , 0.2) (Figure S11 in Supplement 1; Table S11 in Supplement 2).

MR and Colocalization With eQTL Data
Downstream eQTL-MDD MR analysis was performed for the genes myelin regulatory factor (MYRF), flap endonuclease 1 (FEN1), transmembrane protein 258 (TMEM258), and FADS1/ 2/3 ( Figure 3D) because these were within the DHA-MDD colocalized region on chromosome 11 and had eQTL data available from GTEx. The analysis found that the mRNA levels of MYRF, TMEM258, FADS1, FADS2, and FADS3 had potentially positive causal effects on lifetime MDD status [odds ratio  Table 3). In contrast, mRNA levels of FEN1 were not significantly causal to MDD status (odds ratio = 0.98, p = .62). Subsequent colocalization analysis for the FADS cluster eQTLs and MDD found minimal evidence of shared causal variant for both gene expression and MDD. Supporting evidence was strongest for the existence of two independent causal variants for each trait at the same locus (PP.H3 . 0.7) (Table S12 in Supplement 2).

DISCUSSION
This study reports a large-scale (n = 29,757) association and causal analysis of blood metabolites with MDD. We found that 191 metabolites were significantly associated with lifetime MDD status, 129 of which withstood further adjustment for common confounders (39). These findings provide greater confidence that the significant metabolite-MDD associations observed in this study are not solely due to the effects of  confounding. Furthermore, we found that 144 metabolites showed a significant genetic correlation with MDD, suggesting a shared genetic architecture between the traits. Bidirectional MR analyses found evidence for potentially causal relationships between 5 metabolite measures relating to LC-PUFAs and MDD. Complementary colocalization analyses between these metabolites and MDD found that an omega-3 fatty acid, DHA colocalized with MDD within the FADS region. Subsequent MR analysis of eQTL data of genes in the FADS cluster found possible evidence of causality between MYRF, FADS1, FADS2, FADS3, and TMEM258 mRNA abundance and MDD. However, complementary colocalization analysis found evidence that this locus contains separate independent causal variants for mRNA abundance and MDD.
The MR analysis provides evidence that omega-3 LC-PUFAs may have a negative causal effect on lifetime MDD, while the ratio of omega-6 to omega-3 LC-PUFAs may have a positive causal effect on lifetime MDD. Specifically, colocalization analysis provided further support for a shared causal variant between MDD and DHA. Research on omega-3 and omega-6 levels and MDD has been lengthy and complex The association of single nucleotide variants in the FADS cluster in the docosahexaenoic acid genome-wide association studies. (C) The genes present in the FADS cluster. Genes colored red are those that were present in the GTEx (Genotype-Tissue Expression) data and consequently tested for a causal relationship with MDD in the post hoc analysis. MDD, major depressive disorder. These results indicate a positive causal effect of FADS1, FADS2, FADS3, MYRF, and TMEM258 on lifetime MDD status.
Metabolomic Investigation of Major Depressive Disorder (25,(57)(58)(59)(60)(61)(62). Although many cross-sectional and prospective studies have observed a reduction in omega-3 fatty acids, and an increase in the omega-6 to omega-3 ratio in those with MDD (30,58,63), it is important to note both the considerable methodological heterogeneity and contradictory evidence (30,61,64). Clinical trial studies testing the efficacy of omega-3 supplements in the prevention and treatment of MDD also show inconsistent findings (65)(66)(67)(68). Notably, our findings contradict results from a two-sample MR study in 2019, which found no evidence of a significant causal effect of omega-3 levels on MDD (69). However, this study leveraged summary statistics from smaller GWASs, and therefore may have been underpowered to detect causal effects. The enzymatic oxygenation of omega-3 and omega-6 fatty acids and their derived eicosanoids have opposing anti-inflammatory and proinflammatory effects, respectively ( Figure S13 in Supplement 1). Importantly, humans do not show endogenous biosynthesis of active omega-3 and omega-6 fatty acids (eicosapentaenoic acid and arachidonic acid, respectively) and the levels of each are reliant on the intake of their precursors, linoleic acid and alpha-linoleic acid, from the diet (70). Over the last century, the ratio of omega-6 to omega-3 has increased dramatically in many Western diets (20-30:1 compared with 1:1) (71), resulting in an imbalance in proinflammatory and proresolving mediators that may contribute to MDD pathology (72). Furthermore, there is a FADS haplotype associated with more efficient LC-PUFA synthesis from the diet (73). While this is evolutionarily advantageous in environments with limited LC-PUFA availability, it may heighten the imbalance between omega-6 and omega-3 levels resulting from modern Western diets (73). Overall, this indicates the possibility that a disproportionate dietary ratio of omega-6 to omega-3 may be contributing to MDD through increased inflammatory processes and that these may be exacerbated by certain genetic haplotypes.
The post hoc MR analysis found evidence of a possible causal association between MYRF, FADS1, FADS2, FADS3, and TMEM258 mRNA levels and MDD. MYRF and TMEM258 each have plausible mechanisms for their association with MDD due to links to abnormal myelination and intestinal stress, respectively (74)(75)(76). The FADS enzymes however are directly linked to the metabolism of both omega-3 and omega-6 LC-PUFAs (77,78) and have been shown to be central regulators of the ratio of their respective anti-inflammatory and proinflammatory lipid mediators (79). Therefore, this suggests that FADS enzymes act as key regulatory molecules in the activation and resolution of inflammatory processes, which may be important in the pathophysiology of MDD. A recent GWAS of bipolar disorder in Japanese populations also found a genetic association with FADS intron variants (80), indicating that this association may not be specific to MDD, but rather shared across different psychiatric disorders. Although this aligns with evidence that MDD and bipolar disorder share genetic architecture (81), whether the disorders share the putative causal effect of FADS found in this study would require additional analysis dissecting the causal relationship of the FADS cluster on bipolar disorder.
Despite these putative causal mechanisms, complementary colocalization analysis between all eQTLs and MDD found evidence that each trait had distinct, rather than shared, causal variants at the locus. This indicates that the instrumental variable in MR exhibits horizontal pleiotropy on the outcome (a direct violation of the exchangeability assumption for MR), thus weakening the evidence for causality (82,83). Therefore, it cannot be determined with high confidence that the transcript abundance of the genes within the FADS cluster has a direct causal relationship with MDD, but rather that it may be commonly seen alongside the presentation of MDD.
This study overcomes various limitations that existed in previous metabolomic analyses of MDD. First, this study analyzed the association and causal relationship between MDD and 249 metabolites in a hypothesis-free manner, and therefore reduces the possibility of confirmation bias, which has been criticized in previous studies with an a priori focus on LC-PUFAs (25). Second, previous studies often had small sample sizes, each with distinct study methodology and often suffering from self-report bias (30). This study utilizes a large sample size from which metabolite levels were biologically measured in the blood using NMR spectroscopy in a standardized, well-documented procedure (35,37). Our findings are also strengthened by the integration of eQTL data from GTEx version 8 (56), which interrogated the role of the gene products from the colocalized FADS region.
This study is limited because it tests the association of MDD with plasma metabolite levels, despite MDD primarily being a brain-based disorder. However, blood samples are minimally invasive to collect, which is essential for enabling large-scale high-powered studies and affords them the most promise for future clinical applications. Another limitation to this analysis is the high LD within the FADS region (73), meaning that a direct violation of the exclusion-restriction assumption in MR may exist, generating false-positive results (84). However, the complementary use of MR and colocalization analysis in this study acts as a sensitivity analysis (82). Another limitation is the focus of this analysis on European ancestry, especially because the frequencies of the FADS haplotypes differ geographically. The divergent FADS haplotypes are more commonly seen in non-European populations (85) and are associated with an increased expression of FADS1 and a more efficient LC-PUFA synthesis (73). Consequently, the mechanistic effects hypothesized in this study may be greater in non-European ancestral groups, which are not currently captured in this analysis. Additionally, although this is a largescale metabolomic analysis, it is not wholly comprehensive for the human metabolome (86). Indeed, the recent human metabolome database (87) detailed 18,557 metabolites that had been quantified and estimated the presence of thousands more. Finally, as this study focused on MDD prevalence, it may be biased by cases of chronic MDD within the cohort (88). Future work should therefore investigate whether omega-3 fatty acid levels and the omega-6 to omega-3 ratio associate with incident MDD.
In summary, this study interrogated the links between the metabolome and MDD and indicated a protective role for DHA against MDD. Furthermore, this study also suggests a role for an increased ratio of omega-6 to omega-3 fatty acids in the pathophysiology of MDD. We thank the participants of the UK Biobank. We also thank the UK Biobank team for collecting and preparing the data for analyses.
Figures S1 and S13 in Supplement 1 were created with licensed BioRender.com.
Data acquisition and analyses were conducted using the UK Biobank Resource under approved project #4844.
DAG is a scientific consultant for Optima Partners. RFH has received consultant fees from Illumina and is a scientific consultant for Optima Partners. RM has received speaker fees from Illumina, is an adviser to the Epigenetic Clock Development Foundation, and a scientific consultant for Optima Partners. All other authors report no biomedical financial interests or potential conflicts of interest.