Investigating the genetic architecture of eye colour in a Canadian cohort

Summary Eye color is highly variable in populations with European ancestry, ranging from low to high quantities of melanin in the iris. Polymorphisms in the HERC2/OCA2 locus have the largest effect on eye color in these populations, although other genomic regions also influence eye color. We performed genome-wide association studies of eye color in a Canadian cohort of European ancestry (N = 5,641) and investigated candidate causal variants. We uncovered several candidate causal signals in the HERC2/OCA2 region, whereas other loci likely harbor a single causal signal. We observed colocalization of eye color signals with the expression or methylation profiles of cultured primary melanocytes. Genetic correlations of eye and hair color suggest high genome-wide pleiotropy, but locus-level differences in the genetic architecture of both traits. Overall, we provide a better picture of the polymorphisms underpinning eye color variation, which may be a consequence of specific molecular processes in the iris melanocytes.


INTRODUCTION
Pigmentation levels in the iris vary among humans, ultimately leading to different eye colors. The melanin pigment in the iris is synthesized in the melanocytes, within organelles named melanosomes (Sturm and Frudakis, 2004). Eye color diversity is a consequence of different amounts of melanin concentrated in the melanocytes of the iris. In addition, the shape and distribution of melanosomes influence eye color variation. The mechanism is different from that of hair and skin pigmentation, in which two types of cells, melanocytes and keratinocytes (i.e. the epidermal melanin unit), play a key role in the production and distribution of melanin to give hair and skin color (Rees, 2003;Lin and Fisher, 2007;Parra, 2007). In addition, out of the two types of melanin synthesized by melanocytes (i.e. eumelanin, a brown/black pigment and pheomelanin, an orange/yellow pigment), different categorical iris colors are a result of variation mainly on eumelanin content, whereas there is little, nonsignificant variation on pheomelanin quantity, based on measurements on cultured uveal melanocytes (Wakamatsu et al., 2007).
At a molecular level, blue irises appear as melanin-free melanocytes, in which molecules in the iris scatter short blue wavelengths to the surface (Sturm and Frudakis, 2004). Green irises have medium levels of eumelanin, whereas high levels of eumelanin result in brown irises. Therefore, broad eye color classifications (i.e. blue, green, hazel, brown) cover a continuum of low to high quantities of eumelanin accumulated in the iris (Sturm and Frudakis, 2004).
Twin studies have shown that eye color is a highly heritable trait (>85%) and that it does not significantly vary throughout an adult's lifespan (Bito et al., 1997;Larsson et al., 2003). Furthermore, association studies have demonstrated that eye color has a polygenic architecture (Sulem et al., 2007;Edwards et al., 2016;Lloyd-Jones et al., 2016;Adhikari et al., 2019). Some of the loci with moderate/large effects associated with eye color variation are at or near the following genes: OCA2, TYR, TYRP1, SLC45A2, SLC24A4, SLC24A5, and IRF4. However, the variant with the largest effect on eye color variation is an intronic SNP (rs12913832) located in an enhancer within the gene HERC2 that regulates the expression of the downstream gene OCA2 (Visser et al., 2012). Functional studies have shown that the A-allele of rs12913832 allows the formation of a chromatin loop with the promoter of OCA2, facilitating the transcription of the gene. In contrast, the G-allele hinders the formation of the chromatin loop, leading to a diminished expression of OCA2 (Visser et al., 2012;Visser et al., 2014).
The SNP rs12913832 is the key regulatory element of OCA2. But it has been hypothesized that additional distal elements within the same region may be involved in the regulation of OCA2, a process that often is tissue specific (Palstra, 2009;Visser et al., 2014). In fact, through conditional analyses of association, genome-wide association studies (GWAS) have highlighted the presence of additional SNPs associated with variation in pigmentary traits (i.e. skin, hair, and eye pigmentation) within the HERC2/OCA2 region (Beleza et al., 2013;Adhikari et al., 2019;Lona-Durazo et al., 2019;Landi et al., 2020).
These studies have identified variants within HERC2 and OCA2 that are in low (r 2 < 0.2) linkage disequilibrium (LD) with rs12913832 (e.g. rs4778249, rs1667392, rs4778219, rs1800407, rs1448484), as well as other distant candidate regulatory variants near or within the APBA2 gene (e.g. rs4424881, rs36194177), which is located $700kb away from OCA2. However, pinpointing additional causal variants within the HERC2/ OCA2 region is challenging due to the complex LD patterns among the genetic variants and the lack of tissue-specific regulatory annotations. For instance, the Gene and Tissue Expression (GTEx) database (Consortium et al., 2017) includes skin tissue, which beyond a very small proportion of melanocytes, encompasses a diverse set of cell types not involved in pigmentation variation.
In order to improve our understanding of the genetic mechanisms behind eye pigmentation and melanocyte biology, in this paper we present the results of a GWAS of eye color conducted in a Canadian cohort from the Canadian Partnership of Tomorrow's Health (CanPath), along with fine-mapping analyses. We combined these results with gene expression and methylation data of cultured melanocytes by conducting colocalization analyses and transcriptome-wide association studies (TWAS). Our main results indicate that there are several candidate signals in the HERC2/OCA2 region associated with eye color, a different pattern from what is observed for hair color in the same sampled population. By integrating expression and methylation data assayed in melanocytes, we gain a better picture about how genetic polymorphisms may modulate eye color variation.

Eye color distribution in the CanPath cohort
A total of 5,732 participants of the Canadian Partnership for Tomorrow's Health (CanPath), who were genotyped using two genome-wide genotyping arrays (See STAR Methods for details), also self-reported their natural eye color using one of six possible answers: blue, gray, green, amber, hazel, or brown. We excluded amber eye color individuals due to the low number of individuals who self-reported this category. After quality control of the genotypes (i.e. exclusion of poor-quality samples and PCA outliers), we kept 5,641 individuals for further analyses. Overall, the distribution of eye color categories was quite similar across all provinces sampled (Figure 1), in which green and hazel were the least frequent categories and blue was the most common one. An important exception is the significantly lower proportion of individuals who self-reported blue eye color in Quebec, compared with other provinces (chi-square test: 104.39, df = 4, p value < 0.01). This pattern may be explained by the high proportion of French ancestry in the Quebec population (See Figure S1) due to the migration and settlement of French people in the province relatively recently (Bherer et al., 2011). This hypothesis is supported by the difference in allele frequencies between Quebec and all other provinces, in which the HERC2 rs12913832 G-allele has a lower frequency than in other provinces (see Table S1). In addition, a higher proportion of females self-reported green and hazel eye colors, relative to their male counterparts (chi-square test for green and hazel combined = 244.49; df = 1; p value < 0.01), which is similar to the observations previously reported in the case of green eye color (Sulem et al., 2007). Compared with several reported eye color frequencies per country, the CanPath eye color distributions differ from those found in other European and Asian countries, with the blue eye color frequency being most similar to that of Germany (39.6%) (Katsara and Nothnagel, 2019).

Genome-wide association studies and meta-analyses
We performed GWAS of eye color on each genotyping array (genotyped and imputed single-nucleotide polymorphisms (SNPs)) using a linear mixed model and an additive genetic model, using GCTA 1.26.0 (Yang et al., 2011(Yang et al., , 2014. We coded eye color categories as follows: 1 = blue or gray, 2 = green, 3 = hazel, and 4 = brown. We included sex, age, and the first ten principal components (PCs) as fixed effects and a genetic relationship matrix (GRM) as random effect to control for subtle population structure. We did not detect residual population substructure, based on Q-Q plots, in which observed p values did not show an early deviation from the expected p values (See Figure S2). We then carried out a meta-analysis using the summary statistics (beta and SE) including the two GWAS on METASOFT v2.0.1 (Han and Eskin, 2011). Q-Q plots of the meta-analyses (See Figure S3) and LD Score regression (intercept = 0.9935) indicated no residual population structure. We identified several known genome-wide significant loci (p value % 5e-08) associated with eye color (Figure 2; see Figure S4), overlapping or near the genes TYRP1 (lead SNP: rs1326779; beta = 0.139; SE = 0.024), IRF4 (lead SNP: rs12203592; beta = À0.164; SE = 0.029), TYR (lead SNP: rs1126809; beta = À0.136; SE = 0.024), SLC24A4 (lead SNP: rs4144266; beta = À0.124; SE = 0.022), and HERC2 (lead SNP: rs1129038; beta = À1.239; SE = 0.024). In addition, we observed a signal on chromosome 6 overlapping the ILRUN gene (lead SNP: rs116072038; beta = À0.438; SE = 0.077), a locus that has not been previously associated with pigmentation. Data S1 summarizes the suggestive and genome-wide associated SNPs. We conducted approximate conditional and joint analyses of association using GCTA-COJO (Yang et al., 2012), and using as LD reference the samples genotyped in this study with the UKBB array, to investigate if the genome-wide significant loci were being driven by one or more independent signals.
On the IRF4, TYRP1, TYR, and SLC24A4 regions, we identified one independent genome-wide significant SNP per locus (see Table S3), corresponding to known causal variants, such as rs12203592 on IRF4 and rs1126809 on TYR. The lead SNP on SLC24A4 is in high LD (r 2 = 0.98) with rs12896399, a SNP previously associated with pigmentation (Sulem et al., 2007;Eriksson et al., 2010). In the case of TYRP1, the selected SNP (rs1326779) is downstream of TYRP1 and has not been previously highlighted in eye color studies. On the HERC2/OCA2 region, we identified six independent SNPs overlapping OCA2 and HERC2 with p values in the conditional analysis exceeding the genome-wide significant threshold (see Table S3). Several of these SNPs have evidence of heterogeneity among the two studies, as indicated by I 2 and Cochran's Q values in the meta-analysis. However, they all have genome-wide significant p values in the random effects (RE2) model too, which takes into account heterogeneity among studies (see Table S3). To validate our results, we carried out the same GCTA-COJO analysis a second time, using samples genotyped with the GSA array as LD reference. We obtained concordant results, with multiple independent SNPs on the HERC2/ OCA2 region and a single SNP highlighted on the other pigmentation associated loci (i.e. IRF4, TYR, SLC24A4, and TYRP1) (see Table S4).
We also carried out a Bayesian fine-mapping analysis, in which all possible combinations of SNPs are iteratively considered without arbitrary selection of conditioned SNPs. We used the program FINEMAP (Benner et al., 2016) to perform fine-mapping analysis and to identify candidate causal SNPs for functional prioritization. In agreement with the GCTA-COJO analysis, by using FINEMAP we identified known pigmentation loci harboring one causal signal within IRF4, TYR, SLC24A4, and TYR (Data S2). On the IRF4 locus, the only candidate causal SNP with considerable evidence of causality (log 10 BF > 2) was the same SNP highlighted by GCTA-COJO (rs12203592; PIP = 0.999). In contrast, the missense SNP rs1126809 on TYR had a low posterior inclusion probability (PIP = 0.209), due to high LD with other nearby SNPs. Other candidate causal SNPs in the 95% credible set of the TYR locus include intergenic variants and one SNP (rs11018578) on the 3 0 UTR region of NOX4. On the SLC24A4 locus, the variants with considerable evidence of causality (i.e. log10BF R 2) include intronic SNPs within SLC24A4 and other variants upstream of the gene, including the SNP rs12896399 (log 10 BF = 2.58) (Data S2).
On the TYRP1 locus, all candidate causal SNPs in the credible set had a low (<0.1) PIP, most likely due to high LD among multiple SNPs in the locus (See Figure S5). The 95% credible set includes rs10809826 and rs1408799, two SNPs that have been previously associated with eye color (Sulem et al., 2008;Zhang et al., 2013;Galvá n-Femenía et al., 2018;Adhikari et al., 2019). Among the SNPs in the same 95% credible set, rs13297008 is located upstream of TYRP1, and it overlaps a DNase Hypersensitive Site identified in foreskin melanocytes (Data S2), indicative of an active transcriptional regulator region. We did not observe any poorly imputed or unimputed variants overlapping regulatory regions found in melanocytes. However, iScience Article we did find that rs13297008, rs2733831, and rs13296454 are in an active transcription start site (TSS) state on foreskin melanocytes only (across the tissues tested with the 15-core chromatin states). These SNPs are suggestive of either association or genome-wide significance, and all three are within the 95% credible set (Data S1).

Multiple HERC2/OCA2 variants associated with eye color variation
By applying a Bayesian fine-mapping approach on the HERC2/OCA2 region, we identified five candidate causal signals (i.e. five 95% credible sets) associated with eye color. Within these signals, three SNPs had a PIP >0.98 (Table 1 and Figure 3A). These results suggest independent causality of various signals in the locus. One of the candidate SNPs within HERC2 is rs12913832, a known enhancer that regulates the expression of OCA2 (Visser et al., 2012). In addition, three other SNPs within HERC2 and one within OCA2 were nominated as candidate causal loci, all of which fall within introns. These results are similar to the conditional analysis with GCTA-COJO, in which the independent SNPs in the locus encompass both OCA2 and HERC2, but the selected SNPs do not fully overlap. Importantly, all Bayesian fine-mapped SNPs in the locus had genome-wide significant p values on each sample and the same direction of effect. We annotated the putative regulatory function of the SNPs in all five credible sets using diverse databases (e.g. ENCODE, Roadmap Epigenomics Project). Aside from the overlap of rs12913832 (HERC2) with an open chromatin region in foreskin melanocytes, only the SNP rs117007668 is located within an open chromatin region in foreskin melanocytes (Data S2).
We then explored the LD patterns among the candidate causal variants (Table 1) in the credible sets using the CanPath genotypes (i.e. the same LD matrix used for fine-mapping) and considering the genotype probabilities, computed with LDStore v2.0 . Among the top candidate causal variants across the five credible sets (Table 1), most correlations are low (r 2 % 0.2), with the exception of rs117007668 and rs117744568, which are in high LD (r 2 = 0.9) ( Figure 3B). We further compared D 0 values among these same SNPs computed on LDLink, using as a proxy the European populations of the 1000 Genomes Project (Machiela and Chanock, 2015). The D 0 patterns, compared with r 2 values, reflect the allele frequency differences among the SNPs (D' > 0.6 in most cases) and suggest that the candidate causal SNPs are not in complete linkage equilibrium (See Figure S6).
We explored with HaploReg (version 4) Kellis, 2012, 2016) if other variants in the same LD block as our credible set SNPs in the OCA2/HERC2 locus harbor a putative regulatory function, to consider genetic variants that may have not been present in our dataset after imputation. By using as input the top SNP on each of the five credible sets (  iScience Article Finally, we explored SNP-SNP interactions across the top GWAS SNPs that passed the genome-wide significant threshold using the program CASSI and included also the loci fine mapped in the OCA2/HERC2 region (SNPs in Figure 3). By applying a Bonferroni-corrected p value threshold = 0.0015 (i.e. 0.05/33 pairwise tests), we did not identify significant interactions. However, there were nominally significant interactions between SLC24A4 (index SNP: rs4144266) and HERC2/OCA2 (SNPs: rs12913832 and rs4778138) (see Table S7).

Associations with eye color in recent studies
By investigating eye color variation in the OCA2 locus, Andersen et al. (2016) identified that two missense SNPs within OCA2 (rs121918166 and rs74653330) have a measurable effect on eye color variation, in which the alternative alleles decrease melanin levels, even in a heterozygote state. These two SNPs were rare in our sample (MAF <1%); therefore, we did not consider them in our GWAS analyses. By looking at the genotype-level data in these two rare SNPs, we identified 108 individuals heterozygous for either rs121918166 or rs74653330. Among these, 16 individuals were homozygous for the nonblue eye color rs12913832-AA genotype, but none self-reported having blue eye color (brown = 11; green = 1; hazel = 4). In addition, among heterozygous individuals for either one of the rare SNPs (i.e. rs121918166 or rs74653330), 12 individuals were also heterozygous for the nonblue eye color rs12913832-AG genotype and self-reported blue eye color. These results indicate that these two rare variants do not account for the incidences of blue eye color with an rs12913832-AA background in the CanPath, but they may influence the eye color phenotype under an rs12913832-AG background. Furthermore, we identified in our sample a subset of individuals (N = 904) harboring the rs12913832:GG genotype, 21 of whom self-reported brown eye color, 631 green eye color, and 252 hazel eye color, suggesting that the rs12913832:GG genotype does not exclusively yield a blue eye color.
We compared the OCA2/HERC2 haplotypes previously described for eye color in European ancestry populations, with the fine-mapped loci reported in the present study. Donnelly et al. (2012) identified three haplotypes in the region, two of which match the fine-mapped SNPs we have identified: rs12913832 (BEH2) and rs4778138 (BEH1). The third haplotype includes rs916977 and rs1667394, but these two SNPs are not highly correlated (i.e., r 2 < 0.6) with any of our fine-mapped loci. The IrisPlex System is widely used to predict blue/brown eye color based on common genetic polymorphisms (Walsh et al., 2012). Currently, it includes two SNPs in the OCA2/HERC2 region (rs12913832 and rs1800407). The rs1800407 SNP is a missense SNP in OCA2, known to be involved in pigmentation variation (Donnelly et al., 2012;Pospiech et al., 2014;Kidd et al., 2020). This SNP did not pass the quality control in our study, but other SNPs in high LD are not within the fine-mapped 95% credible sets. In addition, other common SNPs across other genes are also included in the IrisPlex System, namely SLC24A4 (rs12896399), SLC45A2 (rs16891982), TYR (rs1393350), and IRF4 (rs12203592). Importantly, in our study there is a genome-wide significant association of TYRP1 with eye color, but this locus is not included in the IrisPlex System.  iScience Article identified up to five independent signals in the HERC2/OCA2 locus (indexed by: rs4778219, rs1800407, rs1800404, rs12913832, and rs4778249). Aside from rs12913832, none of their index SNPs are within the candidate causal SNP sets in our sample, even though rs1800404 was genome-wide significant in our meta-analysis (p value = 1.18 3 10 À11 ). In addition, they identified three novel loci associated with eye pigmentation: DSTYK (chromosome 1), WFDC5 (chromosome 20), and MPST (chromosome 22). We followed-up each of the three index SNPs in our meta-analyses (rs3795556, rs17422688 and rs5756492), but failed to replicate the former two SNPs using a Bonferroni correction (p value threshold = 0.005), whereas rs5756492 was not present in our meta-analysis. These differences may be driven by population ancestry differences, given that the CANDELA cohort includes recently admixed individuals from Latin America, although the phenotyping approach may also be driving these differences.
Finally, the largest eye color GWAS to date conducted in populations of mainly European ancestry reported several novel loci, which had not been previously associated with eye color, and a subset of them has not been previously associated with any pigmentation trait (eye, hair, or skin pigmentation) (Simcoe et al., 2021). However, the ILRUN locus identified in the present study was not among their novel signals, nor we were able to replicate it. In addition, they conducted conditional analyses of association to identify secondary signals on the significant loci, in which they identified a total of 115 independent signals, including three signals in chromosome X. Notably, they identified 33 independent signals in the HERC2/ OCA2 region and two signals in the nearby gene GABRB3. We followed-up their signals (Table S1 from Simcoe et al., 2021) in our meta-analyses and identified 50 SNPs nominally significant in our meta-analysis (p value % 0.05), all with a consistent direction of effect between both studies (see Table S5). This set of SNPs includes novel associations with eye color and/or pigmentation traits included in their study (i.e. DTL, MITF, PDCD6/AHRR, ADRB2, GCNT2, and SIK1). After considering a Bonferroni correction (0.05/ 112; p value % 4.46e-04), 16 SNPs remained significant, all of which overlap known pigmentation genes (IRF4, TYRP1, TYR, OCA2, and HERC2).
Lastly, we checked if the independent SNPs associated with eye color identified with GCTA-COJO by Simcoe et al. (2021) overlap with our eye color SNPs highlighted by GCTA-COJO (See Table S3). We identified an overlap of three SNPs: rs12203592 (IRF4), rs1126809 (TYR), and rs1129038 (HERC2). Interestingly, even though Simcoe et al. identified several independent loci in the OCA2/HERC2 region, the known rs12913832 SNP is not among them, likely because another SNP in perfect LD has a lower p value, which is similar to what we observe in our GCTA-COJO analysis (i.e. rs1129038). We also compared the same independent SNPs identified by Simcoe et al. with our candidate causal loci as defined by FINEMAP and found three overlapped SNPs: rs12203592 (IRF4), rs1126809 (TYR), and rs13297008 (TYRP1).

Colocalization with expression and methylation QTLs from cultured melanocytes
We performed colocalization analyses with hyprcoloc  of the eye color meta-analysis with melanocyte gene expression and methylation cis-QTLs (eQTLs, meQTLs, respectively) to explore if there were shared causal signals (see STAR Methods for details). Through the colocalization of GWAS with eQTLs, we identified a region overlapping OCA2 and AC090696.2 (the latter being a transcript that partially overlaps OCA2) (Table 2), in which the candidate marker is rs12913832. We also colocalized meQTLs with GWAS hits (Table 2) overlapping the gene body of HERC2 (tagged by cg25622125, cg27374167, and iScience Article cg05271345) using a posterior probability threshold of 0.8. Notably, we did not find colocalized eQTLs on the TYRP1 locus that passed the probability cutoff.

Transcriptome-wide association studies
We conducted TWAS using a subset of the CanPath cohort as LD reference and the expression weights from cultured melanocytes to predict the gene expression profile with FUSION . Our results highlighted the expression of three genes as significantly associated with eye color: OCA2, SLC24A4, and RIN3 (See Table S6; See Figure S7). The gene RIN3 is located near SLC24A4, and by conducting conditional TWAS we have shown that these two genes are not independent from each other (See Figure S8).

Genetic correlations
We calculated the genetic correlation between eye and hair color using the data from the two CanPath genotyping arrays for which we had full phenotype data (Lona-Durazo et al., 2021). Using a linear scale for both traits and the same covariates as used in the GWAS (See STAR Methods for details), there is a genetic correlation (r g ) of 55% (SE = 0.12; p value = 7.33e-6) and 69% (SE = 0.21; p value = 0.001) on the UK Biobank and GSA arrays, respectively. Similar to the approach used in a previous study (Lin et al., 2016), we then calculated the genetic correlation without controlling for the effect of significant principal components and obtained genetic correlation values of 63% (SE = 0.08; p value = 3.41e-15) and 79% (SE = 0.15; p value = 1.39e-07) on the UK Biobank and GSA arrays, respectively. These results are in line with the genetic correlations previously reported (Lin et al., 2016), in which they found a lower correlation when including principal components as covariates, due to the correlations between ancestry captured by the PCs and eye or hair color. Nevertheless, by not controlling for significant PCs, we may as well be capturing ancestry in the genetic correlation estimation.

DISCUSSION
In this paper, we present the results of our genome-wide association studies of eye color, as measured categorically through self-reports, from 5,641 participants of the Canadian Partnership for Tomorrow's Health (CanPath). We did not identify new loci associated with eye color that were successfully replicated, and we focused on performing downstream analysis to pinpoint candidate causal SNPs, specifically on those loci for which a functional variant has not yet been identified or in which there is evidence of more than one independent signal. We found that fine-mapping provides evidence for multiple independent SNPs within the HERC2/OCA2 region, whereas other loci likely have a single causal signal. Furthermore, we characterized our GWAS signals by using colocalization analyses with expression and methylation QTLs of cultured melanocytes and conducted TWAS, in which we identified the expression of SLC24A4/ RIN3 and OCA2 as significantly associated with eye color. Lastly, we explored the genetic correlations between hair and eye color in the CanPath cohort.
One of the caveats of this study is that we utilized eye color categories self-reported by participants of the CanPath cohorts, as categorical classifications do not capture as well iris color variation as quantitative measures (Liu et al., 2010;Norton et al., 2015;Edwards et al., 2016). In our sample we have identified significant differences in self-reporting of eye color between sexes, and although these may be a reflection of true sex differences, as has been previously reported in other studies reporting self-assessed categorical eye color (Sulem et al., 2007), we cannot discard the possibility of a self-reporting bias. This limitation is counter-balanced by the relatively large sample size, in comparison to the majority of previous studies (Kayser et al., 2008;Candille et al., 2012;Beleza et al., 2013), with the exception of the largest recent GWAS of eye color (Simcoe et al., 2021). Furthermore, significantly associated loci from self-reported eye color are useful in forensics, for predicting eye color categories, which the human eye can easily distinguish. Indeed, we have here identified most signals associated with eye color that are used in the IrisPlex eye color prediction system (Walsh et al., 2014(Walsh et al., , 2017Chaitanya et al., 2018). One of the loci not included currently in the IrisPlex system is TYRP1, a locus that could potentially improve the eye color prediction system. However, it is important to point out that structural features of the iris (i.e. contraction furrows, Wolfflin nodules, heterochromia) also contribute to color perceptions, but we are not able to distinguish them using the current dataset.
Through our GWAS meta-analysis we identified five known loci associated with eye color, encompassing the genes SLC24A4, IRF4, TYRP1, TYR, and HERC2/OCA2, similar to what was identified in a recent large ll OPEN ACCESS GWAS of both categorical and quantitative eye color loci in a Latin American (CANDELA) cohort (Adhikari et al., 2019). The only two known pigmentation loci that our GWAS failed to identify as significantly associated with eye color encompass the genes SLC24A5 on chromosome 15 and SLC45A2 on chromosome 5, in which missense variants (rs1426654 and rs16891982) are known to alter pigmentation traits (Lamason et al., 2005). The missense SNP on SLC24A5 was rare in our sample (MAF <1%) hence excluded, in line with frequencies observed in the 1000 Genomes Project European populations, in which the alternative allele is nearly fixed. In the case of the SLC45A2 locus, the missense SNP did not reach genome-wide significance (p value = 1.71e-6).
Our fine-mapping analyses identified known causal pigmentation loci in the credible sets, such as rs12203592 on IRF4, rs1126809 on TYR, and rs12913832 on HERC2. Contrary to what has been observed for hair pigmentation (Adhikari et al., 2019), we identified here one independent SNP in the TYR locus associated with eye color, even though there is at least another independent missense variant (rs1042602) known to alter melanin synthesis within the same gene, in addition to candidate regulatory variants in the upstream GRM5 gene associated with skin pigmentation (Stokowski et al., 2007;Sulem et al., 2007;Liu et al., 2010Liu et al., , 2015Beleza et al., 2013;Jacobs et al., 2013;Adhikari et al., 2019;Lona-Durazo et al., 2019). This result is also in line with what was reported in the CANDELA study (Adhikari et al., 2019), in which, compared with hair color, they identified a single candidate SNP in the TYR locus associated with eye color. This exemplifies the importance of characterizing the genetic architecture of different pigmentation traits independently and opens up new questions to investigate the different mechanisms involved in melanin synthesis between cutaneous versus iris melanocytes.
We conducted TWAS and colocalization analysis with expression and methylation QTLs to further explore the shared causal signals among these phenotypes. Through colocalization with melanocyte eQTLs, we found colocalization in the OCA2 region, likely regulated by the SNP rs12913832 in the nearby HERC2 gene. Similarly, colocalization with meQTLs highlighted a signal in the HERC2 locus. The shared signals between meQTLs, eQTLS, and eye color GWAS hits in the HERC2 region may suggest that DNA methylation could play a role in the differential expression of OCA2, thus influencing the eye color phenotype, although this cannot be confirmed with the current evidence. Further analyses, such as Mendelian randomization, will be useful to evaluate causal associations among these traits (e.g. Bonilla et al., 2020).
We did not find colocalization of GWAS SNPs with eQTLs on the TYRP1 locus, even though our GWAS and fine-mapping results suggest a regulatory role of the candidate causal variants in this locus due to (1) the location $11kb upstream of the gene and (2) the overlap of a SNP (rs13297008) with open chromatin regions in foreskin melanocytes. In addition, this gene was absent from the TWAS expression weights dataset, suggesting that the gene expression in the current dataset is not sufficiently heritable (i.e. heritability p > 0.01). Therefore, we are not able to provide evidence of the mechanism in which the variants in the locus affect pigmentation variation, nor we are able to nominate a single causal SNP.
The cultured melanocyte expression and methylation QTLs we used for colocalization and TWAS come from newborn foreskin melanocytes . Similarly, the regulatory annotations from the ENCODE and Roadmap Epigenomics projects (Dunham et al., 2012;Roadmap Epigenomics Consortium et al., 2015) also come from melanocytes, keratinocytes, and fibroblasts from foreskin tissue. The melanocytes from skin and iris have several similarities and same embryological origin, but there are also significant differences between them. For instance, the melanosomes within the iris melanocytes are retained in the cytoplasm, and they are not transferred through dendrite-like structures to adjacent keratinocytes, as it is the case in the skin and hair melanocytes (Sturm and Frudakis, 2004). Moreover, the iris melanocytes are not reactive to the alpha melanocyte stimulating hormone (a-MSH) (Li et al., 2006), and instead, alternative signaling cascades trigger and regulate melanogenesis (Zhou et al., 2018). Therefore, future QTL efforts using a more precise tissue type, such as uveal melanocytes, may aid in characterizing the regulatory differences between cutaneous and iris melanocytes.
The HERC2/OCA2 region on chromosome 15 has the strongest effect on eye color variation, such that blue eye color was initially considered a Mendelian trait (Davenport and Davenport, 1907;Sturm and Frudakis, 2004 iScience Article the locus also have an effect on the expression of OCA2 in the iris. For instance, a subset of participants harbored the rs12913832 homozygous genotype associated with blue eye color (i.e. GG), but they self-reported non-blue eye color. In addition, there may be rare genetic variants within the OCA2/HERC2 region (i.e. rs121918166 and rs74653330) accounting for the blue eye color individuals under the heterozygous rs12913832 genotype, but larger sample size studies are needed to statistically test this hypothesis. Therefore, the expression of OCA2 might be induced by other regulatory variants in the locus, counteracting the effect of rs12913832, as has been previously proposed (Andersen et al., 2016). An alternative explanation could be that a subset of participants self-reported their eye color inaccurately, a hypothesis that we are not able to discard.
In addition, it is possible that genetic interactions between IRF4 and OCA2 also play a role. For instance, it is known that individuals may have blue eye color when harboring one or two rs12913832 A-alleles (HERC2), associated with nonblue eye color, along with one or two rs12203592 T-alleles, associated with light eye color (Laino et al., 2018). Similarly, in the CanPath there is a significant increase in the number of nonbrown eye color individuals with the rs129138-AG genotype as the number of rs12203592-T alleles increases (Fisher exact test p value = 2.2e-16) (See Tables S8 and S9 and Figure S9). Finally, it has been recently suggested that SNPs in the genes TYR (rs1126809), TYRP1 (rs35866166, rs62538956), and SLC24A4 (rs1289469) may be responsible for the brown eye color in individuals of European ancestry with an rs12913832 homozygous G-allele background (Meyer et al., 2020). However, our formal interaction analyses using CASSI did not identify significant interactions (after Bonferroni correction) between any pair of markers analyzed, including the polymorphisms rs12913832 (HERC2) and rs12203592 (IRF4).
Genetic correlations among hair and eye color in the CanPath cohort are high, in line with what has been previously reported (Lin et al., 2016) and considering the effect that most genes have in both phenotypes too (e.g. SLC24A4, IRF4, OCA2). However, we have demonstrated that certain genetic differences come to light when investigating candidate causal variants across the genome. Within the CanPath cohort, we observed that red hair color is driven mainly by multiple candidate causal signals in the MC1R locus and that variants within the same gene also have a significant effect upon blonde hair color. In contrast, variants within MC1R and its antagonist, ASIP, are not associated with eye color, which may be explained by the fact that MC1R is not expressed in iris melanocytes (Li et al., 2006). In addition, this may explain why iris melanocytes do not respond to UV radiation as opposed to skin melanocytes. HERC2/OCA2 is the most significant locus in our analysis of blond versus black and brown versus black hair color (although as described earlier, MC1R is the most important locus determining red hair color). HERC2/OCA2 is also the most significant locus for eye color. However, the signal from hair color is primarily driven by rs12913832, whereas there are several independent signals in HERC2/OCA2 associated with eye color. Lastly, even though IRF4, a transcription factor that upregulates tyrosinase, has a large effect on both blonde hair and blue eye color, the direction of effect of the causal SNP rs12203592 is opposite for both traits: the derived T-allele is associated with blue eye color, whereas the same allele is associated with the presence of brown hair color (Praetorius et al., 2013).

Limitations of the study
The main caveat of the study is the self-reported eye color from CanPath participants.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following: