Identifying and testing marker–trait associations for growth and phenology in three pine species: Implications for genomic prediction

Abstract In tree species, genomic prediction offers the potential to forecast mature trait values in early growth stages, if robust marker–trait associations can be identified. Here we apply a novel multispecies approach using genotypes from a new genotyping array, based on 20,795 single nucleotide polymorphisms (SNPs) from three closely related pine species (Pinus sylvestris, Pinus uncinata and Pinus mugo), to test for associations with growth and phenology data from a common garden study. Predictive models constructed using significantly associated SNPs were then tested and applied to an independent multisite field trial of P. sylvestris and the capability to predict trait values was evaluated. One hundred and eighteen SNPs showed significant associations with the traits in the pine species. Common SNPs (MAF > 0.05) associated with bud set were only found in genes putatively involved in growth and development, whereas those associated with growth and budburst were also located in genes putatively involved in response to environment and, to a lesser extent, reproduction. At one of the two independent sites, the model we developed produced highly significant correlations between predicted values and observed height data (YA, height 2020: r = 0.376, p < 0.001). Predicted values estimated with our budburst model were weakly but positively correlated with duration of budburst at one of the sites (GS, 2015: r = 0.204, p = 0.034; 2018: r = 0.205, p = 0.034–0.037) and negatively associated with budburst timing at the other (YA: r = −0.202, p = 0.046). Genomic prediction resulted in the selection of sets of trees whose mean height was taller than the average for each site. Our results provide tentative support for the capability of prediction models to forecast trait values in trees, while highlighting the need for caution in applying them to trees grown in different environments.

In tree breeding, genome-wide single nucleotide polymorphism (SNP) markers can be used to predict breeding values and significantly increase the rate of gain in subsequent generations in a process known as genomic selection or genomic prediction (Meuwissen et al., 2001). The use of association analyses to identify SNPs significantly associated with traits of interest can further reduce and refine the number of SNPs used in predictive models. In this context, genomic prediction aims to increase the efficiency of breeding programmes to improve timber yield and quality and reduce losses due to pests and diseases in commercial forestry. Increasingly, it is also being used to screen natural populations for their adaptive potential to future threats such as climate change and disease (Capblancq et al., 2020;Isabel et al., 2020). To develop predictive models, multiple, ideally independent, trials are necessary to identify, test and validate the SNPs associated with each trait. In trying to apply genomic prediction approaches to populations outside breeding programmes, there are the additional challenges of comparative genetic complexity (Herbert et al., 1999), a lack of pedigree information and an entirely different selection regime.
In the association analysis and prediction model development phase, groups of closely related species that are differently adapted but have similar genetic backgrounds can be useful experimental systems in which to search for parallel signatures of selection at the genomic level (Wachowiak et al., 2015). For such groups, a multispecies genomic approach can improve the power to detect genes involved in adaptation and show whether orthologous loci contribute to adaptive variation in different species (Neale & Ingvarsson, 2008).
Multispecies approaches are reported to improve our understanding of both transcriptomes and genomes (e.g. Ahrazem et al., 2019;Cornell et al., 2007;Leebens-Mack et al., 2019;Pellegrini et al., 1999;Polturak et al., 2018): the use of comparative species analyses provides a wide phenotypic base and shared evolutionary history for the identification of inter-and intra-specific genetic variation (van Kleunen et al., 2014). Making use of this multispecies approach to select SNPs for genomic prediction may also help to locate them in influential loci and potentially improve the generality of models based upon them.
Globally, pines are among the most important commercial tree species (Kanninen, 2010) and are ecosystem-defining in vast areas of forest across the northern hemisphere. Understanding the genetic architecture of key adaptive traits in pines, such as growth, form, disease resistance and phenology is of interest to a wide range of stakeholders, including the forestry industry and conservationists.
Due to their large size and complexity, the assembly of pine genomes is particularly challenging, and has only been satisfactorily achieved for loblolly pine (Pinus taeda; Zimin et al., 2014) and sugar pine (Pinus lambertiana; Stevens et al., 2016), which are among the largest genomes ever sequenced and assembled. However, thousands of polymorphic regions potentially suitable for use in genotyping in pine species have already been discovered using high-throughput sequencing methods such as whole transcriptome studies (Blanca et al., 2012;Chancerel et al., 2011;Durán et al., 2019;Geraldes et al., 2011;Liu et al., 2014;Parchman et al., 2010;Trick et al., 2009;Wachowiak et al., 2015).
Here we study three closely related pine species (P. sylvestris, Pinus mugo and Pinus uncinata) which have contrasting growth habits and are adapted to different environments. The species are members of the same monophyletic group within Pinaceae (Grotkopp et al., 2004), having diverged within the last 5 million years (Wachowiak et al., 2011) and have the same number of chromosomes (2n = 24).
The three species have weak reproductive barriers between them and share many ancestral polymorphisms (Lewandowski et al., 2000;Wachowiak et al., 2013). We used trait data from a common garden glasshouse experiment  along with genotyping data from a large new multispecies Pinus SNP array (Perry et al., 2020) to identify SNPs associated with growth and phenology in the three species and to determine their putative function.
A range of genomic prediction models were developed, and then tested using a subset of the P. sylvestris. Finally, we evaluated the potential of the models for genomic prediction, by testing the best performing set to estimate trait values in an independent multisite P. sylvestris field trial. We discuss the potential and limitations of the models for genomic prediction.

| ME THODS
Experimental design and analyses performed in the study are summarized in Figure 1.

| Plant material and phenotype assessments
Collection of plant material, experimental design and phenotype assessments for the common garden glasshouse trial (referred to hereafter as the association trial) are described by . Briefly, open-pollinated seeds of the three pine species were collected from three to five trees per population from 28 natural populations in Europe covering the geographical range of each species (Figure 2). The collection consisted of 13 populations of P. sylvestris (SY), nine P. mugo (MU) and six P. uncinata (UN). Seeds from each maternal tree were sown on trays of compost in spring 2010.
After germination, a provenance-progeny trial was established in an unheated glasshouse at the UK Centre for Ecology and Hydrology, Edinburgh, UK (latitude 55.861261, longitude −3.207819). Seedlings were grown under natural light with automatic watering applied during the growing season. The trial was divided into 25 randomized blocks with up to five families per population, of which the first 18 blocks were analysed by . A summary of the counts of populations, families and total numbers of individuals are provided in Table S1. Phenology (traits assessed: BS, timing of bud set, BB, timing of budburst) and growth (traits assessed: H, total height; I, annual increment-the increase in height from 1 year to the next) were recorded for every seedling to evaluate within-and between-species variation (species means for trees sampled in this study recorded in Table S2).
Bud set was defined as the time when a visible apical bud with clearly developed scales was formed at the tip of a stem in each seedling and was recorded as the number of days since the date on which the first plant to set a terminal bud in the trial was observed (in the first year of growth: BS2010). Budburst was scored when new needles emerged around the tip of the apical bud in the main stem and was measured as the number of days since the date on which the first plant to burst bud was observed (in the second and third years, BB2011, BB2012). Phenology observations were conducted twice a week. The height of all pines was measured annually from the second to fourth year of the pine growth (H2011, H2012 and H2013).
The annual increment was estimated for growth between 2011-12 (I2012) and 2012-13 (I2013). On the rare occasions that height was lower than in the previous year (due to, for example, human error or the loss of the leader) measurements were adjusted to 'NA'. To assess the proportion of variation that is under genetic control, the narrow sense heritability (h 2 ) and associated standard error for each trait was estimated using the GRM (genetic relationship matrix)based restricted maximum likelihood (GREML) procedure implemented in GCTA (Yang et al., 2011).
An independent multi-site, field-based provenance-progeny trial of P. sylvestris (referred to hereafter as the independent trial) was also phenotyped and genotyped using the same genotyping array and was used to test the predictive power of models developed using plants in the association trial described above. This trial was selected as there was commonality in the traits measured and geographical range of Scots pine populations used with the glasshouse trial, however the number of genotyped samples were considered to be too low to perform association analyses and it was therefore used instead to test the predictive power of the models in two distinct environments. Seeds from eight families from each of 21 native Scottish P. sylvestris populations ( Figure 2)  were initially grown in an unheated glasshouse whereas trees transplanted to GS were started in pots outside. The two transplantation sites also generally experience different climates, with the YA site typically warmer and drier than the GS site (Table S3) Table S1. Budburst and height were assessed annually from 2015. Height was measured in the winter before the growing season began from 2015 to 2020. Height was also measured before the start of the second growing season in March 2008. The annual increment was estimated as the increase in height from 1 year to the next. Each tree was assessed for budburst stage annually from 2015 until 2019 at weekly intervals from early spring until budburst was complete. Seven distinct stages of budburst were defined (Table S4).
The number of days for each tree to reach each stage of budburst, starting from the day the first tree was observed at each stage at each site, was recorded. When trees progressed through budburst stages rapidly, skipping a stage between assessments, a mean value was taken between the two assessment dates. The duration of the core stages of budburst (time taken to progress from stage 4 to stage 6) was also estimated. Although the method used to record budburst was not identical among the association trial and independent trial, the trait as described by  is equivalent to stages 5 and 6 in the independent trial.
To better understand the relationship between different traits and for individual traits across different years, Pearson's correlation coefficient and significance values were estimated for trees in the association trial for each species separately and for trees in the multi-site independent trial for each site separately using a package 'Hmisc' (Harrell Jr, 2020) in R (R Core Team, 2020). Data on individual traits measured over multiple years enabled their consistency among years to be assessed, as inter-annual variation may occur due to seasonal environmental variation, developmental variation and/ or maternal effects (Vivas et al., 2020). Pearson's correlation coefficient and associated significance values between budburst timing and duration among years and stages in the independent trial were also examined.
Nested ANOVA was performed for growth and phenology in the independent trial to assess within-site spatial heterogeneity for each site. Data for all trees in the trial was used (i.e. not just the subset of genotyped trees), with population as a fixed effect, and families nested within population and block as random effects.

| Genotyping array
The design of the array, genotyping and SNP calling are as described by Perry et al. (2020). Briefly, an array comprising 49,829 SNPs was used to genotype 1920 DNA samples (from needles of four pine species: the species included here plus Pinus uliginosa) according to the Affymetrix Axiom Assay protocol on a GeneTitan and following genotyping, genotype calls were performed using Axiom Analysis Suite as recommended by the manufacturer. A subset of trees from the association trial described in the previous section were genotyped F I G U R E 2 Geographical location of sampled pine populations across Europe (map on left: association trial) and Scotland (map on right: independent trial). Pine species: MU, P. mugo; SY, P. sylvestris and UN, P. uncinata. Independent trial map shows genotyped P. sylvestris (SY-G) and other P. sylvestris populations included in the trial but not genotyped (SY). Multi-site field sites in independent trial: GS, Glensaugh and YA, Yair. Association trial map: Europe base map credit: Natural Earth, Esri France. Independent trial map contains OS data © Crown Copyright and database right 2020  Table S1.
The independent trial of P. sylvestris was also partially genotyped at each site: 100 trees from YA (15% of the trees at the site) and  Table S1.

| Population genetic structure, kinship and statistical power
On the basis of the SNP genotyping results in the association trial, population genetic structure was assessed visually by constructing a neighbour joining tree in the R package 'ape' (Paradis & Schliep, 2019) based on a distance matrix generated in TASSEL version 5.2.39 (Bradbury et al., 2007). SNPs with call rate <80% (N = 48) were excluded. Pairwise kinship (centred identity by state) was estimated for each species independently using all polymorphic markers in TASSEL.
The degree of skewness in the distribution within each species' matrix was calculated using the D'Agostino skewness test in the R package 'fBasics' (Wuertz et al., 2020).

| Genetic associations and putative functions
Using results from the association trial, identification of SNPs potentially associated with phenology (traits: budburst and bud set) and growth (traits: height and increment) was conducted for each trait in each year. For all analyses, raw phenotypic data were used. A multi-locus mixed model (MLMM) approach, with 10 steps, was used to identify loci which have large effects (Segura et al., 2012). Highly significant SNPs (based on the estimations of genetic variance, p < 0.001) were included in a forward-backward stepwise approach, one by one, as cofactors in the model. The kinship matrix used in the MLM approach was also included, but PC scores were not used. Rather than using PC scores to estimate population structure the MLMM approach uses a kinship matrix to describe the covariance structure, which is thought to perform better when population structure is complex (Segura et al., 2012). The multiple Bonferroni criterion, defined as the largest model whose cofactors all have a p-factor below a Bonferroni-corrected threshold of 0.05 (Dunn, 1961), was used to indicate the best model.
Single nucleotide polymorphisms were divided into two classes on the basis of their minor allele frequency: MAF >0.05: common; MAF <0.05: rare. As the majority of traits are controlled by many genes of very small effect it is likely to be important to consider every SNP identified. Each SNP found to be significantly associated with a trait (when no MAF filter was applied, in order to compare the putative function of genes containing rare and common variants) was also examined to compare the putative function of the genes on which they are located with the trait in question. To do this, the full unigene sequence in which each SNP is located was BLASTed against the uniprotkb_viridiplantae database, the result with the highest score (minimum e-value 1E-50) for each unigene was retained, and the putative function determined by a literature survey using the search term 'protein name function plant.' Where the protein was uncharacterized, the protein domain and/or family was recorded and the most likely function inferred.
Where putative functions could be determined the genes were grouped according to their role in the following phenotypic responses: 'Response to environment' (including abiotic and biotic stress response), 'Growth and development' (including cell division, differentiation and senescence); 'Reproduction' (including flowering time and seed yield). Although many cellular processes (e.g. metabolism, signalling pathways, DNA binding, transcription and translation) were also identified as putative functions, these were assumed to be underlying control and expression of phenotypic functions and were not assigned a function.

| Prediction models: construction and internal assessment
Phenotypic prediction multiple linear regression models were constructed in R using data generated from the association trial. A number of different models were constructed and compared using different sets of SNPs and different traits to train the model (the different models assessed are listed in Table 3). Predictive models were constructed using SNPs identified as potentially associated with variation in phenology (trait: budburst: BB2011) and growth (traits: height and increment: H2013 and I2013). To assess the relative contribution of SNPs identified using the multispecies compared to a single species approach, predictive models for both growth and phenology were constructed using SNPs identified from either ( were common in the datasets from which the significant associations were originally identified). As recommended when the number of loci is greater than the number of samples, we elected to construct the prediction model based on all polymorphic SNPs using ridge regression with the R package 'rrBLUP' (Endelman, 2011), rather than the multiple linear regression method. For all models, where necessary, family means were used to replace missing genotype data. The predictive models were initially run using an internal training set comprising 60% of SY trees from the association trial, which had been used to identify associated SNPs, and predictive accuracy assessed using the remaining 40% of SY trees (the internal testing set) in the association trial. Models were run using SY trees and not UN or MU as our intention was to carry out subsequent testing of the models in this species alone. We used budburst and growth but not bud set data as subsequent model testing was applied to data from an independent trial for which only these traits were available. Pearson's correlation coefficient and significance for correlations between predicted values generated by the predictive models and observed values for both phenology and growth (both H2013 and I2013 were tested to see which performed best for the growth model) were estimated using the R package 'Hmisc' (Harrell Jr, 2020).
The SNPs used in each prediction model were assessed for their variation among SY populations using the R package 'hierfstat' (Goudet & Jombart, 2020). Basic statistics including overall observed heterozygosity (H O ), mean gene diversities within populations (H S ), inbreeding coefficient (F IS ) and population differentiation (F ST ) were estimated for each set of SNPs described above.

| Prediction models: independent assessment
The SNPs identified as potentially associated with budburst and growth were tested for their predictive power using genotype and phenotype data from an independent trial of P. sylvestris, established at contrasting sites (YA and GS) in 2012. Genotyped trees from YA and GS were assigned predicted values for both phenology and growth using multiple linear regression models constructed using either all available SNPs or only those found to be significantly associated with the trait (final predictive models, chosen based on their performance in the initial internal test). As models tend to work best in material that is closely related to those used in model development (Beaulieu et al., 2011), the model using all available SNPs was also trained using only SY trees from Scotland grown in the association trial (N = 227). Observed values for growth (height and increment) and budburst (timing and duration) at multiple years (2015-2020 for increment and 2015-2019 for budburst) were compared with values generated by the predictive models.
Multiple years were used to ensure that annual variation caused by seasonal differences could also be considered. Height is a cumulative measure, and therefore, only the most recent (2020) and the measurements made after the first year of growth (2008) were compared with the predicted values. Furthermore, the use of height measurements at both young and more mature ages allowed the impact of maternal effects to be examined and tested. In order to identify SNPs which are good predictors of final height, the use of trees whose traits are not confounded by maternal effects is important. To assess the performance of the predictive models, the Pearson's correlation coefficient and significance values between predicted and observed values for phenology and growth were estimated for each site (GS and YA) separately using the R package 'Hmisc' (Harrell Jr, 2020). The use of two sites in independent testing also allowed comparison of the performance of the predictive models in different environments.
The effectiveness of using the predictive model as a genomic selection tool was also tested and compared with other selection methods. For each method, 10 trees were selected from each trial site: for genomic selection, the 10 trees at each site with the highest values generated by the predictive model were chosen; for phenotype selection, the 10 tallest trees at each site prior to the start of the second growing season (measured in March 2008) were chosen.
The average height at 13 years (2020) of the 10 trees selected using each method was compared. The trees selected using each method were also compared to the 10 tallest trees at each site at age 13. lowest for all species combined (mean h 2 = 0.55).

| Intra-and inter-specific trait variation
Phenotypes for the independent trial are provided in Table S7. The relationships between duration and timing of each stage of budburst in the independent trial were examined. Due to missing data, only budburst stages 4-6 were analysed. Timing (time taken to reach each stage) showed a significant negative correlation with duration (time taken to progress from stage 4 to 6) of budburst at each year assessed for stage 4, but the relationship was positively correlated for stage 6 (Table S8). In contrast, the time to reach stage 6 showed a significant positive correlation with the duration of budburst. Time to reach stage 5 was both positively (GS) and negatively (YA) correlated with the duration of budburst. Budburst stages were highly positively correlated with one another in all years at both sites (Table S8). Therefore, stage 6 is used to represent timing of budburst for all further analyses.
The relationships among the traits measured over multiple years (including both correlations of individual traits over multiple years and correlations of different traits with one another) were examined in both the association trial (Table S9) and independent trial (Table   S10)

| Summary of genotyping array
High quality genotypes (call rate >80%) were obtained for over 94% of trees genotyped within the association trial (N = 762: MU, Table S5). There were 9583 high quality (call rate >80%) polymorphic SNPs which were shared among the three species (Table 1), with a further 1352 SNPs which were polymorphic in at least two species and monomorphic in a third. SNP genotypes obtained for trees from YA and GS were all high quality (Table S7).

| Population genetic structure, kinship and statistical power
The neighbour joining tree generated from the distance matrix indicated the majority of the structure is among species and families with weak population structure, as reported in previous studies (Wachowiak et al., 2013;Wachowiak, Zaborowska, et al., 2018 mixed model approaches and the correction for population stratification prior to testing for genetic association. The statistical power to detect true associations between SNPs and adaptive traits was found to be extremely low for both MU and UN even when the polygenic effect was assumed to be 10 x that of the phenotypic variance (Table S12). This is likely to be due to the low sample numbers, a conclusion supported by the result that the

| Identification of loci associated with traits
One hundred and eighteen SNPs were identified as associated with phenology and growth in the three pine species (Table 2; Table S13) and included SNPs which were identified in more than one species' datasets. There was very little overlap of individual SNPs associated among multiple traits or years: four SNPs were associated with more than one trait, of which only one (comp51128_c0_seq1_1529) (Table S13).
A further 44 SNPs were found to be associated with traits when all species were combined within a single analysis, although 11 of these were also identified in SY and 23 were identified in MU-UN.
Applying a multispecies approach led to the identification of 54

| Putative function of genes containing SNPs associated with traits
One hundred and eighteen SNPs associated with phenology and growth in the three pine species were located at 114 gene loci (two unigenes, comp48223_c0_seq1 and comp47733_c0_seq1, contained three SNPs each). One locus was originally identified in Pinus radiata (Doth_comp54682_c0_seq1_159), the remaining were identified following transcriptome sequencing in P. sylvestris and the taxa of the P. mugo complex (Perry et al., 2020: Table S13). The genetic sequences containing loci associated with each trait were found to be highly similar to proteins with a range of putative functions (Tables S15a-c). Of the SNPs identified when no MAF filter was applied, the majority of SNPs associated with bud set (all identified in SY) were found in genes that code for proteins putatively involved in growth and development (61.11%) with a few (exclusively rare) SNPs found in proteins putatively involved in response to environment (22.22%, Figure 3; Table S15a). In contrast, budburst had high numbers of associated SNPs (both rare and common) in genes that code for proteins putatively involved in response to environment and growth and development (mean contribution of putative function groups coded by genes containing SNPs significantly associated with budburst across species' datasets as a percentage of the total number of proteins: 39.01% and 39.09% for growth and development and response to environment respectively. Table S15b). Whereas the majority of SNPs associated with height were found in proteins putatively associated with growth and development (Table S15c) for proteins putatively associated with response to environment as well as growth and development. Of the five SNPs that were only identified as associated with a gene when a MAF filter was applied, one was putatively associated with response to environment, one with growth and development, one with all three functions and two with none of these functions (Table S15a-c).

| Prediction models: construction and internal assessment
There was a large dropout in the number of SNPs which were suitable for inclusion in subsequent predictive models: of the 38 SNPs The inbreeding coefficient (F IS ) was −0.6 to −0.7 for the majority of SNP sets (Table S16) Table 3. Models constructed using random SNPs were not successful in predicting values that were correlated with observed values for each trait, although the mean strength of the correlation was much higher for the random models trained using H2013 than for either I2013 or BB2011. For all models, except those constructed using random SNPs, those without a MAF filter always performed better than the equivalent models constructed using only common SNPs, although there was little difference in performance for models constructed using all polymorphic SNPs ( The effect of the trait used to train the model was also seen in comparisons of the performance of the models constructed using all polymorphic SNPs: for each trait, predicted values were more closely correlated with the observed values in models using budburst than in those using growth traits (H2013 and I2013). Of the traits used to identify associated SNPs and construct the predictive models, the one with the lowest h 2 (I2013) also had the lowest predictive ability in the SY dataset, whereas the trait with the highest h 2 (BB2011) had the highest predictive ability.

| Prediction models: independent assessment
Predicted values were estimated using the final predictive models for budburst and growth as well as models constructed using all available SNPs and compared with values observed in the independent trial. The independent field sites share populations and families but experience contrasting climates, allowing the models to be independently tested on traits measured in different environments.
The predicted values for each trait were not significantly correlated with the observed values when using models constructed with all available SNPs when trained using the full SY dataset and only for increment in 2016 at GS when only SY from Scotland is used to train the model (Table 4). In contrast, a number of significant correlations were observed in the independent trial when using final predictive models for growth and budburst. The predicted values for budburst were found to be significantly positively correlated with the duration of budburst at GS in 2015 and 2018 (Table 4)  The predicted values for growth were found to be significantly associated with observed increment measurements at YA in every year, but not at GS (Table 4). The correlation between predicted values and observed height in 2008 (at age one) was not significant at either YA or GS, despite the strong correlation observed between predicted values and observed height at age 13 at YA (Figure 4) indicating that the cumulative effect of the trees growing in the environment at YA contributed to the strength of the association.
The effectiveness of the final predictive model for growth as a genomic selection tool was tested by comparing different selection methods ( Figure 5) in trees at both independent trial sites (GS and YA). Genomic selection was the most successful method of selecting tall trees growing at YA: trees were on average 6.80% taller (227 mm) than trees selected using the phenotype method and 5.41% (181 mm) taller than the mean height of trees at this site. In contrast, the phenotype method was more successful than the genomic method at GS (5.07% and 7.70% increase in the mean height of trees selecting using the phenotype method compared to the genomic method and site mean respectively), despite the lack of significant correlation between height at 2008 and height at 2020 (Table   S10). The coefficient of variation (CV) for trees chosen using the phenotype selection method was over 60% greater than for those chosen using the genomic selection method at GS (

| DISCUSS ION
This study is among the first to use a high throughput genotyping array to identify SNPs associated with growth and phenology traits in conifers and is unique in applying a multispecies approach.
Association genetics of adaptive traits is of great interest to forestry and is being studied in many species such as P. contorta (Mahony et al., 2020), Populus trichocarpa (Evans et al., 2014), Picea sitchensis (Holliday et al., 2010) and P. taeda (Lu et al., 2017) and the use of multiple species has the potential to improve the generality of models based upon them. Although other multispecies genotyping arrays have been developed (e.g. for Eucalyptus, Silva-Junior et al., 2015), association analyses are conventionally restricted to a single species.
The high-throughput SNP array allowed nearly 50,000 SNPs (of which 20,795 were successfully converted) to be simultaneously genotyped in a large number of trees. To increase the sample size of the datasets and the statistical power of our analyses, data from P. mugo and P. uncinata, which are both part of the P. mugo complex, were combined. The dropout rate for P. mugo was much higher, and the call rate much lower than for P. sylvestris and P. uncinata. It is likely that this is a consequence of the dominance of P. sylvestris in the sample set used to set allele calling thresholds, coupled with the genetic distance between the two species (Perry et al., 2020).
Despite this, nearly a third of SNPs on the array were high quality in all three species and nearly half of all successfully converted SNPs were polymorphic in all three species-twice the number reported TA B L E 4 Pearson's correlation coefficient (r) and associated significance values for comparison of predicted and observed values for each trait Predicted values estimated by final predictive models for growth and budburst constructed using single nucleotide polymorphisms (SNPs) significantly associated with the traits and assessed for their performance in an internal test. Predictive models constructed using all available SNPs (no MAF filter applied, N SNPs = 15,019) trained using the full SY dataset and also trained with only SY trees from Scotland. Duration: time taken for each tree to progress from stage 4 to stage 6. Timing: time taken to reach stage 6 of budburst. Description of each budburst stage is given in Table S4.

F I G U R E 4
Correlations of observed height measured in 2020 at age 13 against predicted values using the final predictive model for growth for trees in an independent trial at Glensaugh (GS, correlation not significant) and Yair (YA) We used the SNP datasets to test for associations with previously published phenotypes for the three pine species , identifying 118 SNPs significantly associated with variation in growth and phenology over multiple years, of which nearly half would not have been identified without a multispecies approach. As shown previously, the between-population variation in both phenology and height was far less in P. mugo and P. uncinata than in P. sylvestris, reflecting the fact that the latter was sampled from across its much broader geographical distribution and across much wider environmental gradients in photoperiod and temperature . Despite the smaller environmental gradient represented by our P. mugo and P. uncinata sampling, the number of SNPs identified as significantly associated with phenology was similar to the number of SNPs identified in P. sylvestris, although the number of SNPs identified as significantly associated with growth traits was much higher in P. sylvestris. The majority of studies on genetic control of adaptive traits in conifers have also identified multiple QTLs or SNPs associated with variation in timing of bud set, budburst and growth Eckert et al., 2009;Holliday et al., 2010;Hurme et al., 2000;Jermstad et al., 2001Jermstad et al., , 2003Plomion et al., 1996;Prunier et al., 2013) as expected of complex traits (Mackay, 2001). For example, Eckert et al. (2015) tested 475 SNPs and found six significant associations with height and budburst in sugar pine (P. lambertiana) and Budde et al. (2014) identified 17 SNPs significantly associated with serotiny in maritime pine (P. pinaster) using an array with 251 SNPs from candidate genes. However, there have also been a limited number of specific genes implicated in the control of adaptive traits in conifers: loci related to budburst/set were identified in Picea abies and P. sylvestris (PaFTL2 (Avia et al., 2014) and PsFTL2 (Gyllenstrand et al., 2007) respectively).
Overall, the majority of SNPs identified in this study were rare. Of those that were common, counts were similar among P. sylvestris and the P. mugo complex for both phenology (13 and 10 for P. sylvestris and the P. mugo complex respectively) and growth (four and six for P. sylvestris and the P. mugo complex respectively).
Although one SNP was found to be associated with both phenology and growth (the former in the P. mugo complex and the latter in P. sylvestris) it was extremely rare in P. sylvestris. Most likely, this is a confounding effect due to a small number of individuals (in this case, two) with a rare allele at the locus, that are at the tail-end of a trait distribution (the two individuals were ranked 366 and 412 out of 413 for increment in 2013). Although these findings seem to support the use of MAF filtering, applying a filter prior to association analyses was found to significantly reduce the number of common SNPs identified, probably as a result of changes to the PC scores and kinship matrix (describing population structure and relatedness) caused by the removal of rare variants. A further benefit of retaining all SNPs at all stages of analyses was to enable the evaluation of the relative contribution of rare and common SNPs to each trait and to assess the predictive power of models constructed using SNPs with and without MAF filtering. There were very few instances of the same SNP being associated among traits, among species or among years, a finding also reported by Westbrook et al. (2013), possibly indicating the involvement of different genes at different stages of development or in response to varying environmental conditions, as well as the very small effect sizes of most SNPs in polygenic traits (Korte & Farlow, 2013). Our earlier comparative genetic studies of a large set of SNPs located in nuclear genes similarly found almost no shared polymorphisms under selection between different taxa of the P. mugo complex (Wachowiak, Zaborowska, et al., 2018).
Phenological variation in Pinus spp. in common garden studies has been shown to be significantly associated with the environment at the site of origin (Howe et al., 2003;Hurme et al., 1997;Repo et al., 2000;Salmela et al., 2011; with trees from northern European populations setting bud F I G U R E 5 Height at 13 years (measured before the growing season started in 2020) of 10 trees at Yair (YA) and Glensaugh (GS) selected using different methods: Genomic: genomic selection to identify the predicted 10 tallest trees using values from the final predictive model for growth (single nucleotide polymorphisms (SNPs) identified in both SY and MU-SY-UN, no minor allele frequency filter applied, N SNPs = 14); Phenotype: phenotype selection where the 10 tallest trees at each site prior to the start of the second growing season. The dotted line represents the mean height of trees at each site Selection method Height at 13 years (2020; mm) and flushing earlier than trees from more southerly populations.
Whereas environmental cues are expected to play an important role in initiating phenological processes (Dougherty et al., 1994) including budburst (Laube et al., 2014), bud set is thought to be endogenous in Pinus spp., with photoperiod and temperature having relatively minor effects (Cooke et al., 2012). In this study, we found a high proportion of common SNPs in genes putatively involved in environmental responses (including response to abiotic and biotic stress and environmental cues) for both budburst and growth, but not for bud set. Common SNPs associated with bud set were exclusively located in genes related to growth and development. At this stage, assigning unigenes in conifers is largely presumptive and relies on similarity to domains or families of proteins with a large and/or speculative range of functions, many of which are, as yet, unexplored or undefined. However, the divergence of assignment among SNPs associated with budburst and bud set, and its concurrence with physiological understanding of these functions, suggests that the assignment is plausible. Furthermore, as it has previously been demonstrated that intragenic linkage disequilibrium (LD) decays rapidly in our species (Wachowiak et al., 2009(Wachowiak et al., , 2013, there is a higher likelihood that SNPs identified are directly involved in variation of phenology and growth. At present, our ability to better characterize the SNPs is limited by the paucity of highly similar, well characterized and published protein and gene sequences for these species. Although predictive models constructed using all available polymorphic SNPs were the most successful at predicting values in the internal validation set they had no predictive ability when tested in an independent set of trees, possibly reflecting the divergent geographical ranges and associated environments of populations used in the trials (although training the models using trees from Scotland to reflect the geographical range of populations in the independent trial resulted in almost no improvement to the models' predictive ability).
In contrast, predictive models constructed using SNPs identified as significantly associated with budburst and growth in the association trial were found to be successful at estimating values in both the internal assessment and the independent assessment, although in the latter the predictive ability of the models varied spatially (among the sites) and temporally (among years). The final predictive models, chosen for their performance in the internal assessment, comprised SNPs from all species' datasets indicating that the multispecies approach to identify SNPs was justified. When testing these models in an independent trial, observed values for height at age 13 and increment, over multiple years, were highly significantly correlated with predicted values generated by the final predictive model for growth, although only at YA. In contrast, the predictive ability of the growth model for trees at GS was poor. Phenotypic variation is a product of both heritable genetic and environmental variation.
Furthermore, variation in phenotypic plasticity may cause families and populations to respond to environmental variation in different ways (Cooper et al., 2019;Gratani, 2014). Consequently, the predictive ability of models will depend on the interplay between the underlying genetic control of the traits, a host of external cues and stresses that directly and indirectly determine trait expression, and differences among the environments of trees used for association analyses and those used for external testing of predictive models.
Trees growing at the YA site are much larger than at GS, indicating that there may be environmental limitations for growth at GS which are not present at YA. The trees grown in the glasshouse which were used to identify SNPs associated with growth are similarly unlikely to have experienced many environmental limitations. The lack of environmental limitations for trees growing in both the glasshouse (association trial) and at the YA site may explain why the predictive model works well in this set of trees, but does not have any predictive ability when tested in trees grown under a more limiting environment at GS. Ideally, therefore, a predictive model should be used in populations from very similar environments as the population used to perform association analyses . For instance, a predictive model for serotiny constructed by Budde et al. (2014) also had variable success when applied to different populations of P. pinaster. It is also possible that optimization of the prediction models using variable selection approaches such as LASSO, would improve results, particularly where genotype × environment (G × E) interactions are likely to impact the association analyses and/or predictions (Crossa et al., 2017).
The age of the trees used to identify SNPs associated with traits should also be considered with respect to maternal effects, which may be more significant at younger ages (Vivas et al., 2020), resulting in phenotypes which are less a product of their genotype (and environment) than in later life stages. Maternal effects may mask or confound attempts to identify SNPs significantly associated with adaptive traits, and in this study many more SNPs were identified as significantly associated with height and increment in 2013 than in 2011 or 2012 and an incremental reduction in the strength of the relationship between growth in the first year and in the two subsequent years was also observed. These findings suggest that maternal effects were present in at least the first year of growth but that the effect was much less by the third year of growth. The lack of predictive ability in the final predictive model for growth in the independent trial at YA for trees in their first year of growth suggests that maternal effects may be significant in these trees, but that the effect has diminished by age 13 when the predictive ability was very good.
The usefulness of predictive models in commercial forestry depends on their ability to predict traits at final harvest. Lee (1999) found that height at 13 years in another commercial conifer species was a good predictor of height at final harvest, indicating that our model, with high predictive ability in trees at age 13, has the potential to be a useful tool for early selection for height at final harvest in Scots pine.
The relationship between bud burst timing and duration was found to vary as budburst progressed: trees which were observed to reach the first few stages of budburst (where scales were open but needles not yet visible) early in the season did not complete the whole budburst process sooner as might be expected. Instead, these trees took longer overall to complete budburst and it is clear that this relationship is not consistent among sites, which emphasizes the need for caution in applying genotype-trait relationships across environments. Similarly, the prediction model for budburst had variable accuracy among the two independent field sites: the predicted values were significantly (albeit only weakly) positively correlated with the duration of budburst for 2 years at GS, but not at YA, while the predicted values for budburst were significantly correlated with timing of budburst but only at YA in 1 year. This was a negative relationship, such that trees that were predicted to complete budburst early in the season actually completed budburst late. Although this initially seems surprising, it does have a plausible biological explanation. The predictive model was constructed using SNPs which were identified as significantly associated with the timing of budburst in a set of trees from a common garden glasshouse experiment, whilst the independent trial data were collected from trees planted outdoors in a field trial. The environmental difference between the glasshouse and the field was clearly substantial, with possibly the most important deviation between the two being that temperatures in the glasshouse did not drop below freezing throughout the winter. The relationship between the chilling requirement (the accumulation of time spent below a certain temperature) and the initiation of budburst is complex: tree species and populations differ in their chilling requirement as well as in their forcing requirement (the accumulation of time spent above a certain temperature) after the chilling requirement is met (Körner, 2006). An increase in chill days (mean temperature <5°C) can significantly advance budburst timing in P. sylvestris (Laube et al., 2014). Heritable genetic variation in the timing of budburst is therefore likely to be strongly influenced by environmental cues including chilling and subsequent forcing. The contrast between the two environments means that trees requiring a greater number of chill days before the initiation of budburst will experience a delay in the glasshouse but burst bud earlier in the field, resulting in a negative relationship among trait values in the two environments. Moreover, variation in the climate ensures that chilling and forcing conditions vary among sites as well as annually.
Although the mean number of annual chill days is higher in GS than YA, GS also has fewer growing degree days which may delay the onset of budburst in some families or populations.
We found, as has been previously reported (Calleja-Rodriguez et al., 2020), that predictive ability in P. sylvestris (estimated as the correlation between the genomic estimated breeding values and phenotypes) was positively associated with narrow sense heritability of the trait. In contrast, the predictive ability of the models in an independent multi-site trial was not correlated with the predictive ability in the association dataset, possibly because of the different environments involved. However, the heritability estimates are extremely high for some traits (particularly bud set in P. sylvestris) which could be due to the distribution of SNP effect sizes (Young et al., 2018) or the average LD between SNPs and causal variants being different than it is among SNPs (Evans et al., 2018). As previously noted, LD decays rapidly in these species and this may indicate that there is a higher rate of LD between SNPs and causal variants than among SNPs. Our finding that phenological traits (budburst and bud set) had higher narrow sense heritability than growth traits (height and annual increment) has also been reported in Quercus robur (Scotti-Saintagne et al., 2004). Similarly, high narrow sense heritability for budburst has been estimated in other conifers (P. abies, h 2 = 0.8: Aitken & Hannerz, 2001) as has moderate narrow sense heritability for height (P. pinaster, h 2 = 0.37: Vazquez-Gonzalez et al., 2021). Variation in narrow sense heritability across years, as was observed in this study, was also reported for Q. robur (h 2 = 0.48-0.80; Bogdan et al., 2004) and P. taeda (h 2 = 0-0.75; Balocchi et al., 1993), so we might expect the accuracy of genomic prediction to vary considerably by species and by trait.
Predictive models potentially provide a tool with which to determine the phenotype of trees at early life stages, saving both time and money. The gain of nearly 7% in height observed using genomic selection as opposed to phenotype selection is slightly lower than the gain predicted for material derived from existing seed orchards (8-12%: Lee, 1999) but without the extensive and expensive trial set up and maintenance. Furthermore, the height at harvest of Scots pine (with the average yield class for this species of 10) could be expected to increase by 1.08-1.24 m when using predictive modelling based on an average harvest height of 20-23 m (McLean, 2019). Predictive models have several potential applications including selecting for key traits in commercial breeding programmes and assessing forests for their response to abiotic and biotic stress. However, our results show the extent to which values generated by predictive models can vary in the strength of their correlation with the observed values depending on the environment in which they are tested. This may limit the deployment of genomic prediction across environments, but also where environment changes over time: something that will be a widespread issue in the near future (Franklin et al., 2016). Using a multispecies approach also highlights the improvements in both numbers of SNPs identified as significantly associated with adaptive traits and the accuracy of prediction models constructed when using SNPs from multiple species' datasets. However, the small-scale comparisons between selection methods demonstrated the potential for predictive growth models to successfully select taller trees at one of our sites. As we had only small sample sizes and a relatively small pool of trees from which to select, the approach will require further testing using a larger set of trees in future trials.

| CON CLUS IONS
Despite its ecological and economic importance, this study is among the first to explore the association between SNPs and key adaptive traits in P. sylvestris, demonstrating the utility of the Pinus spp.
high throughput array (Perry et al., 2020) for identifying genes and SNPs associated with phenology and growth traits. Development of a predictive model and validation in an independent trial furthermore demonstrates the potential of the approach for accelerated tree breeding. However, the study also highlights the limitations imposed by genotype by environment interactions. This may affect the application of predictive models in populations experiencing different environments from those in which the models were trained.
Applying both a conventional single species and a novel multispecies approach to association analyses and predictive modelling exposes the constraints of the former and benefits of the latter. These results offer promise for this approach, highlighting the potential for improvement of economic traits in Scots pine and justifying future genomic studies in this species.

ACK N OWLED G EM ENTS
This work was financially supported by the following grants: GAPII

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly available through NERC's Environmental Information Data centre (https://eidc.ac.uk/) at https://doi.org/10.5285/55118 e26-cf5c-41d6-9157-738fc e6bdddf (Table S5: phenotypes and genotypes for three pine species within the association trial) and https://doi. org/10.5285/52248 442-a50f-4fc0-a73e-31c61 cd27df9 (Table S7: phenotypes and genotypes for the multisite independent field trial).