The association between female reproductive performance and leukocyte telomere length in wild Soay sheep

Telomere length (TL), typically measured across a sample of blood cells, has emerged as an exciting potential marker of physiological state and of the costs of investment in growth and reproduction within evolutionary ecology. While there is mounting evidence from studies of wild vertebrates that short TL predicts raised subsequent mortality risk, the relationship between reproductive investment and TL is less clear cut, and previous studies report both negative and positive associations. In this study, we examined the relationship between TL and different aspects of maternal reproductive performance in a free‐living population of Soay sheep. We find evidence for shorter TL in females that bred, and thus paid any costs of gestation, compared to females that did not breed. However, we found no evidence for any association between TL and litter size. Furthermore, females that invested in gestation and lactation actually had longer TL than females who only invested in gestation because their offspring died shortly after birth. We used multivariate models to decompose these associations into among‐ and within‐individual effects, and discovered that within‐individual effects were driving both the negative association between TL and gestation, and the positive association between TL and lactation. This suggests that telomere dynamics may reflect recent physiologically costly investment or variation in physiological condition, depending on the aspect of reproduction being investigated. Our results highlight the physiological complexity of vertebrate reproduction, and the need to better understand how and why different aspects of physiological variation underpinning life histories impact blood cell TL.


| INTRODUC TI ON
Telomeres, repetitive DNA sequences present at eukaryotic chromosomal ends, help in the maintenance of genomic stability by preventing the degradation of chromosomes (Blackburn, 1991; reviewed in Chan & Blackburn, 2004). They are also involved in cellular ageing in that they shorten with each cell division and in response to oxidative stress, with telomeres that reach a critical threshold length triggering a cascade of cellular senescence events (Aubert & Lansdorp, 2008;O'Sullivan & Karlseder, 2010;von Zglinicki, 2002). The enzyme telomerase acts to lengthen telomeres (Greider & Blackburn, 1985); in endothermic vertebrates, telomerase is expressed in germline cells throughout life, but is often repressed in adult somatic cells, particularly in larger bodied species (Gomes et al., 2011;Haussmann et al., 2007). At the whole organism level, telomere length (TL) is typically measured as an average across blood cells taken in a sample.
Recent meta-analyses of nonhuman vertebrate studies have found that energetically demanding biological activities predict TL shortening (Chatelain et al., 2020), and that short TL predicts increased subsequent mortality risk (Wilbourn et al., 2018). TL has also been hypothesized to be a mediator of potential life history trade-offs, with shorter telomeres reflecting past investment in physiologically costly activities such as growth and reproduction (Haussmann & Marchetto, 2010;Monaghan, 2014;Young, 2018). Several longitudinal studies have examined the relationship between variation in reproductive life history traits and TL under natural conditions, and more evidence that investment in reproduction is associated with short TL across different taxa is needed to establish the utility of TL as a biomarker of life history trade-offs (Casagrande & Hau, 2019;Young, 2018).
Recent experimental and observational studies in several vertebrate species have provided support for the hypothesis that costly investment in reproduction should result in a decrease in TL (Sudyka, 2019). Most experimental studies to date have found evidence that reproductive investment is associated with shorter telomeres (Heidinger et al., 2012;Reichert et al., 2014;Sudyka et al., 2014) and some observational studies have confirmed this (Cram et al., 2018;Epel et al., 2004;Ryan et al., 2018). However, many correlational studies have found positive associations between TL and reproductive traits (Angelier et al., 2019;Barha et al., 2016;Criscuolo et al., 2018;Le Vaillant et al., 2015;Parolini et al., 2017;Pauliny et al., 2006;Plot et al., 2012) or detected no significant relationship (Hoelzl et al., 2016;Johnsen et al., 2017). While experimental manipulations of reproductive investment provide vital evidence that TL can reflect important aspects of reproductive life history trade-offs, to understand how selection acts on TL and how the trade-offs manifest and vary under natural conditions, we also need studies in the wild. It is well established that variation in resource availability and acquisition under natural conditions can generate marked individual heterogeneity that is likely to mask underlying trade-offs (Lailvaux & Kasumovic, 2011;van Noordwijk & de Jong, 1986). However, longitudinal individual-based field studies combined with appropriate statistical methods can allow us to disentangle the influence of within-individual trade-offs and among-individual heterogeneity on telomere dynamics and ultimately understand the relative contribution of these processes in mediating the relationship between TL and life history traits Cram et al., 2018;Froy et al., 2019;Lieshout et al., 2019;Nussey et al., 2008).
The relationship between reproduction-related traits and TL at the population level may be driven by two underlying processes that arise as a consequence of among-or within-individual variation.
Specifically, a negative correlation between reproductive investment and TL at the within-individual level would be indicative of a classic trade-off between reproduction and somatic maintenance (Sudyka, 2019;Young, 2018). In this case, individuals investing more in reproductive traits would be expected to show a subsequent within-individual loss of TL, potentially reflecting the increased rates of cell division or oxidative stress associated with that increase in reproductive output. On the other hand, a positive correlation at the among-individual level would imply that some individuals are of higher "quality" than others and express both higher average reproductive performance and longer TL. This could be driven by spatial variation in resource availability, individual differences in the ability to acquire resources (van Noordwijk & de Jong, 1986) or variation in the ability of individuals to physiologically tolerate the costs of reproductive expenditure. Dissecting the role of these processes in the relationship between reproduction and TL is critical to better understand the degree to which TL dynamics may reflect recent life history trade-offs or reflect among-individual heterogeneity in average phenotypic quality or individual condition. Previous longitudinal studies of TL and reproduction (reviewed in Sudyka, 2019) have explored the contribution of within-and among-individual variation to TL-reproduction associations using the within-subject centring method (van de Pol & Wright, 2009;van de Pol & Verhulst, 2006). However, a more powerful approach using multivariate models allows us to account for the covariance between traits of interest at different hierarchical levels, in order to more accurately estimate the contributions of different sources of variation (e.g., withinand among-individual effects, annual variation in environmental conditions) to the association between TL and reproduction at the population level (Cam et al., 2002;Froy et al., 2019;Hamel et al., 2018). Here, we use this approach to test how investment in different aspects of reproductive performance predicts subsequent TL at among-and within-individual levels in a wild mammal population.
When testing for evidence of costs of reproduction, it is important to note that such costs may vary depending on age and individual condition. Age-related variation in reproductive performance is ubiquitous in wild vertebrates, and can arise due to a combination of factors such as previous breeding experience, reproductive effort and selective disappearance (Forslund & Pärt, 1995;Mauck et al., 2004).
The influence of age on apparent costs of reproduction in terms of subsequent survival prospects appears to vary across species: in pinnipeds, for example, the cost does not vary with age, while in ungulates costs of reproduction are higher in young and old individuals than prime age adults (Proaktor et al., 2007). Costs of reproduction can also depend on the type of reproductive investment involved.
For instance, energy allocation theory would predict greater resource costs would be involved in lactation compared to mating and gestation in female mammals, due to high resource requirements of milk production and thermoregulation (Gittleman & Thompson, 1988;Millar, 1977). Several studies to date have identified age-dependent associations between reproductive investment and TL (Bichet et al., 2020;Cerchiara et al., 2017;Graham et al., 2019;Parolini et al., 2017).
A study of common terns found that TL-reproductive trait relationships vary depending on the stage of reproduction, suggesting the physiological signal carried by TL may vary depending on the type of reproductive investment involved, as well as the age at which it was measured (Bauch et al., 2013;Bichet et al., 2020). This highlights the importance of testing for potential variation in reproduction-TL associations across age or stage groups for a full understanding of TL's potential as a biomarker of life history trade-offs.
Here, we investigated the relationship between leukocyte telomere length (LTL), age and reproductive performance among females in a free-living Soay sheep (Ovis aries) population . Age-dependent survival costs of reproduction have previously been reported in this system: females that gave birth to lambs in spring were less likely to survive their subsequent winter than those who did not breed, but this cost was most pronounced in the youngest and eldest ewes (Tavecchia et al., 2005). We also previously found that LTL declines with age in Soay sheep, with a sharp decline over the first year of life followed by a more gradual decline thereafter, and that it is positively associated with longevity Froy et al., 2021). In the present study, we used longitudinal samples and life history data collected over several decades to investigate how LTL varies according to a female's reproductive performance. We tested the predictions that giving birth to a lamb and increased litter size would result in physiological costs leading to a decrease in LTL. Furthermore, we predicted that individuals investing in lactation by successfully rearing lambs to weaning age (~4 months) would experience greater physiological costs compared to individuals whose lambs die before becoming fully weaned (Clutton-Brock et al., 1989;Froy et al., 2016). We also tested the prediction that any negative associations between reproduction and LTL would be most pronounced in the youngest and eldest females. Finally, using bivariate mixed-effects models we also quantified the role of among-and within-individual processes underpinning the variation between reproductive performance and TL (Froy et al., 2019(Froy et al., , 2021.

| Study system and data collection
The Soay sheep on the St Kilda archipelago are descended from northern European domestic sheep and are thought to have arrived on the island of Soay around 3000-4000 years ago. In 1932, around 100 sheep were transported from Soay to the larger island of Hirta and became the founders of the population residing there today.
Since 1985, individual-based monitoring of the sheep resident in the Village Bay area of Hirta (around 30% of the island's population) has been ongoing . Reproduction in Soay sheep females follows a birth-pulse pattern where ewes give birth in March-May although, in very rare cases, lambs can also be born as late as August. Female Soays are reproductively mature within their first year and between 6% and 81% conceive in the first November of their life (depending on the year) and produce their first lamb around their first birth day (as yearlings). Most females produce singleton offspring, but 2%-23% of births (depending on the year) are twins (Clutton-Brock et al., 1991). Twinning is exceptionally rare in yearling breeders (0.6% in our data set). The probability of breeding increases dramatically after the yearling stage and remains high until it begins to decline from around age 7 years, whilst twinning rates increases until around 3-4 years and then remains stable with age (Hayward et al., 2013). Lamb mortality during the neonatal period (defined as the period between tagging of the lamb after birth and October 1 of the year they were born) can vary between 10% and 40% depending on population density the previous winter, with 85% of this mortality occurring within the first month after birth (Clutton-Brock, Grenfell, et al., 2003;Clutton-Brock et al., 1992;Overall et al., 2005). By around mid-June, lambs suckle less often and most are fully weaned by the end of summer . We therefore categorized lambs based on whether they survived until August of their year of birth to determine whether their mother had invested substantially in lactation (surviving lambs) or not (lambs died).
Each year since 1985, around 95% of the lambs born in the study area have been caught, weighed and uniquely tagged for identification within a few days of their birth .
Regular censuses and mortality searching means that we recover carcasses of most animals that die and are able to accurately estimate mortality rates. This means we have extensive and accurate information on the reproductive life histories and survival of individuals dating back to 1985. Each August, a fieldwork team visits Hirta and captures as many of the animals resident to the Village Bay population as possible (usually 50%-60%) We construct temporary traps across the study area and at capture take blood and tissue samples, and phenotypic measurements such as body weight from all individuals . Blood samples are collected in heparin vacutainer tubes and kept in a cool box until processing, within 24 h of sampling. The vacutainer tubes are spun at 1008 g for 10 min, after which the plasma layer is removed and replaced by the same quantity of 0.9% NaCl solution, and spun again at 1008 g for 10 min. The buffy coat layer, comprising mainly white blood cells, is then transferred to a 1.5 ml Eppendorf tube and stored at −20℃.

| Telomere length measurement
We used the same August blood samples, LTL measurement methods and quality control levels as described in Froy et al. (2021).
Overall, there were 3641 RTL measurements available (taken from 1,586 individuals and sampled in August across years 1998-2016; see Froy et al., 2021), and we restricted this to include only females sampled aged 1 year or more. We also excluded 14 measures from six individuals that had undergone pharmacological interventions that could have influenced their reproductive performance or survival affecting their annual vital rates. We included August weight at capture in our models of RTL (see below) but weight measures were unavailable for five samples, and had to be excluded from further analysis. We also excluded a single outlier sample which had RTL >4.
Our final data set included 1982 observations from 630 individuals aged 1-15 years. In total, 77.3% of individuals had been sampled more than once for their RTL measure with 31.7% of individuals sampled more times than the average (here, mean number of samples per individual was 3.15).
For each of these samples, genomic DNA was extracted from the buffy coat layer on 96-well plates using the Macherey-Nagel Nucleospin 96 Blood kit (Cat. no. 740665) and eluted to a final volume of 150 µl in elution buffer BE provided in the kit. Real-time quantitative polymerase chain reaction (qPCR) was used to measure relative LTL (amount of telomeric DNA sequence relative to amount of the reference gene, beta-2-microglobin) based on previously established methods validated in sheep and cattle blood samples Seeker et al., 2016). Both telomere samples and reference gene samples were run in separate wells on the same qPCR plate which also included eight calibrator samples in each plate to account for plate variation as well as two nontemplate controls (NTCs) prepared with nuclease-free water. A standard curve was estimated using serially diluted samples of the calibrator, and triplicate runs of all samples (controls, calibrators and tests) were performed.
The linregpcr software package (version 2016.0, Ruijter et al., 2009) was used to correct amplification curves for baseline fluorescence, and to calculate well-specific reaction efficiencies and cycle quantification (Cq) values. Samples were excluded from further analysis if the coefficient of variation (CV) across triplicate Cq values and triplicate PCR efficiency values for either amplicon was >5% or if at least one of their triplicate reactions had an efficiency that was 5% higher or lower than the mean efficiency across all wells on that plate for the respective amplicon.
RTL (relative telomere length) for each sample was calculated (following Pfaffl, 2001) (Froy et al., 2021). The technical repeatability of RTL was estimated to be 0.866 overall where repeatability was estimated as the variance explained by sample identity divided by the total variance from a linear mixed effect model that included only sample identity as a random intercept term (see Froy et al., 2021 for further details).

| Data analysis
We initially used univariate models to quantify the associations between reproduction-related traits and RTL and also examined the interaction between age and reproductive traits on RTL. We then employed bivariate models to estimate the contribution of withinand among-individual processes underpinning the associations identified in the univariate analyses.

| Univariate analyses
We used linear mixed-effect models (LMMs) to test if RTL (measured in summer of year t) was predicted by current reproductive performance (during spring of year t) while accounting for variation associated with individual, year and qPCR assay. We began by building a "base" model including nonreproductive variables, based on previous work by Froy et al. (2021). This work identified a linear function of age as the best-fitting function describing the relationship between RTL and age for individuals aged ≥1 year. We therefore included age of the individual as a fixed continuous covariate to account for agerelated variation in RTL. Since heterogeneity in individual body condition can underlie some of the variation in RTL, we were interested in estimating the magnitude of the association between RTL and reproductive performance while accounting for body weight, which is associated with overall condition in the Soay sheep . We therefore also included August body weight of the individual as a continuous linear covariate in our models. To account for repeated measurements across individuals and years in RTL, we included individual and year of measurement as random effects. We also accounted for measurement error associated with qPCR assays by including qPCR plate and row within plate as additional random effects (following Froy et al., 2021).
We tested the effects of reproductive performance by adding fixed effects to this base model which grouped individuals based on two aspects of their reproductive performance in the spring prior to RTL measurement: (i) whether an individual had bred or not, which reflects investment in gestation; and (ii) the survival of any lambs born past typical weaning age, which reflects investment in gestation and lactation. We also accounted for twinning in some of our models in order to be able to test for effects of increased litter size.
We employed a model comparison approach to test whether effects of gestation alone, gestation and lactation, and litter size best explained the observed variation in RTL. To do this we compared the fit of models including four different categorizations of a female's reproductive performance prior to TL measurement, illustrated in Figure 1 and described in full in Table 1. Since multicollinearity between explanatory variables could potentially affect our analytical results, we calculated variance inflation factors (VIFs) for all explanatory variables in every model described in Table 1 and found VIF values to be <5, indicating that multicollinearity was not a concern here (Fox & Monette, 1992;James et al., 2013;Menard, 2002).
Sample sizes for the different reproductive performance categories by age are given in Table S1.
Full Bayesian statistical inference was performed using the Hamiltonian Monte Carlo sampling algorithm implemented in the modelling software stan (Carpenter et al., 2017). For each model, we ran four randomly initialized chains with 8,000 iterations per chain.
We used the first half of each chain as a warm-up, which resulted in a posterior sample size of 16,000 for each parameter. To assess mixing, we visually examined traceplots and checked convergence with the potential scale reduction factor, Rhat, which is close to 1 at convergence (Gelman & Rubin, 1992 (Vehtari et al., 2017). We ranked the models based on their out-of-sample predictive accuracy (i.e. elpd; expected log pointwise predictive density) transformed onto the deviance scale (LOOIC; Vehtari et al., 2017).
By doing so, we could compare models based on their LOOIC, values which were on the same scale as other information criteria (e.g. Akaike's information criterion, deviance information criterion), thereby allowing us to apply the rules of traditional model selection procedures (Burnham & Anderson, 2002;Spiegelhalter et al., 2002).   (Lenth, 2020).

F I G U R E 1
Illustration of different reproductive performance scenarios and their relationship to timing of telomere length measurement in this study. The scenarios are as follows: (a) female does not reproduce in a given spring (and is caught and sampled in the following August and has telomere length measured); (b) female gives birth to one or two lambs that die as neonates (female pays cost of gestation but no or little cost of lactation); (c) female gives birth to one or two lambs that survived to weaning age (~4 months; pays both gestation and lactation costs). Line drawing of sheep by Becky Lister-Kaye To test for age-dependent associations between reproductive performance and RTL, we included an interaction between reproductive performance and age group where age group was modelled as a categorical predictor with three levels: 1 = Yearlings, 2 = Adults (2-to 6-year-old ewes), 3 = Geriatrics (≥7-year-old ewes). We used these groupings based on findings from a previous study where the survival costs of reproduction were identified in 1-year-old ewes and ewes ≥7 years of age (Tavecchia et al., 2005). We compared the topranked model relating TL and reproductive performance (Table 1) to a model in which the reproductive variable was allowed to interact with our age-grouping (Table S3). We also calculated the posterior median and 95% HPDIs for the difference in RTL for individuals in different reproductive performance and age group categories by calculating pairwise contrasts (Table S4).

| Bivariate analyses
We also fitted bivariate GLMMs of RTL and different aspects of reproductive performance using a Bayesian framework, implemented in the R package MCMCglmm version 2.29 (Hadfield, 2010). Bivariate models allow estimation of variance for each of the two focal response variables, as well as estimation of the covariance between them at different hierarchical levels (Froy et al., 2019(Froy et al., , 2021 but ignore variation in litter size. These were: (i) fecundity (0 = did not reproduce, 1 = gave birth to one or more lambs; 1982 samples from 630 females); (ii) gestational investment (0 = did not reproduce, 1 = gave birth to one or more lambs and all lambs died; 617 samples from 389 females); (iii) lactational investment (0 = gave birth to one or more lambs but all lambs died; 1 = gave birth to one or more lambs but at least one survived; 1632 samples from 570 females); and (iv) gestational and lactational investment (0 = did not reproduce or gave birth to one or more lambs but all lambs died, 1 = gave birth to one or more lambs but at least one survived; 1982 samples from 630 females). We ran four separate bivariate models including RTL (modelled as a Gaussian distribution) and each of these binary reproductive traits (modelled as threshold distributions) as the response variables. In all four models, the random effects for individual identity and year were fitted for both response variables using unstructured variance-covariance matrices. Random effects for qPCR plate and row were included for RTL only. We obtained a posterior distribution for the variance and covariance between our response traits for each of our random effects. In all models, age and body weight (in August) of the individual were included as fixed covariates for both response variables, ensuring that the estimated covariance between reproductive performance and TL was independent of any potentially confounding effects of these two factors. For the first two models with fecundity and gestational investment as response traits, we also accounted for the nonlinear relationship between age and the response traits by including a quadratic term for age as an additional fixed effect. Similarly, for the lactational investment model and the final model accounting for both gestational and lactational investment, we included age (linear and quadratic terms), sex of the lambs and litter size as fixed effects as these factors are known to be associated with lamb survival to independence (Clutton-Brock et al., 1992).
In all models, TL was standardized such that the mean was 0 and standard deviation was 1, prior to inclusion in the models. For the binary traits of fecundity, gestation and lactation, the residual variance was fixed to 1 (as is recommended for threshold models).
The models were run for 5.3 × 10 5 iterations, with 3 × 10 4 iterations as warm-up and a thinning interval of 500 resulting in a posterior stored sample size of 1000 and autocorrelation <0.1 for all parameters. Parameter expanded priors were used for the variance components and inverse-Wishart priors for the residual variances in all models. Terms were considered statistically significant based on their 95% CIs (credible intervals) not spanning 0.

| Univariate analyses
Our model comparison approach identified the top-ranked model as one which included a fixed effect contrasting females that did not give birth, that gave birth and all lambs died (gestational cost), F I G U R E 2 Relationship of TL with (a) age and (b) body weight of Soay ewes from the best-fitting model (Table 2). Points represent raw data (n = 1,982 observations from 630 individuals) and lines represent the median line (in bold) and 2000 draws from the posterior distribution. Full model estimates in Table 2 F I G U R E 3 (a) Relationship between TL and reproductive performance. Points and boxplot represent raw data (n = 1,982 observations from 630 Soay ewes) while density plot represents mean and 95% CI from posterior distribution (Table 2). (b) Pairwise contrast plot showing the difference in posterior median estimate between the different reproductive performance categories (Table S2) and that gave birth and had surviving lambs (gestation and lactation costs), but did not contrast females with differing litter sizes (model 4, Table 1). Effect sizes and 95% CIs for this model are shown in Table 2. The model provided some support for a decline in TL with age (−0.003; 95% CI: −0.009 to 0.003) and with increased August body weight (−0.004; 95% CI: −0.009 to 0.001) (Figure 2). The individual repeatability of TL estimated in this model was 0.241 (Table 2).
Post hoc comparisons between the different categories of reproductive performance from the best-fitting model (Table 2) revealed that females that gave birth to at least one lamb that subsequently died before becoming fully weaned had shorter telomeres compared to females that had no offspring (−0.052, 95% CI: −0.094 to -0.007; Figure 3; Table 2; Table S2). Additionally, individuals that successfully weaned their offspring had longer telomeres compared to individuals whose lambs died before weaning (0.043, 95% CI: 0.009-0.078; Figure 3; Table 2; Table S2).
Since the expression of reproductive traits is highly agedependent, we further investigated whether these patterns dif- HPDIs) from the posterior distributions for this model to examine the age-dependent relationship between reproduction and RTL and present these in Tables S3 and S4 and Figure S1. We note that, in this model, the reduction in TL between females that gave birth but did not successfully wean lambs vs. nonreproducing females was slightly stronger in yearlings and only in this age class did the 95% CIs associated with the contrast not overlap zero ( Figure S1).
Similarly, the contrast between females that bred and successfully weaned offspring vs. females that bred but did not wean offspring was strongest and did not have 95% CIs overlapping zero only in the prime-aged adult group ( Figure S1).

| Bivariate analyses
We used bivariate mixed-effect models to decompose the population-level correlations between RTL and the reproductive traits into among-individual, among-year and within-individual levels. In the bivariate model with RTL and fecundity as response variables, the 95% CIs spanned 0 at all three hierarchical levels ( Figure 4; Table S5). However, we found a negative correlation between RTL and gestational investment at the residual (within-individual) level (-0.143; 95% CI: -0.307 to −0.003; Figure 4; Table S6). We also observed a positive correlation between RTL and lactational investment at the residual level (0.137; 95% CI: 0.013-0.214; Figure 4; Table S7) and between RTL and gestational and lactational investment at the residual level (0.082; 95% CI: 0.011-0.196; Figure 4; Table S8). The 95% CIs overlapped zero at the among-individual and among-year levels in all of these models (Figure 4; Tables S5-S8).

| DISCUSS ION
We found mixed support for the hypothesis that TL and reproductive performance are negatively associated. Consistent with this prediction, female Soay sheep that gave birth in spring, and thus paid any physiological costs associated with gestation, had shorter telomeres come summer than females that did not give birth. However, we found no effect of litter size on TL and found that females whose lambs survived to weaning, and thus paid any costs of lactation, actually had longer TL in summer than females that gave birth but lost their lambs as neonates. These results suggest that telomeres  (Tables S5-S8) full-term gestation and parturition predicted shortened telomeres around 4 months later. Previous studies in mammals have also found that increased reproductive investment predicted increased telomere attrition (Cram et al., 2018;Kotrschal et al., 2007), although the mechanisms linking reproduction and telomere shortening remain unknown. Our results show that TL shortening may reflect costs of some aspects of reproduction in wild mammals, but also suggest that heterogeneity in individual condition may also drive the relationship between reproduction and TL dynamics.
Contrary to our predictions, we found no evidence that increased litter size influenced TL and found that ewes that have invested in both gestation and lactation have longer telomeres than those that only invested in gestation. Both findings can be interpreted as high-quality individuals paying a lower physiological cost of reproduction in terms of telomere attrition due to heterogeneity in individual physiological condition or state. This is supported by findings in studies of other wild populations where individuals with the highest reproductive success experienced lesser telomere shortening (common terns, Bauch et al., 2013;sand lizards, Olsson et al., 2011). In female Soay sheep, heavier ewes, which have higher overall body condition, are more likely to give birth to twins and are also more likely to have offspring that survive through to weaning Clutton-Brock et al., 1996;Regan et al., 2017). This suggests that these females are higher quality individuals that are likely to be better able to tolerate the associated costs of reproduction and provide high-quality milk and care to their offspring.
Previous studies that have failed to detect predicted negative associations between TL and reproductive investment have hypothesized that individual quality may be mediating the relationship between TL dynamics and fecundity such that high-quality individuals may be able to invest in both reproduction and somatic maintenance (Bauch et al., 2013;Cerchiara et al., 2017;Graham et al., 2019;Parolini et al., 2017). One explanation for the increase in TL associated with lactation in female Soay sheep is that TL is capturing some form of heterogeneity in an individual's physiological condition or state, which is also positively associated with a ewe's ability to provide nutrients and care to its lambs after birth. We note that this association appears to be independent of the ewe's body mass, which was included in our models. Rerunning models without August weight produced qualitatively identical results (not shown), and this suggests that positive RTL-lamb survival associations are associated with aspects of physiological state that are not well captured by variation in summer body mass.
It is perhaps surprising that such positive "quality" or "condition" effects are observable for lactation but not gestation, given lactation is expected to be more physiologically demanding than gestation in female ungulates (Clutton-Brock et al., 1989;Oftedal, 1985). Our results may reflect differences in the ability of low-vs.
high-quality females to tolerate the physiological costs of gestation and lactation. Female Soay sheep have high average annual fecundity during adulthood, with relatively low annual variation in the probability of breeding compared to other temperate wild ungulate species, perhaps as a result of human domestication in their recent evolutionary history (Boyd et al., 1964;. Ewes in poor condition that breed may be unable to meet the demands of rearing their offspring to weaning, with the cost of giving birth manifesting as reduced TL in these females. Ewes in good condition are more likely to breed, and be able to invest sufficiently to successfully wean a lamb. The increased TL of these females may reflect their ability to tolerate any detrimental physiological effects of investment into reproduction, which are over-ridden by effects of individual differences in resource availability and acquisition during the heavily resource-demanding lactation period. Furthermore, investigation of age-specific costs of reproduction on TL revealed no statistical support for the model including an interaction between different age groups and reproductive performance based on our model comparison approach. This suggests that the observed effects of reproduction on TL are not driven by specific age groups but rather that these reproduction-related effects on TL are consistent across ages. However, since more than 50% of the individuals that fail to breed consist of a single age group (i.e., yearlings) and twinning is also rare in this age group (0.6% of yearlings in this data set had twins), we recognize that there may be limited statistical power to test for such age-specific interactions between reproductive performance and TL.
We used bivariate mixed-effects models to determine whether observed associations between TL and reproductive performance were driven by processes operating at the within-or amongindividual level, or both. The results of these models recapitulated key results of our univariate models of LTL-identifying a negative covariance between LTL and gestation, and a positive covariance with lactation-but offered some additional insight into whether these correlations originated at the among-or within-individual levels. We would expect signals of short-term costs of reproduction to be present at the within-individual level, but associations driven by consistent individual heterogeneity in resource availability or condition would be expected to manifest at the among-individual level.
While we only observed significant correlations between LTL and reproductive measures at the within-individual level, we note that our ability to estimate among-individual heterogeneity in reproductive performance is limited in some of these models due to exclusion of some data. In particular, the gestational investment model can only estimate the propensity of individuals to fail to breed vs.
giving birth to offspring that died preweaning. This model excludes more than half the telomere data and the many observations of females that produced lambs that survived to weaning age. The ability of this model to separate among and within sources of reproductive variation should therefore be interpreted cautiously (Table S6).
However, the models including lactational investment and gestational and lactational investment (Tables S7 and S8), which include the vast majority or all of the observations, should offer more robust insights. These models suggest that positive associations between LTL and the propensity to produce lambs that survive to weaning are present at the within-individual rather than the among-individual level. This suggests that it is not heterogeneity between individuals (due to differences in genetics or early-life environment) that is driving the association. Rather, more immediate effects of variation in current physiological condition on TL must underpin the relationship between maternal TL and lactational investment. This is in contrast to our recent study using a similar approach, which found that the positive association between TL and survival in Soay sheep was driven by genetically based among-individual covariation (Froy et al., 2021). Previous longitudinal studies that have examined the effect of reproductive traits on both TL and telomere attrition show mixed results (Bauch et al., 2013;Cerchiara et al., 2017;Heidinger et al., 2012;Reichert et al., 2014;Sudyka et al., 2014), but few have directly decomposed the contributions of among-and withinindividual level processes to observed associations Cram et al., 2018 -Brock et al., 1982;Moen, 1973). This could result in higher overall condition of lactating females. Thus, variation in feeding behaviour and condition during the period between parturition in spring and TL measurement in summer could underlie the observed differences in TL between females that did or did not invest in lactation. Similarly, lamb neonatal mortality could potentially result in physiological stress or behavioural changes in the mother (Nowak et al., 2000), resulting in shortened TL in mothers that invest in gestation alone. Reproduction could also result in the activation of the hypothalamic-pituitary-adrenal (HPA) axis and elevated glucocorticoid levels (Crespi et al., 2013;Reeder & Kramer, 2005) with downstream effects on TL (Haussmann & Marchetto, 2010). There are also endocrine changes during gestation, parturition and lactation where gestating mammals experience high levels of oestrogen and progesterone to maintain pregnancy while prolactin and oxytocin act on mammary glands of lactating mammals to stimulate milk production (Bazer & First, 1983;Norris & Lopez, 2010). These hormonal changes, specifically changes in oestrogen levels, are known to affect telomerase transcription and expression in vitro (Grasselli et al., 2008;Kyo et al., 1999). At a proximate level, increased cell turnover rates or oxidative stress associated with gestation and the physiological changes accompanying pregnancy and parturition (as detailed above) could potentially be driving LTL shortening in mammals (Sudyka, 2019). Ultimately, we need more experimental studies which manipulate different aspects of reproductive investment to understand the causes of associations between reproduction and TL. However, alongside this, correlational studies of TL in wild animals are necessary to understand how the relationships between TL, reproduction and survival relate to fitness and thus whether TL is under natural selection and is a reliable marker of trade-offs between life history traits in natural settings.
Our long-term longitudinal study of a free-living ungulate demonstrates that telomere dynamics are unlikely to be a straightforward biomarker of reproductive costs when applied in an observational context. We find that the strength and direction of the relationship between TL and reproduction varies depending on the type of reproductive output measured. This probably reflects the physiologi- relationships between physiological biomarkers such as TL and life history traits, with important consequences for our understanding of the processes responsible for telomere dynamics in natural systems (Froy et al., 2019(Froy et al., , 2021.

ACK N OWLED G EM ENTS
We thank the many project members and volunteers who have contributed to the long-term study of Soay sheep on St Kilda over many years; the National Trust for Scotland for permission to work on St Kilda and QinetiQ and Kilda Cruises for logistical support in the field. We thank Lorraine Kerr and Eliane Salvo-Chirnside for laboratory support; and Jacob Moorad and Amy Pedersen for constructive feedback on earlier drafts of the manuscript. We also thank two anonymous reviewers for feedback that improved the manuscript.
The long-term field study has been largely funded by the UK Natural

DATA AVA I L A B I L I T Y S TAT E M E N T
The data used in this study are available at https://github.com/sanja narav indra n/TL-Repro ducti on-SoayEwes