Received: October 29, 2018 | Accepted: December 14, 2018 | Published: December 20, 2018
OBM Genetics 2018, Volume 2, Issue 4 doi: 10.21926/obm.genet.1804055
Academic Editors: Stéphane Viville and Marcel Mannens
Special Issue: Epigenetic Mechanisms in Health and Disease
Recommended citation: Lawrence T, Murphy TM. Genetic and Epigenetic Regulation of Telomere Length: Current Findings, Methodological Limitations and Possibilities for Future Studies. OBM Genetics 2018;2(4):055; doi:10.21926/obm.genet.1804055.
© 2018 by the authors. This is an open access article distributed under the conditions of the Creative Commons by Attribution License, which permits unrestricted use, distribution, and reproduction in any medium or format, provided the original work is correctly cited.
Telomeres are ribonucleoprotein complexes that protect the ends of eukaryotic chromosomes  and they play a critical role in maintaining genomic stability . Telomere length (TL) and sequence varies between species, but in humans the average length is 5-15kb and is made up of TTAGGG repeats . The main function of telomeres is ‘capping’ the chromosomes, preventing them from damage . Double stranded DNA makes up the majority of telomeres and a single-stranded DNA overhang forms the t-loop (telomeric-loop) which is supported by a network of proteins called shelterin . Shelterin plays a role in protection from enzymatic digestion, TL regulation and controlling signalling cascades . These properties enable telomeres to protect chromosome ends from degradation, DNA damage and fusion. DNA replication is proficient at copying coding DNA but the eukaryotic DNA replication machinery is unable to copy the end of telomeres that have a single-stranded structure , resulting in telomere shortening. This gradual loss of repeats reduces the renewal capacity of cells and ultimately causes the onset of cellular senescence . If cells continue to divide the telomeres fuse, causing genomic instability that can lead to a number of diseases including cancer. TL has been widely implicated as a marker of biological age , and is influenced by inflammation and cellular stress . Furthermore, shorter telomeres are robustly associated with several human diseases, including depression , cardiovascular disease  and cancer . This review examines the evidence for genetic and epigenetic regulation of TL and discusses the role in human disease, before outlining some of the methodological limitations of these studies. Finally, the review defines what the ‘epigenetic clock’ is and evaluates its relationship with TL.
TL is maintained by the action of telomerase, a ribonucleoprotein which contains the RNA template TERC and a reverse transcriptase TERT, in some cell types, notably stem and germ cells . However, in most somatic tissues telomeres shorten with each cell division, a process believed to be accelerated by oxidative stress and inflammation [13, 14]. In somatic cells telomeres become shortened by 100-200bp with each cell division, due to the lack of TL maintenance normally carried out by telomerase . Telomerase activity in most somatic cells diminishes after birth resulting in gradual telomere shortening . Since a critical TL is needed to prevent activation of DNA damage pathways, the shortening of telomeres eventually leads to cell cycle arrest, a hallmark of cellular senescence, considered to promote ageing .
Inter-individual variation in mean TL has been associated with cancer and several age-associated diseases and TL has emerged as a promising biomarker for biological age . Early evidence of the role of telomeres in disease emerged from studies of telomeropathies, which develop when telomere erosion occurs prematurely as a consequence of mutations in genes coding for factors involved in telomere maintenance and repair. Human telomeropathies include Hoyeraal-Hreidarsson syndrome (HHS), dyskeratosis congenita (DC) and aplastic anemia . Although these diseases show a wide and complex range of clinical symptoms, all of them are characterized by presenting with critically short telomeres. TL is also associated with a several cancers and mutations in telomere-associated genes, such as TERT, OBFC1 and TERC, are associated with an increased risk of multiple cancers . It has even been suggested that short leucocyte TL (LTL) increases the risk of all cancers and cancer fatalities . Moreover, illnesses such as depression and diseases with an inflammatory etiology have been associated with telomere shortening [20,21,22]. Together with mouse models of telomerase deficiency, these observations provided important biological insight illustrating that short telomeres are a major mechanistic cause of the disease phenotypes . Moreover, multiple model systems (e.g., ciliates, yeast, plants, and mouse) have been instrumental in studying the molecular underpinnings of telomere regulation and their role in disease and have been extensively reviewed elsewhere [2,24,25]. Although a causal relationship between telomerase deficiency and human disease has been established, the cause and effect relationship between variation in TL and human disease is less well understood. Deciphering the role of genetic and epigenetic regulation of TL in humans will further facilitate our understanding of telomere biology and help elucidate biological pathways that are influenced by changes in TL and advance our understanding of the biology of ageing and age-associated diseases.
Since the 1990s, it has been established that mean TL shortens with age at a rate of roughly 31bp per year  but variation is frequently observed between individuals of the same age, implying that something other than age can also effect TL . Twin studies and cohort studies were used to investigate the mean heritability of TL which was shown to range between 78-82% [27,28]. This high heritability has led to several investigations to identify genomic loci which illicit an effect on TL.
Genome-wide association studies of telomere length. The first locus to be associated with TL was TERC, one of the components of telomerase. The first genome-wide association studies (GWAS) to identify the association between TL and genetic variation at the TERC locus was conducted in 2917 European individuals with replication in 9492 individuals, whereby a SNP (rs12696304) located 1.5kb downstream from the TERC locus was significantly associated (P=3.72 x 10-14) with shorter mean LTL . Another GWAS study identified a SNP (rs10936599) within a haplotype block that encompassed the TERC locus and was significantly associated (P=3.92 x 10-5) with longer TL , in contrast to the previous identified TERC-associated SNP . This may be due to neither SNP being located in TERCs coding region and instead they effect TERC expression through one of the other genes at the 3q26 locus, with one SNP preventing its expression and the other increasing its function . Jones et al also identified that the SNP resulted in a heightened risk of developing colorectal cancer, providing evidence that TL does indeed serve a role in the development of some diseases . A third GWAS study confirmed that genetic variation at the TERC locus is associated with mean LTL and was conducted in 3,417 participants with replication through de-novo genotyping . They also identified a SNP (rs4387287) in the OBFC1 (STN1) gene, that is significantly associated (P=3.9 x 10-9) with mean LTL . The protein encoded by OBFC1 stimulates DNA polymerase-alpha-primase, the enzyme that initiates DNA replication  and appears to function in a telomere-associated complex with C17ORF68 and TEN1 .
TERT or telomerase reverse transcriptase codes for the catalytic subunit of telomerase and forms the telomerase enzyme with TERC  (See Figure 1). A study identified two independent SNPs (rs2736108 P=5.8 x 10-7. rs7705526 P=2.3 x 10-14), located in the TERT promoter and TERT intron 2 respectively, that are both significantly associated with longer TL . The SNP (rs7705526) located in TERT intron 2 was also significantly associated (P=1.3 x 10-15) with a higher risk of low malignant potential ovarian cancer . Multiple different studies have attempted to replicate this finding, and whilst some have been successful Shen et al found no genome-wide significant associations between TERT and TL . It is possible that this is due to the small sample size of the study (n=742). Despite the power of these approaches to identify genetic variation associated with TL, these initial studies were moderately powered to identify genetic variants associated with TL and more recent genetic studies have employed large-scale meta-analyses to identify robust genetic loci associated with TL.
Meta-analyses investigating genetic loci associated with TL. The first large-scale TL GWAS meta-analysis to be conducted included 9190 participants from six GWASs (European Population) with replication in 2226 participants from four studies . This study confirmed the association between TL and genetic variation at the previously identified OBFC1 and TERC loci (OBFC1, rs9419958 P=9.1 × 10−11. TERC, rs1317082 P=1.1 × 10−8) . The meta-analysis also identified novel genetic loci associated with TL, CTC1 (rs3027234 P=3.6 x 10-8) and ZNF676 (rs412658 P=3.3 x 10-8) . The OBFC1 and CTC1 loci together make up 2/3 of the human heterotrimer complex named CST (for Cdc13-Stn1-Ten1) suggesting that it may play an important role in maintaining telomeres . In 2013, the largest GWAS meta-analysis of LTL was performed with a cohort of 37,684 individuals and replication in 10,739 individuals. The study identified 7 robust genetic loci-that were significantly associated with mean LTL (P < 5 x 10-8) . The previously identified TERC, TERT, OBFC1 loci and the novel NAF1 and RTEL1 loci , have a known role in the maintenances of telomeres (See Figure 1). NAF1 encodes a protein required for H/ACA box snoRNA assembly and therefore is required for the assembly of TERC as it belongs to that RNA family . RTEL1 encodes a DNA helicase with roles in telomere maintenance and DNA repair . In addition, the large-scale meta-analysis identified LTL-associated genetic variants associated with ZNF208 - a gene that codes for a member of the Zinc finger protein family and thus plays an important role in DNA binding mediated gene regulation (see Figure 1). The final LTL-associated genetic variant (rs11125529) is associated with both ACYP2 and TSPYL6 genes. ACYP2 has also been reported to play an important role in pyruvate metabolism . The biological function of TSPYL6 is not known, but TSPYL2, another member of TSPY-like/SET/nucleosome assembly protein-1 superfamily, plays a role in chromatin remodelling and suppression of tumour growth . Interestingly, the authors show that alleles associated with shorter TL are also associated with an increased risk of coronary heart disease, again supporting the idea that TL plays a role in disease . More recent larger GWAS studies of TL have further built on the findings by Codd and colleagues, and successfully identified two additional robust genetic variants (DCAF4  and PXK ) associated with TL. DCAF4 is shown to interact with DDB1 and CUL4 and is consequently suggested to possibly play a role in UVR- induced DNA damage and transcription coupled repair pathways . PXK is a serine/ threonine kinase that regulates electrical excitability and it therefore seems unlikely that it would be involved in the regulation of TL. A comprehensive list of genetic variants associated with LTL is summarised in Table 1 and illustrated in Figure 1.
Figure 1 Molecular pathways involved in telomere Length. Several genes that are known to code for enzymes that play an important role in the regulation of telomere length exhibit genetic (e.g. CTC1, OBFC1, TERT, TERC, ZNF208, ZNF676) and epigenetic (e.g. MAD1) alterations. CST complex: cellular multiprotein complex.
Table 1 Genes associated with telomere length.
Details of the genes function, loci, associated ethnicity, SNPs and health conditions related to genetic changes in the gene are included. Genes identified in studies with less than 500 participants and no replication were excluded. All information in the table is from gene ontology, genetics home reference and NCBI websites.
Interestingly, genetic risk score analysis showed that inheritance of multiple alleles associated with shorter LTL is associated with an increased risk of coronary artery disease, providing preliminary evidence that telomere shortening might play a causal role in this condition .
Genome-wide associations in other ethnic groups. Many studies examining genetic variation associated with TL have been conducted in European/Caucasian cohorts, and more recent studies have attempted to replicate these findings in different populations. The TERC locus has also been shown to be associated with TL in a Chinese Han  population and the TERT locus has been shown to be associated with TL in both Chinese Han  and Korean populations . Several novel TL-associated genetic variants have been identified in non-european populations (see Table 1 for more details) highlighting the importance of conducting such studies in different populations to identify novel genetic loci and provide new insights into the mechanisms involved in maintaining telomeres. This is evident in a meta-analysis conducted in a Punjabi Sikh cohort that identified an intronic variant (rs74019828) in CSNK2A2, a novel gene, was associated with TL (P=4.5 x 10-8) . To date, few studies have been conducted in African populations and those that have been conducted are simply comparisons to European studies . In order to better understand the mechanisms by which TL is controlled future studies conducted in African populations would be beneficial, particularly as research suggests that African Americans and sub-Saharan Africans have longer LTL than Europeans and the cause of this difference is unknown [47,48] .
Parent-to-offspring transmission of telomeres. As discussed above TL is a heritable trait with two potential sources of heritability: inherited genetic variation (e.g. SNPs affecting telomere maintenance (see Table 1)) and variability in the lengths of telomeres in gametes that produce offspring zygotes . The latter, is described as “direct” transmission of telomeres . Evidence supports a role for parental TL on offspring TL and suggests that sperm TL increases with age in humans, and as a result offspring of older fathers inherit longer telomeres . However, telomeres undergo a “reprogramming” event during early embryogenesis and it remains unclear to what extent reprogramming alters the impact of germ cell TL on offspring TL . Recent observations provide evidence that TL in parental germ cells impacts TL in offspring cells and contributes to LTL heritability despite telomere “reprogramming” during embryogenesis . However, larger studies are required to provide a robust estimation of LTL heritability by ‘direct’ transmission.
Despite the success of genetic studies in furthering our understanding of telomere biology, identified variants only account for only a small proportion of the estimated heritability. Over the last decade, epigenetic regulation of mammalian telomeres has become apparent [52,53,54]. Epigenetics can be defined as the mechanisms that initiate and maintain heritable patterns of gene expression without altering the sequence of the genome. There are several layers of epigenetic complexity including histone modifications, chromatin remodelling, micro-RNAs and DNA methylation. The latter being the most thoroughly studied to date . The epigenome is potentially malleable – changing with age and in response to a plethora of environmental and psychosocial factors , thus providing a mechanism mediating the interaction between genetic susceptibility and environmental risk exposures . Given that environmental factors (such as smoking, stress and obesity) have been shown to both influence a person’s epigenome [58,59,60] and accelerate the rate of telomere shortening [61,62] it is plausible that together these changes could influence the expression of subtelomeric genes.
The most widely studied epigenetic modification is DNA methylation, which refers to the covalent addition of a methyl group to the carbon at position 5 of the cytosine ring, by a family of DNA methyltransferase (DNMT) enzymes, resulting in 5-methylcytosine . In mammalian DNA, 5-methylcytosine is found in ∼4% of genomic DNA, primarily at cytosine–guanine dinucleotides . These CpG sites are non-randomly dispersed throughout the genome, concentrated in hot spots or CpG-rich regions known as CpG islands. Approximately half of all human genes are estimated to contain a 5′ or promoter CpG island [65,66]. In contrast, mammalian telomeres consist solely of TTAGGG repeats and they lack CpG sites . However, the subtelomeric region (segment of DNA adjacent to telomeric repeats) has been shown to be heavily methylated in both mice and human studies [67,68].The use of knockout mice models found that DNA methyltransferases (DNMTs)  and other epigenetic enzymes (e.g. histone methyltransferases (HMTs)) were essential for TL regulation , highlighting a crucial role for epigenetic mechanisms in TL homeostasis in mammals.
Global and genome-wide DNA methylation studies and their association with telomere length. A study by Wong et al , examined the association between global DNA methylation and TL in a longitudinal study. As a proxy for global DNA methylation the study examined DNA methylation at LINE-1 and Alu elements using bisulfite pyrosequencing – a method shown to accurately reflect global DNA methylation levels . The study found an association between both LINE-1 (P < 0.01) and Alu methylation (P = 0.02) and TL, after controlling for confounders such as age, smoking and cellular heterogeneity. In addition, the rate of telomeric change was shown to be correlated with the quantity of LINE-1 methylation . Although the sample size was limited, this exploratory study suggests global DNA hypomethylation may be related to decreased TL in peripheral blood leukocytes. A recent larger study using an ELISA-based method to determine global 5mC content also showed that decreased global DNA methylation was associated with shortened TL in adolescents . Taken together these studies provide the impetus for future studies, involving larger cohorts, to focus on gene-specific DNA methylation signatures potentially related to TL. Such studies are likely to further our understanding of the role telomeres play in the etiology of chronic diseases .
Buxton et al , performed the first genome-wide DNA methylation association study of TL in a small cohort (n = 24), using the Illumina 450k Beadchip (450k) array. The study identified 65 gene promoters that were enriched for CpG sites where methylation levels were associated with TL . Interestingly, TL-associated epigenetics changes were enriched for subtelomeric and imprinted genomic regions . Several of the 65 loci enriched for CpG sites at which DNA methylation levels are associated with TL have potential roles in human telomere biology, such as the MAD1L1 gene, a potent regulator of TERT (See Figure 1) . These results point towards a bi-directional relationship between epigenetics and telomere shortening. Whereby as telomeres shorten, the methylation levels of many subtelomeric-associated genes change. This potentially results in transcriptional changes to these genes and therefore DNA methylation changes at telomere-associated genes, which may further contribute to dysregulation of TL in humans. The latter is further supported by studies in cancer, which have shown that the epigenetic plasticity of the TERT gene promoter is an important regulator of telomerase activity [72,73]. Moreover, inhibiting the expression of the TERT gene through epigenetic mechanisms can lead to telomeric shortening. Both DNA hypermethylation and histone modifications of the hTERT promoter have been associated with TERT expression and TL shortening . A major limitation of this study is the small sample size used. In 2016, Marioni and colleagues, performed a large-scale meta-analysis of epigenome-wide association studies (EWAS) of TL  in old-age cohorts (n = 2194). Marioni and colleagues did not identify any robust DNA methylation changes associated with TL. Of note, the cohorts used in the Marioni study were from participants > 70 years of age and it would be interesting to repeat such analysis in younger cohorts, whose epigenome and telomere biology will be less influenced by age-related diseases.
Telomere position effect. Telomere shortening can also alter the expression of nearby genes, a phenomenon known as the telomere position effect (TPE) . TPE involves the spreading of telomeric heterochromatin resulting in transcriptional silencing of nearby genes . To date, most studies of TPE are limited to model organisms (e.g Yeast), where telomeres have been shown to loop over longer distances and repress genes up to 20 kb from the telomere end [77,78]. In human studies, chromosome looping has been shown to enable telomeres to access distant genetic loci (up to 10 mb away) in their respective chromosomes and affect their silencing. Interestingly, the same loci become separated when telomeres are shortened . This process is called telomere position effect over long distances (TPE-OLD) and research suggests that this phenomenon produces extensive changes in gene regulation before telomere shortening induces DNA damage signals . Further investigation of epigenetic mechanisms (e.g. chromatin modifications and DNA modifications) involved in TPE-OLD is vital to further our understanding of this phenomena and its role in diseases of ageing.
Telomere regulation by non-coding RNAs. TERRA (telomere repeat-containing RNA) is an long non-coding RNA lncRNA transcribed from the telomeric sequence  and have been hypothesized to regulate telomere replication and is possibly involved in silencing effects observed in TPE-OLD. The transcription of TERRA molecules is thought to be driven by DNA methylation of CpG island promoters found on subtelomeres . Once transcribed TERRA molecules bind to telomerase and regulates its activity . This ‘TERRA-silencing’ is suspected as a potential mechanism of telomere maintenance in human tumours through the promotion-of Alternative Lengthening of Telomeres (ALT) rather than extension by telomerase . However, TERRA molecules are also implicated in cancer cells that are negative to telomerase indicating they may also act via an alternative mechanism, possibly via a DNA-damage-response (DDR) pathway . Further exploration of TERRA as an epigenetic method of controlling TL is needed due to our current limited understanding of their mechanisms.
Another hypothesized proxy of biological age is the epigenetic clock (or DNA methylation age (DNAm age)) . The epigenetic clock measures the cumulative effect of an epigenetic maintenance system  by measuring DNA methylation levels at 353 CpG sites and is highly correlated with chronological age . The difference between the predicted DNA methylation age and the chronological age is termed “Epigenetic age acceleration” . This acceleration is correlated with several ageing related phenotypes including frailty, cancer, cardiovascular disease and Parkinson’s disease [85,86,87]. Interestingly, TL and the epigenetic clock are independently associated with chronological age [88,89]. A recent meta-analysis showed that among several estimates of epigenetic age acceleration, one measure, i.e., extrinsic epigenetic age acceleration (EEAA), was superior in predicting all-cause mortality . EEAA is defined as the weighted average of DNAm age and imputed proportions of naïve CD8+ T cells, memory CD8+ T cells and plasmablasts . Interestingly, LTL and EEAA are negatively correlated and that this correlation reflects the ageing of the immune system, namely the age-dependent change of the proportions of naive CD8+ T cells and memory CD8+ T cells . This study suggests that DNAm Age is a proxy for immune ageing and that telomere shortening partially reflects this immune ageing. The authors suggest that as the immune system is largely developed during early life , studies should focus on LTL and epigenetic age in children to gain further mechanistic insights . In the future, it may be interesting to combine both the epigenetic age and TL as a biomarker panel for biological ageing, and prediction of diseases of ageing.
Technological caveats. Three common techniques employed in the genetic and epigenetic variation studies described above are real-time quantitative PCR (qPCR), southern blotting (SB) and single TL analysis (STELA). qPCR is the most commonly adopted method due to its ease of use and high throughput capabilities. It does have limitations as it is unable to directly measure TL in base pairs, instead, TL is measured as the ratio of telomeric product/single copy gene product (T/S) . However, the changes seen in TL are normally small, meaning that precision in TL measurements may be important to obtain reproducible results . Hence, SB and STELA are used due to their high precision . Adoption of STELA is limited as the assays have only been developed for a small number of chromosome ends due to the lack of current knowledge on subtelomeric regions . A study was consequently conducted to investigate the reproducibility of these three techniques and concluded that the variation in results between techniques was minimal with qPCR and SB having very similar reproducibility . This indicates that the choice of technique has little effect on results; however, it is also important to consider how the techniques vary in methodology. For example, qPCR requires a smaller amount of DNA than the alternative techniques, allowing more samples to be compared, making it the better choice for large epidemiological studies .
Controlling for cellular heterogeneity. Both the epigenome and TL are confounded by cellular heterogeneity and current studies assessing average TL are often performed in bulk cells and do not account for the heterogeneity of TL among individual cells . A major concern in epigenetic epidemiology or telomere studies is that any apparent disease-associated differences may simply reflect differences in cellular composition . For whole blood, routine cell counts or the use of algorithms that can infer cellular composition from epigenomic data  can be applied to control for this variation statistically in studies examining DNA methylation changes associated with TL. Recently, a simple and robust method of Single Cell Amplification of Telomere Repeats by PCR (SCATR-PCR) has been developed  that can compare relative TL in individual cells based on a real-time PCR technique. SCATR-PCR coupled with advances in single cell epigenomic technology will allow future studies to directly examine epigenetic changes and TL at the single cell level.
Causality. Currently it remains unclear whether the epigenetic changes of telomere-associated genes cause or are a consequence of telomere shortening. None of the studies outlined in this review have attempted to examine the causal pathways between epigenetics changes and TL, either statistically or experimentally. Longitudinal studies (measuring both epigenetic variation and TL at multiple time points throughout the life course) and Mendelian randomization (MR) approaches are required to fully understand the temporal sequence of events. Longitudinal studies are expensive and thus MR approaches are increasingly proving to be a promising approach. MR theorises that if a trait is causally related to a phenotype, genetic variant(s) controlling activity of that trait (e.g. DNA methylation) should also be associated with the outcome (TL) . Using genetic variants as proxy for exposure overcomes confounding because genetic variants are inherited at random during meiosis, so they are unrelated to potential confounders (measured or unmeasured). Therefore, following the principles of MR may represent a valid method for revealing the role of environmentally induced epigenetic changes as modifiers/risk factors of telomere shortening. Future studies examining epigenetic alterations and TL should be undertaken by applying these novel approaches to investigate the direction of effect. For instance, the 2-step epigenetic MR approach, which is an extension of MR assumptions, allows researchers to investigate the causal role of DNA methylation in the association between an environmental exposure and TL.
Additional considerations. As mentioned previously TL decreases at roughly 31bp per year  and TL is longer in women than in men . It is consequently important when investigating TL that the findings are adjusted for both age and sex which is done in most of the studies reported above. Some also adjusted for confounding factors specific to their study including BMI and smoking, further increasing the power of these studies.
A multi-omics approach is required to disentangle the molecular basis underlying telomere biology and its association with disease. Genome-wide investigations have demonstrated that the genetic predisposition to TL and its association with disease is highly polygenic. By integrating polygenic risk scores for TL and DNA methylation profiling together would allow the research community to gain a broader and deeper understanding of the pathways primarily involved in telomere biology and their potential co-ordinated interaction. Future studies should also examine the transcriptional consequences of the observed telomere-associated DNA methylation changes. System biology methods, such as weighted correlation network analysis (WGCNA), could be applied to identify telomere-associated gene pathways and networks that could be further targeted as biomarkers or therapeutic targets for diseases of ageing.
Therese M. Murphy would like to acknowledge funding from the Brain and Behaviour research foundation through a NARSAD Young Investigator Award.
Therese M. Murphy conceived the topic of the review. Taylor Lawrence prepared the first draft of the review and Therese M. Murphy edited subsequent versions of the draft.
The authors have declared that no competing interests exist.