Free Publication in 2019
1. Division of Immune Diversity, German Cancer Research Center (DKFZ), INF 280, 69120 Heidelberg, Germany
2. Faculty of Biosciences, Uni Heidelberg, 69120 Heidelberg, Germany
3. Department of Medicine, Division of Hematology and Medical Oncology, Weill Cornell Medicine, New York, NY, 10021, USA
4. GRAIL, Inc., 1525 O’Brien Drive, Menlo Park, California, 94025, USA
Academic Editor: Kakoli Das
Special Issue: Genetic Heterogeneity in Cancer
Received: December 3, 2018 | Accepted: March 29, 2019 | Published: April 8, 2019
OBM Genetics 2019, Volume 3, Issue 2 doi:10.21926/obm.genet.1902072
Recommended citation: Tasakis RN, Papavasiliou FN, Shaknovich R. RNA Editors and DNA Mutators: Cancer Heterogeneity Through Sequence Diversification. OBM Genetics 2019;3(2):17; doi:10.21926/obm.genet.1902072.
© 2019 by the authors. This is an open access article distributed under the conditions of the Creative Commons by Attribution License, which permits unrestricted use, distribution, and reproduction in any medium or format, provided the original work is correctly cited.
Cancer is indisputably the most diverse group of diseases and is one of the most challenging for today’s biomedical research. Cancer may arise due to different extrinsic and/or intrinsic causes that drive transformation through heterologous mechanisms often associated with a variety of genomic events, which can then lead to an even more diverse clinical presentation and course [1,2]. Cancer development often follows the same pathogenesis: starting from a single genetic event and leading to acquisition of additional genetic and epigenetic abnormalities, clonal evolution and plasticity (events that confer remarkable biological advantage over normal differentiated cellular elements, together with the ability to evolve, develop drug resistance and escape immune surveillance) [3,4]. One of the first attempts to explain cancer development, with regards to somatic (DNA) mutations, was the two-hit hypothesis proposed by Alfred Knudson in 1971, which posits that cancer development requires at least two mutations: one in a proto-oncogene, which then turns into an oncogene, and one in a tumor-suppressor gene . In other words, to become cancerous, a cell must lose control over cell division but must also escape death. What we have learned in the decades since, is that cancer cells incur well beyond two mutations: in fact, the mutational load in most cancers is high and this is generally thought to speed up the evolution of the tumor [6,7].
Tumors originate from cancer stem cells (CSCs) or progenitor cells, and tumor formation follows a clonal evolution model . Mutations arise relatively frequently and are generally thought to be a byproduct of cellular proliferation processes through errors in DNA replication and transcription . They can also arise by the targeted action of enzymes that insert non-canonical bases into the genome (e.g. deaminases, reviewed below; Figure 1). In both cases, mutations are corrected by DNA repair enzymes , but when they are excessive in number they can also be "fixed" as an error into the genome and get inherited onto future generations . Mutations that give a growth advantage to a specific cell will be positively selected ; the cell will then be called a CSC and the mutations that initiated CSC formation will be termed "driver" mutations . CSCs share to a great extent characteristics of “regular” stem cells, such as self-renewal, multipotency and tumor-initiating capacity . The last two aforementioned characteristics, which reflect the cellular plasticity of CSCs, can be enhanced by a set of complementary mutations (often termed "passenger" mutations). Chronologically, passenger mutations occur after driver mutations and, even though they may have not been the events that initiate tumorigenesis, they do provide additional sequence heterogeneity  powering intra-tumor clonal evolution, which follows the principles of the Darwinian evolution concept . These additional mutations will eventually allow the CSC to further adapt to different environments, empowering metastasis (Figure 1) [17,18,19].
Figure 1 RNA editors and DNA mutators (ADARs & AID/APOBECs) mediate intra-tumor heterogeneity. The clonal evolution model of cancer development maintains that a cell receives an oncogenic hit and evolves into a cancer stem cell that is responsible for forming a primary tumor. Aside from the initial hit, additional hits (e.g. through targeted base changes catalyzed by enzymes of the RNA/DNA deaminase family) continually empower phenotypic diversity.
However, intra-tumor mutational heterogeneity, is not always high . Furthermore, a high mutation load does always translate to heterogeneity; several tumor types harbor predominant mutations within a small set of oncogenes or tumor suppressors . And yet, in such cases, poorly or homogeneously mutated tumors often behave as if they are informationally heterogeneous: they are functionally diverse, and can relapse just as efficiently as their heterogeneous counterparts. For example, excluding hypermutated samples, colon and rectum cancers were found to have considerably similar patterns of genomic alteration - even though they are phenotypically and functionally different entities . Glioblastomas are another example of a tumor entity that is poorly mutated but quick to relapse suggesting highly heterogeneous disease .
Emerging evidence suggests that such mutationally "silent", and yet clonally evolving, tumors are often highly modified at the epitranscriptomic (mRNA) level. Epitranscriptomic modifications and specifically those that are catalyzed by the adenosine or cytidine deaminases (the "RNA editing" family [24,25]), are already known to impart sequence heterogeneity at the level of the transcript and have been presumed to increase the informational diversity of otherwise genetically similar cell types [26,27]. RNA editing is mediated by members of the same family of enzymes (polynucleotide deaminases) that were already characterized as prolific DNA mutators. These enzymes come in two flavors: those that belong to the AID/APOBEC family that induce Cytosine-to-Uracil (C-to-U) nucleotide changes , and those that belong to the ADAR family that induce Adenosine-to-Inosine (A-to-I) nucleotide changes (where inosine (I) is decoded by the cellular expression machinery as a guanosine (G)) . While initially these were thought to be strictly selective with regard to their nucleic acid substrate (e.g. with AID being a DNA mutator  and APOBEC1 or ADAR1 functioning exclusively as RNA editors [31,32]) this view has been revised in light of well-defined contexts where an RNA editor like APOBEC1 or like ADAR1 can function quite capably as a DNA mutator [31,32] and vice versa, as is the case for APOBEC3A .
This review aims to highlight the APOBEC/ADAR DNA/RNA family of deaminases as they contribute to tumor evolution through sequence diversification either by RNA editing or by DNA mutation.
As mentioned, there are two distinct types of RNA editing by deamination: C-to-U editing and A-to-I (where I is recognized as a G) . C-to-U editing is catalyzed by the AID/APOBEC family. APOBEC1, the founding member , was initially identified based on its ability to catalyze editing on apolipoprotein B (ApoB); it has been named after this activity (APOlipoprotein-B-mRNA Editing enzyme Catalytic polypeptide-1). ApoB comes in two protein forms encoded by the same genetic locus . In the liver, ApoB exists in its long form (ApoB-100, 4563 amino acid residues), while in the small intestine it is truncated into its short form (ApoB-48). The protein size difference was traced to a single culprit, the APOBEC1 deaminase, which is primarily expressed in small intestine and deaminates a C in the position 6666 (in the triplet CAA) of the ApoB mRNA resulting in stop codon formation (UAA), and in significant truncation of ApoB-100 . The editing event occurs within the exonic region of ApoB, and its biological impact is striking: whereas ApoB-100, expressed in liver, acts as a ligand to the low-density lipoprotein receptor (LDL-R), ApoB-48 (produced through editing in small intestine), lacks the ligand domain, and is involved in the formation and secretion of chylomicrons . However, RNA editing events in exonic regions are rare. Transcriptome-wide screening studies have demonstrated that C-to-U editing preferentially occurs within AU-rich 3’UTRs , which suggests that editing plays a prominent role in transcript processing.
While APOBEC1 is capable of editing RNA on its own when ectopically expressed, it normally functions within an editing complex that contains one of several partner proteins (like RNA Binding Motif 47 (RBM47) and APOBEC1 Complementation Factor (A1CF) and probably others as well) whose function is to guide the editase to specific target transcripts . Consequently, Rbm47-deficient mice fail to edit a number of transcripts (including the ApoB transcript) in vivo, while elimination of A1CF ablates editing of a different set, and double deficiency does not lead to complete loss of editing [41,42]. These RNA binding proteins might have an additional function, which is to selectively place APOBEC1 onto mRNA (vs DNA). Specifically, APOBEC1 without its cofactors is a capable DNA mutator in E.coli (substantially more potent in that regard than AID  (see below)), and can also mutate DNA in cancer cells (for example mutational signatures of APOBEC1 correlate with advanced esophageal adenocarcinoma ). Therefore APOBEC1, within the nucleus , can act both on DNA (to induce DNA mutation) and on RNA (to mediate RNA editing) - but the circumstances under which APOBEC1 functions as an RNA editor vs a DNA mutator are not clear, and could include the lack of the RBP co-factor. In that regard, it is worth noting that mutations that disrupt the function of the Rbm47 co-factor are intimately related to cancer progression .
AID (the product of the Aicda gene) was the second member of the AID/APOBEC family to be characterized. AID is a key player in adaptive immunity: it was first identified by virtue of its expression in B cells undergoing Ig class switch recombination of the antibody (Ig) locus . It was subsequently discovered that it is also a key mediator of diversification (through somatic hypermutation) of antibody genes . In addition to Ig genes, AID has other genome-wide targets . AID has demonstrable activity as a single stranded DNA mutator: in vitro, it deaminates ssDNA . At switch loci, ssDNA becomes accessible during transcription-mediated R-loop formation ; at the coding portion of the Ig, which is the recombined variable region, ssDNA is also thought to become accessible during transcription, though evidence for that is not nearly as compelling as for the (stable) R-loops of the switch region [49,50]. After deamination has taken place, the U is either "fixed" (not repaired) within DNA (as an A:T base pair leading to a transition mutation as compared to the original G:C base pair) or further processed into a DNA break by components of the base excision repair (BER) pathway, leading to class switch recombination or chromosomal translocations .
Whereas strong experimental evidence suggests that AID is a DNA mutator, it is still capable of binding RNA [52,53]. Indeed, it is possible that under the right circumstances (perhaps together with an unknown co-factor) it can also act as an RNA modification enzyme  - it is worth remembering that APOBEC1, as well as other APOBECs, are quite capable mutators in the absence of their co-factors. Whereas AID is not a catalytic component of an RNA editase in the context of ex vivo stimulated B cells , compelling evidence that it cannot act as an RNA editor in other contexts - perhaps even within GC B cells that undergo SHM - does not currently exist. It is interesting to recall that mutational loads for the recombined V region used to be reported in the low percentage digits when detected through reverse transcription and cDNA amplification (e.g. 2-4% mutation accumulation in the recombined VDJ region of Ramos cells ); these are currently reported through genomic DNA amplification with rates that are 10 to 100-times lower (in Ramos cells, rates are given as roughly 2x10-3 / bp ). While this discrepancy might be in part due to different reporting schemes, it is not impossible to imagine that it could also be evidence of RNA modification by AID within the context of GC B cells; this remains to be determined.
The APOBEC3 family was discovered soon after AID was characterized. The human genome encodes 7 APOBEC3s (3A-3D, 3F- 3H), which have important roles in anti-retroviral immunity by virtue of their ability to deaminate viral DNA intermediates and hinder retroviral replication [58,59]. Specifically, upon retroviral infection and after the viral ssRNA is reverse transcribed, APOBEC3G (the founding member of the APOBEC3 family in humans) was shown to deaminate cytosines (C) to uracils (U) on the cDNA. This can lead both to viral cDNA degradation through UDG and base excision repair engagement, and to viral cDNA hypermutation upon integration as the host’s repair machinery replaces Us with Ts . Whereas the vast majority of anti-viral work in the APOBEC3 context has been done through overexpression and viral production in artificial settings (such as in HEK293T cells), it is abundantly clear that most proteins of the APOBEC3 family have the ability to deaminate ssDNA and are proper DNA mutators. Consequently, most APOBEC3 family members are kept away from genomic DNA and are sequestered in the cytoplasm . Exceptions to this rule are APOBEC3A and -3B, which can be found in the nucleus and are correlated with a particular type of genomic mutation termed "kataegis"  that is prevalent in a number of tumors. It is possible that kataegis mutations are the result of the requirement for these enzymes to access the genome for the purposes of RNA editing (with mutation being the unfortunate side effect). Indeed, recent work has suggested that APOBEC3A and -3B can function as RNA editors in human monocytes and macrophages . Whether they require co-factors to transition from DNA mutation to RNA editing within the nucleus is currently unknown.
All AID/APOBEC family enzymes are evolutionarily related, with AID and APOBEC2 being the oldest: both are present in the jawed vertebrates. AID plays a central role in the mechanism of antibody diversification. In contrast, APOBEC2 appears to lack the ability to deaminate; yet, its deletion leads to specific functional outcomes in muscle where it is most highly expressed, which suggests it plays a yet to be defined role [62,63]. APOBEC1 appears later on (in lizards ), likely as gene duplication of the AID locus. Finally, the APOBEC3 family arose and expanded in the placental mammals . APOBEC4, the last member of the family identified to date is primarily expressed in testis, seems enzymatically inactive and has yet to be assigned a biological role . Together, these close evolutionary relationships together with the ability of many members of this family of enzymes to toggle between RNA editing and DNA mutation, suggests a previously unacknowledged flexibility of function, which is likely dependent on context, such as co-factors or nuclear/cytoplasmic localization . The roles of AID/APOBECs, as well as of ADARs (discussed in section 2.2), are summarized in Table 1.
Table 1 The physiological roles of ADAR and AID/APOBEC family members. ADARs and AID/APOBECs play important roles in a variety of biological processes, in which diversification is required. Although not all of them demonstrate deamination activity, their expression is necessary for healthy individuals. Here we summarize the physiological roles of ADARs and AID/APOBECs. Further details are discussed in the text (sections 2.1 and 2.2).
The second broad class of polynucleotide deaminases catalyzes the most abundant type of editing (which is the conversion of A-to-I): these enzymes are termed Adenosine deaminases acting on RNA (ADARs) . There are three ADARs in mammals: ADAR1 and ADAR2, both of which have catalytic activity, and ADAR3, which is currently considered enzymatically inactive. ADAR1 is ubiquitously expressed and has two isoforms: p110, which resides exclusively in the nucleus, and the interferon-inducible p150, which is mostly cytoplasmic . Both are involved in the host response to dsRNA which can arise either from exogenous insults, such as viral infections, or endogenous injury, like retrotransposon mobilization [68,69].
ADAR2 appears to be functional only in the brain, or at least it is so in healthy individuals: it rose to prominence because of its ability to target the glutamate receptor B (gluR-B) pre-mRNA and within it edit an Arginine to a Glutamine within what is known as the Q/R region of a specific ion channel, decreasing its permeability to Ca2+ . This example is probably one of the very few ones in mammals, where ADARs deaminate coding region mRNA to alter amino acid decoding and bring about a strong biological impact. In contrast, ADAR-mediated coding region editing seems to be routine in cephalopods, which use A-to-I editing to radically diversify their proteomes . But overall, it is clear at least in mammals that ADAR mediated editing is most robust and prevalent in intronic regions of pre-mRNAs and in particular, mRNAs that carry Alu repeats .
ADARs do not require co-factors to edit RNA. A "cofactor" functionality is embedded within each ADAR by virtue of the dsRNA binding subdomain that they all contain. This subdomain is necessary and sufficient to target A-to-I editing within dsRNA, as for example the hairpins generated at intron-exon junctions, or the longer dsRNA regions generated by the folding of inverted Alu repeats. The modular aspect of ADAR structure - effectively, a deaminase domain tethered to a dsRNA binding subdomain - has found use in current synthetic biology approaches that utilize the catalytic subdomain in conjunction with other RNA binding domains, like cas13, together with guide RNAs to target editing to specific loci at will [73,74], or with DNA binding domains, like cas9 and guide RNA, to target ADAR to ssDNA through RNA: DNA hybrid formation . Indeed, recent work suggests that ADAR might also have a proper DNA mutator function within RNA: DNA hybrids, for instance, R-loops . All in all, the scientific challenge now is to define the biological context where the dual roles of editors and mutators are understood and biological consequences are delineated.
The ability of AID/APOBECs and ADARs to target both DNA and RNA in their natural setting can also be exploited by tumors both in gain of function and in loss of function contexts. For example, most tumors overexpress specific AID/APOBECs and these have been correlated with specific gene level mutational spectra in many cancers [31,61,75,76]. In particular, APOBEC3s have been correlated with the presence of tight C-to-T clusters in cancer genomes, which have been termed ‘kataegis’ mutations [75,77]. Kataegis mutations were originally described in genetic analyses from 21 breast cancer genomes  and now form a specific signature that can be predictive of disease outcomes . Intriguingly however, loss of APOBEC3 function can also lead to tumorigenesis, through a loss of the cells' ability to deaminate viral genomes or restrict retrotransposon intermediates leading to virally-driven transformation and cancer development .
Additionally, AID overexpression has been associated with a variety of lymphomas. As discussed, AID catalyzed deamination of the Ig locus can lead to class switch recombination (when it is targeted to switch regions) to somatic hypermutation (when it is targeted to the rearranged variable region) and finally also to translocations (often between the Ig loci and oncogenes like myc) – [46,51,75]. The presence of such translocations in particular, due to AID overexpression, has been demonstrated to play a prominent role in multiple myeloma and several lymphomas : Transgenic mice that overexpress AID developed T-cell lymphomas with AID mutational patterns, affecting the T-cell Receptor (TCR) gene, as well as the c-myc oncogene . AID overexpression also was found to increase the aggressiveness of DLBCL in mouse models  and has been correlated with a high mutation load (C to G conversion), epigenetic heterogeneity and aggressiveness of human DLBCLs [82,83,84]. Interestingly however, loss of AID has also been associated with lymphomagenesis: hyper IgM syndrome patients show evidence of lymphoproliferative disease associated with frank lymphoma formation in some cases [54,85], though the mechanistic basis of this remains unclear.
RNA editing also has a role in cancer pathogenesis although this has been a bit more controversial. Early experiments in mouse models demonstrated an association between ectopic APOBEC1 overexpression (e.g. in liver) with the development of hepatocellular carcinoma . Similarly, early work with human cell lines showed that APOBEC1 could edit coding regions of the NF1 gene and promote tumor formation . More recently, several experiments have revealed a robust association between loss of APOBEC1 in certain tumor models (e.g. ApcMin mice  or models of testicular germ cell tumors ) and reduced tumor burden . Mechanistically, we understand this to be due to the sequence heterogeneity imparted by editing at the level of mRNA, both in coding regions and untranslated regions . Of course, APOBEC1 can also act on DNA in the context of tumorigenesis : on one hand this requires caution in the interpretation of the cancer data (is oncogenesis mediated through RNA editing or DNA mutation?). But on the other hand, this linkage suggests the potential for a precursor-product relationship whereby RNA editing, perhaps driven by inflammatory stimuli to suppress transformation in non-cancerous cells, diversifies instead the RNA leading to tumorigenesis. Noting that RNA editing is co-transcriptional  and that many RNA editing enzymes moonlight as DNA mutators, it is easy to imagine how an RNA editor like APOBEC1 might lose hold of its mRNA substrate in situ and mutate the DNA of the cognate locus, thus "fixing" an editing event into the genome (Figure 2).
Figure 2 Co-transcriptional RNA/DNA editing by APOBEC1. RNA editing enzymes edit RNA co-transcriptionally (here APOBEC1 and its cofactors is shown but we envision ADAR working similarly). At the same time, it is well known that editing enzymes can gain access to and mutate DNA (without a co-factor in the case of APOBEC1 or using an R-loop in the case of ADAR). Here we envision that an RNA editing enzyme loses its grip on RNA and then targets DNA in situ. Such a DNA molecule in the vicinity of the cognate transcript will be by necessity the gene that encodes the transcript. In this fashion, we hypothesize that the RNA and DNA deamination functions of these enzymes can be temporally linked.
There is also an emergent role of ADAR editing in cancer pathogenesis. Several investigators employing in silico approaches have reported that A-to-I editing levels are exceptionally high in several cancer types, with the most prevalent ones to be lung, head and neck, breast and thyroid cancers [92,93]. Indeed, the editing load in such cancers is estimated to be as high as the mutational load and can occur both in the coding regions and in the transcript UTRs. It is important to stress however that in contrast to mutational load, which can be described as the historical record of all mutations within the cell - some of which are driver mutations but most of which would qualify as "passengers" - the editing load represents an active mark that diversifies the cellular transcriptome only in the presence of the editing deaminase. Thus, whereas some editing events have been considered as "driver" events (e.g. AZIN1  and others ) they must be considered in toto in the context of oncogenesis. This is of immediate clinical relevance and several attempts are currently made to generate inhibitors of A-to-I deaminases as small molecules with anti-cancer activity [96,97].
Sequence heterogeneity can arise from a number of different mechanisms. Classically, it can arise simply by errors during DNA replication . It has been shown that in many types of cancer, DNA repair mechanisms are dysfunctional, contributing to cancer development in the early stages [99,100]. A very clear example of this is colorectal cancer (CRC), where mutations in the mismatch repair pathway  can lead to the inability of the tumor to correct polymerase slippage errors over tandem repeats , causing what has been termed microsatellite instability or MSI . In CRC, MSI occurs in 15-20% of sporadic colorectal tumors and
in more than 95% of CRC patients with Lynch syndrome , highlighting the role that a single type of repair mutation can play in cancer pathogenesis .
Sequence heterogeneity can also arise from the accumulation of point mutations. These are generally aggregated into an index called the Tumor Mutational Burden (TMB - which is simply a measurement of the total number of somatic mutations per tumor sample, often expressed in the number of mutations per Mega base of DNA (mut/Mb)) . A high TMB results in heterogeneity and increased tumor evolution and adaptation, and consequently worse disease prognosis. These mutations were thought to be spontaneous, but there is now clear evidence that instead, most DNA mutations are enzymatically acquired through the action of the AID/APOBEC (and perhaps ADAR) deaminases. For example, there is now concrete evidence that APOBEC-introduced mutational burden enhances intratumor heterogeneity , which increases the risk of relapse, drug resistance and immune escape .
Finally, sequence heterogeneity can also be generated through abundant RNA modification (which, in the case of both ADARs and APOBECs, is robust and reproducible). Here we have discussed the notion that RNA editing (and the associated editing burden) is also important for tumor progression and maintenance. Some editing deaminases (e.g. ADAR1, APOBEC3s) are significantly overexpressed in cancer and target hundreds of transcripts, giving rise to editing signatures  that could be important predictors of disease progression. At the same time, whereas RNA modification would not immediately imply a potential "bystander" effect on DNA, we note the possibility that these enzymes have the ability to target both nucleic acids. We therefore hypothesize that editing based transcriptional heterogeneity is intimately linked with mutational signatures in cancer, a hypothesis that remains to be tested.
The authors would like to thank Riccardo Pecori and Taga Lerner for their fruitful feedback and comments on the suggested model of co-transcriptional RNA/DNA editing by APOBEC1. Moreover, the authors would like to acknowledge Mikaela Behm for her comments on the manuscript.
These authors contributed equally to this work.
The present work is supported by the European Research Council consolidator grant (LS6, ERC-2014-CoG) for the project “RNA editing in health and disease”.
The authors have declared that no competing interests exist.