7 Mutations and cancer

When normal cells in the body (somatic cells) are changed into cancer cells they are said to have been ‘transformed.’ This transformation results in cells that, for example, proliferate inappropriately and in an unregulated fashion. The risk of this occurring is linked to a number of extrinsic (life style choices) or intrinsic (genetic predisposition) [risk factors][Risk factors) that are linked to an increased risk of mutations.

On the other hand, the discovery of oncogenes and tumour supressor genes demonstrates that subtle changes to normal cellular genes can trigger the transformation to an oncogene or damage the funciont of a tumour suppressor gene and such contribute to the development of cancer (carcinogenesis).

What are the underlying mechansims which induce these changes and how do they change DNA?

7.1 Mutations as the source of genetic variation

Mutations are defined as changes to the genomic nucleic acid sequence of an organism, i.e. for mammalian cells this is their DNA, a double stranded helix arranged into the chromosomes within the nucleus. As the molecule which stores all the genetic information DNA has a special role in the cell. It is essential that it is faithfully replicated so that an accurate copy of this information is passed on to each of the cells resulting from either mitosis (somatic cells divide producing two diploid daughter cells to proliferate) or meiosis (germ line cells generating 4 haploid daughter cells for sexual reproduction). The expression daughter cells thus describes all cells resulting from such divisions (not cells of an offspring).

This means any information encoded in this DNA is passed on to each of these daughter cells as well as to any future generation of offspring of these daughter cells. This means the DNA information as well as any changes to it is heritable and all cells that share this same genetic information are considered a clone.

Mutations are the source of genetic variation that can be passed on to daughter cells - without mutations and faithful replication the genetic code of cells would be unchangeable. Genetic variation when it affects germ line cells (passed on to future generations) is an essential driver of the processes of evolution; introduction of genetic variation in somatic cells also makes physiological contribution to the development of e.g. the immune system through contributing to the diversity of lymphocytes (V(D)J recombination). Genetic variation that leads to the transformation of somatic cells kick starts carcinogenesis and (as discussed below) sustains cancer progression.

7.2 DNA mutations

7.2.1 Underlying mechanisms

Spontaneous mutations. Changes to DNA sequence occur spontaneously at a baseline level due to a number of potential mechanisms. For example, spontaneous mutations can be induced by some of the physiological chemical reactions occurring in cells, the fact that the process of DNA replication in preparation for cell division has some change of errors occurring that introduce changes to the nucleic acid sequence, mutations induced by potential errors introduced through DNA repair mechanisms which constantly try and rectify DNA damage (see also BRAC1) . Furthermore, it is possible that errors introduced during the separation of chromosomes in the M-phase.

Induced mutations. In addition, a number of external factors can induced mutations. The increase of cancer risk has been linked to the exposure to some of these agents (UV, tobacco smoke) or and has been discussed [earlier][Risk factors). A large number of chemicals have the ability to induce chemical changes to the DNA directly or indirectly lead to sequence changes/mutations. In fact, some of the chemotherapeutic drugs used to treat cancer such as e.g. alkylating agents can induce mutations and can increase risk for development to cancer after treatment. Another important source of mutation stems from radiation either as ionising radiation or from high energy UV light.

7.2.2 Effects on nucleic acid structure

Spontaneous or induced mutations can affect the the DNA sequence and structure at different levels. In principle random and can lead to changes to cellular proteins (amount, ratio, structure).

Nucleotide level. The smallest changes result in point mutations that affect only specific nucleic acid residues where one residue could be substituted for another (e.g. nucleotide transitions (e.g. A⟷G C⟷T) or transversions) or where a specific residue is inserted or deleted.

Gene level. Mutations such as insertions, deletions or transpositions can also affect a whole sequence of nucleotides. When these muated DNA sequences are big enough to include genes they can lead to multiple copies (e.g. amplification of EGFR,HER2) or deletion of genes. Changes to the surrounding DNA can lead affect gene regulation (e.g. gene mix leading to fusion protein, bcr-abl).

Chromosome level. Finally, mutations can also affect multiple genes and whole chromosomes. In fact, one of the first suggestions that cancer is linked to ‘heritable elements’ came from Hanseman and Boveri who observed irregularities in chromatin loops (i.e., chromosomes). In fact, aneuploidy, the loss, duplication, or breaking of chromosomes is observed commonly in all forms of cancer.

7.2.3 Effect on protein structure and function

Ultimately, DNA molecules shape the structure and function of cells after transcription into RNA and translation into proteins. The effect of any type of mutations will depend on what region of the DNA is effected e.g. whether the change involves a coding, regulatory (e.g. promoter), or non-coding part of the DNA and whether it will be transcribed and preserved through RNA splicing operations. Furthermore, at the nucleotide level the genetic 3-letter code has some redundancy (‘degenerate code’) so that a mutation changing one nucleotide does not necessarily lead to a functional change in the translated protein. Therefore, changes to individual nucleotides in the gene sequence can have varying effects on the amino acid sequence, i.e. silent (no changes), nonsense (early stop codon), missense (changed amino acid).

Not all mutations which result in changes to the amino acid sequence in a protein will necessarily be accompanied by significant changes, also known as gain-of-function (activating ) or loss-of-function (inactivating) changes, to the protein function the phenotype (externally observable). When they lead to changes in the function of physiological cellular genes that play a potential role in cancer development and progression they can transform normal cells to cancerous cells. A gene that has the potential to accelerate cancer, e.g., by proliferation, when transformed becomes an oncogene. Similarly, cellular genes that can put the breaks on cancer are also known as tumour suppressor genes.

7.3 Epigenetics and epimutations

The link between our heritable traits and genes and their encoding in the cell’s DNA has become fundamental to our view of biology. As discussed above, for cancer the role of genes and their mutation has been clearly identified and universally confirmed. However, in recent years it has become apparent that cells have additional ways in which they can pass information to daughter cells which is maintained after cell division and does not involve a change of the nucleic acid sequence.

When one considers that the similarities of genes and DNA between Humans and their closest relatives, e.g. chimpanzees, have each around 25,000 genes with a homology of almost 99%.19 Given this 1% differnce in genes one might expect the species to be much more similar than they actually are. This suggests that while humans and chimpanzees DNA are largely identical may be they are used in a different way. Observations like this prompt the question about other factors at play that influence the link between genome (all the DNA) and the proteome (all the proteins expressed).

But similarly, considering the fact that the different cell types in the body (e.g. nerve cell vs hair follicles vs bone etc.) are derived from the same fertilised egg and have identical DNA it would appear likely that additional mechanisms are at play that determine which specific DNA program is active in a given cell.

If one consider the DNA as the blueprint that defines the specific building blocks from which cells are made there are obviously other factors which determine what parts of the DNA are used, and when they are used. In terms of regulating gene expression, the role of regulation of gene expression transcription (i.e., promoter, transcription factors, …), RNA level factors such as RNA stability were seen as the master switches for the regulation of gene expression, ultimately encoded as genetic information. Until recently, our understanding was also that heritable changes can only be passed on in the form of changes to DNA sequence.

But it is now clear that other processes exist that can lead to heritable changes that are passed on to daughter cells which do not involve DNA sequence. These mechanisms/processes are known as ‘epigenetic.’ To date two key mechanisms have been identified, both of which regulate long term gene expression using defined covalent modifications.

7.3.1 Mechanisms of epigenetics

Epigenetics are changes in cellular information other than the DNA sequence that can be passed on to daughter cells or offspring (‘heritable’). They are involved in controlling gene expression, e.g. during embryogenesis and cell differentiation.

There are two fundamental epigenetic mechanisms at play that determine what gene expression program is active - one acts by covalently ‘tagging’ specific DNA sections, the other by modifies how the DNA is stored on the nucleosomes; nucleosome act like a spool onto which the DNA is wound and which like a string of pearls come together to form the chromosomes. A common principle for these mechanisms is that they change accessibility of the DNA to downstream proteins and processes through DNA methylation and histone code modifications. DNA Methylation

The modification of DNA chemistry occurs by covalent modification of (methylation) of DNA regions, in particular in CpG islands; these are frequenlty associated with gene promoter regions and methylation reduces access of transcription factors and thus gene expession. Here DNA is covalently changed by methylation of cytosine residues to 5-methyl cytosine by DNA methyl-transferase enzymes. Typically around 75% of these regions are methylated in mammalian DNA allowing the body to distinguish it from the DNA of pathogenic organisms (e.g. bacteria).

This is specifically relevant in so called CpG islands, i.e. region of DNA with an increased abundance of CpGs (CpG = cytosine-phosphate-guanine, i.e. guanine follows cytosine in the DNA sequence ). Such CpG islands are frequently associated with the promoter regions of around 70% of genes. Methylation of promoters reduces the ability of transcription factors to bind to the promoter regions effectively leading to gene silencing. So the direct chemical modification of DNA (without changes to the DNA sequence) can effectively be used by the cell to ‘switch-off’ particular gene expression programs. Inappropriate methylation of such promoters can lead to increased (hyper-methylation) or reduced (hypo-methylation) gene expression and in cancer can lead to the inactivation of tumour suppressor genes (e.g. MHL1 mismatch repair) or activation of oncogenes.

DNA is stored in chromosomes by wrapping it onto nucleosomes which consist of histone proteins. A family of enzymes can attach or take away a series of covalent tags from the histones; these tags determine whehter this section of DNA is expressed. Similarly, other enzymes can reversibly put a methyl tag onto the DNA itself. When the promoter regions of a gene are tagged in this way the expression of the gene is switched off.

Figure 7.1: DNA is stored in chromosomes by wrapping it onto nucleosomes which consist of histone proteins. A family of enzymes can attach or take away a series of covalent tags from the histones; these tags determine whehter this section of DNA is expressed. Similarly, other enzymes can reversibly put a methyl tag onto the DNA itself. When the promoter regions of a gene are tagged in this way the expression of the gene is switched off. Histone code

This epigenetic mechanisms modulates the way DNA is stored and accessed in the nucleus. Given the length of DNA (around 3 billion base pairs would potentially stretch 2 m) and the typical size of a nucleus (6 µm) it is essential that DNA is packed in a safe and efficient fashion. Within the nucleus the DNA is organised into chromosomes in which the DNA is packed by being tightly wrapped around nucleosomes. The nucleosomes are protein complexes made from eight histone protein subunits. Nucleosomes act like spools with each carrying loops of the ‘thread’ of double stranded DNA. The nucleosomes are are spaced along the in regular intervals similar to the pearls in a necklace. Thus the nucleosomes allow the full length of chromosomal DNA which is roughly a couple of meters in length to fit into the cell’s nucleus which is only a few micrometers in size.

In addition to allowing DNA to fit into the nucleus nucleosomes and histones have been found to play an important role in the regulation and coordination of gene expression. In order for transcription of DNA to RNA and then translation into proteins to occur the relevant parts of the DNA have to be accessible to transcription factors. This accessibility is regulated by covalent modification of specific residues of histones on individual nucleosomes (Figure 7.1) changing the way these residues bind the DNA. These residues can, for example, act like long arms which can hold the DNA tightly. Examples of such covalent modifications include methylation (the same chemical reaction that causes DNA methylation but now it involves the histones), acetylation, and phosphorylation of histone residues. The combination of these modifications on different locations of individual histones act as ‘tags’ that together provide a code that allows detailed regulates gene expression. Specific sets of enzymes are involved in recognising, maintaining, erasing or writing these histone tags, for example, Histone Methyl Transferases, Histone Acetyl Transferases, Histone Deacetylases. These and other histone modifying enzymes, as well as the DNA methyl transferases are the target of recent drug development efforts.

7.3.2 Epigenetic Effects

Epigenetics are critical to physiological processes that require long-term changes to the cell’s gene expression program, e.g. during embryogenesis epigenetic process ensure that distinct cell populations develop from the initial omnipotent stem cells that then make up the different germ layers or later on the distinct tissue lineages. Within the body such processes are also continuously active during the cell’s terminal differentiation. For example, epigenetic changes ensure that cells are associated with fixed lineages and that differentiation follows specific pathways.

Environmental factors such as exposure to specific chemical can change the balance of these epigenetic processes. Thus, exposure to environmental factors during early pregnancy could allow the developing embryo to adapt to the environment it will be born into. These epigenetic changes in the embryo may therefore affect the phenotype of the offspring in profound ways: For example, severe war-time famine in the so-called Dutch Hunger Winter at the end of World War II lead to adverse metabolic profiles (suboptimal glucose handling, higher body mass index (BMI), elevated total and low-density lipoprotein (LDL) cholesterol) in the off-spring of pregnant mothers exposed to these events. It is becoming increasingly clear that these effects can also affect the next generation (grand children) e.g. affecting their likelihood to develop diabetes in later life {Wei, Schatten, and Sun (2015)}.

See also20

Mutations and epimutations lead to heritable changes that can lead to cell transformation and can be passed on to daughter cells.