CANCER AND GENETIC ROULETTE
Thus about 0.0003% of the bases comprising the human genome are damaged beyond repair in the course of a normal lifetime. But the really interesting question is: what happens in a cancer cell? It’s only in the last few years that we’ve been able to answer with other than a ‘guesstimate’ but now, thanks to the sequencing revolution, we know that in a typical human cancer cell the number of such mutations is ~10,000 although the range for different cancers is from 1,000 to 100,000. Note that these cancer mutations will be superimposed on the ‘normal’ mutational background. Recall that the human genome has three billion (3 × 109) base pairs with about 22,000 genes, the regions encoding proteins taking up <2% of the total DNA. Although the vast majority of mutations occur in inter-gene regions of the DNA, there are likely to be ~100 mutations that alter amino acids in each cell, i.e. <1% of the ~22,000 genes have acquired a mutation. Non-coding mutations can, of course, alter the behaviour of cells, but for the moment let’s focus on the 100 coding mutations. Of the 100 mutated genes, only a small number actually drive the development of cancer. With some exceptions it takes a long time for a critical set of mutations to accumulate in a single cell – that’s why most forms of cancer are diseases of old age. Despite the sequence revolution, we still don’t know for any cancer precisely what the critical number of mutations is but estimates range from five to fifteen distinct mutations (Fig.1), although fewer may be required for some types, in particular, leukaemias. Whatever the precise number, they make up a set of ‘driver’ mutations sufficient to override the normal controls of cell proliferation.
1. A mutational steeplechase leads to cancer.
When their effects emerge at the earliest stage of tumour development the precursor cells are therefore monoclonal (derived from one single cell).
The five to fifteen distinct ‘driver’ mutations are thus a sub-group of the random mutations that become ‘fixed’ (i.e. remain) in the genome and the term ‘driver’ is used to distinguish mutated genes that specifically contribute to cancer development from the ‘passengers’ that don’t do much. A further distinction gives rise to the concept of ‘restricted cancer genes’ that are in effect ‘drivers’ for specific types of cancers, for example, the translocations that are critical for some leukaemias (e.g. BCR-ABL1, see below) or the EWS-ETS fusion in Ewing ’s sarcoma.
Whole genome sequencing has already facilitated comparison of DNA sequences from hundreds of tumours with those of normal tissue from the same individuals. This permits the sequence comparison of essentially all genes to identify mutations that characterise specific tumours or sub-sets thereof.
Showing genes with mutations as a ‘landscape’ vividly illustrates their distribution across the genome. Imagine a separate dot for every gene (~20,000) on the map of (Fig. 2).
2. Two-dimensional map of genes mutated in a group of colon cancers.
If there are 100 coding mutations, there should be 100 dots on the map. In (Fig. 2) infrequent mutations (occurring in only one or two tumours) have been omitted. The rest fall into two groups: about 50 genes are mutated in about 5% of these tumours and these show up as ‘hills’ – i.e. they’re fairly common. A small number (five) have arisen in almost all of these tumours – these are ‘mountains’ – that is, cancer drivers. Four of them – APC, KRAS2, TP53 (aka P53) and PIK3CA (encoding the catalytic alpha subunit of phosphoinositide-3-kinase) – were anticipated as they were known to be frequently mutated in colon cancer. The other gene affected, FBXW7, encodes a protein that causes degradation of an important regulator of cell proliferation, cyclin E: its inactivation leads to genetic instability.
Having introduced the expression ‘cancer genes’ we should note that it is really jargon: strictly there are no such things but it’s a useful term if you define it to mean genes that, as a result of some change, have become abnormal in terms of the activity of the protein (or the RNA) they encode.