GENOMIC PARTITIONING AND CANCER
It is self-evident that a complete understanding of the molecular signature of cancer requires the complete genomic sequence. Desirable though that may be, the identification of over 2,000 chromosomal rearrangements in 24 breast cancer samples underlines the difficulty of delineating driver mutations from the large background of passengers. Furthermore, although massively parallel methods now permit complete sequencing of individual genomes at high speed, there is, of course, a cost and time component that prompts consideration of more selective approaches. The choices include (1) sequencing selected candidate genes or a specific region of the genome already implicated in a pathway (as used to reveal BRAF); (2) sequencing only protein-coding sequences, i.e. exons; and (3) sequencing common variants, i.e. SNPs. This strategy of ‘genomic partitioning’ therefore uses a variety of methods to focus on discrete regions of interest.
An example of the first of these selective approaches focused on the 86 members of the protein tyrosine kinase superfamily by first sequencing 593 exons of the tyrosine kinome in 29 melanoma samples. This yielded 30 somatic mutations in 19 genes that were then selected for sequencing in a total of 79 melanoma samples, yielding 99 non-synonymous somatic mutations, most of these not previously known to be associated with melanoma. It was found that 19% of the melanomas had mutations in the EGFR family member ERBB4 that increased the kinase activity and transforming capacity of the protein (Fig. 1).
1. Distribution of mutations in ERBB4.
Mutations in ERBB4 had not previously been identified and they offer new targets for therapeutic intervention.
Protein-coding sequences make up only 1% of the human genome (the exome) and the strategy of exome sequencing involves only ~5% of the sequencing necessary for the whole genome. The negative aspect is that it may miss up to 20% of exonic sequences and repeated sequencing of at least eight-fold (‘depth of coverage’) is required to provide statistically reliable data on variants. Despite these problems, exome sequencing applied to small numbers of individuals has identified the causes of several monogenic diseases.
As the melanoma example shows, even selective sequencing generates large amounts of information. Systems approaches are currently being developed to integrate ‘omic’ data to classify individual cancers more precisely and thus accelerate the implementation of effective therapies. One example of this approach focuses on the problem of identifying drivers in the enormous number of genomic aberrations present in individual tumours. The high frequency of copy number aberrations is a particular problem in this context because the phenotype is driven by the expression level of a gene, i.e. the amount of mRNA and protein made, which may not directly reflect its copy number. The application of an algorithm that integrates copy number and gene expression levels (CONEXIC) to a sample of melanomas has identified known drivers and has also revealed two novel genes necessary for melanoma growth.
In a similar approach combining DNA sequencing, gene expression (mRNA) profiling and functional proteomics, the most probable drivers of breast and colorectal tumours have been resolved and are represented by the interconnecting pathways shown in Fig. 2
2. Systems biology applied to genomic and proteomic data.
These studies revealed that breast and colorectal cancers typically have point mutations that alter about 80 genes and have major copy number changes (either deletion or at least 12 copies) in 17 genes. Predictably, many of the affected genes are those we have encountered as major oncogenes or tumour suppressors. A notable finding, consistent with some of the whole genome sequencing data summarised earlier, is that the majority can be grouped into functional pathways, for example, those controlling proliferation and survival, the cell cycle and cell adhesion (Fig. 3).
- Major cancer signalling pathways showing proteins for which there are specific inhibitors.
Within one pathway-specific gene may be deleted or amplified, indicating that anomalous signalling can arise from aberrations in either positive or negative regulators. Thus, for example, and copy number changes occurred in pathways driven by EGFR and ERBB2 and involving PI3K in both types of a tumour. Point mutations and homozygous deletions affecting P53, SMAD2, SMAD3 and PTEN characterised colon cancers, together with amplification of MYC and EGFR. Breast tumours showed homozygous deletions of INK4 genes (CDKN2A and CDKN2B) and amplification of cyclins D1, D3 and E3. Additional genes not previously known to undergo copy number change in these tumours were also resolved.
Although the ‘cancer landscape’ that emerges from such studies is complex, the prominent pathways represent major targets for therapeutic intervention. A combination of drugs capable of inhibiting all or most of these might produce effective remission, even allowing for the mutational fluidity of heterogeneous tumours.
About 10% to 15% of cancers are hereditary, depending on the type of cancer. Thus, for example, approximately 10% of breast cancers and ~35% of colorectal cancers arise from an inherited susceptibility. BRCA1 and BRCA2 are two familiar major susceptibility genes for breast cancer: inherited mutations in either conferring a high risk of developing the disease. Variants in some other identified genes also increase the risk but, taken together, these known mutations account for less than 25% of the overall familial risk of breast cancer. High-risk germline mutations for colon cancer occur in APC and in mismatch repair genes (MLH1 and MSH2) but, as with breast cancer, the known loci are responsible for only a small proportion (<6%) of cases. These and similar observations have led to the conclusion that susceptibility to hereditary cancers comes mainly from the combined many loci with each individual variant conferring only a small increase in relative risk. A succession of GWA studies has addressed this problem by comparing the genomic sequences of several thousand individuals with the disease with those of a similar number of normal control individuals so that sufficient statistical power is generated to identify SNPs that predispose to specific cancers. These have typically used the Illumina HumanHap550 BeadChip Array (e.g. the Colorectal Tumour Gene Identification (Corgi) Consortium and the Breast Cancer Association Consortium). This approach has identified alleles that confer low-penetrance susceptibility to most major cancers (breast, lung, ovarian, pancreatic, prostate, testicular germ cell, thyroid, urinary bladder and colorectal cancers, glioma, follicular lymphoma, neuroblastoma, childhood ALL, CLL and melanoma) and to a variety of other diseases (Alzheimer ’s disease, Crohn ’s disease, amyotrophic lateral sclerosis, atrial fibrillation, bipolar disorder, coronary artery disease, rheumatoid arthritis, diabetes). For breast cancer, GWA methods have identified SNPs (or at least haplotypes) in four genes (FGFR2, TNRC9, MAP3K1 and LSP1) that account for 3.6% of the familial risk in a first step to identifying the complete complement of low-penetrance variants that can contribute to breast cancer.
The identification of cancer-associated SNPs raises the problem of determining their function prior to developing therapeutic strategies. For FGFR2 the two SNPs in intron 2 mentioned earlier have been shown to alter the binding affinity of transcription factors (OCT1/RUNX2 and C/EBPβ). The result is an increase in FGFR2 expression, thereby increasing the risk of developing breast cancer. Despite this finding, determining the functional effect of SNPs, in general, is technically very challenging.
The facility with which individual genomes can now be analysed also permits the identification of mutations responsible for relatively rare diseases and the genomes of a number of individuals with a variety of genetic diseases have already been sequenced (familial hypercholesterolemia, familial neuropathy, etc.).
Whole genome sequencing of individual tumours has typically revealed thousands of somatic mutations, for example, more than 30,000 in a metastasis of malignant melanoma and, as noted earlier, in AML. These mutations affect, as expected, major cancer genes (i.e. ‘drivers’, e.g. MYC). The majority are clearly ‘passengers’ but, from analysis of such data, the following points have already emerged for a variety of cancers: (1) although the major mutations are in genes already known to be ‘drivers’, mutations are also being detected in genes not previously known to be cancer associated; (2) this type of screen has shown that what is often critical is aberrant signalling in a pathway, which can be caused by mutations in a number of different genes – without whole genome sequencing such ‘core signalling pathways’ would not be identified; and (3) tumours sub-classified on the basis of mutational signature may respond differentially to treatment. Even with currently available drugs, the latter is important because that information could spare patients from chemotherapy treatments that won’t work.
As this approach is extended to all the major cancers it should greatly increase the range of diagnostic and predictive biomarkers. We have already seen an example of genomics providing biomarkers through the application of gene expression profiling to breast cancer. However, WGS has advantages overexpression profiling in that it is easier to obtain DNA than RNA and the complete sequence reveals all mutations. This includes amplifications, translocations, tandem duplications and copy number variations, the extent of which is only just beginning to be recognised.
High throughput sequencing has already made important contributions to cancer diagnosis and prognosis and as part of that identified novel cancer genes. The capacity to identify rapidly not only the genes but the precise mutations that they carry is critical in the rational design of therapy regimes. This point is exemplified by the EGFR inhibitor erlotinib, which has been approved for the treatment of non- small cell lung carcinoma. However, the efficacy of the drug depends on the precise EGFR mutation. Patients with the exon 19 mutation have a significantly better response to this drug than those with other EGFR mutations. Thus an important outcome of WGS will be the acquisition of biomarkers predictive of response so that individual tumour genotypes can be matched with appropriately targeted agents to maximise patient benefit and minimise pointless treatment.
The identification of novel cancer genes will prompt efforts to design specific therapies for each that emerges. For the products of oncogenes – the majority of ‘cancer genes’ – that means inhibitors and indeed the vast majority of targeted chemotherapy focuses on knocking out oncoproteins. The re-activation of tumour suppressor genes is a much more intractable problem and, apart from gene therapy referred to earlier, can only be approached if the gene product is part of a combination that shows synthetic lethality such that inactivation of the other component causes cell death. An example of the potential of this approach is the anti-tumour activity of olaparib, an inhibitor of poly (ADP-ribose) polymerase (PARP) in preliminary trials on patients with BRCA1 or BRCA2 mutations.
We now briefly consider four examples to illustrate current strategies aimed at targeting key cancer drivers.
The identification of oncogenic BRAF as a major driver in melanoma has led to the introduction of several small molecule inhibitors that preferentially block ATP binding to the mutant form (BRAFV600E) of the protein. Although some of these have given remarkably high response rates in terms of initial tumour regression (Fig. 4), most patients develop resistance to the drugs within about a year.
- PET imaging of melanoma.
An astonishing range of molecular mechanisms has been unveiled by which resistance to these agents can arise. They fall into two broad categories: those that re-activate ERK signalling and those that promote proliferation by ERK-independent means. Stimulation of ERK can occur through BRAF amplification (Corcoran et al.), or by mutation of NRAS, RAF1 or the downstream MEK1 (Montagut et al; Emery et al.; Nazarian et al.; Wagle et al.). In addition, a truncated form of BRAFV600E (p61 BRAFV600E) with a deleted exon causes BRAF to form dimers: this has the effect of blocking inhibition (Poulikakos et al.). MAPK signalling is activated independently of RAF by elevated expression of a distinct member of the MAP3K family (Johannessen et al.) or by PDGFRβ upregulation (Nazarian et al.) or IGF1R/PI3K signalling (Villanueva et al.).
One implication of these effects is that although BRAF inhibitors normally block MAPKK activation, in NRAS mutant cells they can cause ERK phosphorylation. This is important because BRAF is mutated to a constitutively active kinase in ~50% of melanomas but in a further 20%, the driving mutation is in NRAS. RAF inhibitors induce BRAF/RAF1 binding to mutant RAS, thus activating MAPKK – the inhibited BRAF acting as a scaffold. Thus, anti-BRAF drugs can, if secondary RAS mutations are acquired, become tumour promoters. In other words, depending on the cellular context, RAF inhibitors can be very effective anti-tumour agents or they can actually enhance tumour growth (Fig. 5).
5. RAF signalling in normal and cancer cells.
These extraordinary effects occur because RAF proteins themselves form dimers as well as binding to RAS proteins and, in melanomas, resistance to inhibition of mutant BRAF develops because the cells can switch to either of the other isoforms (ARAF or RAF1) to activate the MAPK pathway.
This diverse panoply of evasion strategies is a striking demonstration of genomic flexibility and a stark warning of the difficulties that can confront drug therapies for cancer. For our other examples, we turn from the major melanoma oncoprotein to three drivers that play critical roles in many cancers, PI3K, MYC and p53.
Increased expression of MYC is a key ‘driver’ in most human cancers and for that reason, an inhibitor of MYC is an attractive therapeutic concept for cancer. However, because MYC is essential for normal cells to divide, the notion of blocking its function in a whole animal carries obvious risks. Despite that reservation, preliminary evidence from mouse models gives grounds for optimism. Omomyc is a small, synthetic protein that blocks the interaction of MYC with its partner MAX. MYC–MAX dimers are the functional form in which MYC acts as a transcription factor to drive cell proliferation. In a mouse model of lung cancer that closely resembles the human disease, Omomyc inhibition of MYC causes rapid regression of established lung tumours (Fig. 6).
6. Lung tumour regression induced by inhibition of MYC.
The same strategy of inhibiting the action of endogenous MYC also causes regression of pancreatic tumours, demonstrating that the phenomenon is not confined to one type of cancer. In the latter model, the specific expression in beta cells of simian virus 40 (SV40) large T and small t antigens ablates the action of p53 and the retinoblastoma protein and gives rise to highly angiogenic tumours. A prominent effect of MYC suppression is the rapid induction of endothelial cell death leading to a collapse of the tumour vasculature. This dramatic perturbation of the microenvironment precedes tumour regression and reflects the role of MYC as a master transcription factor, regulating, inter alia, a range of cytokine genes together with VEGF, the most potent pro-angiogenic agent.
Other small molecule inhibitors that target MYC transcription have been shown to have significant anti-tumour activity against mouse xenografts of several human tumours. The most encouraging and unexpected finding in these mouse models, however, is that blockade of MYC activity over extended periods has no significant, adverse effects on the mice. The strategy of inhibiting this master regulator, therefore, remains open as a viable therapy.
The tumour suppressor p53, together with INK4 and RB1, lies at the core of the protective anti-tumour network and when that system is overcome p53 activity is a frequent casualty. So there are two ways of looking at this: (1) most of us don’t die from cancer and most that do are pretty old, so this randomly evolved protection system works rather well; or (2) because our protection system is the product of unintelligent design, we ought to be able to do better. We have mentioned one approach to ‘p53 therapy’ that attempts to target selectively cells in which it has been lost. Perhaps an alternative is to consider re-programming the genome by, for example, boosting our INK4A or p53 stock? Studies of mice indicate that we should: transgenic mice engineered to express three normal copies of either the P53 or the Ink4a/Arf genes (rather than the usual two) are significantly protected from cancer (Fig. 7).
7. The protective effect of a third P53 allele in transgenic mice treated with a chemical carcinogen.
Given that p53 can drive senescence, you might wonder whether mice with an extra P53 gene age prematurely but, so long as the extra P53 is under normal control, they don’t and seem otherwise fine. What’s more, in mouse tumours generated by knocking out P53, the restoration of P53 expression can cause tumours to regress and revert to a senescent form. The tumour cells actually get broken down and so the tumours disappear. Thus modest increases in the activity of P53 or Ink4a/Arf confer a beneficial, cancer-resistant phenotype without affecting normal viability or ageing.