throbber
nature
`
`ARTICLES
`
`Vol 456 | 6 November 2008 | doi:10.1038/nature07485
`
`DNA sequencing of a cytogenetically
`normal acute myeloid leukaemia genome
`
`Timothy J. Ley1,2,3,4*, Elaine R. Mardis2,3*, Li Ding2,3, Bob Fulton3, Michael D. McLellan3, Ken Chen3, David Dooling3,
`Brian H. Dunford-Shore3, Sean McGrath3, Matthew Hickenbotham3, Lisa Cook3, Rachel Abbott3, David E. Larson3,
`Dan C. Koboldt3, Craig Pohl3, Scott Smith3, Amy Hawkins3, Scott Abbott3, Devin Locke3, LaDeana W. Hillier3,8,
`Tracie Miner3, Lucinda Fulton3, Vincent Magrini2,3, Todd Wylie3, Jarret Glasscock3, Joshua Conyers3,
`Nathan Sander3, Xiaoqi Shi3, John R. Osborne3, Patrick Minx3, David Gordon8, Asif Chinwalla3, Yu Zhao1,
`Rhonda E. Ries1, Jacqueline E. Payton5, Peter Westervelt1,4, Michael H. Tomasson1,4, Mark Watson3,4,5, Jack Baty6,
`Jennifer Ivanovich4,7, Sharon Heath1,4, William D. Shannon1,4, Rakesh Nagarajan4,5, Matthew J. Walter1,4,
`Daniel C. Link1,4, Timothy A. Graubert1,4, John F. DiPersio1,4 & Richard K. Wilson2,3,4
`
`Acute myeloid leukaemia is a highly malignant haematopoietic tumour that affects about 13,000 adults in the United States
`each year. The treatment of this disease has changed little in the past two decades, because most of the genetic events that
`initiate the disease remain undiscovered. Whole-genome sequencing is now possible at a reasonable cost and timeframe to
`use this approach for the unbiased discovery of tumour-specific somatic mutations that alter the protein-coding genes. Here
`we present the results obtained from sequencing a typical acute myeloid leukaemia genome, and its matched normal
`counterpart obtained from the same patient’s skin. We discovered ten genes with acquired mutations; two were previously
`described mutations that are thought to contribute to tumour progression, and eight were new mutations present in virtually
`all tumour cells at presentation and relapse, the function of which is not yet known. Our study establishes whole-genome
`sequencing as an unbiased method for discovering cancer-initiating mutations in previously unidentified genes that may
`respond to targeted therapies.
`
`We used massively parallel sequencing technology to sequence the
`genomic DNA of tumour and normal skin cells obtained from a patient
`with a typical presentation of French–American–British (FAB) subtype
`M1 acute myeloid leukaemia (AML) with normal cytogenetics. For the
`tumour genome, 32.7-fold ‘haploid’ coverage (98 billion bases) was
`obtained, and 13.9-fold coverage (41.8 billion bases) was obtained
`for the normal skin sample. Of the 2,647,695 well-supported single
`nucleotide variants (SNVs) found in the tumour genome, 2,584,418
`(97.6%) were also detected in the patient’s skin genome, limiting the
`number of variants that required further study. For the purposes of this
`initial study, we restricted our downstream analysis to the coding
`sequences of annotated genes: we found only eight heterozygous,
`non-synonymous somatic SNVs in the entire genome. All were new,
`including mutations in protocadherin/cadherin family members
`(CDH24 and PCLKC (also known as PCDH24)), G-protein-coupled
`receptors (GPR123 and EBI2 (also known as GPR183)), a protein
`phosphatase (PTPRT), a potential guanine nucleotide exchange factor
`(KNDC1), a peptide/drug transporter (SLC15A1) and a glutamate
`receptor gene (GRINL1B). We also detected previously described,
`recurrent somatic insertions in the FLT3 and NPM1 genes. On the
`basis of deep readcount data, we determined that all of these mutations
`(except FLT3) were present in nearly all tumour cells at presentation
`and again at relapse 11 months later, suggesting that the patient had a
`single dominant clone containing all of the mutations. These results
`demonstrate the power of whole-genome sequencing to discover new
`cancer-associated mutations.
`
`AML refers to a group of clonal haematopoietic malignancies that
`predominantly affect middle-aged and elderly adults. An estimated
`13,000 people will develop AML in the United States in 2008, and
`8,800 will die from it1. Although the life expectancy from this disease
`has increased slowly over the past decade, the improvement is pre-
`dominantly because of improvements in supportive care—not in the
`drugs or approaches used to treat patients.
`For most patients with a ‘sporadic’ presentation of AML, it is not yet
`clear whether inherited susceptibility alleles have a role in the patho-
`genesis2. Furthermore, the nature of the initiating or progression
`mutations is for the most part unknown3. Recent attempts to identify
`additional progression mutations by extensively re-sequencing tyro-
`sine kinase genes yielded very few previously unidentified mutations,
`and most were not recurrent4,5. Expression profiling studies have
`yielded signatures that correlate with specific cytogenetic subtypes of
`AML, but have not yet suggested new initiating mutations6–8. Recent
`studies using array-based comparative genomic hybridization and/or
`single nucleotide polymorphism (SNP) arrays, although identifying
`important gene mutations in acute lymphoblastic leukaemia9,10 have
`revealed very few recurrent submicroscopic somatic copy number
`variants in AML (M.J.W., manuscript in preparation, and refs 11–
`13). Together, these studies suggest that we have not yet discovered
`most of the relevant mutations that contribute to the pathogenesis of
`AML. We therefore believe that unbiased whole-genome sequencing
`will be required to identify most of these mutations. Until recently, this
`approach has not been feasible because of the high cost of conventional
`
`1Department of Medicine, 2Department of Genetics, 3The Genome Center at Washington University, 4Siteman Cancer Center, 5Department of Pathology and Immunology, 6Division
`of Biostatistics, and 7Department of Surgery, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 8Department of Genome Sciences, University of Washington,
`Seattle, Washington 98195, USA.
`*These authors contributed equally to this work.
`
`66
`
` ©2008 Macmillan Publishers Limited. All rights reserved
`
`Foresight EX1012
`Foresight v Personalis
`IPR2024-00170
`
`

`

`NATURE | Vol 456 | 6 November 2008
`
`ARTICLES
`
`capillary-based approaches and the large numbers of primary tumour
`cells required to yield the necessary genomic DNA. ‘Next-generation’
`sequencing approaches, however, have changed this landscape.
`Our group has pioneered the use of whole-genome re-sequencing
`and variant discovery approaches using the Illumina/Solexa techno-
`logy with the genome of the nematode worm Caenorhabditis elegans as
`a proof-of-principle14. This approach has distinct advantages in
`reduced cost, a markedly increased data production rate, and a low
`input requirement of DNA for library construction. In the present
`study, we used a similar approach to sequence the tumour genome
`of a single AML patient and the matched normal genome (derived
`from a skin biopsy) of the same patient. After alignment to the human
`reference genome, sequence variants were discovered in the tumour
`genome and compared to the patient’s normal sequence, to the dbSNP
`database, and to variants recently reported for two other human gen-
`omes15,16; revealing new single nucleotide and small insertion/deletion
`(indel) variants genome-wide. Somatic mutations were detected in
`genes not previously implicated in AML pathogenesis, demonstrating
`the need for unbiased whole-genome approaches to discover all muta-
`tions associated with cancer pathogenesis.
`
`Rationale for using the FAB M1 AML subtype for sequencing
`Of the eight FAB subtypes of AML, M1 AML is one of the most
`common (,20% of all cases). No specific cytogenetic abnormalities
`or somatic initiating mutations have been identified for this subtype;
`in fact, about half of the patients with de novo M1 AML have normal
`cytogenetics17–19. The frequency of well-described progression muta-
`tions (for example, activating alleles of FLT3, KIT and RAS) is similar
`to that of other common FAB subtypes5. We therefore decided to
`sequence the genome of tumour cells derived from a patient with M1
`AML, because so little is known about the molecular pathogenesis of
`this common subtype. The criteria used to select the sample are out-
`lined in Supplementary Information.
`
`Case presentation of UPN 933124
`The case presentation is described in detail in the Supplementary
`Information. In brief, a previously healthy woman in her mid-50s
`presented suddenly with fatigue and easy bruisability, and was found
`to have a peripheral white blood cell count of 105,000 cells per micro-
`litre, with 85% myeloblasts. A bone marrow examination revealed
`100% myeloblasts with morphological features and cell surface mar-
`kers consistent with FAB M1 AML (Supplementary Fig. 1).
`Cytogenetic analysis of tumour cells revealed a normal 46,XX karyo-
`type. Although the patient experienced a complete remission with
`conventional therapies, she relapsed at 11 months and expired
`24 months after her initial diagnosis was made. At relapse, the bone
`marrow had 78% myeloblasts, and contained a new clonal cytoge-
`netic abnormality, t(10; 12) (p12; p13). Informed consent for whole-
`genome sequencing was subsequently obtained from her next of kin.
`
`A typical M1 AML diploid genome and expression profile
`The tumour sample from patient 933124 contained no somatic copy
`number changes at a resolution of ,5 kb (further confirmed on the
`NimbleGen 2.1M array platform, data not shown), and no evidence
`of copy number neutral loss-of-heterozygosity (LOH), indicating
`that the genome was essentially diploid at this level of resolution
`(see Supplementary Fig. 2). Further analysis of the 933124-derived
`tumour and skin samples showed 26 inherited copy number variants
`(that is, detected in both the tumour and skin samples). All but two of
`these had been previously reported in the Database of Genomic
`Variants (see Supplementary Table 1). All of the copy number var-
`iants detected in this genome were found in at least one other AML
`patient (89 other cases, mostly Caucasian, have been queried using
`the same SNP array platform), and all but one were found in at least
`one of the 160 Caucasian HapMap and Coriell samples that were
`studied on the same array platform (Supplementary Table 1).
`
`To determine whether the tumour cells of 933124 were typical of
`M1 AML, we compared the expression signatures of 111 de novo AML
`cases using unsupervised clustering (Ward’s method, see Supple-
`mentary Information). The expression profile of patient 933124
`clustered with multiple other M1 (and M2) AML cases with normal
`cytogenetics, suggesting that the genetic events underlying the patho-
`genesis of this case are similar to those of other cases exhibiting normal
`cytogenetics (Supplementary Fig. 3).
`
`Coverage depth of the tumour and skin genomes
`Because most of the acquired mutations in cancer genomes have been
`shown to be heterozygous, the complete sequencing of a cancer gen-
`ome requires the detection of both alleles at most positions in the
`genome20. We therefore designed sequence coverage metrics to define
`the point at which 90% diploid coverage had been reached. To min-
`imize errors associated with any single platform or measurement,
`diploid coverage for this genome was assessed using a set of high-
`quality SNPs derived from two different SNP array platforms,
`Affymetrix 6.0 and Illumina Infinium 550K. For a SNP to be included
`in the high-quality set, the following criteria had to be satisfied: (1)
`identical genotypes were called from both assays at the same genomic
`positions, and (2) the resulting genotype was heterozygous. For the
`933124 tumour genome, 46,494 heterozygous SNPs passed the above
`criteria and were defined as high-quality SNPs. For the skin samples,
`46,572 high-quality SNPs were defined.
`We performed 98 full runs on the Illumina Genome Analyser to
`achieve the targeted level of 90% diploid coverage as determined by
`coverage of the high-quality SNP set. Maq21 was used to perform
`alignment, determine consensus, and identify SNVs within the 98
`billion bases generated from the tumour genome (see Table 1). Maq
`predicted a total of 3.81 million SNVs (Maq SNP quality $ 15) in the
`tumour genome, including matching heterozygous genotypes for
`91.2% of the 46,494 high-quality SNPs. When we lowered the Maq
`SNP quality cutoff to 0, 94.06% high-quality SNPs were predicted.
`Further investigation of Maq alignments revealed coverage for both
`alleles at a further 5.38% of the high-quality SNPs, but Maq did not
`predict a SNP or matching heterozygous genotype owing to insuf-
`ficient depth or quality of coverage. Extra analysis revealed coverage
`at 46,484 of 46,494 high-quality SNPs for at least one allele (that is,
`99.98% haploid coverage for the tumour genome).
`We sequenced the genome of normal skin cells from the same
`patient to enable the identification of inherited sequence variants
`in the tumour genome. Our targeted diploid coverage goal for the
`skin-derived genome was 80%. We achieved this goal with only 34
`Solexa runs (41.8 billion bases), using improved reagents and longer
`read lengths to attain 82.6% diploid and 84.2% haploid coverage
`(Table 1).
`To begin evaluating the quantity and quality of the detected
`sequence variants in the tumour and skin genomes, we compared
`the overlap and uniqueness of this genome’s variants with respect to
`the James D. Watson and J. Craig Venter genomes, and to dbSNP
`(v127; Fig. 1). Of the 3.68 million single nucleotide variants (SNVs;
`Maq SNP quality $15, excluding SNVs found on chromosome X)
`predicted by Maq in the tumour genome, 2.36 million were present in
`dbSNP, 2.36 million were detected in the skin genome (Fig. 1a),
`1.50 million were detected in the Venter genome, and 1.58 million
`were found in the Watson genome (Fig. 1b). Ultimately, 1.70 million
`SNVs were unique to the 933124 tumour genome. On filtering the
`933124 SNVs at different Maq quality values to determine the
`stability of results, we observed that the proportion of 933124
`SNVs that also are in dbSNP increases from 63.9% to 69.48% when
`the Maq quality threshold score increases from 15 to 30, as expected.
`
`Refining the detection of potential somatic mutations
`Because the number of sequence variants initially detected by Maq
`was high, we developed improved filtering tools to effectively sepa-
`rate true variants from false positives. To this end, we generated an
`
`67
`
` ©2008 Macmillan Publishers Limited. All rights reserved
`
`Foresight EX1012
`Foresight v Personalis
`IPR2024-00170
`
`

`

`ARTICLES
`
`NATURE | Vol 456 | 6 November 2008
`
`Table 1 | Tumour and skin genome coverage from patient 933124
`
`Libraries
`Runs
`Reads obtained
`Reads passing quality filter
`Bases passing quality filter
`Reads aligned by Maq
`Reads unaligned by Maq
`
`SNVs detected with respect to hg18 (no Y)
`SNVs (chr 1–22) detected with respect to hg18
`SNVs also present in dbSNP
`SNVs also present in Venter genome
`SNVs also present in Watson genome
`SNVs not in dbSNP/Venter/Watson
`SNVs not in dbSNP/Venter/Watson/skin
`
`HQ SNPs
`HQ SNPs where reference allele is detected
`HQ SNPs where variant allele is detected
`HQ SNPs where both alleles are detected
`
`Tumour
`
`4
`98
`5,858,992,064
`3,025,923,365
`98,184,511,523
`2,729,957,053
`295,966,312
`
`3,811,115
`3,681,968 (100.0%)
`2,368,458 (64.3%)
`1,499,010 (40.7%)
`1,573,435 (42.7%)
`1,223,830 (33.2%)
`925,200 (25.1%)
`
`46,494 (100.0%)
`42,419 (91.2%)
`43,164 (92.9%)
`42,415 (91.2%)
`
`Skin
`
`3
`34
`2,122,836,148
`1,228,177,690
`41,783,794,834
`1,080,576,680
`138,276,594
`
`2,918,446
`2,830,292 (100.0%)
`2,161,695 (76.4%)
`1,383,431 (48.9%)
`1,456,822 (51.5%)
`591,131 (20.9%)
`2
`
`46,572 (100.0%)
`38,454 (82.6%)
`39,220 (84.2%)
`38,454 (82.6%)
`
`Assessments are shown of the haploid and diploid coverage of the tumour and skin genomes from AML patient 933124.
`Chr, chromosome; hg18, human genome version 18; HQ, high quality.
`
`experimental data set by re-sequencing Maq-predicted SNVs, ran-
`domly selecting a training subset and a test data set, whose annota-
`tions and features were submitted to Decision Tree C4.5 (ref. 22).
`
`a
`
`b
`
`933124
`
`Venter
`
`Watson
`
`Skin
`
`Tumour
`
`dbSNP
`
`Figure 1 | Overlap of SNPs detected in 933124 and other genomes. a, Venn
`diagram of the overlap between SNPs detected in the 933124 tumour
`genome and the genomes of J. D. Watson and J. C. Venter. b, Venn Diagram
`of the overlap among the 933124 tumour genome, the skin genome and
`dbSNP (ver. 127). SNVs were defined with a Maq SNP quality $15.
`
`68
`
`This approach identified parameters that separated true variants
`from false positives, revealing that SNV-supporting read counts
`(unique on the basis of read start position and base position in
`supporting reads), base quality and Maq quality scores are chief
`determinants for identifying false positives. Implementing rules
`obtained from the Decision Tree analysis resulted in 91.9% sensitivity
`and 83.5% specificity for validated SNVs.
`
`Identification of somatic mutations in coding sequences
`The patient had 3,813,205 sequence variants in her tumour genome,
`as defined by Maq scores of .15 (Table 1). Of these, 2,647,695 were
`supported by the Decision Tree analysis in the tumour genome, of
`which 2,584,418 (97.6%) were also detected in the skin genome
`(Fig. 2). The detailed algorithm for selecting putative somatic var-
`iants is described in Supplementary Information. Most of the 63,277
`tumour-specific variants we detected were either present in dbSNP or
`were previously described in the Watson or Venter genomes
`(31,645), or occurred in non-genic regions (20,440). A total of
`11,192 variants were located within the boundaries of annotated
`
`3,813,205 tumour SNVs (Maq15)
`
`2,647,695 well supported SNVs (decision tree)
`
`!
`!
`!
`~ !
`!
`!
`---------.... -
`! ~
`
`2,584,418 present
`in skin (SNPs)
`
`20,440 in
`non-genic regions
`/
`10,735 intronic
`
`63,277 tumour-specific SNVs
`
`31,632 new SNVs
`
`31,645 in dbSNP/
`Watson/Venter
`
`11,192 SNVs in genic regions
`
`---------....
`216 in UTR
`
`241 SNVs in coding sequence
`
`181 SNVs predicted to alter gene function
`(non-synonymous and splice junctions)
`
`60 synonymous
`7 unable to
`be validated
`(technical failures)
`
`14 validated
`as germline
` SNVs (SNPs)
`
`
`8 validated as somatic
`SNVs (acquired mutations)
`
`152 validated
`as wild type
`(false positives)
`
`Figure 2 | Filters used to identify somatic point mutations in the tumour
`genome. See text for details. UTR, untranslated regions.
`
` ©2008 Macmillan Publishers Limited. All rights reserved
`
`Foresight EX1012
`Foresight v Personalis
`IPR2024-00170
`
`

`

`NATURE | Vol 456 | 6 November 2008
`
`ARTICLES
`
`Primary tumour
`Relapse tumour
`Skin
`
`* *
`
`*
`*
`
`*
`*
`
`*
`*
`
`*
`*
`
`*
`*
`
`*
`*
`
`*
`*
`
`*
`*
`
`*
`*
`
`100
`
`80
`
`60
`
`40
`
`20
`
`Variant (%)
`
`0
`
`SLC15A1
`G RINL1B
`G PR123
`K N D C1
`C D H24
`PTPRT
`
`P CLK C
`EB12
`
`FLT3
`
`N P M 1
`
`B R C A2
`
`TP53
`
`Figure 3 | Summary of Roche/454 FLX readcount data obtained for ten
`somatic mutations and two validated SNPs in the primary tumour, relapse
`tumour and skin specimens. The readcount data for the variant alleles in the
`primary tumour sample and relapse tumour sample are statistically different
`from that of the skin sample for all mutations (P , 0.000001 for all
`mutations, Fisher’s exact test, denoted by a single asterisk in all cases). Note
`that the normal skin sample was contaminated with leukaemic cells
`containing the somatic mutations. The patient’s white blood cell count was
`105,000 (85% blasts) when the skin punch biopsy was obtained.
`
`tumour variants to move forward in the discovery pipeline if they
`were detected at a low frequency (two or fewer reads) in the skin
`sample, as defined by a binomial test.
`
`Detecting insertions and deletions (indels)
`To discover small indels (,6 bp) from sequence reads (32–35 bp
`long), we started with a set of 236 million reads that were not con-
`fidently aligned by Maq to the reference genome. We applied
`Cross_Match and BLAT to identify gapped alignments that are unique
`in the genome. To detect indels longer than 6 bp, we developed a ‘split
`reads’ algorithm (see Supplementary Information) that aligns sub-
`segments of reads independently to the genome, and computes a
`mapping quality for the derived gapped alignment on the basis of
`the number of hits and the quality of the bases. These efforts resulted
`in the identification of 726 putative small indels (1 to 30 bp in size)
`that occur in coding exons, 393 of which (54.2%) were found in
`dbSNP. After manual review, we selected a set of 28 putative somatic
`coding indels for validation using PCR-based dye terminator sequen-
`cing. Of these putative indels, 22 were validated but were found pre-
`sent in both tumour and skin (15 of these were in dbSNP), two were
`false positive calls, two had no coverage, and two were previously
`validated somatic insertions in NPM1 (4 bp) and FLT3 (30 bp).
`
`Discussion
`Here we describe the sequencing and analysis of a primary human
`cancer genome using next-generation sequencing technology. Our
`
`genes; 216 of these variants were in untranslated regions, and 10,735
`were in introns (but not involving splice junctions) and were not
`explored further in our analysis. Of the coding sequence variants, 60
`were synonymous, and not further evaluated. The remaining 181
`variants were either non-synonymous, or were predicted to alter
`splice site function. By sequencing polymerase chain reaction
`(PCR)-generated amplicons from the tumour and skin samples
`(and also from the relapse tumour sample obtained 11 months after
`the original presentation), we determined that 152 of these variants
`were false positive (that is, wild type) calls, 14 were inherited SNPs,
`and eight were somatic mutations in both the original tumour and
`the relapse sample (Table 2). Seven variants could not be validated,
`either because the regions involved were repetitive, or because all
`attempts to obtain PCR amplicons failed. All of the PCR-amplified
`exons from the eight genes containing validated somatic mutations
`were sequenced in 187 further cases of AML using samples from our
`discovery and validation sets23; no further somatic mutations were
`detected in these genes (data not shown). A description of how we
`estimated the false negative (12.45%) and false positive (0.06%) rates
`for SNVs over the entire genome is presented in Supplementary
`Information. Using these estimates, we can predict that very few
`somatic, non-synonymous variants were missed by our analysis of
`this deeply covered genome.
`
`Defining mutation frequencies in the tumour sample
`To better define the percentage of tumour cells that contained each
`of the discovered somatic mutations, we amplified each mutation-
`containing locus from non-amplified genomic DNA derived from
`the de novo and relapse tumour samples, and from the skin biopsy
`obtained at presentation. The resulting amplicons were sequenced
`using the Roche/454 FLX platform, and the frequency of reads con-
`taining the reference and variant alleles were defined (Fig. 3 and
`Table 3). Control amplicons containing a known heterozygous
`SNP in BRCA2 (encoding N372H) and a homozygous SNP in
`TP53 (encoding P72R) were analysed similarly. The BRCA2 SNP
`yielded ,50% variant frequencies in the tumour and skin samples,
`whereas nearly 100% of the TP53 alleles were variant in all three
`samples, as expected. Remarkably, all eight somatic SNVs were
`detected at ,50% frequencies in the primary tumour sample
`(100% blasts), and at ,40% frequencies in the relapse sample
`(78% blasts;
`if the variant
`frequencies are corrected for blast
`counts—that is, multiplied by 1.28—the frequencies at relapse also
`were ,50%). The NPMc (cytoplasmic nucleophosmin) mutation
`was also detected at a frequency of ,50%, but the FLT3 internal
`tandem duplication (ITD) allele was only detected in 35.1% of the
`454 reads at diagnosis and 31.3% at relapse, suggesting that the
`mutation was not present in all tumour cells at diagnosis or relapse.
`Notably, the variant alleles also were detected at frequencies of
`,5–13% in the skin sample. In retrospect, it is clear that the skin
`sample contained contaminating leukaemic cells, because the
`patient’s white blood cell count at presentation was 105,000 per
`microlitre, with 85% blasts. This information was used to inform
`the Decision Tree analysis described above: we allowed high-quality
`
`Table 2 | Non-synonymous somatic mutations detected in the AML sample
`Gene
`Consequence
`Type
`Solexa tumour reads
`WT:variant
`9:9
`15:12
`7:8
`9:13
`15:10
`11:11
`7:12
`19:9
`18:12
`36:6
`
`CDH24
`SLC15A1
`KNDC1
`PTPRT
`GRINL1B
`GPR123
`EBI2
`PCLKC
`FLT3
`NPM1
`
`Y590X
`W77X
`L799F
`P1235L
`R176H
`T38I
`A338V
`P1004L
`ITD
`CATG ins
`
`Nonsense
`Nonsense
`Missense
`Missense
`Missense
`Missense
`Missense
`Missense
`Indel
`Indel
`
`Solexa skin reads
`WT:variant
`16:0
`19:0
`20:0
`16:0
`14:0
`13:0
`18:2
`15:1
`8:0
`33:0
`
`Conservation score of
`mutant base
`0.998
`1.000
`NA
`1.000
`NA
`NA
`1.000
`0.98
`NA
`NA
`
`Mutations in other AML
`cases*
`0/187
`0/187
`0/187
`0/187
`0/187
`0/187
`0/187
`0/187
`51/185
`43/180
`
`Ins, insertion; WT, wild type.
`* Patient cohort defined in ref. 23.
`
` ©2008 Macmillan Publishers Limited. All rights reserved
`
`69
`
`Foresight EX1012
`Foresight v Personalis
`IPR2024-00170
`
`

`

`ARTICLES
`
`NATURE | Vol 456 | 6 November 2008
`
`Table 3 | 454 Readcount data for somatic mutations and known SNPs
`Primary AML (100% blasts)
`
`Gene
`CDH24
`SLC15A1
`KNDC1
`PTPRT
`GRINL1B
`GPR123
`EBI2
`PCLKC
`FLT3
`NPM1
`BRCA2
`TP53
`
`Consequence
`Y590X
`W77X
`L799F
`P1235L
`R176H
`T38I
`A338V
`P1004L
`ITD
`CATG ins
`N372H
`P72R
`
`Variant
`
`5672
`3817
`4640
`998
`2211
`4618
`12750
`992
`4220
`1550
`778
`8989
`
`Ref
`
`4890
`4962
`4848
`1058
`2674
`4569
`15453
`855
`7810
`1974
`752
`1
`
`Variant (%)
`53.70
`43.48
`48.90
`48.54
`45.26
`50.27
`45.21
`53.71
`35.08
`43.98
`50.85
`99.99
`
`Variant
`
`564
`875
`770
`126
`318
`850
`458
`341
`3475
`143
`763
`8161
`
`Skin
`
`Relapse (78% blasts)
`
`Ref
`
`10358
`10773
`8972
`1489
`4461
`9751
`10088
`3153
`23159
`2390
`876
`0
`
`Variant (%)
`5.16
`7.51
`7.90
`7.80
`6.65
`8.02
`4.34
`9.76
`13.05
`5.65
`46.55
`100.00
`
`Variant
`
`3108
`4714
`3883
`350
`1447
`3660
`2646
`705
`3870
`2303
`285
`7914
`
`Ref
`
`4599
`7173
`6342
`493
`2070
`6057
`3627
`773
`8495
`3910
`303
`6
`
`Variant (%)
`40.33
`39.66
`37.98
`41.52
`41.14
`37.67
`42.18
`47.70
`31.30
`37.07
`48.47
`99.92
`
`The differences between variant frequencies in primary or relapse tumour samples and skin were highly significant for all somatic mutations (P , 0.000001, Fisher’s exact test, one tailed). The
`BRCA2 variant is a known heterozygous SNP in this genome, and the TP53 variant is a known homozygous SNP.
`
`patient’s tumour genome was essentially diploid, and contained ten
`non-synonymous somatic mutations that may be relevant for her
`disease. These mutations affect genes participating in several well-
`described pathways that are known to contribute to cancer patho-
`genesis, but most of these genes would not have been candidates for
`directed re-sequencing on the basis of our current understanding of
`cancer. Hence, these results justify the use of next-generation whole-
`genome sequencing approaches to reveal somatic mutations in can-
`cer genomes.
`As we demonstrated in our re-sequencing of the genome of the C.
`elegans N2 Bristol strain14, and again in this study, massively parallel
`short-read sequencing provides an effective method for examining
`single nucleotide and short indel variants by comparison of the aligned
`reads to a reference genome sequence. By sequencing our patient’s
`tumour genome to a depth of .30-fold coverage, and gauging our
`ability to detect known heterozygous positions across the genome,
`we have produced a sufficient depth and breadth of sequence coverage
`to comprehensively discover somatic genome variants. A slightly lower
`coverage of the normal genome from this individual helped to identify
`nearly 98% of potential variants as being inherited, a critical filter that
`allowed us to more readily identify the true somatic mutations in this
`tumour. Our results strongly support the notion that hypothesis-
`driven (for example, candidate gene-based) examination of tumour
`genomes by PCR-directed or capture-based methods is inherently
`limited, and will miss key mutations. A further and important consid-
`eration is the demand for large amounts of genomic DNA by these
`techniques; this is a serious limitation when precious clinical samples
`are being studied. The Illumina/Solexa technology requires only ,1 mg
`of DNA per library, enabling the study of primary tumour DNA rather
`than requiring the use of tumour cell lines, which may contain genetic
`changes and adaptations required for immortalization and mainten-
`ance in tissue culture conditions.
`A total of ten non-synonymous somatic mutations were identified
`in this patient’s tumour genome. Two are well-known AML-associated
`mutations, including an internal tandem duplication of the FLT3
`receptor tyrosine kinase gene, which constitutively activates kinase
`signalling, and portends a poor prognosis5,24,25, and a four-base inser-
`tion in exon 12 of the NPM1 gene (NPMc)26–28. Both of these mutations
`are common (25–30%) in AML tumours, and are thought to contri-
`bute to progression of the disease rather than to cause it directly29.
`Notably, the frequency of the mutant FLT3 allele in the primary and
`relapse tumour samples (35.08% and 31.30%, respectively) was
`significantly less than that of the other nine mutations (P , 0.000001
`for both the primary and relapse samples). These data suggest that the
`FLT3 ITD may not have been present in all tumour cells, and further,
`that it may have been the last mutation acquired.
`The other eight somatic mutations that we detected are all single
`base changes, and none has previously been detected in an AML
`genome. Four of the genes affected, however, are in gene families
`that are strongly associated with cancer pathogenesis (including
`
`70
`
`PTPRT, CDH24, PCLKC and SLC15A1). The other four somatic
`mutations occurred in genes not previously implicated in cancer
`pathogenesis, but whose potential functions in metabolic pathways
`suggest mechanisms by which they could act to promote cancer
`(including KNDC1, GPR123, EBI2 and GRINL1B). We speculate
`about the roles of these mutations for the pathogenesis of this
`patient’s disease in Supplementary Information.
`The importance of the eight newly defined somatic mutations for
`AML pathogenesis is not yet known, and will require functional
`validation studies in tissue culture cells and mouse models to assess
`their relevance. Even though we could not detect recurrent mutations
`in the limited AML sample set that we surveyed, several lines of
`evidence suggest that these mutations may not be random, ‘passen-
`ger’ mutations. First, somatic mutations in this genome are extremely
`rare. The rarity of somatic variants, and the normal diploid structure
`of the tumour genome, argues strongly against genetic instability or
`DNA repair defects in this tumour. Conceptually, this result is further
`supported by the very small number of somatic mutations discovered
`in the expressed tyrosine kinases of AML samples4,5; genetic insta-
`bility does not seem to be a general feature of AML genomes.
`Second, on the basis of the equivalent frequencies of the variant
`and wild-type alleles for the mutations in the tumour genome (except
`for FLT3 ITD), it is highly probable that all the mutations are het-
`erozygous, and are present in virtually all of the tumour cells (Fig. 3).
`The latter suggests that these mutations may have all been selected for
`and retained because they are important for disease pathogenesis in
`this patient. Alternatively, all may have occurred simultaneously in
`the same leukaemia-initiating cell, but only a subset of the mutations
`(or an as-yet undetected mutation) is truly important for pathoge-
`nesis (that is, disease ‘drivers’ versus passengers). Although we sug-
`gest that the latter hypothesis is very unlikely on the basis of our
`current understanding of tumour progression, many more AML
`genomes will need to be sequenced to resolve this issue.
`Third, the same mutations were detected in tumour cells in the
`relapse sample at approximately the same frequencies as in the prim-
`ary sample. All of these mutations were therefore present in the
`resistant tumour cells that contributed to the patient’s relapse, fur-
`ther suggesting that a single clone contains all ten mutations. Fourth,
`seven of the ten genes containing somatic mutations were detectably
`expressed in the tumour sample. FLT3 and NPM1 messenger RNAs
`were highly expressed in this tumour sample, as they are in virtually
`all AML samples. We detected mRNA from the CDH24, SLC15A1
`and EBI2 genes on the Affymetrix expression array, whereas express-
`ion of GRINL1B and PCLKC were detected by PCR with reverse
`transcription (RT–PCR; data not shown). Expression of KNDC1,
`PTPRT and GPR123 was not detected by either approach, but we
`cannot rule out expression of these genes in a small subset of tumour
`cells (for example, leukaemia-initiating cells). Furthermore, for the
`five point mutatio

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket