`
`stm.sciencemag.org
`
`Downloaded from
`
`Noninvasive Identification and Monitoring of Cancer Mutations by
`Targeted Deep Sequencing of Plasma DNA
` et al.Tim Forshew
`
` 4 Sci Transl Med
`
`, 136ra68 (2012);
`DOI: 10.1126/scitranslmed.3003726
`
`Editor's Summary
`
`Deep Sequencing Tumor DNA in Plasma
`
`Five liters of circulating blood contain millions of copies of the genome, broken into short fragments; in cancer
`patients, a small fraction is circulating tumor DNA (ctDNA). An even smaller number harbor mutations that affect
`cancer outcome. Looking for diagnostic answers in circulating DNA is a challenge, but Forshew, Murtaza, and
`cancolleagues have risen to the occasion by developing a tagged-amplicon deep sequencing (TAm-Seq) method that
`
`amplify and sequence large genomic regions from even single copies of ctDNA. By sequencing such large regions,
`the authors were able to identify low-level mutations in the plasma of patients with high-grade serous ovarian
`carcinomas.
`
`Forshew
`. designed primers to amplify 5995 bases that covered select regions of cancer-related genes,
`et al
`including
`TP53
`BRAF
`
`
` , EGFR,
`. In plasma obtained from 38 patients with high levels of ctDNA, the authors
`
`KRAS, and
`were able to identify mutations in
` at allelic frequencies of 2% to 65%. In plasma samples from one patient, they
`TP53
`also identified a de novo mutation in
`
` that had not been detected 15 months prior in the tumor mass itself.EGFR
`Finally, the TAm-Seq approach was used to sequence ctDNA in plasma samples collected from two women with
`ovarian cancer and one woman with breast cancer at different time points, tracking as many as 10 mutations in
`parallel. Forshew and coauthors showed that levels of mutant alleles reflected the clinical course of the disease and
`its treatment
`for example, stabilized disease was associated with low allelic frequency, whereas patients at relapse
`exhibited a rise in frequency.
`
`Through several experiments, the authors were able to show that TAm-Seq is a viable method for sequencing
`large regions of ctDNA. Although this provides a new way to noninvasively identify gene mutations in our blood,
`TAm-Seq will need to achieve a more sensitive detection limit (<2% allele frequency) to identify mutations in the
`plasma of patients with less advanced cancers. Nevertheless, once optimized, this ''liquid biopsy'' approach will be
`amenable to personalized genomics, where the level and type of mutations in ctDNA would inform clinical
`decision-making on an individual basis.
`
`
`
` and other services, including high-resolution figures,A complete electronic version of this article
`can be found at:
` http://stm.sciencemag.org/content/4/136/136ra68.full.html
`
`
`
`Supplementary Material
`can be found in the online version of this article at:
`http://stm.sciencemag.org/content/suppl/2012/05/25/4.136.136ra68.DC1.html
`
`reprints
`permission to reproduce this of this article or about obtaining
`Information about obtaining
`article
` in whole or in part can be found at:
` http://www.sciencemag.org/about/permissions.dtl
`
`
`
`
` (print ISSN 1946-6234; online ISSN 1946-6242) is published weekly, except theScience Translational Medicine
`
`last week in December, by the American Association for the Advancement of Science, 1200 New York Avenue
`NW, Washington, DC 20005. Copyright 2012 by the American Association for the Advancement of Science; all
`
` is a registered trademark of AAAS. Science Translational Medicine
`rights reserved. The title
`
`PGDX EX. 1005
`Page 1 of 33
`
`
`
`--
`
`
` on May 31, 2012
`
`stm.sciencemag.org
`
`Downloaded from
`
`R E S E A R C H A R T I C L E
`
`C A N C E R G E N O M I C S
`Noninvasive Identification and Monitoring of
`Cancer Mutations by Targeted Deep
`Sequencing of Plasma DNA
`Tim Forshew,1* Muhammed Murtaza,1,2* Christine Parkinson,1,2,3* Davina Gale,1*
`Dana W. Y. Tsui,1* Fiona Kaper,4† Sarah-Jane Dawson,1,2,3 Anna M. Piskorz,1,2
`Mercedes Jimenez-Linan,3,5 David Bentley,6 James Hadfield,1 Andrew P. May,4 Carlos Caldas,1,2,3,7
`James D. Brenton,1,2,3,7‡ Nitzan Rosenfeld1,2‡
`
`Plasma of cancer patients contains cell-free tumor DNA that carries information on tumor mutations and tumor
`burden. Individual mutations have been probed using allele-specific assays, but sequencing of entire genes to de-
`tect cancer mutations in circulating DNA has not been demonstrated. We developed a method for tagged-amplicon
`deep sequencing (TAm-Seq) and screened 5995 genomic bases for low-frequency mutations. Using this method, we
`identified cancer mutations present in circulating DNA at allele frequencies as low as 2%, with sensitivity and spec-
`ificity of >97%. We identified mutations throughout the tumor suppressor gene TP53 in circulating DNA from 46
`plasma samples of advanced ovarian cancer patients. We demonstrated use of TAm-Seq to noninvasively identify
`the origin of metastatic relapse in a patient with multiple primary tumors. In another case, we identified in plasma
`an EGFR mutation not found in an initial ovarian biopsy. We further used TAm-Seq to monitor tumor dynamics, and
`tracked 10 concomitant mutations in plasma of a metastatic breast cancer patient over 16 months. This low-cost,
`high-throughput method could facilitate analysis of circulating DNA as a noninvasive “liquid biopsy” for person-
`alized cancer genomics.
`
`INTRODUCTION
`Circulating cell-free DNA extracted from plasma or other body fluids
`has potentially transformative applications in cancer management
`(1–7). Characterization of tumor mutation profiles is required for in-
`formed choice of therapy, given that biological agents target specific
`pathways and effectiveness may be modulated by specific mutations
`(8–11). Yet, mutation profiles in different metastatic clones can differ
`significantly from each other or from the parent primary tumor (12, 13).
`Evolutionary changes within the cancer can alter the mutational spec-
`trum of the disease and its responsiveness to therapies, which may
`necessitate repeat biopsies (14–17). Biopsies are invasive and costly and
`only provide a snapshot of mutations present at a given time and lo-
`cation. For some applications, mutation detection in plasma DNA as a
`“liquid biopsy” could potentially replace invasive biopsies as a means
`to assess tumor genetic characteristics (2–7). Sensitive methods for de-
`tecting cancer mutations in plasma may find use in early detection
`screening (1), prognosis, monitoring tumor dynamics over time, or de-
`tection of minimal residual disease (3, 18, 19). In high-grade serous
`
`1Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Robinson
`Way, Cambridge CB2 0RE, UK. 2Department of Oncology, University of Cambridge,
`Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK. 3Addenbrooke’s Hos-
`pital, Cambridge University Hospital NHS Foundation Trust and National Institute for
`Health Research Cambridge Biomedical Research Centre, Cambridge CB2 2QQ, UK.
`4Fluidigm Corporation, 7000 Shoreline Court, Suite 100, South San Francisco, CA 94080,
`USA. 5Department of Histopathology, Addenbrooke’s Hospital, Cambridge CB2 0QQ,
`UK. 6Illumina Cambridge, Chesterford Research Park, Little Chesterford, Cambridge
`CB10 1XL, UK. 7Cambridge Experimental Cancer Medicine Centre, Cambridge CB2
`0RE, UK.
`*These authors contributed equally to this work.
`†Present address: Illumina, Inc., 5200 Illumina Way, San Diego, CA 92122, USA.
`‡To whom correspondence should be addressed. E-mail: nitzan.rosenfeld@cancer.org.
`uk (N.R.); james.brenton@cancer.org.uk (J.D.B.)
`
`ovarian carcinomas (HGSOC), mutations in the tumor suppressor
`gene TP53 have been observed in 97% of cases (20, 21), but these are
`located throughout the gene and are difficult to assay. A cost-effective
`method that could detect and measure allele frequency (AF) of TP53
`mutations in plasma may be highly applicable as a biomarker for
`HGSOC (22).
`Circulating DNA is fragmented to an average length of 140 to
`170 base pairs (bp) and is present in only a few thousand ampli-
`fiable copies per milliliter of blood, of which only a fraction may be
`diagnostically relevant (2, 3, 23–25). Recent advances in noninvasive
`prenatal diagnostics highlight the clinical potential of circulating
`DNA (25–28), but also the challenges involved in analysis of circulating
`tumor DNA (ctDNA), where mutated loci and AFs may be more var-
`iable. Various methods have been optimized to detect extremely rare
`alleles (1, 2, 6, 7, 29–31), and can assay for predefined or hotspot
`mutations. These methods, however, interrogate individual or few
`loci and have limited ability to identify mutations in genes that lack
`mutation hotspots, such as the TP53 and PTEN tumor suppressor
`genes (32). In patients with more advanced cancers, ctDNA can com-
`prise as much as 1% to 10% or more of circulating DNA (2), presenting
`an opportunity for more extensive genomic analysis. Targeted
`resequencing has been recently used to identify mutations in selected
`genes at AFs as low as 5% (33–35). However, identifying mutations
`across sizeable genomic regions spanning entire genes at an AF as
`low as 2%, or in few nanograms of fragmented template from circu-
`lating DNA, has been more challenging.
`In response, we describe a tool for noninvasive mutation analysis
`on the basis of tagged-amplicon deep sequencing (TAm-Seq), which
`allows amplification and deep sequencing of genomic regions span-
`ning thousands of bases from as little as individual copies of fragmented
`DNA. We applied this technique for detection of both abundant and
`
`www.ScienceTranslationalMedicine.org
`
`30 May 2012
`
`Vol 4 Issue 136 136ra68
`
`1
`
`PGDX EX. 1005
`Page 2 of 33
`
`
`
`R E S E A R C H A R T I C L E
`
`rare mutations in circulating DNA from blood plasma of ovarian and
`breast cancer patients. This sequencing approach allowed us to
`monitor changes in tumor burden by sampling only patient plasma
`over time. Combined with faster, more accurate sequencing technolo-
`gies or rare allele amplification strategies, this approach could poten-
`tially be used for personalized medicine at point of care.
`
`RESULTS
`
`Targeted deep sequencing of fragmented DNA by TAm-Seq
`To amplify and sequence fragmented DNA, we designed primers to
`generate amplicons that tile regions of interest in short segments of
`about 150 to 200 bases (Fig. 1A and table S1), incorporating universal
`
` on May 31, 2012
`
`stm.sciencemag.org
`
`1200
`
`1000
`
`800
`
`600
`
`400
`
`200
`
`
`
`0
`
`
`
`0.004
`0.002
`Frequency of nonreference allele
`
`0.006
`
`0.08
`0.06
`0.04
`0.02
`Frequency of nonreference allele
`
`0.1
`
`2000
`
`1000
`
`C > A
`
`2000
`
`1000
`
`G > A
`
`2000
`
`1000
`
`T > A
`
`1200
`
`1000
`
`800
`
`600
`
`400
`
`200
`
`0
`
`C
`
`sesab ecnerefernon fo rebmuN
`
`D
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`4000
`
`4000
`
`G > C
`
`400
`
`T > C
`
`A
`650 bp
`
`TP53
`
`B
`
`Exon 6
`
`Exon 5
`
`DNA (dilute or degraded)
`
`Single-plex PCR
`
`Downloaded from
`
`A > C
`
`2000
`
`2000
`
`200
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`400
`
`200
`
`A > G
`
`2000
`
`1000
`
`C > G
`
`4000
`
`2000
`
`T > G
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`2000
`
`1000
`
`A > T
`
`1000
`
`500
`
`C > T
`
`4000
`
`2000
`
`G > T
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`0
`
`0.005
`
`0.01
`
`Number of nonreference bases
`
`Barcoding PCR
`
`Pool and sequence
`
`Fig. 1. Overview of tagged amplicon sequencing (TAm-Seq). (A) Illustration
`of amplicon design. Primers were designed to amplify regions of interest in
`overlapping short amplicons (table S1). Amplicon design is illustrated for a
`region covering exons 5 to 6 of TP53. Colored bars, segmented into forward
`and reverse reads, show regions covered by different amplicons (excluding
`primer regions). Sequencing adaptors are attached at either end, such that a
`single-end read generates separate sets of forward and reverse reads (fig. S1).
`Because amplicons are mostly shorter than 200 bp, the forward and reverse
`reads also partially overlap. Figure adapted from University of California, Santa
`Cruz, Genome Browser (http://genome.ucsc.edu/). (B) Workflow overview. Mul-
`tiple regions were amplified in parallel. An initial preamplification step was
`
`Frequency of nonreference allele
`performed for 15 cycles using a pool of the target-specific primer pairs to pre-
`serve representation of all alleles in the template material. The schematic diagram
`shows DNA molecules that carry mutations (red stars) being amplified alongside
`wild-type molecules. Regions of interest in the preamplified material were then
`selectively amplified in individual (single-plex) PCR, thus excluding nonspecific
`products. Finally, sequencing adaptors and sample-specific barcodes were
`attached to the harvested amplicons in a further PCR. (C) Distribution of ob-
`served nonreference read frequencies, averaged over 47 FFPE samples, across
`all loci and all nonreference bases. Inset expands the low-frequency range. (D)
`Distribution of the observed background nonreference read frequencies aver-
`aged over 47 FFPE samples for the 12 different A/C/G/T base substitutions.
`
`www.ScienceTranslationalMedicine.org
`
`30 May 2012
`
`Vol 4 Issue 136 136ra68
`
`2
`
`PGDX EX. 1005
`Page 3 of 33
`
`
`
` on May 31, 2012
`
`stm.sciencemag.org
`
`Downloaded from
`
`sequencing data were obtained for 44 samples, and 43 single-base sub-
`stitutions were called (table S3). These matched 100% of mutations
`identified by Sanger sequencing and included three additional muta-
`tions at low AFs that were below detection thresholds of Sanger sequenc-
`ing (fig. S2). The upper bound of AFs that may have been missed was
`estimated (Supplementary Methods) at <5% for 36 of 44 FFPE sam-
`ples (82%) and <10% for 42 of 44 samples (95%), with median value
`of 1.3% and mean value of 2.7%. Mutant AFs were highly reproduc-
`ible in duplicate samples. For 42 of 43 mutations called, the difference
`in measured frequency between duplicates was less than 0.08, and the
`relative difference was 25% or less (Fig. 2A). Mutant AFs correlated
`significantly with tumor cellularity in the FFPE block (correlation
`coefficient = 0.422; P = 0.0049, t test) (Fig. 2B).
`In a separate run, we sequenced libraries prepared from six differ-
`ent diluted mixtures of six FFPE samples, with a different known point
`mutation in TP53 in each, to mean read depth of 5600. Of more than
`100,000 possible non-SNP (single-nucleotide polymorphism) substitu-
`tions, we identified all 33 expected point mutations present at AF >1%,
`including 6 mutations present at AF <2%, with one false-positive called
`with AF = 1.9%. Using less stringent parameters (Supplementary Meth-
`ods), we identified three additional mutations present at AF = 0.6%
`(Fig. 2C), with no additional false positives. Thus, we obtained 100%
`sensitivity, identifying mutations at AFs as low as 0.6%. A positive pre-
`dictive value (PPV) of 100% was calculated for mutations at AF >2%,
`and a PPV of 90% for mutations identified at AF <2% (Fig. 2D).
`
`Quantitative limitations of mutation detection
`When applying TAm-Seq to measure a predefined mutation (as op-
`posed to screening thousands of possible substitutions), the frequency
`of the mutant allele can be read out directly from the data at the
`desired locus. False detection is less likely, and criteria for confident
`mutation detection for a predefined substitution can be less stringent
`than those described above for de novo mutation identification (Sup-
`plementary Methods). The minimal nonreference AFs that could be
`detected depend on the read depth and background rates of nonrefer-
`ence reads, which vary per locus and substitution type. Minimal de-
`tectable frequencies increase when higher confidence margins are used
`(Supplementary Methods) and had a median value of 0.14% at con-
`fidence margin of 0.95 and 0.18% at confidence margin of 0.99 (fig.
`S3). The minimal detectable frequency would also be limited if a min-
`imal number of reads is applied for confident mutation detection; for
`example, a minimum of 10 reads implies that sequencing depth of
`5000 would be required to detect mutations at AF as low as 0.2%.
`For alleles present at ~10 or fewer copies in the starting template, re-
`producibility would also be limited by sampling noise, because these
`alleles may be over- or underrepresented in any particular reaction.
`To characterize the quantitative accuracy of TAm-Seq as applied to
`circulating DNA, we simulated rare circulating tumor mutations by
`mixing plasma DNA from two healthy individuals. Using the same
`set of primers as used for the FFPE experiment, we identified that
`these two individuals differed at five known SNP loci (table S4). Total
`amplifiable copies in both plasma DNA samples were determined by
`digital PCR and mixed to obtain minor AFs ranging from 0.16% to
`40% (Supplementary Methods). We sequenced diluted templates
`containing between 250 and <1 expected copy of the minor allele (ta-
`ble S5). The coefficient of variation (CV) of the observed AFs was
`equal on average to the inverse square root (1/√n) of the expected
`number of copies of the rare allele (Fig. 3A), which is the theoretical
`
`R E S E A R C H A R T I C L E
`
`adaptors at 5′ ends (fig. S1). Performing single-plex amplification with
`each of these primer pairs would require dispersing the initial sam-
`ple into many separate reactions, considerably increasing the prob-
`ability of sampling errors and allelic loss. Multiplex amplification
`using a large set of primers could result in nonspecific amplification
`products and biased coverage. We therefore applied a two-step ampli-
`fication process: a limited-cycle preamplification step where all primer
`sets were used together to capture the starting molecules present in
`the template, followed by individual amplification to purify and select
`for intended targets (Fig. 1B) (Supplementary Methods). The final
`concentration of each primer in the preamplification reaction was
`50 nM, reducing the potential for interprimer interactions, and 15 cy-
`cles of long-extension (4 min) polymerase chain reaction (PCR) were
`used to remain in the exponential phase of amplification. We used a
`microfluidic system (Access Array, Fluidigm) to perform parallel single-
`plex amplification from multiple preamplified samples using multiple
`primer sets. An additional PCR step attached sequencing adaptors
`(fig. S1) and tagged each sample by a unique molecular identifier
`or “barcode” (table S2). Sequencing adaptors were separately attached
`at either end and the products mixed together, such that single-end
`sequencing generated separate sets of forward and reverse reads. We
`performed 100-base single-end sequencing (GAIIx sequencer, Illumina),
`with an additional 10 cycles using the barcode sequencing primer,
`generating ~30 million reads per lane. This produced an average read
`depth of 3250 for each of 96 barcoded samples for 48 amplicons read
`in two possible orientations.
`
`Validation and sensitivity for mutation identification in
`ovarian tumor samples
`We designed a set of 48 primer pairs to amplify 5995 bases of genomic
`sequence covering coding regions (exons and exon junctions) of TP53
`and PTEN, and selected regions in EGFR, BRAF, KRAS, and PIK3CA
`(table S1) by overlapping short amplicons (Fig. 1A). The sequenced
`regions cover mutations that account for 38% of all point mutations
`in the COSMIC database (v55) (32). We used TAm-Seq to sequence
`DNA extracted from 47 formalin-fixed, paraffin-embedded (FFPE)
`tumor specimens of ovarian cancers (table S3), which were also se-
`quenced for TP53 by Sanger sequencing (36) (Supplementary Meth-
`ods). DNA extracted from FFPE samples is generally degraded and
`fragmented as a result of fixation and long-term ambient storage. We
`amplified DNA from each sample in duplicate, tagging each replicate
`with a different barcode. Using a single lane of sequencing, we gen-
`erated 3.5 gigabases of data passing signal purity filters, producing
`mean read depth of 3200 above Q30 for each of the 9024 expected
`read groups (48 amplicons × 2 directions × 94 barcoded samples). Back-
`ground frequencies of nonreference reads were ~0.1% (median, 0.03%;
`mean, 0.2%; in keeping with Q30 quality threshold applied), yet varied
`substantially between loci and base substitutions (Fig. 1C) and showed
`a clear bias toward purine/pyrimidine conservation (Fig. 1D). Sixty-six
`percent of loci had mean background rate of <0.1%, and 96% of loci
`had background rate of <0.6%.
`The data set interrogated nearly 18,000 possible single-base substi-
`tutions for each sample, which introduces a risk of false detection. To
`control for sporadic PCR errors and reduce false positives, we called
`point mutations in a sample only if nonreference AFs were above the
`respective substitution-specific background distribution at a high con-
`fidence margin (0.9995 or greater), and ranked high in the list of non-
`reference AFs, in both replicates (Supplementary Methods). Duplicate
`
`www.ScienceTranslationalMedicine.org
`
`30 May 2012
`
`Vol 4 Issue 136 136ra68
`
`3
`
`PGDX EX. 1005
`Page 4 of 33
`
`
`
` on May 31, 2012
`
`stm.sciencemag.org
`
`Downloaded from
`
`R E S E A R C H A R T I C L E
`
`1
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`0
`
`0
`
`100
`
`10–1
`
`Measured frequency of mutant alleleB
`
`D
`
`Allele frequency
`
`Detected by Sanger seq
`Missed by Sanger seq
`
`
`
`0.2
`0.4
`0.6
`0.8
`Frequency of mutant allele (repeat 1)
`
`1
`
`Known mutations
`False positive
`
`10–2
`
`1
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`0
`
`A
`
`Frequency of mutant allele (repeat 2)
`
`C
`
`10–1
`
`10–2
`
`Frequency of mutant allele (repeat 2)
`
`10–2
`10–1
`70
`60
`50
`40
`30
`20
`10
`Frequency of mutant allele (repeat 1)
`Mutations (sorted by allele frequency)
`Fig. 2. Identification of mutations in ovarian cancer FFPE samples by TAm-Seq. (A) Concordance be-
`tween duplicate measurements of AFs of mutations identified in fragmented DNA extracted from
`FFPE samples. The mutation frequency in each library was calculated as the fraction of reads with
`the mutant (nonreference) base. Solid line indicates equality. Dotted lines indicate a difference in
`AF of 0.05. (B) Correlation of AF with FFPE tumor cellularity. The measured mutant AF (average of
`both repeats) correlated significantly with the cellularity, estimated from histology (table S3). (C) Con-
`cordance between duplicate measurements of AFs of mutations identified in a mixture of DNA
`extracted from different FFPE samples. (D) Summary of mutations called in FFPE using TAm-Seq,
`sorted by increasing AF. Dotted line indicates AF of 2%.
`
`limit of accuracy set by the Poisson distribution for independently
`segregating molecules. We compared the observed AF to the expected
`AF for cases where more than six copies of the minor allele were
`expected. Of 24 such cases, the root mean square (RMS) relative error
`between the expected and the observed frequency was 14%, with on-
`ly 2 of 24 cases exhibiting more than 20% discrepancy. For samples
`with expected minor AF of 0.025, the RMS error was 23% (Fig. 3B).
`
`Noninvasive identification of cancer mutations
`in plasma circulating DNA
`We applied TAm-Seq to directly identify mutations in plasma of can-
`cer patients. We studied a cohort of samples from individuals with
`HGSOC. These samples were first analyzed for tumor-specific muta-
`tions using digital PCR (Supplementary Methods), a method that is
`highly accurate (2, 3, 7, 37) but requires design and validation of
`a different assay for every mutation screened and relies on previous
`identification of mutations in tumor samples from the same patients
`(2, 3). We initially selected for analysis seven cases that had relatively
`high levels of circulating mutant TP53 DNA in the plasma (as assessed
`by digital PCR). Using the equivalent amount of DNA present in 30
`
`www.ScienceTranslationalMedicine.org
`
`30 May 2012
`
`Vol 4 Issue 136 136ra68
`
`4
`
`P = 0.0049 (t test)
`
`0.1 0.2 0.3 0.4 0.5 0.6 0.7
`0.8 0.9
`Tumor cellularity of FFPE sample
`
`1
`
`False positive
`
`80
`
`to 120 ml of plasma, we performed du-
`plicate preamplification reactions for each
`sample. For all seven patients, TP53 tu-
`mor mutations were identified in the cir-
`culating DNA at frequencies of 4% to 44%
`(Table 1). In one plasma sample collected
`from an ovarian cancer patient at relapse,
`we also identified a de novo mutation in the
`tyrosine kinase domain of EGFR (exon 21),
`at AF of 6% (patient 27, Table 1). We sub-
`sequently validated the presence of this
`mutation in plasma by performing repli-
`cate Sanger sequencing reactions of highly
`diluted template (Supplementary Meth-
`ods), and 4 of 91 wells that were successful-
`ly Sanger-sequenced contained the EGFR
`mutation (fig. S4). We further validated
`the presence of this mutation by design-
`ing a sequence-specific TaqMan probe
`targeting this mutation and performing
`digital PCR (Table 1). The mutation was
`also identified by TAm-Seq in additional
`plasma collected from the same individual
`(sample 16, Table 2). This mutation in
`EGFR was not found in the ovarian mass
`removed by interval debulking surgery
`15 months before the blood sample was
`collected, although the same sample did
`contain the concomitant TP53 mutation
`found in the same patient’s plasma, at AF
`of 85% (patient 27, table S3). We subse-
`quently used TAm-Seq to sequence seven
`additional samples collected at the time
`of initial surgery including deposits in
`right and left ovaries and omentum. The
`EGFR mutation was detected in the two
`omental samples above the 0.99 confi-
`dence margin (fig. S3) at AF of 0.7%, but
`was not detected in the six ovarian samples (below the 0.8 confidence
`margin). Without previous identification in plasma, this mutation
`would not have been directly identified on screening those samples
`using high-specificity mutation identification criteria owing to its
`low AF. In contrast, the TP53 mutation was identifiable in all biopsy
`and plasma samples (Fig. 4A). The frequency of mutant alleles in the
`relapsed tumor could not be directly assessed because a biopsy at re-
`lapse was not available.
`We validated the TAm-Seq method on a larger panel of plasma
`samples in which levels of tumor-specific mutations were measured
`in parallel using patient-specific digital PCR assays. DNA extracted
`from 62 additional plasma samples collected at different time points
`from 37 patients with advanced HGSOC was amplified in duplicate
`(table S6), using DNA present in ~0.15 ml of plasma per reaction
`(range, 0.06 to 0.2 ml). Amplicon libraries were tagged and pooled
`together for sequencing with libraries prepared from 24 control sam-
`ples. This generated an average sequencing depth of 650 for 62 plasma
`samples, sufficient to detect mutations present at AFs of 1% to 2%. Of
`>1.5 million possible substitutions, 42 mutations were called using
`the parameters previously optimized for FFPE analysis (table S6).
`
`PGDX EX. 1005
`Page 5 of 33
`
`
`
`0
`10
`
`10–1
`
`100
`
`10–1
`
`10–2
`
`A
`
`C
`
`Allele frequency
`
`R E S E A R C H A R T I C L E
`
`Fig. 3. Noninvasive identification and
`quantification of cancer mutations in plasma
`DNA by TAm-Seq. (A) Sampling noise in
`sequencing of sparse DNA using dilutions
`of plasma DNA from healthy individuals.
`CV of triplicate AF readings was calculated
`for each of the five SNPs in each of the
`mixes, which had varying numbers of copies
`of the minor allele (n) (blue dots). Bin av-
`erages (red diamonds) are the mean CVs
`calculated for each bin (bin edges denoted
`by the dotted vertical lines). A linear fit to
`the log2 of the mean CV as a function of
`the log2 expected copy number was cal-
`culated (black line). Two data points, with
`(n = 100, CV = 0.0064) and (n = 32, CV =
`0.0185), were omitted from the figure for
`enhanced scaling. Three data points with
`minor allele copies of <0.8 were omitted
`from the analysis (n = 0.51, CV = 0.62; n =
`0.41, CV = 0.86; n = 0.20, CV = 0.99). (B)
`Expected versus observed frequency of
`rare alleles in a dilution series of circulating
`DNA. Mean observed frequency was calcu-
`lated for each of five SNPs for samples,
`where expected initial number of minor
`allele copies was greater than 6. Expected
`frequencies were calculated on the basis
`of quantification by digital PCR. Dotted
`lines represent 20% deviation from the ex-
`pected frequencies. Inset highlights cases
`with expected minor AF <0.025. (C) Muta-
`tions identified in 62 plasma samples from patients with advanced HGSOC
`using TAm-Seq. AFs are based on digital PCR measurement for con-
`firmed mutations (identified or missed by TAm-Seq), and on TAm-Seq
`for the false positives called using parameters optimized for analysis
`
`of FFPE samples. The dashed horizontal line indicates AF of 2%. Mu-
`tations detected by digital PCR at AF <1% are not shown. (D) AFs
`measured by TAm-Seq versus digital PCR for mutations identified in
`plasma DNA.
`
`Table 1. Mutations identified by TAm-Seq in plasma samples from seven
`ovarian cancer patients. TAm-Seq was used to sequence DNA extracted
`from plasma of subjects with HGSOC (stage III/IV at diagnosis). Plasma
`was collected when patients presented with relapse disease, before initia-
`tion of chemotherapy. For patient 46, DNA from a formalin-fixed, paraffin-
`
`embedded (FFPE) sample was not included in the TAm-Seq set and the
`mutation was validated in FFPE by Sanger sequencing. CA125 was
`measured at time of plasma collection. Mean depth of coverage at the mu-
`tation locus in the TAm-Seq data was averaged over the repeats (RMS
`deviation = 850). AF, allele frequency; N, no; Y, yes.
`
`Patient
`ID
`
`Age at
`diagnosis
`
`8
`12
`14
`25
`27†
`
`31
`46
`
`60
`62
`58
`61
`68
`
`64
`56
`
`Time elapsed
`since surgery
`(months);
`number of
`previous
`lines of
`chemotherapy
`
`CA125
`(U/ml)
`
`Plasma per
`amplification
`reaction
`(ml)
`
`Gene
`
`Mutation
`and base
`change
`(genome
`build hg19)
`
`Protein
`change
`
`Detected
`in
`FFPE
`
`Mean
`depth
`(sequencing
`reads)
`
`Mean AF
`using
`TAm-Seq
`
`Mean
`AF
`using
`digital
`PCR
`
`13; 1
`27; 3
`50; 3
`9; 1
`15; 1
`
`12; 1
`30; 2
`
`2122
`365
`260
`944
`1051
`
`313
`1509
`
`50
`50
`120
`110
`90
`
`30
`30
`
`TP53 17:7577120
`TP53 17:7577579
`TP53 17:7578212
`TP53 17:7578404
`TP53 17:7578262
`EGFR 7:55259437
`TP53 17:7578406
`TP53 17:7578406
`
`C>T p.R273H
`G>T
`p.Y234*
`G>A p.R213*
`A>T
`p.C176S
`C>G p.R196P
`G>A p.R832H
`C>T p.R175H
`C>T p.R175H
`
`Y
`Y
`Y
`Y
`Y
`N
`Y
`Y
`
`5000
`5000
`5800
`4800
`7700
`5700
`4500
`4200
`
`0.09
`0.10
`0.15
`0.04
`0.06
`0.06
`0.44
`0.23
`
`0.10
`0.08
`0.12
`0.08
`0.14
`0.05
`0.56
`0.30
`
`*Indicates stop codon.
`
`†Both a TP53 and an EGFR mutation were identified in this sample (Fig. 4A).
`
`www.ScienceTranslationalMedicine.org
`
`30 May 2012
`
`Vol 4 Issue 136 136ra68
`
`5
`
` on May 31, 2012
`
`stm.sciencemag.org
`
`Downloaded from
`
`rs1625895
`rs1800899
`rs17337360
`rs1050171
`rs10241451
`
`0.06
`
`0.04
`
`0.02
`
`
`
`0.04
`0.02
`0.4
`0.3
`0.2
`0.1
`Expected frequency of minor allele
`
`0.06
`
`
`
`
`
`0.08
`0.5
`
`10–1
`Allele frequency by digital PCR
`
`100
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`0
`
`
`
`0
`
`10 0
`
`–1
`
`10
`
`10
`
`–2
`
`10–2
`
`B
`
`Observed frequency of minor allele
`
`D
`
`Allele frequency by TAm-Seq
`
`
`
`log2(CV) = –0.49 log2(n) – 0.056
`
`Individual data points
`Bin averages
`
`102
`10 0
`101
`Number of copies of the minor allele (n)
`
`Missed
`False positive
`
`40
`30
`20
`10
`Mutations (sorted by allele frequency)
`
`PGDX EX. 1005
`Page 6 of 33
`
`
`
`R E S E A R C H A R T I C L E
`
`Table 2. Mutations identified by TAm-Seq in a set of 62 plasma sam-
`ples from ovarian cancer patients. Forty mutations were identified by
`TAm-Seq using stringent parameters for mutation calling. Plasma sam-
`
`ples described in this table are distinct from those in Table 1, but pa-
`tients included overlap. Additional data on patients and mutations are
`provided in table S6.
`
`Sample
`number
`
`Plasma volume per
`amplification reaction (ml)
`
`DNA amount per
`amplification reaction (ng)
`
`Gene
`
`Protein
`change
`
`Mean depth
`(sequencing reads)
`
`Mean AF using
`TAm-Seq
`
`Mean AF using
`digital PCR
`
` on May 31, 2012
`
`stm.sciencemag.org
`
`Downloaded from
`
`0.167
`0.150
`0.410
`0.035
`0.013
`0.038
`0.081
`0.627
`0.604
`0.682
`0.581
`0.045
`0.120
`0.068
`0.135
`0.050
`0.432
`0.108
`0.226
`0.074
`0.125
`0.106
`0.286
`0.099
`0.061
`0.364
`0.253
`0.034
`0.122
`0.206
`0.201
`0.275
`0.362
`0.331
`0.323
`0.482
`0.445
`0.245
`0.121
`0.073
`
`0.260
`0.244
`0.507
`0.059
`0.021
`0.044
`0.091
`0.608
`0.526
`0.651
`0.526
`0.039
`0.046
`0.091
`0.088
`0.048
`0.113
`0.029
`0.201
`0.085
`0.081
`0.074
`0.269
`0.094
`0.048
`0.321
`0.548
`0.040
`0.137
`0.216
`0.151
`0.191
`0.287
`0.275
`0.315
`0.435
`0.452
`0.185
`0.143
`0.071
`
`0.9
`4.2
`5.7
`9.9
`1.4
`2.1
`17.9
`14.8
`10.7
`6.1
`4.9
`2.8
`2.5
`3.0
`3.7
`
`4.2
`4.4
`5.2
`3.6
`4.1
`3.7
`7.1
`3.9
`5.7
`3.6
`9.5
`3.6
`2.4
`13.2
`5.3
`5.8
`9.4
`10.1
`16.4
`19.7
`15.0
`8.5
`3.6
`5.2
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`EGFR
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`TP53
`
`p.R273C
`p.R248Q
`p.R248Q
`p.R213X
`p.C141Y
`p.C141Y
`p.I195N
`p.R175H
`p.R175H
`p.R175H
`p.R175H
`p.C135R
`p.C135R
`p.C135R
`p.R196P
`p.R832H
`p.C176S
`p.C176S
`p.R175H
`p.R175H
`p.R175H
`p.R175H
`p.R175H
`p.R273H
`p.R282W
`p.C141Y
`p.E258K
`p.C135Y
`p.E56X
`p.K132N
`p.K132N
`p.K132N
`p.K132N
`p.K132N
`p.K132N
`p.K132N
`p.K132N
`p.K132N
`Splicing
`p.C238R
`
`640
`340
`640
`81