`
`SYSTEMS AND METHODS TO DETECT RARE MUTATIONS AND COPY
`
`NUMBER VARIATION
`
`WSGR Docket No. 42534-704103
`
`Inventor(s): AniirAli Talasaz,
`a citizen of the United States,
`2181 Camino a Los Cerros
`
`Menlo Park, CA 94025
`
`Assignee:
`
`Guardant Health, Inc.
`
`Entity:
`
`Large
`
`W
`
`'Wilson Sonsini Goodrich 85 Rosati
`FRO 3; F. Sbl O :\ AL C U R i” \‘ RATI ON
`
`650 Page Mill Road
`Palo Alto, CA 94304
`
`(650) 493-9300 (Main)
`(650) 493-6811 (Facsimile)
`
`Filed Electronically on: March 15, 2013
`
`PGDX EX. 1012
`
`Page 1 of 71
`
`PGDX EX. 1012
`Page 1 of 71
`
`
`
`SYSTEMS AND METHODS TO DETECT RARE MUTATIONS AND COPY NUMBER
`
`VARIATION
`
`BACKGROUND OF THE INVENTION
`
`[0001] The detection and quantification of polynucleotides is important for molecular biology and
`
`medical applications such as diagnostics. Genetic testing is particularly useful for a number of
`
`diagnostic methods. For example, disorders that are caused by mutations, copy number variation, or
`
`changes in epigenetic markers, such as cancer and partial or complete aneuploidy, may be detected or
`
`more accurately characterized with DNA sequence information.
`
`[0002] Early detection and monitoring of genetic diseases, such as cancer is often useful and needed
`
`in the successful treatment or management of the disease. One approach may include the monitoring
`
`of a sample derived from cell free nucleic acids, a population of polynucleotides that can be found in
`
`different types of bodily fluids. In some cases, disease may be characterized or detected based on
`
`detection of genetic aberrations, such as a change in copy number variation and/or mutation of one or
`
`more nucleic acid sequences, or the development of certain rare mutations. Cell free DNAs have
`
`been known in the art for decades, and may contain genetic aberrations associated with a particular
`
`disease. With improvements in sequencing and techniques to manipulate nucleic acids, there is a
`
`need in the art for improved methods and systems for using cell free DNA to detect and monitor
`
`disease.
`
`SUMMARY OF THE INVENTION
`
`[0003] The disclosure provides for a method for detecting copy number variation comprising: a)
`
`sequencing extracellular polynucleotides from a bodily sample from a subject, wherein each of the
`
`extracellular polynucleotide are optionally attached to unique barcodes; b) filtering out reads that fail
`
`to meet a set threshold; c) mapping sequence reads obtained from step (a) to a reference sequence; d)
`
`quantifying/counting mapped reads in two or more predefined regions of the reference sequence; e)
`
`determining a copy number variation in one or more of the predefined regions by (i) normalizing
`
`number of reads in the predefined regions to each other and/or the number of unique barcodes in the
`
`predefined regions to each other; (ii) comparing the normalized numbers obtained in step (i) to
`
`normalized numbers obtained from a control sample.
`
`5453271_1
`
`-2-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 2 of 71
`
`PGDX EX. 1012
`Page 2 of 71
`
`
`
`[0004] The disclosure also provides for a method for detecting a rare mutation in a cell-free or
`
`substantially cell free sample obtained from a subject comprising: a) sequencing extracellular
`
`polynucleotides from a bodily sample from a subject, wherein each of the extracellular
`
`polynucleotide generate a plurality of sequencing reads; sequencing extracellular polynucleotides
`
`from a bodily sample from a subject, wherein each of the extracellular polynucleotide generate a
`
`plurality of sequencing reads; b) sequencing extracellular polynucleotides from a bodily sample from
`
`a subject, wherein each of the extracellular polynucleotide generate a plurality of sequencing reads;
`
`sequencing extracellular polynucleotides from a bodily sample from a subject, wherein each of the
`
`extracellular polynucleotide generate a plurality of sequencing reads; c)filtering out reads that fail to
`
`meet a set threshold; d) mapping sequence reads derived from the sequencing onto a reference
`
`sequence; e) identifying a subset of mapped sequence reads that align with a variant of the reference
`
`sequence at each mappable base position; f) for each mappable base position, calculating a ratio of (a)
`
`a number of mapped sequence reads that include a variant as compared to the reference sequence, to
`
`(b) a number of total sequence reads for each mappable base position; g) normalizing the ratios or
`
`frequency of variance for each mappable base position and determining potential rare variant(s) or
`
`mutation(s); h) and comparing the resulting number for each of the regions with potential rare
`
`variant(s) or mutation(s) to similarly derived numbers from a reference sample.
`
`[0005] Additionally, the disclosure also provides for a method of characterizing the heterogeneity of
`
`an abnormal condition in a subject, the method comprising generating a genetic profile of
`
`extracellular polynucleotides in the subject, wherein the genetic profile comprises a plurality of data
`
`resulting from copy number variation and rare mutation analyses.
`
`[0006]
`
`In some embodiments, the prevalence/concentration of each rare variant identified in the
`
`subject is reported and quantified simultaneously. In other embodiments, a confidences score,
`
`regarding the prevalence/concentrations of rare variants in the subject, is reported.
`
`[0007]
`
`In some embodiments, extracellular polynucleotide comprises DNA. In other embodiments,
`
`extracellular polynucleotides comprise RNA. Polynucleotides may be fragments or fragmented after
`
`isolation. Additionally, the disclosure provides for a method for circulating nucleic acid isolation and
`
`extraction.
`
`545327l_l
`
`-3-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 3 of 71
`
`PGDX EX. 1012
`Page 3 of 71
`
`
`
`[0008]
`
`In some embodiments, extracellular polynucleotides are isolated from a bodily sample which
`
`may be selected from a group consisting of blood, plasma, serum, urine, saliva, mucosal excretions,
`
`sputum, stool and tears.
`
`[0009]
`
`In some embodiments, the methods of the disclosure also comprise a step of determining the
`
`percent of sequences having copy number variation or rare mutation or variant in said bodily sample.
`
`[0010]
`
`In some embodiments, the percent of sequences having copy number variation in said bodily
`
`sample is determined by calculating the percentage of predefined regions with an amount of
`
`polynucleotides above or below a predetermined threshold.
`
`[0011]
`
`In some embodiments, bodily fluids are drawn from a subject suspected of having an
`
`abnormal condition which may be selected from the group consisting of, mutations, rare mutations,
`
`indels, copy number variations, transversions, translocations, inversion, deletions, aneuploidy, partial
`
`aneuploidy, polyploidy, chromosomal instability, chromosomal structure alterations, gene fusions,
`
`chromosome fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions,
`
`DNA lesions, abnormal changes in nucleic acid chemical modifications, abnormal changes in
`
`epigenetic patterns, abnormal changes in nucleic acid methylation infection and cancer.
`
`[0012]
`
`In some embodiments, the subject may be a pregnant female in which the abnormal condition
`
`may be a fetal abnormality selected from the group consisting of, mutations, rare mutations, indels,
`
`copy number variations, transversions, translocations, inversion, deletions, aneuploidy, partial
`
`aneuploidy, polyploidy, chromosomal instability, chromosomal structure alterations, gene fusions,
`
`chromosome fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions,
`
`DNA lesions, abnormal changes in nucleic acid chemical modifications, abnormal changes in
`
`epigenetic patterns, abnormal changes in nucleic acid methylation infection and cancer
`
`[0013]
`
`In some embodiments, the method may comprise comprising attaching one or more barcodes
`
`to the extracellular polynucleotides or fragments thereof prior to sequencing, in which the barcodes
`
`comprise are unique. In other embodiments barcodes attached to extracellular polynucleotides or
`
`fragments thereof prior to sequencing are not unique.
`
`[0014]
`
`In some embodiments, the methods of the disclosure may comprise selectively enriching
`
`regions from the subject’s genome or transcriptome prior to sequencing. In other embodiments the
`
`methods of the disclosure comprise selectively enriching regions from the subject’s genome or
`
`545327l_l
`
`-4-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 4 of 71
`
`PGDX EX. 1012
`Page 4 of 71
`
`
`
`transcriptome prior to sequencing. In other embodiments the methods of the disclosure comprise
`
`non-selectively enriching regions from the subject’s genome or transcriptome prior to sequencing.
`
`[0015] Further, the methods of the disclosure comprise attaching one or more barcodes to the
`
`extracellular polynucleotides or fragments thereof prior to any amplification or enrichment step.
`
`[0016]
`
`In some embodiments, the barcode is a polynucleotide, which may further comprise random
`
`sequence or a fixed or semi-random set of oligonucleotides that in combination with the diversity of
`
`molecules sequenced from a select region enables identification of unique molecules and be at least a
`
`3, 5, 10, 15, 20 25, 30, 35, 40, 45, or 50mer base pairs in length.
`
`[0017]
`
`In some embodiments, extracellular polynucleotides or fragments thereof may be amplified.
`
`In some embodiments amplification comprises global amplification or whole genome amplification.
`
`[0018]
`
`In some embodiments, sequence reads of unique identity may be detected based on sequence
`
`information at the beginning (start) and end (stop) regions of the sequence read and the length of the
`
`sequence read. In other embodiments sequence molecules of unique identity are detected based on
`
`sequence information at the beginning (start) and end (stop) regions of the sequence read, the length
`
`of the sequence read and attachment of a barcode.
`
`[0019]
`
`In some embodiments, amplification comprises selective amplification, non-selective
`
`amplification, suppression amplification or subtractive enrichment.
`
`[0020]
`
`In some embodiments, the methods of the disclosure comprise removing a subset of the reads
`
`from further analysis prior to quantifying or enumerating reads.
`
`[0021]
`
`In some embodiments, the method may comprise filtering out reads with an accuracy or
`
`quality score of less than a threshold, e.g., 90%, 99%, 99.9%, or 99.99% and/or mapping score less
`
`than a threshold, e.g., 90%, 99%, 99.9% or 99.99%. In other embodiments, methods of the
`
`disclosure comprise filtering reads with a quality score lower than a set threshold.
`
`[0022]
`
`In some embodiments, predefined regions are uniform or substantially uniform in size, about
`
`10kb, 20kb, 30kb 40kb, 50kb, 60kb, 70kb, 80kb, 90kb, or 100kb in size. In some embodiments, at
`
`least 50, 100, 200, 500, 1000, 2000, 5000, 10,000, 20,000, or 50,000 regions are analyzed.
`
`[0023]
`
`In some embodiments, a genetic variant, rare mutation or copy number variation occurs in a
`
`region of the genome selected from the group consisting of gene filSlOIlS, gene duplications, gene
`
`deletions, gene translocations, microsatellite regions, gene fragments or combination thereof. In
`
`5453271_1
`
`-5-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 5 of 71
`
`PGDX EX. 1012
`Page 5 of 71
`
`
`
`other embodiments a genetic variant, rare mutation or copy number variation occurs in a region of the
`
`genome selected from the group consisting of genes, oncogenes, tumor suppressor genes, promoters,
`
`regulatory sequence elements, or combination thereof. In some embodiments the variant is a
`
`nucleotide variant, single base substitution, or small indel, transversion, translocation, inversion,
`
`deletion, truncation or gene truncation about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 nucleotides in length.
`
`[0024]
`
`In some embodiments, the method comprises correcting/normalizing/adjusting the quantity of
`
`mapped reads using the barcodes or unique properties of individual reads.
`
`[0025]
`
`In some embodiments, enumerating the reads is performed through enumeration of unique
`
`barcodes in each of the predefined regions and normalizing those numbers across at least a subset of
`
`predefined regions that were sequenced. In some embodiments, samples at succeeding time intervals
`
`from the same subject are analyzed and compared to previous sample results. The method of the
`
`disclosure may further comprise determining partial copy number variation frequency, loss of
`
`heterozygosity, gene expression analysis, epigenetic analysis and hypermethylation analysis after
`
`amplifying the barcode-attached extracellular polynucleotides.
`
`[0026]
`
`In some embodiments, copy number variation and rare mutation analysis is determined in a
`
`cell-free or substantially cell free sample obtained from a subject using multiplex sequencing,
`
`comprising performing over 10,000 sequencing reactions; simultaneously sequencing at least 10,000
`
`different reads; or performing data analysis on at least 10,000 different reads across the genome. The
`
`method may comprise multiplex sequencing comprising performing data analysis on at least 10,000
`
`different reads across the genome. The method may further comprise enumerating sequenced reads
`
`that are uniquely identifiable.
`
`[0027]
`
`In some embodiments, the methods of the disclosure comprise normalizing and detection is
`
`performed using one or more of hidden markov, dynamic programming, support vector machine,
`
`Bayesian network, trellis decoding, Viterbi decoding, expectation maximization, Kalman filtering, or
`
`neural network methodologies.
`
`[0028]
`
`In some embodiments the methods of the disclosure comprise monitoring disease progression,
`
`monitoring residual disease, monitoring therapy, diagnosing a condition, prognosing a condition, or
`
`selecting a therapy based on discovered variants.
`
`5453271_1
`
`-6-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 6 of 71
`
`PGDX EX. 1012
`Page 6 of 71
`
`
`
`[0029]
`
`In some embodiments, a therapy is modified based on the most recent sample analysis.
`
`Further, the methods of the disclosure comprise inferring the genetic profile of a tumor, infection or
`
`other tissue abnormality. In some embodiments growth, remission or evolution of a tumor, infection
`
`or other tissue abnormality is monitored. In some embodiments the subject’s immune system are
`
`analyzed and monitored at single instances or over time.
`
`[0030]
`
`In some embodiments, the methods of the disclosure comprise identification of a variant that
`
`is followed up through an imaging test (e. g., CT, PET-CT, MRI, X-ray, ultrasound) for localization
`
`of the tissue abnormality suspected of causing the identified variant.
`
`[0031]
`
`In some embodiments, the methods of the disclosure comprise use of genetic data obtained
`
`from a tissue or tumor biopsy from the same patient. In some embodiments, whereby the
`
`phylo genetics of a tumor, infection or other tissue abnormality is inferred.
`
`[0032]
`
`In some embodiments, the methods of the disclosure comprise performing population-based
`
`no-calling and identification of low-confidence regions. In some embodiments, obtaining the
`
`measurement data for the sequence coverage comprises measuring sequence coverage depth at every
`
`position of the genome. In some embodiments correcting the measurement data for the sequence
`
`coverage bias comprises calculating window-averaged coverage. In some embodiments correcting
`
`the measurement data for the sequence coverage bias comprises performing adjustments to account
`
`for GC bias in the library construction and sequencing process. In some embodiments correcting the
`
`measurement data for the sequence coverage bias comprises performing adjustments based on
`
`additional weighting factor associated with individual mappings to compensate for bias.
`
`[0033]
`
`In some embodiments, the methods of the disclosure comprise extracellular polynucleotide
`
`derived from a diseased cell origin. In some embodiments, the extracellular polynucleotide is derived
`
`from a healthy cell origin.
`
`[0034] The disclosure also provides for a system comprising a computer readable medium for
`
`performing the following steps: selecting predefined regions in a genome; enumerating number of
`
`sequence reads in the predefined regions; normalizing the number of sequence reads across the
`
`predefined regions; and determining percent of copy number variation in the predefined regions. In
`
`some embodiments, the entirety of the genome or at least 10%, 20%, 30%, 40%, 50%, 60%, 70%,
`
`5453271_1
`
`-7-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 7 of 71
`
`PGDX EX. 1012
`Page 7 of 71
`
`
`
`80%, or 90% of the genome is analyzed. In some embodiments, computer readable medium provides
`
`data on percent cancer DNA or RNA in plasma or serum to the end user.
`
`[0035]
`
`In some embodiments, the amount of genetic variation, such as polymorphisms or causal
`
`variants is analyzed. In some embodiments, the presence or absence of genetic alterations is
`
`detected.
`
`[0036] This disclosure also provides for a method comprising: a. providing at least one set of tagged
`
`parent polynucleotides, and for each set of tagged parent polynucleotides; b. amplifying the tagged
`
`parent polynucleotides in the set to produce a corresponding set of amplified progeny
`
`polynucleotides; c. sequencing a subset (including a proper subset) of the set of amplified progeny
`
`polynucleotides, to produce a set of sequencing reads; and d. collapsing the set of sequencing reads to
`
`generate a set of consensus sequences, each consensus sequence corresponding to a unique
`
`polynucleotide among the set of tagged parent polynucleotides. In certain embodiments the method
`
`further comprises: e. analyzing the set of consensus sequences for each set of tagged parent
`
`molecules.
`
`[0037]
`
`In some embodiments each polynucleotide in a set is mappable to a reference sequence.
`
`[0038]
`
`In some embodiments the method comprises providing a plurality of sets of tagged parent
`
`polynucleotides, wherein each set is mappable to a different reference sequence.
`
`[0039]
`
`In some embodiments the method fithher comprises converting initial starting genetic
`
`material into the tagged parent polynucleotides.
`
`[0040]
`
`In some embodiments the initial starting genetic material comprises no more than 100 ng of
`
`polynucleotides.
`
`[0041]
`
`In some embodiments the method comprises bottlenecking the initial starting genetic material
`
`prior to converting.
`
`[0042]
`
`In some embodiments the method comprises converting the initial starting genetic material
`
`into tagged parent polynucleotides with a conversion efficiency of at least 10%, at least 20%, at least
`
`30%, at least 40%, at least 50%, at least 60%, at least 80% or at least 90%.
`
`[0043]
`
`In some embodiments converting comprises any of blunt-end ligation, sticky end ligation,
`
`molecular inversion probes, PCR, ligation-based PCR, single strand ligation and single strand
`
`circularization.
`
`5453271_1
`
`-8-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 8 of 71
`
`PGDX EX. 1012
`Page 8 of 71
`
`
`
`[0044]
`
`In some embodiments the initial starting genetic material is cell-free nucleic acid.
`
`[0045]
`
`In some embodiments a plurality of the reference sequences are from the same genome.
`
`[0046]
`
`In some embodiments each tagged parent polynucleotide in the set is uniquely tagged.
`
`[0047]
`
`In some embodiments the tags are non-unique.
`
`[0048]
`
`In some embodiments the generation of consensus sequences is based on information fiom the
`
`tag and at least one of sequence information at the beginning (start) region of the sequence read, the
`
`end (stop) regions of the sequence read and the length of the sequence read.
`
`[0049]
`
`In some embodiments the method comprises sequencing a subset of the set of amplified
`
`progeny polynucleotides sufficient to produce sequence reads for at least one progeny from of each of
`
`at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least
`
`90% at least 95%, at least 98%, at least 99%, at least 99.9% or at least 99.99% of unique
`
`polynucleotides in the set of tagged parent polynucleotides.
`
`[0050]
`
`In some embodiments the at least one progeny is a plurality of progeny, e. g., at least 2, at least
`
`5 or at least 10 progeny.
`
`[0051]
`
`In some embodiments the number of sequence reads in the set of sequence reads is greater
`
`than the number of unique tagged parent polynucleotides in the set of tagged parent polynucleotides.
`
`[0052]
`
`In some embodiments the subset of the set of amplified progeny polynucleotides sequenced is
`
`of sufficient size so that any nucleotide sequence represented in the set of tagged parent
`
`polynucleotides at a percentage that is the same as the percentage per-base sequencing error rate of
`
`the sequencing platform used, has at least a 50%, at least a 60%, at least a 70%, at least a 80%, at
`
`least a 90% at least a 95%, at least a 98%, at least a 99%, at least a 99.9% or at least a 99.99% chance
`
`of being represented among the set of consensus sequences.
`
`[0053]
`
`In some embodiments the method comprises enriching the set of amplified progeny
`
`polynucleotides for polynucleotides mapping to one or more selected reference sequences by: (i)
`
`selective amplification of sequences from initial starting genetic material converted to tagged parent
`
`polynucleotides; (ii) selective amplification of tagged parent polynucleotides; (iii) selective sequence
`
`capture of amplified progeny polynucleotides; or (iv) selective sequence capture of initial starting
`
`genetic material.
`
`5453271_1
`
`-9-
`
`42534-704103
`
`PGDX EX. 1012
`
`Page 9 of 71
`
`PGDX EX. 1012
`Page 9 of 71
`
`
`
`[0054]
`
`In some embodiments analyzing comprises normalizing a measure (e. g., number) taken from
`
`a set of consensus sequences against a measure taken from a set of consensus sequences from a
`
`control sample.
`
`[0055]
`
`In some embodiments analyzing comprises detecting mutations, rare mutations, indels, copy
`
`number variations, transversions, translocations, inversion, deletions, aneuploidy, partial aneuploidy,
`
`polyploidy, chromosomal instability, chromosomal structure alterations, gene fusions, chromosome
`
`fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions, DNA lesions,
`
`abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns,
`
`abnormal changes in nucleic acid methylation infection or cancer.
`
`[0056]
`
`In some embodiments the polynucleotides comprise DNA, RNA, a combination of the two or
`
`DNA plus RNA-derived cDNA.
`
`[0057]
`
`In some embodiments a certain subset of polynucleotides is selected for or is enriched based
`
`on polynucleotide length in base-pairs from the initial set of polynucleotides or from the amplified
`
`polynucleotides.
`
`[0058]
`
`In some embodiments analysis further comprises detection and monitoring of an abnormality
`
`or disease within an individual, such as, infection and/or cancer.
`
`[0059]
`
`In some embodiments the method is performed in combination with immune repertoire
`
`profiling.
`
`[0060]
`
`In some embodiments the polynucleotides are extract from the group consisting of blood,
`
`plasma, serum, urine, saliva, mucosal excretions, sputum, stool, and tears.
`
`[0061]
`
`In some embodiments collapsing comprising detecting and/or correcting errors, nicks or
`
`lesions present in the sense or anti-sense strand of the tagged parent polynucleotides or amplified
`
`progeny polynucleotides.
`
`[0062] This disclosure also provides for a method comprising detecting genetic variation in initial
`
`starting genetic material with a sensitivity of at least 5%, at least 1%, at least 0.5%, at least 0.1% or at
`
`least 0.05%. In some embodiments the initial starting genetic material is provided in an amount less
`
`than 100 ng of nucleic acid, the genetic variation is copy number/heterozygosity variation and
`
`detecting is performed with sub-chromosomal resolution; e. g., at least 100 megabase resolution, at
`
`5453271_1
`
`-10-
`
`42534—704103
`
`PGDX EX. 1012
`
`Page 10 of 71
`
`PGDX EX. 1012
`Page 10 of 71
`
`
`
`least 10 megabase resolution, at least 1 megabase resolution, at least 100 kilobase resolution, at least
`
`10 kilobase resolution or at least 1 kilobase resolution.
`
`[0063] This disclosure also provides for a system comprising a computer readable medium for
`
`performing the following steps: a. providing at least one set of tagged parent polynucleotides, and for
`
`each set of tagged parent polynucleotides; b. amplifying the tagged parent polynucleotides in the set
`
`to produce a corresponding set of amplified progeny polynucleotides; c. sequencing a subset
`
`(including a proper subset) of the set of amplified progeny polynucleotides, to produce a set of
`
`sequencing reads; and d. collapsing the set of sequencing reads to generate a set of consensus
`
`sequences, each consensus sequence corresponding to a unique polynucleotide among the set of
`
`tagged parent polynucleotides and, optionally, e. analyzing the set of consensus sequences for each
`
`set of tagged parent molecules.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0064] The novel features of a system and methods of this disclosure are set forth with particularity
`
`in the appended claims. A better understanding of the features and advantages of this disclosure will
`
`be obtained by reference to the following detailed description that sets forth illustrative embodiments,
`
`in which the principles of a systems and methods of this disclosure are utilized, and the
`
`accompanying drawings of which:
`
`[0065] Fig. 1 is a flow chart representation of a method of detection of copy number variation using a
`
`single sample.
`
`[0066] Fig. 2 is a flow chart representation of a method of detection of copy number variation using
`
`paired samples.
`
`[0067] Fig. 3 is a flow chart representation of a method of detection of rare mutation detection.
`
`[0068] Fig. 4A is graphical copy number variation detection report generated from a normal, non
`
`cancerous subject.
`
`[0069] Fig. 4B is a graphical copy number variation detection report generated from a subject with
`
`prostate cancer.
`
`[0070] Fig. 4C is schematic representation of intemet enabled access of reports generated from copy
`
`number variation analysis of a subject with prostate cancer.
`
`5453271_1
`
`-11-
`
`42534—704103
`
`PGDX EX. 1012
`
`Page 11 of 71
`
`PGDX EX. 1012
`Page 11 of 71
`
`
`
`[0071] Fig. 5A is a graphical copy number variation detection report generated from a subject with
`
`prostate cancer remission.
`
`[0072] Fig. 5B is a graphical copy number variation detection report generated from a subject with
`
`prostate recurrence cancer.
`
`[0073] Fig. 6A is graphical rare mutation detection report generated from various mixing
`
`experiments using DNA samples containing both wildtype and mutant copies of MET and TP53.
`
`[0074] Fig. 6B is logarithmic graphical representation of rare mutation detection results. Observed
`
`vs. expected percent cancer measurements are shown for various mixing experiments using DNAs
`
`samples containing both wildtype and mutant copies of MET, HRAS and TP53.
`
`[0075] Fig. 7A is graphical report of percentage of two rare mutations in two genes, MET and TP53,
`
`in a subject with prostate cancer as compared to a reference (control).
`
`[0076] Fig. 7B is schematic representation of intemet enabled access of reports generated from rare
`
`mutation analysis of a subject with prostate cancer.
`
`[0077] Fig. 8 is a flow chart representation of a method of analyzing genetic material.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`1. General Overview
`
`[0078] The present disclosure provides a system and method for the detection of rare mutations and
`
`copy number variations in cell free polynucleotides. Generally, the systems and methods comprise
`
`sample preparation, or the extraction and isolation of cell free polynucleotide sequences from a
`
`bodily fluid; subsequent sequencing of cell free polynucleotides by techniques known in the art; and
`
`application of bioinformatics tools to detect rare mutations and copy number variations as compared
`
`to a reference. The systems and methods also may contain a database or collection of different rare
`
`mutations or copy number variation profiles of different diseases, to be used as additional references
`
`in aiding detection of rare mutations, copy number variation profiling or general genetic profiling of
`
`a disease.
`
`[0079] The systems and methods may be particularly useful in the analysis of cell free DNAs. In
`
`some cases, cell free DNAs are extracted and isolated from a readily accessible bodily fluid such as
`
`blood. For example, cell free DNAs can be extracted using a variety of methods known in the art,
`
`including but not limited to isopropanol precipitation and/or silica based purif1cation. Cell free
`
`5453271_1
`
`-12-
`
`42534—704103
`
`PGDX EX. 1012
`
`Page 12 of 71
`
`PGDX EX. 1012
`Page 12 of 71
`
`
`
`DNAs may be extracted from any number of subjects, such as subjects without cancer, subjects at
`
`risk for cancer, or subjects known to have cancer (e. g. through other means).
`
`[0080] Following the isolation/extraction step, any of a number of different sequencing operations
`
`may be performed on the cell free polynucleotide sample. Samples may be processed before
`
`sequencing with one or more reagents (e.g., enzymes, unique identifiers (e. g., barcodes), probes,
`
`etc.). In some cases if the sample is processed with a unique identifier such as a barcode, the samples
`
`or fragments of samples may be tagged individually or in subgroups with the unique identifier. The
`
`tagged sample may then be used in a downstream application such as a sequencing reaction by which
`
`individual molecules may be tracked to parent molecules.
`
`[0081] After sequencing data of cell free polynucleotide sequences is collected, one or more
`
`bioinformatics processes may be applied to the sequence data to detect genetic features or aberrations
`
`such as copy number variation, rare mutations or changes in epigenetic markers, including but not
`
`limited to methylation profiles. In some cases, in which copy number variation analysis is desired,
`
`sequence data may be: 1) aligned with a reference genome; 2) filtered and mapped; 3) partitioned
`
`into windows or bins of sequence; 4) coverage reads counted for each window; 5) coverage reads can
`
`then be normalized using a stochastic or statistical modeling algorithm; 6) and an output file can be
`
`generated reflecting discrete copy number states at various positions in the genome. In other cases, in
`
`which rare mutation analysis is desired, sequence data may be 1) aligned with a reference genome; 2)
`
`filtered and mapped; 3) frequency of variant bases calculated based on coverage reads for that
`
`specific base; 4) variant base frequency normalized using a stochastic, statistical or probabilistic
`
`modeling algorithm; 5) and an output file can be generated reflecting mutation states at various
`
`positions in the genome.
`
`[0082] A variety of different reactions and/operations may occur within the systems and methods
`
`disclosed herein, including but not limited to: nucleic acid sequencing, nucleic acid quantification,
`
`sequencing optimization, detecting gene expression, quantifying gene expression, genomic profiling,
`
`cancer profiling, or analysis of expressed markers. Moreover, the systems and methods have
`
`numerous medical applications. For example, it may be used for the identification, detection,
`
`diagnosis, treatment, staging of, or risk prediction of various genetic and non-genetic diseases and
`
`disorders including cancer. It may be used to assess subject response to different treatments of said
`
`5453271_1
`
`-13-
`
`42534—704103
`
`PGDX EX. 1012
`
`Page 13 of 71
`
`PGDX EX. 1012
`Page 13 of 71
`
`
`
`genetic and non-genetic diseases, or provide information regarding disease progression and
`
`prognosis.
`
`[0083] The present disclosure fiarther provides methods and systems for detecting with high
`
`sensitivity genetic variation in a sample of initial genetic material. The methods involve using one or
`
`both of the following tools: First, the efficient conversion of individual polynucleotides in a sample
`
`of initial genetic material into sequence-ready tagged parent polynucleotides, so as to increase the
`
`probability that individual polynucleotides in a sample of initial genetic material will be represented
`
`in a sequence-ready sample. This can produce sequence information about more polynucleotides in
`
`the initial sample. Second, high yield generation of consensus sequences for tagged parent
`
`polynucleotides by high rate sampling of progeny polynucleotides amplified from the tagged parent
`
`polynucleotides, and collapsing of generated sequence reads into consensus sequences representing
`
`sequences of parent tagged polynucleotides. This can reduce noise introduced by amplification bias
`
`and/or sequencing errors, and can increase sensitivity of detection.
`
`[0084] Sequencing methods typically involve sample pre