`a2) Patent Application Publication 10) Pub. No.: US 2015/0368708 Al
`
`(43) Pub. Date: Dec. 24, 2015
`TALASAZ
`
`US 20150368708A1
`
`(54)
`
`(71)
`
`(72)
`
`SYSTEMS AND METHODS TO DETECT
`RARE MUTATIONS AND COPY NUMBER
`VARIATION
`
`Applicant: GAURDANT HEALTH,INC.,
`Redwood City, CA (US)
`
`Inventor: AmirAli TALASAZ, Menlo Park, CA
`(US)
`
`(21)
`
`Appl. No.:
`
`14/425,189
`
`(22)
`
`PCTFiled:
`
`Sep. 4, 2013
`
`(86)
`
`PCT No.:
`
`PCT/US13/58061
`
`§ 37] (c)(),
`(2) Date:
`
`Mar.2, 2015
`
`Related U.S. Application Data
`
`(60)
`
`Provisional application No. 61/696,734, filed on Sep.
`4, 2012, provisional application No. 61/704,400, filed
`on Sep. 21, 2012, provisional application No. 61/793,
`997, filed on Mar. 15, 2013, provisional application
`No. 61/845,987, filed on Jul. 13, 2013.
`
`Publication Classification
`
`(51)
`
`Int. Cl.
`C120 1/68
`GO6F 19/22
`(52) U.S.CL
`CPC veeeecceeee C120 1/6874 (2013.01); C12Q 1/6806
`(2013.01); GO6F 19/22 (2013.01)
`
`(2006.01)
`(2006.01)
`
`(57)
`
`ABSTRACT
`
`The present disclosure provides a system and methodfor the
`detection ofrare mutations and copy numbervariationsin cell
`free polynucleotides. Generally, the systems and methods
`comprise sample preparation, or the extraction andisolation
`of cell free polynucleotide sequences from a bodily fluid;
`subsequent sequencing of cell free polynucleotides by tech-
`niques knownin the art; and application of bioinformatics
`tools to detect rare mutations and copy numbervariations as
`comparedto a reference. The systems and methods also may
`contain a databaseor collection of different rare mutations or
`
`copy numbervariation profiles of different diseases, to be
`used as additional references in aiding detection ofrare muta-
`tions, copy numbervariationprofiling or general genetic pro-
`filing of a disease.
`
`PGDX EX. 1015
`Page 1 of 51
`
`PGDX EX. 1015
`Page 1 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 1 of 16
`
`US 2015/0368708 Al
`
`Extract and isolate cell free polynucleotides
`from bodily fluid
`
`Obtain sequencing data covering ccll free
`polynucleotides
`
`Mapsequencereads to a reference genome and
`determine number of reads for each mappable
`
`position in a plurality of chromosomal regions
`
`102
`
`104
`
`106
`
`100
`
`108
`
`Divide each ofthe chromosomalregions into
`windowsor bins and determine numberof reads
`per window
`
`Normalize the sequence reads per window and
`correct for bias
`
`
`
`Use a stochastic orstatistical algorithm to
`convert the number of sequence reads per
`windowinto discrete copy numberstates
`
`Generate report identifying genomic positions
`with copy numbervariations
`
`Fig. 1
`
`110
`
`112
`
`114
`
`PGDX EX. 1015
`Page 2 of 51
`
`PGDX EX. 1015
`Page 2 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 2 of 16
`
`US 2015/0368708 Al
`
`Extract and isolate cell free polynucleotides
`from bodily fluid for both a subject and a control
`subject
`
`Obtain sequencing data covering cell free
`polynucleotides for both subject and control
`
`Map sequencereadsin subject to control and
`determine numberof reads for each mappable
`position in a plurality of chromosomal regions
`
`200
`
`Divide each of the chromosomal regions into
`windowsor bins and determine number of reads
`per window
`
`Normalize the sequence reads per window and
`correctfor bias
`
`202
`
`204
`
`206
`
`208
`
`210
`
`212
`
`Use a stochastic or statistical algorithm to
`convert the number of sequence reads per
`window into discrete copy numberstates
`
`214
`
`Generate report identifying genomic positions
`with copy numbervariations in relationship to
`control
`
`Fig. 2
`
`PGDX EX. 1015
`Page 3 of 51
`
`PGDX EX. 1015
`Page 3 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 3 of 16
`
`US 2015/0368708 Al
`
`Extract and isolate cell free polynucleotides
`from bodily fluid for both a subject and a control
`subject
`
`Obtain sequencing data covering cell free
`polynucleotides for both subject and control or
`reference
`
`Map sequence reads in subject to control and
`determine number of reads for each mappable
`position
`
`300
`
`302
`
`304
`
`306
`
`308
`
`Calculate the frequency of variant bases as the
`numberof reads containing the variant divided
`by the total reads
`
`Analyze all four nucleotides for each mappable
`position in cell free polynucleotide
`
`310
`
`Use a stochastic or statistical algorithm to
`convert frequency of variance per each base into
`discrete variant states for each base position
`
`312
`
`314
`
`Generate report identifying base variants orrare
`mutations with largest deviation(s) for each
`base position with respect to reference or control
`
`Fig. 3
`
`PGDX EX. 1015
`Page 4 of 51
`
`PGDX EX. 1015
`Page 4 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 4 of 16
`
`US 2015/0368708 A1
`
`%
`Yat
`
`wnwel
`
`wit
`
`oy
`tedhee
`
`“ery
`o,
`
`‘elpene
`
`4g4Vaud
`
`ty
`
`N
`
`armal
`
`-ine
`anV3
`
`1
`
`‘ead
`
`oowywh
`
`Ch,
`
`vol
`
`tASe
`
`‘hyn
`
`abt
`
`Weng
`atl
`
`us
`
`weg
`wht
`
`3
`
`os,
`
`Z
`
`wag
`%
`
`
`
`weween,
`
`tobe
`
`7“eeoe
`
`Prostate
`
`Cancer Pat
`
`ent 7
`
`Fig. 4
`
`PGDX EX. 1015
`Page 5 of 51
`
`PGDX EX. 1015
`Page 5 of 51
`
`
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 5 of 16
`
`US 2015/0368708 Al
`
`Modem to connect te intemet
`
`Chip having array of microwells
`for sequencing reactions
`Sequencing
`apparalus.
`
`
`=a,
`7
`f ma ~oc
`Software
`
`
`
`
`:q
`
`
`
`
`a:aaB
`
`4
`
`:
` -
`"
`.
`Sofware
`
`™
`
`*teaace
`
`:q
`:33
`fas
`
`Handheld device ta provide
`sequencing infarmation to
`remote user
`
`Samputer systam
`
`C
`
`Fig. 4
`
`PGDX EX. 1015
`Page 6 of 51
`
`PGDX EX. 1015
`Page 6 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 6 of 16
`
`US 2015/0368708 Al
`
`PEENTarBAUANEAngEPOEREIWirePeterlaaEaatmaiNHaesTMartaactoyyttaptatyeaattrgs
`
`
`
`#ofCopies QUALAALLIDLELLTDEULIDDELLLDEULIEDECLIDULTEDELLDIDUADEPELUDELOIDELI
`
`Prastate Cancer Patient 2
`
`
`
`#ofCoples
`
`Prostate Cancer Patient 3
`
`Fig. 5
`
`PGDX EX. 1015
`Page 7 of 51
`
`PGDX EX. 1015
`Page 7 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 7 of 16
`
`US 2015/0368708 Al
`
`
`
`
`
`“egeaARALIHUTTULLLLLLDTLLrnennecnoererccneaselnieeEE
`
`Vey
`
`ROAS
`s
`
`SSS
`
`s
`<
`
`TP53 7578552
`
`100%
`
`10% ~
`
`a
`SSS
`
`Ay
`1% ™
`1.00% 0.109% vse “
`
`A
`
`100.00%
`
`10.00%
`
`& TP53
`
`SS HRAS
`
`&& MET
`
`:
`100.00%
`
`0.10%
`
`1.00%
`
`10.00%
`
`Fig. 6
`
`PGDX EX. 1015
`Page 8 of 51
`
`PGDX EX. 1015
`Page 8 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 8 of 16
`
`US 2015/0368708 Al
`
`SAAN
`asSSSDéé5e”OU€™£'°"0D’“—’’O’e’_"£BE
`ai
`RSS
`ge
`Yy%
`SS
`Yr,ty‘Ny
`\
`
`ingWty
`
`PIK3CA 178952189
`
`ty“ity
`
`TP53 7578411
`
`ETUDEDLUEbyennanunuanunnauaunununanaununuuunuuawe®
`
` cnnarearornootesnnntmntereroerennnrnesannenrennonasncrnnrenornmnnennsnas®”
`
`OLADLLCDILLLALTODDLLOYDUTELUTEDDOLLEMOLLUDUTUD
`LOLIULALODLITTLDDADLIOLLITODALETETALUDUETLDULEDLETOUELETPADTLOLDIUALEULETT
`
`5%
`
`for Sequencing react
`
`
` eee+
`
`
`5%
`
`Meili
`
`3.5%
`
`1.5%
`
`
`
`!
`
`
`
`3.5% |
`
`1.5% |
`
`\X
`
`g
`S
`x
`Xx
`=
`x
`&
`Qe
`\
`§
`no
`X
`0
`Oo
`$
`§
`QQUAAAAAAAANAAANASSMNNUNAONNNRANANNAANNNNNN
`Rays
`ow
`ow
`
`Modem to cormect to intemet
`
`,
`.
`“hip havi
`Chip
`favingarray of microwells
`Pp
`+
`‘
`
`
`.
`Sequencing
`apparaius
`
`.
`User
`/
`
`
`
`Software
`
`£
`
`
`
`Handheld device ta provide
`sequencing information to
`remote user
`
`B
`
`Fig. 7
`
`|
`Camputer system
`
`PGDX EX. 1015
`Page 9 of 51
`
`PGDX EX. 1015
`Page 9 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 9 of 16
`
`US 2015/0368708 Al
`
`Provide initial starting genetic material
`
`Convert polynucleotides from initial starting
`genetic material into tagged parent
`
`polynucleotides
`
`Amplify tagged parent polynucleotides to
`produce amplified progeny polynucleotides
`
`800
`
`Sequence a subset of amplified progeny
`polynucleotides to produce sequence reads
`
`Collapse sequence reads into set of consensus
`sequences of unique tagged parent
`
`polynucleotides
`
`Analyze set of consensus sequences
`
`802
`
`804
`
`806
`
`808
`
`810
`
`812
`
`Fig. 8
`
`PGDX EX. 1015
`Page 10 of 51
`
`PGDX EX. 1015
`Page 10 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 10 of 16
`
`US 2015/0368708 Al
`
`Provide initial starting genctic matcrial
`
`902
`
`904
`
`Convert polynucleotides from initial starting
`genetic material into tagged parent
`polynucleotides
`
`Amplify tagged parent polynucleotides to
`producc amplified progeny polynucleotides
`
`906
`
`900
`
`Sequence a subset of amplified progeny
`polynucleotides to produce sequence reads
`
`Group sequence reads into families, each family
`generated from a unique tagged parent
`polynucleotide
`
`distortion compared with sequence reads
`
`Produce representation of information in tagged
`parent polynucleotides and/or initial starting
`genctic material with reduced noise and/or
`
`908
`
`910
`
`912
`
`Fig. 9
`
`PGDX EX. 1015
`Page 11 of 51
`
`PGDX EX. 1015
`Page 11 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 11 of 16
`
`US 2015/0368708 Al
`
`Provideinitial starting genetic material
`
`1002
`
`1004
`
`Convert polynucleotides from initial starting
`genetic material into tagged parent
`polynucleotides
`
`Amplify tagged parent polynucleotides to
`
`produce amplified progeny polynucleotides \ 1006
`
`
`
`\ 1008
`
`\ 1010
`
`1014
`
`1016a
`
`Sequence a subset of amplified progeny
`polynucleotides to produce sequence reads
`olynucleotides
`t
`d
`d
`
`1000
`
`Group sequence reads into families, each family
`generated from a unique tagged parent
`:
`polynucleotide
`
`Determine quantitative measure of (c.g., count
`numberof) families mapping to each ofa
`plurality of reference loci; optionally quantify
`sequence reads in each family
`
`1012
`each family
`locus
`
`Infer quantities of unique tagged parent
`polynucleotides mapping to each locus
`based on quantity of families at leach
`locus and quantity of sequence reads in
`
`Determine CNV based on
`quantity of families mapping to
`each reference locus
`
`Determine CNV based on quantity
`of inferred unique tagged parent
`polynucleotides mapping to each
`
`1016b
`
`Fig. 10
`
`PGDX EX. 1015
`Page 12 of 51
`
`PGDX EX. 1015
`Page 12 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 12 of 16
`
`US 2015/0368708 Al
`
`Provide initial starting genetic material
`
`1102
`
`1104
`
`Convert polynucleotides from initial starting
`genetic material into tagged parent
`polynucleotides
`
`Amplify tagged parent polynucleotides to
`produce amplified progeny polynucleotides
`
`Sequence a subset of amplified progeny
`polynucleotides to produce sequence reads
`
`1106
`
`1108
`
`1100
`
`Group sequence reads into families, each family
`generated from a unique tagged parent
`polynucleotide
`
`1110
`
`1112
`
`1114
`
`At a selected locus (nucleotide or sequence of
`nucleotides) assign, for each family, a
`confidence score for each of one or more bases
`or sequence of bases
`
`Infer the frequency of each of one or more bases
`or sequence ofbases at the locus in the set of
`tagged parent polynucleotides based on the
`confidence scores among the families
`
`Fig. 11
`
`PGDX EX. 1015
`Page 13 of 51
`
`PGDX EX. 1015
`Page 13 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 13 of 16
`
`US 2015/0368708 Al
`
`Provide at lcast one individual polynucleotide
`molecule
`
`1202
`
`Encode sequence information in the at least one
`individual polynucleotide molecule to produce a
`signal
`
`Passat Icast part of the signal through a channel
`to produce a received signal comprising
`nucleotide sequence information aboutthe at
`least one individual polynucleotide molecule,
`wherein the received signal comprises noise
`
`1200
`
`1206
`
`1208
`
`1204
`and/or distortion
`1210
`
`decoding the received signal to produce a
`message comprising sequence information about
`the at least one individual polynucleotide
`molecule, wherein decoding reduces noise
`and/or distortion in the message
`
`Group sequence reads into families, each family
`generated from a unique tagged parent
`polynucleotide
`
`Provide the message to a recipient
`
`1212
`
`Fig. 12
`
`PGDX EX. 1015
`Page 14 of 51
`
`PGDX EX. 1015
`Page 14 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24, 2015 Sheet 14 of 16
`
`US 2015/0368708 A1
`
`&
`
`ys
`
`6
`
`-2
`
`“4
`
`30008
`
`ny
`ye
`
`&
`
`2 8
`
`w
`
`&
`
`2
`
`s
`
`&
`
`sSqganpaoonazargargetegannnge32receCCERINNE EEONE TATEEEETEgenderterseeeeGANAS19 HCCCERLEEEE
`
`rzessedians
`
`Fig. 13B
`
`40080
`
`Fig. 13
`
`PGDX EX. 1015
`Page 15 of 51
`
`PGDX EX. 1015
`Page 15 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 15 of 16
`
`US 2015/0368708 Al
`
`POS.)
`
`10.00%
`
`Messured
`
`i Mutations
`
`Porcentape 0.10%
`
`B.40%
`
`L028
`
`TELOBO
`
`sf
`LODO
`
`Percentage of spiked LNCaP
`
`Fig. 14
`
`PGDX EX. 1015
`Page 16 of 51
`
`PGDX EX. 1015
`Page 16 of 51
`
`
`
`Patent Application Publication
`
`Dec. 24,2015 Sheet 16 of 16
`
`US 2015/0368708 Al
`
`1501
`
`1520
`
`1525
`
`1515
`
`1510
`
`1530
`
`Fig. 15
`
`PGDX EX. 1015
`Page 17 of 51
`
`PGDX EX. 1015
`Page 17 of 51
`
`
`
`US 2015/0368708 Al
`
`Dec. 24, 2015
`
`SYSTEMS AND METHODS TO DETECT
`RARE MUTATIONS AND COPY NUMBER
`VARIATION
`
`CROSS-REFERENCE
`
`[0001] This application claimspriority to U.S. Provisional
`Patent Application No. 61/696,734, filed Sep. 4, 2012, U.S.
`Provisional Patent Application No. 61/704,400,filed Sep. 21,
`2012, U.S. Provisional Patent Application No. 61/793,997,
`filed Mar. 15, 2013, and U.S. Provisional Patent Application
`No. 61/845,987, filed Jul. 13, 2013, each of whichis entirely
`incorporated herein by reference for all purposes.
`
`BACKGROUND OF THE INVENTION
`
`[0002] The detection and quantification of polynucleotides
`is important for molecular biology and medical applications
`such as diagnostics. Genetic testing is particularly useful for
`a numberof diagnostic methods. For example, disorders that
`are causedbyrare genetic alterations(e.g., sequence variants)
`or changes in epigenetic markers, such as cancer andpartial
`or complete aneuploidy, may be detected or more accurately
`characterized with DNA sequenceinformation.
`[0003] Early detection and monitoring of genetic diseases,
`such as cancer is often useful and needed in the successful
`
`treatment or managementofthe disease. One approach may
`include the monitoring of a sample derived from cell free
`nucleic acids, a population of polynucleotides that can be
`foundin different types of bodily fluids. In some cases, dis-
`ease may becharacterized or detected based on detection of
`genetic aberrations, such as a change in copy numbervaria-
`tion and/or sequence variation of one or more nucleic acid
`sequences, or the development of other certain rare genetic
`alterations. Cell free DNA (“cfDNA”) has been knownin the
`art for decades, and may contain genetic aberrations associ-
`ated with a particular disease. With improvements in
`sequencing and techniques to manipulate nucleic acids, there
`is a need in the art for improved methods and systems for
`using cell free DNA to detect and monitor disease.
`
`SUMMARYOF THE INVENTION
`
`[0004] The disclosure provides for a method for detecting
`copy numbervariation comprising: a) sequencing extracellu-
`lar polynucleotides from a bodily sample from a subject,
`wherein each of the extracellular polynucleotide are option-
`ally attached to unique barcodes; b)filtering out reads thatfail
`to meet a set threshold; c) mapping sequencereads obtained
`from step (a) to areference sequence; d) quantifying/counting
`mappedreads in two or more predefined regionsofthe refer-
`ence sequence; e) determining a copy numbervariation in one
`or more of the predefined regions by (i) normalizing the
`numberofreads in the predefined regions to each other and/or
`the numberof unique barcodes in the predefined regions to
`each other; and (i) comparing the normalized numbers
`obtained in step (1) to normalized numbers obtained from a
`control sample.
`[0005] The disclosure also provides for a method for
`detecting a rare mutation in a cell-free or substantially cell
`free sample obtained from a subject comprising: a) sequenc-
`ing extracellular polynucleotides from a bodily sample from
`a subject, wherein each of the extracellular polynucleotide
`generate a plurality of sequencing reads; b) sequencing extra-
`cellular polynucleotides from a bodily sample from a subject,
`wherein each of the extracellular polynucleotide generate a
`
`plurality of sequencing reads; sequencing extracellular poly-
`nucleotides from a bodily sample from a subject, wherein
`each of the extracellular polynucleotide generate a plurality
`of sequencingreads; c) filtering out readsthat fail to meet a set
`threshold; d) mapping sequence reads derived from the
`sequencing onto a reference sequence; e) identifying a subset
`of mapped sequence reads that align with a variant of the
`reference sequence at each mappable base position; f) for
`each mappable base position, calculating a ratio of (a) a
`number of mapped sequence reads that include a variant as
`comparedto the reference sequence, to (b) a numberoftotal
`sequence reads for each mappable base position; g) normal-
`izing the ratios or frequency of variance for each mappable
`base position and determining potential rare variant(s) or
`mutation(s); h) and comparing the resulting number for each
`ofthe regions with potential rare variant(s) or mutation(s) to
`similarly derived numbers from a reference sample.
`[0006] Additionally,
`the disclosure also provides for a
`method of characterizing the heterogeneity of an abnormal
`condition in a subject, the method comprising generating a
`genetic profile of extracellular polynucleotidesin the subject,
`wherein the genetic profile comprises a plurality of data
`resulting from copy numbervariation and/or other rare muta-
`tion (e.g., genetic alteration) analyses.
`[0007]
`In some embodiments, the prevalence/concentra-
`tion of each rare variant identified in the subject is reported
`and quantified simultaneously. In other embodiments, a con-
`fidence score, regarding the prevalence/concentrations ofrare
`variants in the subject, is reported.
`[0008]
`In some embodiments, extracellular polynucle-
`otides comprise DNA.In other embodiments, extracellular
`polynucleotides comprise RNA. Polynucleotides may be
`fragments or fragmented after isolation. Additionally, the
`disclosure provides for a methodfor circulating nucleic acid
`isolation and extraction.
`
`In some embodiments, extracellular polynucle-
`[0009]
`otides are isolated from a bodily sample that may be selected
`from a group consisting of blood, plasma, serum, urine,
`saliva, mucosal excretions, sputum,stool andtears.
`[0010]
`In some embodiments, the methods of the disclo-
`sure also comprise a step of determining the percent of
`sequences having copy numbervariation or otherrare genetic
`alteration (e.g., sequence variants) in said bodily sample.
`[0011]
`In some embodiments, the percent of sequences
`having copy numbervariation in said bodily sample is deter-
`mined by calculating the percentage of predefined regions
`with an amount of polynucleotides above or below a prede-
`termined threshold.
`
`Insome embodiments, bodily fluids are drawn from
`[0012]
`a subject suspected of having an abnormal condition which
`maybe selected from the group consisting of, mutations, rare
`mutations, single nucleotide variants, indels, copy number
`variations, transversions, translocations, inversion, deletions,
`aneuploidy, partial aneuploidy, polyploidy, chromosomal
`instability, chromosomalstructure alterations, gene fusions,
`chromosomefusions, gene truncations, gene amplification,
`gene duplications, chromosomal
`lesions, DNA lesions,
`abnormal changes in nucleic acid chemical modifications,
`abnormal changes in epigenetic patterns, abnormal changes
`in nucleic acid methylation infection and cancer.
`[0013]
`In some embodiments, the subject may be a preg-
`nant female in which the abnormal condition maybe a fetal
`abnormality selected from the group consisting of, single
`nucleotide variants, indels, copy numbervariations, transver-
`
`PGDX EX. 1015
`Page 18 of 51
`
`PGDX EX. 1015
`Page 18 of 51
`
`
`
`US 2015/0368708 Al
`
`Dec. 24, 2015
`
`sions, translocations, inversion, deletions, aneuploidy, partial
`aneuploidy, polyploidy, chromosomal instability, chromo-
`somal
`structure alterations, gene fusions, chromosome
`fusions, gene truncations, gene amplification, gene duplica-
`tions, chromosomallesions, DNA lesions, abnormal changes
`in nucleic acid chemical modifications, abnormal changes in
`epigenetic patterns, abnormal changesin nucleic acid methy-
`lation infection and cancer
`
`In some embodiments, the method may comprise
`[0014]
`comprising attaching one or more barcodesto the extracellu-
`lar polynucleotides or fragments thereof prior to sequencing,
`in which the barcodes comprise are unique. In other embodi-
`ments barcodes attached to extracellular polynucleotides or
`fragments thereof prior to sequencing are not unique.
`[0015]
`In some embodiments, the methods of the disclo-
`sure may comprise selectively enriching regions from the
`subject’s genomeor transcriptome prior to sequencing. In
`other embodiments the methods of the disclosure comprise
`selectively enriching regions from the subject’s genomeor
`transcriptomeprior to sequencing. In other embodiments the
`methods ofthe disclosure comprise non-selectively enriching
`regions from the subject’s genomeor transcriptomeprior to
`sequencing.
`[0016]
`Further, the methods of the disclosure comprise
`attaching one or more barcodes to the extracellular poly-
`nucleotides or fragments thereof prior to any amplification or
`enrichmentstep.
`[0017]
`In some embodiments, the barcode is a polynucle-
`otide, which may further comprise random sequence or a
`fixed or semi-random set of oligonucleotides that in combi-
`nation with the diversity of molecules sequenced from a
`select region enables identification of unique molecules and
`be at least a 3, 5, 10, 15, 20 25, 30, 35, 40, 45, or 5Omer base
`pairs in length.
`[0018]
`In some embodiments, extracellular polynucle-
`otides or fragments thereof may be amplified.
`In some
`embodiments amplification comprises global amplification
`or whole genomeamplification.
`[0019]
`In some embodiments, sequence reads of unique
`identity may be detected based on sequence information at
`the beginning(start) and end (stop) regions of the sequence
`read and the length of the sequence read. In other embodi-
`ments sequence molecules of unique identity are detected
`based on sequence information at the beginning (start) and
`end (stop) regions of the sequence read, the length of the
`sequenceread and attachmentofa barcode.
`[0020]
`In some embodiments, amplification comprises
`selective amplification, non-selective amplification, suppres-
`sion amplification or subtractive enrichment.
`[0021]
`In some embodiments, the methods of the disclo-
`sure comprise removing a subset of the reads from further
`analysis prior to quantifying or enumerating reads.
`[0022]
`In some embodiments, the method may comprise
`filtering out reads with an accuracy or quality score of less
`than a threshold, e.g., 90%, 99%, 99.9%, or 99.99% and/or
`mapping score less than a threshold, e.g., 90%, 99%, 99.9%
`or 99.99%. In other embodiments, methods of the disclosure
`comprise filtering reads with a quality score lower than a set
`threshold.
`
`In some embodiments, predefined regions are uni-
`[0023]
`form or substantially uniform in size, about 10 kb, 20 kb, 30
`kb 40 kb, 50 kb, 60 kb, 70 kb, 80 kb, 90 kb, or 100 kbin size.
`In some embodiments, at least 50, 100, 200, 500, 1000, 2000,
`5000, 10,000, 20,000, or 50,000 regions are analyzed.
`
`Insome embodiments, a genetic variant, rare muta-
`[0024]
`tion or copy number variation occurs in a region of the
`genomeselected from the group consisting of gene fusions,
`gene duplications, gene deletions, gene translocations, mic-
`rosatellite regions, gene fragments or combination thereof. In
`other embodiments a genetic variant, rare mutation, or copy
`numbervariation occurs in a region of the genome selected
`from the group consisting of genes, oncogenes, tumor sup-
`pressor genes, promoters, regulatory sequence elements, or
`combination thereof. In some embodiments the variantis a
`nucleotide variant, single base substitution, or small indel,
`transversion, translocation, inversion, deletion, truncation or
`gene truncation about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20
`nucleotides in length.
`[0025]
`In some embodiments, the method comprises cor-
`recting/normalizing/adjusting the quantity of mapped reads
`using the barcodes or unique properties of individual reads.
`[0026]
`Insome embodiments, enumerating the readsis per-
`formed through enumeration of unique barcodes in each of
`the predefined regions and normalizing those numbersacross
`at least a subset ofpredefined regions that were sequenced.In
`some embodiments, samples at succeeding time intervals
`from the same subject are analyzed and comparedto previous
`sample results. The method of the disclosure may further
`comprise determining partial copy number variation fre-
`quency,loss of heterozygosity, gene expression analysis, epi-
`genetic analysis and hypermethylation analysis after ampli-
`fying the barcode-attached extracellular polynucleotides.
`[0027]
`In some embodiments, copy number variation and
`rare mutation analysis is determinedin a cell-free or substan-
`tially cell free sample obtained from a subject using multiplex
`sequencing, comprising performing over 10,000 sequencing
`reactions; simultaneously sequencing at least 10,000 differ-
`ent reads; or performing data analysis on at least 10,000
`different reads across the genome. The method may comprise
`multiplex sequencing comprising performing data analysis
`on at least 10,000 different reads across the genome. The
`method may further comprise enumerating sequenced reads
`that are uniquely identifiable.
`[0028]
`In some embodiments, the methods of the disclo-
`sure comprise normalizing and detection is performed using
`one or more of hidden markov, dynamic programming, sup-
`port vector machine, Bayesian network,trellis decoding, Vit-
`erbi decoding, expectation maximization, Kalmanfiltering,
`or neural network methodologies.
`[0029]
`Insome embodiments the methodsofthe disclosure
`comprise monitoring disease progression, monitoring
`residual disease, monitoring therapy, diagnosing a condition,
`prognosing a condition, or selecting a therapy based on dis-
`covered variants.
`
`In some embodiments, a therapy is modified based
`[0030]
`on the most recent sample analysis. Further, the methods of
`the disclosure comprise inferring the genetic profile of a
`tumor, infection or other tissue abnormality. In some embodi-
`ments growth, remission or evolution of a tumor, infection or
`othertissue abnormality is monitored. In some embodiments
`the subject’s immune system are analyzed and monitored at
`single instances or over time.
`[0031]
`In some embodiments, the methods of the disclo-
`sure comprise identification of a variant that is followed up
`through an imagingtest (e.g., CT, PET-CT, MRI, X-ray, ultra-
`sound) for localization ofthe tissue abnormality suspected of
`causing the identified variant.
`
`PGDX EX. 1015
`Page 19 of 51
`
`PGDX EX. 1015
`Page 19 of 51
`
`
`
`US 2015/0368708 Al
`
`Dec. 24, 2015
`
`In some embodiments, the methods of the disclo-
`[0032]
`sure comprise use of genetic data obtained from a tissue or
`tumorbiopsy from the samepatient. In some embodiments,
`whereby the phylogenetics of a tumor, infection or other
`tissue abnormality is inferred.
`[0033]
`In some embodiments, the methods of the disclo-
`sure comprise performing population-based no-calling and
`identification of low-confidence regions. In some embodi-
`ments, obtaining the measurementdata for the sequence cov-
`erage comprises measuring sequence coverage depth at every
`position of the genome. In some embodiments correcting the
`measurement data for the sequence coverage bias comprises
`calculating window-averaged coverage. In some embodi-
`ments correcting the measurement data for the sequence cov-
`erage bias comprises performing adjustments to account for
`GCbiasin the library construction and sequencing process. In
`some embodiments correcting the measurementdata for the
`sequence coverage bias comprises performing adjustments
`based on additional weighting factor associated with indi-
`vidual mappings to compensate for bias.
`[0034]
`In some embodiments, the methods of the disclo-
`sure comprise extracellular polynucleotide derived from a
`diseased cell origin. In some embodiments, the extracellular
`polynucleotide is derived from a healthycell origin.
`[0035] The disclosure also provides for a system compris-
`ing a computer readable medium for performing the follow-
`ing steps: selecting predefined regions in a genome; enumer-
`ating number of sequence reads in the predefined regions;
`normalizing the number of sequence reads across the pre-
`defined regions; and determining percent of copy number
`variation in the predefined regions. In some embodiments,the
`entirety ofthe genomeorat least 10%, 20%, 30%, 40%, 50%,
`60%, 70%, 80%, or 90% of the genomeis analyzed. In some
`embodiments, computer readable medium provides data on
`percent cancer DNA or RNA in plasma or serum to the end
`user.
`
`Insome embodiments, the amount of genetic varia-
`[0036]
`tion, such as polymorphismsor causal variants is analyzed. In
`some embodiments, the presence or absence of genetic alter-
`ations is detected.
`
`[0037] The disclosure also provides for a method for
`detecting a rare mutation in a cell-free or a substantially cell
`free sample obtained from a subject comprising: a) sequenc-
`ing extracellular polynucleotides from a bodily sample from
`a subject, wherein each of the extracellular polynucleotides
`generate a plurality of sequencing reads; b)filtering out reads
`that fail to meet a set threshold; c) mapping sequence reads
`derived from the sequencing onto a reference sequence; d)
`identifying a subset of mapped sequencereadsthat align with
`a variant of the reference sequence at each mappable base
`position; e) for each mappable base position, calculating a
`ratio of (a) anumber of mapped sequencereadsthat include a
`variant as comparedto the reference sequence, to (b) a num-
`ber of total sequence reads for each mappable baseposition;
`f) normalizing the ratios or frequency of variance for each
`mappable base position and determining potential rare vari-
`ant(s) or other genetic alteration(s); and g) comparing the
`resulting numberfor each of the regions
`[0038] This disclosure also provides for a method compris-
`ing: a. providing at least one set of tagged parent polynucle-
`otides, and for each set of tagged parent polynucleotides; b.
`amplifying the tagged parent polynucleotides in the set to
`produce a corresponding set of amplified progeny polynucle-
`otides; c. sequencing a subset (including a proper subset) of
`
`the set of amplified progeny polynucleotides, to produce a set
`of sequencing reads; and d. collapsing the set of sequencing
`reads to generate a set of consensus sequences, each consen-
`sus sequence corresponding to a unique polynucleotide
`among the set of tagged parent polynucleotides. In certain
`embodiments the method further comprises: e. analyzing the
`set of consensus sequences for each set of tagged parent
`molecules.
`[0039]
`Insome embodiments each polynucleotide ina setis
`mappable to a reference sequence.
`[0040]
`In some embodiments the method comprises pro-
`viding a plurality of sets of tagged parent polynucleotides,
`wherein each set
`is mappable to a different reference
`sequence.
`[0041]
`Insome embodiments the method further comprises
`converting initial starting genetic material into the tagged
`parent polynucleotides.
`[0042]
`In some embodiments the initial starting genetic
`material comprises no more than 100 ng of polynucleotides.
`[0043]
`Insome embodiments the method comprisesbottle-
`necking the initial starting genetic material prior to convert-
`ing.
`In some embodiments the method comprises con-
`[0044]
`verting theinitial starting genetic material into tagged parent
`polynucleotides with a conversionefficiency of at least 10%,
`at least 20%,at least 30%,at least 40%, at least 50%,at least
`60%, at least 80% orat least 90%.
`[0045]
`In some embodiments converting comprises any of
`blunt-end ligation, sticky end ligation, molecular inversion
`probes, PCR,ligation-based PCR,single strand ligation and
`single strand circularization.
`[0046]
`In some embodiments the initial starting genetic
`material is cell-free nucleic acid.
`
`In some embodiments a plurality of the reference
`[0047]
`sequencesare from the same genome.
`[0048]
`In some embodiments each tagged parent poly-
`nucleotide in the set is uniquely tagged.
`[0049]
`In some embodiments the tags are non-unique.
`[0050]
`In some embodiments the generation of consensus
`sequencesis based on information from the tag and/orat least
`one of sequence informationat the beginning(start) region of
`the sequenceread, the end(stop) regions of the sequence read
`and the length of the sequence read.
`[0051]
`In some embodiments
`the method comprises
`sequencing a subset of the set of amplified progeny poly-
`nucleotides sufficient to produce sequence reads for at least
`one progeny from ofeach of at least 20%, at least 30%, at least
`40%, at least 50%, at least 60%, at least 70%, at least 80%,at
`least 90% at least 95%, at least 98%, at least 99%, at least
`99.9% or at least 99.99% of unique polynucleotides in the set
`of tagged parent polynucleotides.
`[0052]
`In some embodiments the at least one progenyis a
`plurality of progeny, e.g., at least 2, at least 5 or at least 10
`progeny.
`Insome embodiments the