`
`(19) World Intellectual Property Organization
`International Bureau
`
`(43) International Publication Date
`29 March 2012 (29.03.2012)
`
`PCT
`
`I IIIII IIIIIIII II IIIIII IIIII IIIII IIIII IIII I II Ill lllll lllll lllll lllll lllll llll 1111111111111111111
`(10) International Publication Number
`WO 2012/038839 A2
`
`(51) International Patent Classification:
`C12N 15/09 (2006.01)
`
`(21) International Application Number:
`PCT/IB201 l/003160
`
`(22) International Filing Date:
`20 September20ll (20.09.2011)
`
`(25) Filing Language:
`
`(26) Publication Language:
`
`(30) Priority Data:
`21 September20l0 (21.09.2010)
`61/385,001
`61/432,119
`l2January20ll (12.01.2011)
`
`English
`
`English
`
`US
`US
`
`(71) Applicant (for all designated States except US): POPU(cid:173)
`LATION GENETICS TECHNOLOGIES LTD.
`[GB/GB]; Babraham Research Campus, Cambridge CB22
`3AT(GB).
`
`lQY (GB). BRENNER, Sydney [GB/GB]; 3 Barton
`(GB). OSBORNE, Robert
`Square, Ely CB7 4PJ
`[GB/GB]; 38 Pilgrim Close, Great Chesterford, Saffron,
`Walden CBlO lQG (GB). LICHTENSTEIN, Conrad
`[GB/GB]; 55 Warkworth Terrace, Cambridge, Cam(cid:173)
`bridgeshire CBl lEE (GB).
`(8 l) Designated States (unless otherwise indicated, for every
`kind of national protection available): AE, AG, AL, AM,
`AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ,
`CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO,
`DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT,
`HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP,
`KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD,
`ME, MG, MK, MN, MW, MX, l'vIY, MZ, NA, NG, NI,
`NO, NZ, OM, PE, PG, PH, PL, PT, QA, RO, RS, RU,
`RW, SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ,
`TM, TN,TR, TT, TZ, UA, UG, US, UZ, VC, VN,ZA,
`ZM,ZW.
`
`(72) Inventors; and
`(75) Inventors/Applicants (for US only): CASBON, James
`[GB/GB]; 56 High Street, Hinxton, Cambridgeshire CBlO
`
`(84) Designated States (unless otherwise indicated. for every
`kind of regional protection available): ARIPO (BW, GH,
`GM, KE, LR, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG,
`
`[Continued on ne.,ct page J
`
`(54) Title: INCREASING CONFIDENCE OF ALLELE CALLS WITH MOLECULAR COUNTING
`
`Genomic DNA template
`
`GycleO
`
`A----------------(cid:173)
`-< t - - - - - - - - - - - - - - - - -8
`t Cycle 1
`
`A-----------------.
`
`t Cycle2
`
`A-----------------.
`
`o.........,..~~~~~~~~~~·~
`
`-----
`--------------
`----N
`
`~
`
`(57) Abstract: Aspects of the present invention include
`methods and compositions for determining the number of
`individual polynucleotide molecules originating from the
`same genomic region of the same original sample that have
`been sequenced in a particular sequence analysis configura(cid:173)
`tion or process. In these aspects of the invention, a degen(cid:173)
`erate base region (DBR) is attached to the starting polynu(cid:173)
`cleotide molecules that are subsequently sequenced (e.g.,
`after certain process steps are perfonned, e.g., amplification
`and/or enrichment). The number of difforent DBR se(cid:173)
`quences present in a sequencing run can be used to deter(cid:173)
`mine/estimate the nmnber of different starting polynu(cid:173)
`cleotides that have been sequenced. DBRs can be used to
`enhance numerous different nucleic acid sequence analysis
`applications, including allowing higher confidence allele
`call determinations in genotyping applications.
`
`O",
`~
`QO
`QO
`~
`~ Fig.3
`N
`'!""'"'i
`0
`N
`0
`~
`
`i
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/03 883 9 A2 I IIIII IIIIIIII II IIIIII IIIII IIIII IIIII IIII I II Ill lllll lllll lllll lllll lllll llll 1111111111111111111
`
`ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ,
`TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK,
`EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU,
`LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK,
`SM, TR), OAPI (BF, BJ, CF, CG, Cl, CM, GA, GN, GQ,
`GW, ML, MR, NE, SN, TD, TG).
`
`Published:
`
`without intemational search report and to be republished
`upon receipt of that report (Rule 48.2(g))
`
`with sequence listing part o(description (Rule 5.2(a))
`
`ii
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`1
`
`INCREASING CONFIDENCE OF ALLELE CALLS WITH
`MOLECULAR COUNTING
`
`BACKGROUND
`
`5
`
`Genotyping is an important technique in genetic research for mapping a genome and
`
`localizing genes that are linked to inherited characteristics (e.g., genetic diseases). The
`
`genotype of a subject generally includes determining alleles for one or more genomic locus
`
`based on sequencing data obtained from the subject's DNA. Diploid genomes (e.g., human
`
`genomes) may be classified as, for example, homozygous or heterozygous at a genomic
`
`10
`
`locus depending on the number of different alleles they possess for that locus, where
`
`heterozygous individuals have two different alleles for a locus and homozygous individuals
`
`have two copies of the same allele for the locus. The proper genotyping of samples is
`
`crucial when studies are done in the large populations needed to relate genotype to
`
`phenotype with high statistical confidence.
`
`15
`
`In genotyping analysis of diploid genomes by sequencing, the coverage (number of
`
`sequencing reads) for a particular genomic locus is used to establish the confidence of an
`
`allele call. However, confidence in allele calling is significantly reduced when bias is
`
`introduced during sample preparation, e.g., when the starting sample is in limiting amounts
`
`and/or when one or more amplification reactions are employed to prepare the sample for
`
`20
`
`sequencing. Thus, in samples having limited amounts of DNA, one may see high coverage
`
`(i.e., a high number of sequencing reads) for an allele on one chromosome over the allele on
`
`a different chromosome due to amplification bias (e.g., amplification from only a few, or
`
`even one, polynucleotide molecule). In this case, coverage alone may be misleading when
`
`measuring confidence in an allele call.
`
`25
`
`The present invention finds use in increasing the confidence in allele calling as well
`
`as in other applications based on nucleic acid sequence analysis, especially in the context of
`
`studying genotypes in a large population of samples.
`
`SUMMARY OF THE INVENTION
`
`30
`
`Aspects of the present invention include methods and compositions for determining
`
`the number of individual polynucleotide molecules originating from the same genomic
`
`region of the same original sample that have been sequenced in a particular sequence
`
`analysis configuration or process. In these aspects of the invention, a degenerate base region
`
`(DBR) is attached to the starting polynucleotide molecules that are subsequently sequenced
`
`Page 1
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`2
`
`(e.g., after certain process steps are performed, e.g., amplification and/or enrichment). The
`
`number of different DBR sequences present in a sequencing run can be used to
`
`determine/estimate the number of individual polynucleotide molecules 01iginating from the
`
`same genomic region of the same original sample that have been sequenced in a particular
`
`5
`
`sequence analysis configuration or process. DBRs can be used to improve the analysis of
`
`many different nucleic acid sequencing applications. For example, DBRs enable the
`
`determination of a statistical value for an allele call in genotyping assays that cannot be
`
`derived from the read number alone.
`
`In certain embodiments, aspects of the subject invention are drawn to methods of
`
`10
`
`determining the number of starting polynucleotide molecules sequenced from multiple
`
`different samples. In certain embodiments, the method includes:(1) attaching an adapter to
`
`starting polynucleotide molecules in multiple different samples, where the adapter for each
`
`sample includes: a unique MID specific for the sample; and a degenerate base region (DBR)
`
`(e.g., a DBR with at least one nucleotide base selected from: R, Y, S, W, K, M, B, D, H, V,
`
`15
`
`N, and modified versions thereof); (2) pooling the multiple different adapter-attached
`
`samples to generate a pooled sample; (3) amplifying the adapter-attached polynucleotides in
`
`the pooled sample; (4) sequencing a plurality of the amplified adapter-attached
`
`polynucleotides, where the sequence of the MID, the DBR and at least a portion of the
`
`polynucleotide is obtained for each of the plurality of adapter-attached polynucleotides; and
`
`20
`
`(5) determining the number of distinct DBR sequences present in the plurality of sequenced
`
`adapter-attached polynucleotides from each sample to determine or estimate the number of
`
`starting polynucleotides from each sample that were sequenced in the sequencing step.
`
`BRIEF DESCRIPTION OF THE ORA WINGS
`
`25
`
`The invention is best understood from the following detailed description when read
`
`in conjunction with the accompanying drawings. Included in the drawings are the following
`
`figures:
`
`Figure 1 shows the allele ratio for each MID in samples prepared from the indicated
`
`amount of starting material (top of each panel; in nanograms).
`
`30
`
`Figure 2 shows the fraction of DBR sequences for each MID associated with each
`
`allele at a synthetic polymorphic position. Samples were prepared from the indicated amount
`
`of starting material (top of each panel; in nanograms).
`
`Figure 3 shows the products produced in the first two cycles of PCR using primers
`
`having DBR sequences.
`
`Page 2
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`3
`
`DEFINITIONS
`
`Unless otherwise defined, all technical and scientific terms used herein have the same
`
`meaning as commonly understood by one of ordinary skill in the art to which this invention
`
`5
`
`belongs. Still, certain elements are defined for the sake of clarity and ease of reference.
`
`Terms and symbols of nucleic acid chemistry, biochemistry, genetics, and molecular
`
`biology used herein follow those of standard treatises and texts in the field, e.g. Kornberg
`
`and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger,
`
`Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read,
`
`10 Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor,
`
`Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York,
`
`1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford,
`
`1984 ); and the like.
`
`"Amplicon" means the product of a polynucleotide amplification reaction. That is, it
`
`15
`
`is a population of polynucleotides, usually double stranded, that are replicated from one or
`
`more starting sequences. The one or more starting sequences may be one or more copies of
`
`the same sequence, or it may be a mixture of different sequences. Amplicons may be
`
`produced by a variety of amplification reactions whose products are multiple replicates of
`
`one or more target nucleic acids. Generally, amplification reactions producing amplicons are
`
`20
`
`"template-driven" in that base pairing of reactants, either nucleotides or oligonucleotides,
`
`have complements in a template polynucleotide that are required for the creation of reaction
`
`products. In one aspect, template-driven reactions are primer extensions with a nucleic acid
`
`polymerase or oligonucleotide ligations with a nucleic acid ligase. Such reactions include,
`
`but are not limited to, polymerase chain reactions (PCRs), linear polymerase reactions,
`
`25
`
`nucleic acid sequence-based amplification (NASBAs), rolling circle amplifications, and the
`
`like, disclosed in the following references that are incorporated herein by reference: Mullis
`
`et al, U.S. patents 4,683,195; 4,965,188; 4,683,202; 4,800,159 (PCR); Gelfand et al, U.S.
`
`patent 5,210,015 (real-time PCR with "TAQMANTM" probes); Wittwer et al, U.S. patent
`
`6,174,670; Kacian et al, U.S. patent 5,399,491 ("NASBA"); Lizardi, U.S. patent 5,854,033;
`
`30 Aono et al, Japanese patent publ. JP 4-262799 (rolling circle amplification); and the like. In
`
`one aspect, amplicons of the invention are produced by PCRs. An amplification reaction
`
`may be a "real-time" amplification if a detection chemistry is available that permits a
`
`reaction product to be measured as the amplification reaction progresses, e.g. "real-time
`
`PCR" described below, or "real-time NASBA" as described in Leone et al, Nucleic Acids
`
`Page 3
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`4
`
`Research, 26: 2150-2155 (1998), and like references. As used herein, the term "amplifying"
`
`means performing an amplification reaction. A "reaction mixture" means a solution
`
`containing all the necessary reactants for performing a reaction, which may include, but not
`
`be limited to, buffering agents to maintain pH at a selected level during a reaction, salts, co-
`
`5
`
`factors, scavengers, and the like.
`
`The term "assessing" includes any form of measurement, and includes determining if
`
`an ele1nent is present or not. The terms "determining", "measuring", "evaluating",
`
`"estimating", "assessing" and "assaying" are used interchangeably and includes quantitative
`
`and qualitative determinations. Assessing may be relative or absolute. "Assessing the
`
`10
`
`presence of' includes determining the amount of something present, and/or determining
`
`whether it is present or absent.
`
`Polynucleotides that are "asymmetrically tagged" have left and right adapter domains
`
`that are not identical. This process is refeITed to generically as attaching adapters
`
`asymmetrically or asymmetrically tagging a polynucleotide, e.g., a polynucleotide fragment.
`
`15
`
`Production of polynucleotides having asymmetric adapter termini may be achieved in any
`
`convenient manner. Exemplary asymmetric adapters are described in: U.S. Patents
`
`5,712,126 and 6,372,434; U.S. Patent Publications 2007/0128624 and 2007/0172839; and
`
`PCT publication W0/2009/032167; all of which are incorporated by reference herein in their
`
`entirety. In certain embodiments, the asymmetric adapters employed are those described in
`
`20 U.S. Patent Application Ser. No. 12/432,080, filed on April 29, 2009, incorporated herein by
`
`reference in its entirety.
`
`As one example, a user of the subject invention may use an asymmetric adapter to
`
`tag polynucleotides. An "asymmetric adapter" is one that, when ligated to both ends of a
`
`double stranded nucleic acid fragment, will lead to the production of primer extension or
`
`25
`
`amplification products that have non-identical sequences flanking the genomic insert of
`
`interest. The ligation is usually followed by subsequent processing steps so as to generate
`
`the non-identical terminal adapter sequences. For example, replication of an asymmetric
`
`adapter attached fragment( s) results in polynucleotide products in which there is at least one
`
`nucleic acid sequence difference, or nucleotide/nucleoside modification, between the
`
`30
`
`terminal adapter sequences. Attaching adapters asymmetrically to polynucleotides (e.g.,
`
`polynucleotide fragments) results in polynucleotides that have one or more adapter
`
`sequences on one end (e.g., one or more region or domain, e.g., a primer binding site) that
`
`are either not present or have a different nucleic acid sequence as compared to the adapter
`
`sequence on the other end. It is noted that an adapter that is termed an "asymmetric adapter"
`
`Page 4
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`5
`
`is not necessarily itself structurally asymmetric, nor does the mere act of attaching an
`
`asymmetric adapter to a polynucleotide fragment render it immediately asymmetric. Rather,
`
`an asymmetric adapter-attached polynucleotide, which has an identical asymmetric adapter
`
`at each end, produces replication products (or isolated single stranded polynucleotides) that
`
`5
`
`are asymmetric with respect to the adapter sequences on opposite ends (e.g., after at least one
`
`round of amplification/primer extension).
`
`Any convenient asymmetric adapter, or process for attaching adapters
`
`asymmetrically, may be employed in practicing the present invention. Exemplary
`
`asymmetric adapters are desc1ibed in: U.S. Patents 5,712,126 and 6,372,434; U.S. Patent
`
`10
`
`Publications 2007/0128624 and 2007/0172839; and PCT publication W0/2009/032167; all
`
`of which are incorporated by reference herein in their entirety. In certain embodiments, the
`
`asymmetiic adapters employed are those described in U.S. Patent Application Ser. No.
`
`12/432,080, filed on April 29, 2009, incorporated herein by reference in its entirety.
`
`"Complementary" or "substantially complementary" refers to the hybridization or
`
`15
`
`base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for
`
`instance, between the two strands of a double stranded DNA molecule or between an
`
`oligonucleotide primer and a primer binding site on a single stranded nucleic acid.
`
`Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single
`
`stranded RNA or DNA molecules are said to be substantially complementary when the
`
`20
`
`nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide
`
`insertions or deletions, pair with at least about 80% of the nucleotides of the other strand,
`
`usually at least about 90% to 95%, and more preferably from about 98 to 100%.
`
`Alternatively, substantial complementarity exists when an RNA or DNA strand will
`
`hybridize under selective hybridization conditions to its complement. Typically, selective
`
`25
`
`hybridization will occur when there is at least about 65% complementary over a stretch of at
`
`least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90%
`
`complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by
`
`reference.
`
`"Duplex" means at least two oligonucleotides and/or polynucleotides that are fully or
`
`30
`
`partially complementary undergo Watson-Crick type base pairing among all or most of their
`
`nucleotides so that a stable complex is formed. The terms "annealing" and "hybridization"
`
`are used interchangeably to mean the formation of a stable duplex. "Perfectly matched" in
`
`reference to a duplex means that the poly- or oligonucleotide strands making up the duplex
`
`form a double stranded structure with one another such that every nucleotide in each strand
`
`Page 5
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`6
`
`undergoes Watson-Crick base pairing with a nucleotide in the other strand. A stable duplex
`
`can include Watson-Crick base pairing and/or non-Watson-Crick base pairing between the
`
`strands of the duplex (where base pairing means the forming hydrogen bonds). In certain
`
`embodiments, a non-Watson-Crick base pair includes a nucleoside analog, such as
`
`5
`
`deoxyinosine, 2, 6-diaminopurine, PNAs, LNA's and the like. In certain embodiments, a
`
`non-Watson-Crick base pair includes a "wobble base", such as deoxyinosine, 8-oxo-dA, 8-
`
`oxo-dG and the like, where by "wobble base" is meant a nucleic acid base that can base pair
`
`with a first nucleotide base in a complementary nucleic acid strand but that, when employed
`
`as a template strand for nucleic acid synthesis, leads to the incorporation of a second,
`
`10
`
`different nucleotide base into the synthesizing strand (wobble bases are described in further
`
`detail below). A "mismatch" in a duplex between two oligonucleotides or polynucleotides
`
`means that a pair of nucleotides in the duplex fails to undergo Watson-Crick bonding.
`
`"Genetic locus," "locus," or "locus of interest" in reference to a genome or target
`
`polynucleotide, means a contiguous sub-region or segment of the genome or target
`
`15
`
`polynucleotide. As used herein, genetic locus, locus, or locus of interest may refer to the
`
`position of a nucleotide, a gene or a portion of a gene in a genome, including mitochondrial
`
`DNA or other non-chromosomal DNA (e.g., bacterial plasmid), or it may refer to any
`
`contiguous portion of genomic sequence whether or not it is within, or associated with, a
`
`gene. A genetic locus, locus, or locus of interest can be from a single nucleotide to a
`
`20
`
`segment of a few hundred or a few thousand nucleotides in length or more. In general, a
`
`locus of interest will have a reference sequence associated with it (see description of
`
`"reference sequence" below).
`
`"Kit" refers to any delivery system for delivering mate1ials or reagents for carrying
`
`out a method of the invention. In the context of reaction assays, such delivery systems
`
`25
`
`include systems that allow for the storage, transport, or delivery of reaction reagents (e.g.,
`
`probes, enzymes, etc. in the appropriate containers) and/or supporting materials (e.g.,
`
`buffers, written instructions for performing the assay etc.) from one location to another. For
`
`example, kits include one or more enclosures (e.g., boxes) containing the relevant reaction
`
`reagents and/or supporting materials. Such contents may be delivered to the intended
`
`30
`
`recipient together or separately. For example, a first container may contain an enzyme for
`
`use in an assay, while a second container contains probes.
`
`"Ligation" means to form a covalent bond or linkage between the termini of two or
`
`more nucleic acids, e.g. oligonucleotides and/or polynucleotides. The nature of the bond or
`
`linkage may vary widely and the ligation may be carried out enzymatically or chemically.
`
`Page 6
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`7
`
`As used herein, ligations are usually carried out enzymatically to form a phosphodiester
`
`linkage between a 5' carbon of a terminal nucleotide of one oligonucleotide with 3' carbon
`
`of another oligonucleotide. A variety of template-driven ligation reactions are described in
`
`the following references, which are incorporated by reference: Whiteley et al, U.S. patent
`
`5
`
`4,883,750; Letsinger et al, U.S. patent 5,476,930; Fung et al, U.S. patent 5,593,826; Kool,
`
`U.S. patent 5,426,180; Landegren et al, U.S. patent 5,871,921; Xu and Kool, Nucleic Acids
`
`Research, 27: 875-881 (1999); Higgins et al, Methods in Enzymology, 68: 50-71 (1979);
`
`Engler et al, The Enzymes, 15: 3-29 (1982); and Namsaraev, U.S. patent publication
`
`2004/0110213.
`
`10
`
`"Multiplex Identifier" (MID) as used herein refers to a tag or combination of tags
`
`associated with a polynucleotide whose identity (e.g., the tag DNA sequence) can be used to
`
`differentiate polynucleotides in a sample. In certain embodiments, the MID on a
`
`polynucleotide is used to identify the source from which the polynucleotide is derived. For
`
`example, a nucleic acid sample may be a pool of polynucleotides derived from different
`
`15
`
`sources, (e.g., polynucleotides derived from different individuals, different tissues or cells, or
`
`polynucleotides isolated at different times points), where the polynucleotides from each
`
`different source are tagged with a unique MID. As such, a MID provides a correlation
`
`between a polynucleotide and its source. In certain embodiments, MIDs are employed to
`
`uniquely tag each individual polynucleotide in a sample. Identification of the number of
`
`20
`
`unique MIDs in a sample can provide a readout of how many individual polynucleotides are
`
`present in the sample ( or from how many 01iginal polynucleotides a manipulated
`
`polynucleotide sample was derived; see, e.g., U.S. Patent No. 7,537,897, issued on May 26,
`
`2009, incorporated herein by reference in its entirety). MIDs are typically comprised of
`
`nucleotide bases and can range in length from 2 to 100 nucleotide bases or more and may
`
`25
`
`include multiple subunits, where each different MID has a distinct identity and/or order of
`
`subunits. Exemplary nucleic acid tags that find use as MIDs are described in U.S. Patent
`
`7,544,473, issued on June 6, 2009, and titled "Nucleic Acid Analysis Using Sequence
`
`Tokens", as well as U.S. Patent 7,393,665, issued on July 1, 2008, and titled "Methods and
`
`Compositions for Tagging and Identifying Polynucleotides", both of which are incorporated
`
`30
`
`herein by reference in their entirety for their description of nucleic acid tags and their use in
`
`identifying polynucleotides. In certain embodiments, a set of MIDs employed to tag a
`
`plurality of samples need not have any particular common property (e.g., Tm, length, base
`
`composition, etc.), as the methods described herein can accommodate a wide variety of
`
`unique MID sets. It is emphasized here that MIDs need only be unique within a given
`
`Page 7
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`8
`
`experiment. Thus, the same MID may be used to tag a different sample being processed in a
`
`different expeliment. In addition, in certain expeliments, a user may use the same MID to tag
`
`a subset of different samples within the same expeliment. For example, all samples derived
`
`from individuals having a specific phenotype may be tagged with the same MID, e.g., all
`
`5
`
`samples derived from control ( or wildtype) subjects can be tagged with a first MID while
`
`subjects having a disease condition can be tagged with a second MID (different than the first
`
`MID). As another example, it may be desirable to tag different samples delived from the
`
`same source with different MIDs (e.g., samples delived over time or delived from different
`
`sites within a tissue). Further, MIDs can be generated in a valiety of different ways, e.g., by
`
`10
`
`a combinatorial tagging approach in which one MID is attached by ligation and a second
`
`MID is attached by plimer extension. Thus, MIDs can be designed and implemented in a
`
`valiety of different ways to track polynucleotide fragments duling processing and analysis,
`
`and thus no limitation in this regard is intended.
`
`"Next-generation sequencing" (NGS) as used herein refers to sequencing
`
`15
`
`technologies that have the capacity to sequence polynucleotides at speeds that were
`
`unprecedented using conventional sequencing methods (e.g., standard Sanger or Maxam(cid:173)
`
`Gilbert sequencing methods). These unprecedented speeds are achieved by performing and
`
`reading out thousands to millions of sequencing reactions in parallel. NGS sequencing
`
`platforms include, but are not limited to, the following: Massively Parallel Signature
`
`20
`
`Sequencing (Lynx Therapeutics); 454 pyro-sequencing (454 Life Sciences/Roche
`
`Diagnostics); solid-phase, reversible dye-terminator sequencing (Solexa/Illumina); SOLiD
`
`technology (Applied Biosystems); Ion semiconductor sequencing (Ion Torrent); and DNA
`
`nano ball sequencing (Complete Genomics ). Descriptions of ce1tain NGS platforms can be
`
`found in the following: Shendure, et al., "Next-generation DNA sequencing," Nature, 2008,
`
`25
`
`vol. 26, No. 10, 1135-1145; Mardis, "The impact of next-generation sequencing technology
`
`on genetics," Trends in Genetics, 2007, vol. 24, No. 3, pp. 133-141; Su, et al., "Next(cid:173)
`
`generation sequencing and its applications in molecular diagnostics" Expert Rev Mol Diagn,
`
`2011, 11(3):333-43; and Zhang et al., "The impact of next-generation sequencing on
`
`genomics", J Genet Genomics, 2011, 38(3):95-109.
`
`30
`
`"Nucleoside" as used herein includes the natural nucleosides, including 2'-deoxy and
`
`2'-hydroxyl forms, e.g. as described in Kornberg and Baker, DNA Replication, 2nd Ed.
`
`(Freeman, San Francisco, 1992). "Analogs" in reference to nucleosides includes synthetic
`
`nucleosides having modified base moieties and/or modified sugar moieties, e.g. described by
`
`Scheit, Nucleotide Analogs (John Wiley, New York, 1980); Uhlman and Peyman, Chemical
`
`Page 8
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`9
`
`Reviews, 90: 543-584 (1990), or the like, with the proviso that they are capable of specific
`
`hybridization. Such analogs include synthetic nucleosides designed to enhance binding
`
`properties, reduce complexity, increase specificity, and the like. Polynucleotides comprising
`
`analogs with enhanced hybridization or nuclease resistance properties are described in
`
`5 Uhlman and Peyman ( cited above); Crooke et al, Exp. Opin. Ther. Patents, 6: 855-870
`
`(1996); Mesmaeker et al, Current Opinion in Structual Biology, 5: 343-355 (1995); and the
`
`like. Exemplary types of polynucleotides that are capable of enhancing duplex stability
`
`include oligonucleotide N3'~P5' phosphoramidates (referred to herein as "amidates"),
`
`peptide nucleic acids (referred to herein as "PNAs"), oligo-2'-0-alkylribonucleotides,
`
`10
`
`polynucleotides containing C-5 propynylpyrimidines, locked nucleic acids ("LNAs"), and
`
`like compounds. Such oligonucleotides are either available commercially or may be
`
`synthesized using methods described in the literature.
`
`"Polymerase chain reaction," or "PCR," means a reaction for the in vitro
`
`amplification of specific DNA sequences by the simultaneous primer extension of
`
`15
`
`complementary strands of DNA. In other words, PCR is a reaction for making multiple
`
`copies or replicates of a target nucleic acid flanked by primer binding sites, such reaction
`
`comprising one or more repetitions of the following steps: (i) denaturing the target nucleic
`
`acid, (ii) annealing primers to the primer binding sites, and (iii) extending the primers by a
`
`nucleic acid polymerase in the presence of nucleoside triphosphates. Usually, the reaction is
`
`20
`
`cycled through different temperatures optimized for each step in a thermal cycler instrument.
`
`Particular temperatures, durations at each step, and rates of change between steps depend on
`
`many factors well-known to those of ordinary skill in the art, e.g. exemplified by the
`
`references: McPherson et al, editors, PCR: A Practical Approach and PCR2: A Practical
`
`Approach (IRL Press, Oxford, 1991 and 1995, respectively). For example, in a conventional
`
`25
`
`PCR using Taq DNA polymerase, a double stranded target nucleic acid may be denatured at
`
`a temperature >90°C, primers annealed at a temperature in the range 50-75°C, and primers
`
`extended at a temperature in the range 72-78°C. The term "PCR" encompasses de1ivative
`
`forms of the reaction, including but not limited to, RT-PCR, real-time PCR, nested PCR,
`
`quantitative PCR, multiplexed PCR, and the like. Reaction volumes range from a few
`
`30
`
`nanoliters, e.g. 2 nL, to a few hundred µL, e.g. 200 µL. "Reverse transcription PCR," or
`
`"RT-PCR," means a PCR that is preceded by a reverse transcription reaction that converts a
`
`target RNA to a complementary single stranded DNA, which is then amplified, e.g. Tecott et
`
`al, U.S. patent 5,168,038, which patent is incorporated herein by reference. "Real-time
`
`PCR" means a PCR for which the amount of reaction product, i.e. amplicon, is monitored as
`
`Page 9
`
`FOUNDATION EXHIBIT 1056
`IPR2019-00634
`
`
`
`WO 2012/038839
`
`PCT/IB2011/003160
`
`10
`
`the reaction proceeds. There are many forms of real-time PCR that differ mainly in the
`
`detection chemistries used for monitoring the reaction product, e.g. Gelfand et al, U.S. patent
`
`5,210,015 ("TAQMANTM"); Wittwer et al, U.S. patents 6,174,670 and 6,569,627
`
`(intercalating dyes); Tyagi et al, U.S. patent 5,925,517 (molecular beacons); which patents
`
`5
`
`are incorporated herein by reference. Detection chemistries for real-time PCR are reviewed
`
`in Mackay et al, Nucleic Acids Research, 30: 1292-1305 (2002), which is also incorporated
`
`herein by reference. "Nested PCR" means a two-stage PCR wherein the amplicon of a first
`
`PCR becomes the sample for a second PCR using a new set of primers, at least one of which
`
`binds to an interior location of the first amplicon. As used herein, "initial primers" in
`
`10
`
`reference to a nested amplification reaction mean the primers used to generate a first
`
`amplicon, and "secondary primers" mean the one or more primers used to generate a second,
`
`or nested, amplicon. "Multiplexed PCR" means a PCR wherein multiple target sequences
`
`(or a single target sequence and one or more reference sequences) are simultaneously carried
`
`out in the same reaction mixture, e.g. Bernard et al, Anal. Biochem., 273: 221-228
`
`15
`
`(l 999)(two-color real-time PCR). Usually, distinct sets of primers are employed for each
`
`sequence being amplified.
`
`"Polynucleotide" or "oligonucleotide" is used interchangeably and each means a
`
`linear polymer of nucleotide monomers. Monomers making up polynucleotides and
`
`oligonucleotides are capable of specifically binding to a natural polynucleotide by