`(12) Patent Application Publication (10) Pub. No.: US 2007/0224613 A1
`
`
` Strathmann (43) Pub. Date: Sep. 27, 2007
`
`US 20070224613A1
`
`(54) MASSIVELY MULTIPLEXED SEQUENCING
`
`Related US. Application Data
`
`(76)
`
`Inventor: Michael Paul Strathmann, Seattle, WA
`(US)
`
`Correspondence Address:
`MICHAEL STRATHMANN
`1205 8TH AVE W
`
`SEATTLE, WA 98119 (US)
`
`(21) APP1~ N0-3
`
`11/676,302
`
`(22)
`
`Filed:
`
`Feb. 18, 2007
`
`(60) Provisional application No. 60/774,928, filed on Feb.
`18, 2006.
`
`Publication Classification
`
`(51)
`
`Int. Cl.
`(2006.01)
`C12Q 1/68
`(52) US. Cl.
`................................................. 435/6; 977/924
`
`ABSTRACT
`(57)
`The present invention provides multiplexed methods for
`analyzing polynucleotides associated with sample tags. The
`multiplexed information is deconvoluted by single-molecule
`and more generally single-particle detection methods. In
`particular, a method for determining nucleic acid sequence
`information is provided.
`
`Ariosa Exhibit 1032, pg. 1
`|PR2013-00277
`
`Ariosa Exhibit 1032, pg. 1
`IPR2013-00277
`
`
`
`Patent Application Publication Sep. 27, 2007
`
`US 2007/0224613 A1
`
`18
`
`20
`
`10
`
`12
`
`14
`
`16
`
`I—l
`
`Sample Tag
`
`Fig. 1a
`
`30D
`
`30C
`
`El”
`it”30B
`{if30A
`
`5§4
`
`20
`
`40
`
`14
`12
`10
`I—l
`
`16
`
`Sample Tag
`
`Fig. 1b
`
`Ariosa Exhibit 1032, pg. 2
`|PR2013—00277
`
`Ariosa Exhibit 1032, pg. 2
`IPR2013-00277
`
`
`
`US 2007/0224613 A1
`
`Sep. 27, 2007
`
`MASSIVELY MULTIPLEXED SEQUENCING
`
`1. RELATED APPLICATION DATA
`
`[0001] This application claims the benefit of US. Provi-
`sional Application No. 60/774,928 filed on Feb. 18, 2006,
`which is incorporated herein by reference.
`
`2. FIELD OF THE INVENTION
`
`invention is related to the field of
`[0002] The present
`molecular biology, and provides multiplexed methods for
`analyzing nucleic acids, in particular nucleic acid sequenc-
`ing.
`
`3. BACKGROUND
`
`[0003] The ability to rapidly and inexpensively sequence
`DNA will accelerate the development of pharmacogenom-
`ics, i.e. drugs and other medical treatments tailored to the
`genetic makeup of an individual. The significance of
`improvements to DNA sequencing methodologies is under-
`scored by the stated goal of the National Human Genome
`Research Center to reduce the cost of sequencing a human
`genome to $1000.
`
`Several methods for massively parallel sequencing
`[0004]
`are being commercialized (see for example, 454 Life Sci-
`ences, Solexa, Helicos and Agencourt) which rely on a
`sequencing-by-synthesis approach. This approach relies on a
`polymerase to incorporate one of the four bases per sequenc-
`ing step in a replicating DNA strand (the template), followed
`by detection of the base. Typically, many identical DNA
`strands are sequenced simultaneously in order to produce
`enough signal for detection of the incorporated base. These
`replicating strands must remain “in sync” through each step
`of the sequencing process so that signals do not become
`jumbled. The result can be very short sequencing reads on
`the order of 20-25 bases. The process can be made highly
`parallel by performing the sequencing steps on different
`clusters of DNA strands in the same reaction vessel and
`
`recording signals simultaneously. Variations in the sequenc-
`ing-by-synthesis approach have resulted in longer sequenc-
`ing reads (e.g. 454 Life Sciences) but
`the tradeolf is
`increased overall cost.
`
`[0005] Another strategy for massively parallel sequencing
`involves multiplexing many different templates that have
`been subjected to a standard sequencing reaction (for
`example, Sanger chain-termination reactions). The multi-
`plexed sequencing reactions are separated by size,
`for
`example using a standard polyacrylamide gel, and the tem-
`plates present in any fraction are identified by deconvolution
`of the multiplexed mixture. As with traditional Sanger
`sequencing, the sequence of a template is determined from
`the size-separated “ladder”. The key to these approaches is
`the method for multiplexing the templates and the method
`for deconvoluting the fractionated reaction products.
`
`[0006] Van Ness describes the use ofmass tags that can be
`detected by mass spectrometry (PCT Pat. Pub. No. WO
`97/27331). Different tags are attached to the 5' end of a
`sequencing primer. Each tagged primer is used to sequence
`a different template by the chain-termination method. The
`different reactions are pooled and fractionated by size (i.e.
`sequencing products are collected from the end of a capillary
`electrophoresis device). The tags present in each fraction are
`
`assayed by mass spectrometry. This information is decon-
`voluted to reproduce the “sequence ladders” of the different
`templates. The method is limited by the number of different
`tags that can be synthesized, which in turn limits the number
`of multiplexed templates. The method is limited since it is
`not parallel until the sequencing reactions are pooled.
`
`[0007] Strathmann overcomes the limitations of Van Ness’
`method by employing nucleic acid tags instead of mass tags
`(US. Pat. No. 6,480,791). The number of different nucleic
`acid tags is enormous and simple to achieve, which permits
`very deep multiplexing. The deconvolution of fractionated
`sequencing reaction products is achieved by hybridization of
`nucleic acid tags
`to a DNA microarray comprising
`sequences that are complementary to the tags. The result is
`a massively parallel sequencing method capable of long read
`lengths. DNA microarrays are expensive, however and typi-
`cally require 12 hours or longer to achieve good signal to
`noise ratios with complex samples. The time and expense to
`sequence a human genome may still be too prohibitive for
`“personal genomics”.
`
`[0008] What is needed in the art is a method to sequence
`DNA very rapidly and inexpensively. The instant invention
`addresses this need by providing a massively-multiplexed
`sequencing method and novel deconvolution strategies that
`eliminate the need for microarrays.
`
`4. BRIEF DESCRIPTION OF THE FIGURES
`
`FIG. 1a is a drawing of a preferred embodiment of
`[0009]
`a sample tag joined to a sample polynucleotide.
`
`FIG. 1b is a drawing of a preferred embodiment of
`[0010]
`sequencing primers and amplification primers for preparing
`and analyzing sequencing reaction products that are pooled
`prior to fractionation.
`
`5. SUMMARY
`
`It is an object of the invention to provide massively
`[0011]
`multiplexed methods for analyzing a collection of poly-
`nucleotides,
`particularly for generating nucleic
`acid
`sequence
`information. More
`specifically,
`the method
`employs Sanger or Maxam and Gilbert nucleic acid
`sequencing reactions carried out on a collection of sample
`polynucleotides cloned into sample-tagged vectors so that a
`sample tag preferably is joined to one sample polynucle-
`otide. The sample tags are used to deconvolute the sequence
`information derived from the different sample polynucle-
`otides. Deconvolution is achieved through single-molecule
`and more generally, single-particle detection methods.
`
`6. DETAILED DESCRIPTION OF PREFERRED
`EMBODIMENTS
`
`6.1 Definitions
`
`[0012] A “sequence element” or “element” as used herein
`in reference to a polynucleotide is a number of contiguous
`bases or base pairs in the polynucleotide, up to and including
`the complete polynucleotide. When referring to a sequence
`element with a particular property, the sequence element
`consists of the bases or base pairs that contribute to the
`property or are defined by the property.
`
`[0013] The term “sample” as used herein refers to a
`polynucleotide or that element of a polynucleotide which
`
`Ariosa Exhibit 1032, pg. 3
`|PR2013-00277
`
`Ariosa Exhibit 1032, pg. 3
`IPR2013-00277
`
`
`
`US 2007/0224613 A1
`
`Sep. 27, 2007
`
`will be analyzed for some property according to the method
`of this invention. For example, a sample polynucleotide may
`be joined to other sequence elements to form a larger
`polynucleotide in order to practice the invention. The ele-
`ment of the larger polynucleotide that is homologous to the
`sample polynucleotide is the “sample element” or “sample
`sequence element”.
`[0014] A “sample tag” refers to a sequence element used
`to identify or distinguish different sample polynucleotides,
`sequence elements or clones present as members of a
`collection. In general, an individual sample tag is joined to
`an individual polynucleotide resulting in a collection of
`“sample-tagged”
`polynucleotides
`comprising
`distinct
`sample tags. A sample-tagged polynucleotide may comprise
`one or more distinct sample tags, which are used to distin-
`guish different segments of the polynucleotide. For example,
`sample tags may be present at the 5' and 3' ends of the
`polynucleotide, or different tags may be distributed at mul-
`tiple sites in the polynucleotide. The same sample poly-
`nucleotide may be associated with more than one sample
`tag, but to be informative, one sample tag must be associated
`with only one sample polynucleotide in a collection. It is
`these informative associations that constitute sample-tagged
`clones. Methods for designing sample tags are well known
`in the art as exemplified by, e.g., Brenner (US. Pat. No.
`5,635,400).
`In some embodiments of the invention,
`the
`sample tags may comprise individual synthetic oligonucle-
`otides each of which has been ligated into a vector,
`to
`provide a library or collection of vectors with distinct
`sample tags or the oligonucleotides are ligated directly to the
`polynucleotides to be analyzed. In other embodiments, the
`sample tag may comprise part of the sample sequence
`element.
`
`“Tagged” as used herein in reference to a poly-
`[0015]
`nucleotide means the polynucleotide is derived in one or
`more steps from a sample-tagged polynucleotide by for
`example enzymatic, chemical or mechanical means, and the
`polynucleotide comprises a tag. The “tag” is a sequence
`element that corresponds to a sample tag and can be used to
`identify or distinguish the sample tag. Note a sequence
`element is itself a tag if it is derived from a tag and can be
`used to identify or distinguish the tag. In many embodi-
`ments, the tag and the sample tag are identical. In certain
`embodiments, the tag comprises the sample tag but contains
`additional sequence elements. The additional sequence ele-
`ments may be necessary for example to permit increased
`hybridization temperatures or to impose structural con-
`straints on the tag. In other embodiments, the sample tag
`comprises the tag but contains additional sequence elements.
`For example, two different sample tags that share the same
`tag may be distinguished by preferential PCR amplification
`of the tag with primers that are specific to only one tag.
`Subsequent removal of the priming sequences produces
`identical tags that can be used to distinguish the different
`sample tags. During amplification or another step in the
`invention, the tag could lose all sequence identity with the
`sample tag. Nevertheless, as long as there exists an identi-
`fiable correspondence between the two, information associ-
`ated with the tag can be related to the sample tag which in
`turn can be related to the sample polynucleotide. The
`number of distinct tags required to characterize a collection
`of sample-tagged polynucleotides will vary.
`In some
`embodiments, a one-to-one relationship exists between the
`tag and the sample tag. In other embodiments, the tags will
`
`identify information in addition to the sample identity, for
`example the terminating nucleotide, the restriction site, etc.
`Consequently, more distinct tags than distinct sample tags
`may be used. Finally as outlined above, the same tag may be
`used to identify more than one sample tag.
`[0016] A “tag complement” as used herein refers to a
`molecule that will substantially hybridize to only one tag, or
`a set of distinguishable tags, among a collection of tags
`under the appropriate conditions. Different tags that hybrid-
`ize to the same tag complement may be distinguished for
`example by different fluorophores, by their ability to hybrid-
`ize to a second oligonucleotide, etc. Some degree of cross-
`hybridization by otherwise distinguishable tags can be tol-
`erated, provided the signal arising from hybridization
`between a tag A and its tag complement A' is discemable
`from the cross-hybridization signal arising from hybridiza-
`tion between a different tag B and the tag complement A'. In
`embodiments where the tag complement is a polynucleotide
`or sequence element, preferably the tag is perfectly matched
`to the tag complement.
`In embodiments where specific
`hybridization results in a triplex, the tag may be selected to
`be either double stranded or single stranded. Thus, where
`triplexes are formed, the term “complement” is meant to
`encompass either a double stranded complement of a single
`stranded tag or a single stranded complement of a double
`stranded tag. Tag complements need not be polynucleotides.
`For example, RNA and single-stranded DNA are known to
`adopt sequence dependent conformations and will specifi-
`cally bind to polypeptides and other molecules (Gold et al.,
`U.S. Pat. No. 5,270,163 & U.S. Pat. No. 5,475,096).
`[0017] The terms “oligonucleotide” or “polynucleotide”
`as used herein include linear oligomers of natural or modi-
`fied monomers or linkages, including deoxyribonucleosides,
`ribonucleosides, I-anomeric forms thereof, peptide nucleic
`acids (PNAs), and the like, capable of specifically binding
`under the appropriate conditions to a target polynucleotide
`by way of a regular pattern of monomer-to-monomer inter-
`actions, such as Watson-Crick type of base pairing, base
`stacking, Hoogsteen or reverse Hoogsteen types of base
`pairing, or the like. Usually monomers are linked by phos-
`phodiester bonds or analogs thereof to form “oligonucle-
`otides” ranging in size from a few monomeric units, e.g.,
`3-4, to several tens of monomeric units, and “polynucle-
`otides” are larger. However the usage of the terms “oligo-
`nucleotides” and “polynucleotides” in the art overlaps and
`varies. The terms are used interchangeably herein. When-
`ever a polynucleotide is represented by a sequence of letters,
`such as “ATGCCTG,” it will be understood that the nucle-
`otides are in 5'—>3' order from left to right and that “A”
`denotes deoxyadenosine, “C” denotes deoxycytidine, “G”
`denotes deoxyguanosine, and “T” denotes thymidine, unless
`otherwise noted. Analogs of phosphodiester linkages include
`phosphorothioate, phosphorodithioate, phosphoranilidate,
`phosphoramidate, and the like. It is clear to those skilled in
`the art when polynucleotides having natural or non-natural
`nucleotides may be employed. Polynucleotides or oligo-
`nucleotides can be single-stranded or double-stranded. As
`used herein, “nucleic acid sequencing reaction” refers to a
`reaction that carried out on a polynucleotide clone will
`produce a collection of polynucleotides of differing chain
`length from which the sequence of the original nucleic acid
`can be determined. The term encompasses, e.g., methods
`commonly referred to as “Sanger Sequencing,” which uses
`dideoxy chain terminators to produce the collection of
`
`Ariosa Exhibit 1032, pg. 4
`|PR2013-00277
`
`Ariosa Exhibit 1032, pg. 4
`IPR2013-00277
`
`
`
`US 2007/0224613 A1
`
`Sep. 27, 2007
`
`polynucleotides of differing length and variants such as
`“Thermal Cycle Sequencing”, “Solid Phase Sequencing,”
`exonuclease methods, and methods that use chemical cleav-
`age to produce the collection of polynucleotides of differing
`length, such as Maxam-Gilbert and phosphothioate sequenc-
`ing. These methods are well known in the art and are
`described in, e.g., Ausubel, et al., Current Protocols in
`Molecular Biology, John Wiley, New York, 1997; Gish et al.,
`Science, 240: 1520-1522, 1988; Sorge et al., Proc. Natl.
`Acad. Sci. USA, 86:9208-12, 1989; Li et al., Nucleic Acids
`Res., 21:1239-44, 1993; Porter et al., Nucleic Acids Res.,
`25:1611-7, 1997. The term also includes methods based on
`termination of RNA polymerase (e.g., Axelrod et al.,
`Nucleic Acids Res., 523549-63, 1978).
`
`[0018] A “sequencing method” is a broad term that
`encompasses any reaction carried out on a polynucleotide to
`determine some sequence from the polynucleotide. The term
`encompasses nucleic acid sequencing reactions, sequencing
`by hybridization (Southern, US. Pat. No. 5,700,637;
`Drmanac et al., US. Pat. No. 5,202,231;Khrapko et al., US.
`Pat. No. 5,552,270; Fodor et al., US. Pat. No. 5,871,928),
`step-wise sequencing (e.g. Cheeseman, US. Pat. No. 5,302,
`509; Rosenthal, PCT Pat. Pub. No. WO 93/21340; Brenner,
`US. Pat. No. 5,763,175), etc.
`
`[0019] A “sequence ladder” refers to a pattern of frag-
`ments from one clone resulting from the size separation and
`Visualization of reaction products produced by a “nucleic
`acid sequencing reaction.” Typically, size separation is
`accomplished by denaturing gel electrophoresis. The nucleic
`acid sequence is ascertained by interpreting the “sequence
`ladder” to determine the identity of the 3 ' terminal nucle-
`otides of reaction products that differ in length by one
`nucleotide. Generating and interpreting “sequence ladders”
`is well within the skill in the art, and is described in, e.g.,
`Ausubel et al., Current Protocols in Molecular Biology, John
`Wiley, New York, 1997. A“band” in a sequence ladder refers
`to the clonal population of reaction products that terminate
`at
`the same base and so migrate together through the
`separation medium. A band will have width due to disper-
`sion and diffusion, so it is possible to speak of a part or
`portion of a band, which means a collection of the clonal
`population that has migrated more closely together than
`some other collection.
`
`[0020] A “primer” is a molecule that binds to a polynucle-
`otide and enables a polymerase to begin synthesis of the
`daughter strand. For example, a primer can be a short
`oligonucleotide, a tRNA (e.g. Panet et al., Proc. Natl. Acad.
`Sci. USA., 72:2535-9, 1975) or a polypeptide (e.g. Guggen-
`heimer et al., J. Biol. Chem., 259:7807-14, 1984). A “primer
`binding site” is the sequence element to which the primer
`binds.
`
`[0021] A “sequencing primer” is an oligonucleotide that is
`hybridized to a polynucleotide clone to prime a nucleic acid
`sequencing reaction. The sequencing primer is prepared
`separately, usually on a DNA synthesizer and then combined
`with the polynucleotide. A “sequencing primer binding site”
`is the sequence element to which the sequencing primer
`hybridizes. The sequencing primer binding sites in two
`different polynucleotides are considered to be the same
`when the same sequencing primer will efficiently prime the
`nucleic acid sequencing reaction for both polynucleotides.
`Of course, mispriming frequently occurs during sequencing
`
`reactions, but these artifactual priming sites are minor com-
`ponents of the sequencing reaction products. One skilled in
`the art will
`readily understand the difference between
`mispriming and efficient priming at the sequencing primer
`binding site.
`
`“Deconvoluting” means separating data derived
`[0022]
`from a plurality of different polynucleotides into component
`parts, wherein each component represents data derived from
`one of the polynucleotides comprising the plurality.
`
`[0023] An “array” refers to a solid support that provides a
`plurality of spatially addressable locations, referred to herein
`as features, at which molecules may be bound. The number
`of different kinds of molecules bound at one feature is small
`relative to the total number of different kinds of molecules
`
`In many embodiments, only one kind of
`in the array.
`molecule (e.g. oligonucleotide) is bound at each feature.
`Similarly, “to array” a collection of molecules means to form
`an array of the molecules.
`
`“Spatially addressable” means that the location of
`[0024]
`a molecule bound to the array can be recorded and tracked
`throughout any of the procedures carried out according to
`the method of the invention.
`
`[0025] A “library” refers to a collection of polynucle-
`otides. A particular library might
`include, for example,
`clones of all of the DNA sequences expressed in a certain
`kind of cell, or in a certain organ of the body, or a collection
`of man-made polynucleotides, or a collection of polynucle-
`otides comprising combinations of naturally-occurring and
`man-made sequences. Polynucleotides in the library may be
`spatially separated, for example one clone per well of a
`microtiter plate, or the library may comprise a pool of
`polynucleotides or clones. When a reaction is performed on
`a spatially separated library, the same reaction by definition
`must be performed separately on every member of the
`library. When a reaction is performed on a pooled library, the
`reaction need only be performed once.
`
`“Physical mapping” broadly refers to determining
`[0026]
`the locations of two or more landmarks in a polynucleotide
`segment. The term is meant to distinguish genetic mapping
`methods, which rely on a determination of recombination
`frequencies to estimate distance between two or more land-
`marks, from the methods of the present invention, which
`determine the actual
`linear distance between landmarks.
`
`Similarly, a “physical map” is the product of physical
`mapping.
`
`“Landmark” broadly refers to any distinguishable
`[0027]
`feature in a polynucleotide other than an unmodified nucle-
`otide. Landmarks include, by way of example, restriction
`sites, single nucleotide polymorphisms, short sequence ele-
`ments recognized by nucleic acid binding molecules, DNase
`hypersensitive sites, methylation sites, transposon, etc. This
`definition is meant to distinguish physical mapping from
`“sequencing”, which refers to determining the linear order of
`nucleotides in a polynucleotide.
`
`“Fingerprinting” refers to the use of physical map-
`[0028]
`ping data to determine which nucleic acid fragments have a
`specific sequence (fingerprint) in common and therefore
`overlap.
`
`“Cloning” as used herein in reference to a poly-
`[0029]
`nucleotide refers to any method used to replicate a poly-
`
`Ariosa Exhibit 1032, pg. 5
`|PR2013-00277
`
`Ariosa Exhibit 1032, pg. 5
`IPR2013-00277
`
`
`
`US 2007/0224613 A1
`
`Sep. 27, 2007
`
`nucleotide segment. The term encompasses cloning in vivo,
`which makes use of a cloning vector to carry inserts of the
`polynucleotide segment of interest, and what I refer to as
`cloning in vitro in which one or both strands of a polynucle-
`otide segment of interest is replicated without the use of a
`vector. Cloning in vitro encompasses, for example, replica-
`tion of a polynucleotide segment using PCR, linear ampli-
`fication using a primer that recognizes a portion of the
`polynucleotide segment
`in conjunction with an enzyme
`capable of replicating the polynucleotide, in-vitro transcrip-
`tion, rolling circle replication, etc. Similarly, a “clone” in
`reference to a polynucleotide means a polynucleotide that
`has been replicated to produce a population of polynucle-
`otides or sequence elements that share identical or substan-
`tially identical sequence. Substantial identity encompasses
`variations in the sequence of a polynucleotide that some-
`times are introduced during PCR or other replication meth-
`ods. This notion of substantial identity is well understood by
`those skilled in the art and it applies whenever the identity
`of polynucleotides is at issue.
`[0030]
`“Hybridization” as used herein refers to a sequence
`dependent binding interaction between at least one strand of
`a polynucleotide and another molecule. From the context, it
`is obvious to one skilled in the art whether a double-stranded
`
`polynucleotide must be denatured before the binding event.
`For example,
`the term includes Watson-Crick type base
`pairing, Hoogsteen and reverse Hoogsteen bonding, binding
`of an aptamer to its cognate molecule, etc. “Cross-hybrid-
`ization” occurs when two distinct polynucleotides can bind
`to the same molecule or two distinct molecules can bind to
`
`the same polynucleotide. In general, cross-hybridization
`depends on the collection of polynucleotides (or molecules)
`since two polynucleotides (or molecules) cannot cross-
`hybridize if they are not in the same collection. Hybridiza-
`tion and cross-hybridization also may be used in reference to
`sequence elements. For example, two distinct polynucle-
`otides may contain identical sample tags. The polynucle-
`otides cross-hybridize to the tag complement whereas the
`tags, being identical, do not cross hybridize.
`[0031] A “common sequence” or “common sequence ele-
`ment” refers to a sequence or sequence element that is or is
`intended to be present in every member of a collection of
`polynucleotides.
`[0032] The term “distinct” as used herein in reference to
`polynucleotides or sequence elements means that
`the
`sequences of the polynucleotides or sequence elements are
`not identical,
`
`[0033] A “pool” is a group of different molecules or
`objects that is combined together so that they are not isolated
`from one another and any operation performed on one
`member of the pool is by necessity performed on many
`members of the pool. For example, a pool of polynucleotides
`in solution is simply a plurality of different polynucleotides
`or clones mixed together in one solution; or each clone may
`be attached to a solid support, for example an array or a
`bead, in which case the pool consists of the clones combined
`together in one solution (e.g.
`the same fluid container).
`Similarly, “to pool” means to form a pool.
`[0034] An “aliquot” is a subdivision of a sample such that
`the composition of the aliquot is essentially identical to the
`composition of the sample.
`[0035] The term “to derive” as used herein in reference to
`polynucleotides means to generate one polynucleotide from
`
`another by any process, for example enzymatic, chemical or
`mechanical. The generated polynucleotide is “derived” from
`the other polynucleotide.
`
`[0036] The term “amplify” in reference to a polynucle-
`otide means to use any method to produce multiple copies of
`a polynucleotide segment, called the “amplicon”, by repli-
`cating a sequence element from the polynucleotide or by
`deriving a second polynucleotide from the first polynucle-
`otide and replicating a sequence element from the second
`polynucleotide. The copies of the amplicon may exist as
`separate polynucleotides or one polynucleotide may com-
`prise several copies of the amplicon. A polynucleotide may
`be amplified by, for example a polymerase chain reaction, in
`vitro transcription, rolling-circle replication, in vivo repli-
`cation, etc. Frequently, the term “amplify” is used in refer-
`ence to a sequence element in the amplicon. For example,
`one may refer to amplifying the tag in a polynucleotide by
`which is meant amplifying the polynucleotide to produce an
`amplicon comprising the tag sequence element. The precise
`usage of amplify is clear from the context to one skilled in
`the art.
`
`[0037] The term “cleave” as used herein in reference to a
`polynucleotide means to perform a process that produces a
`smaller fragment of the polynucleotide. If the polynucle-
`otide is double-stranded, only one of the strands may con-
`tribute to the smaller fragment. For example, physical shear-
`ing,
`endonucleases,
`exonucleases,
`polymerases,
`recombinases, topoisomerases, etc. will cleave a polynucle-
`otide under the appropriate conditions. A “cleavage reac-
`tion” is the process by which a polynucleotide is cleaved.
`
`[0038] A “mapping reaction” as used herein refers to any
`reaction that can be carried out on a polynucleotide clone to
`generate a physical map or a nucleotide sequence of the
`clone. Similarly, a “map” is a physical map or a nucleotide
`sequence.
`
`[0039] The term “associating” as used herein in reference
`to a tagged polynucleotide with a property and a tag comple-
`ment means determining that the polynucleotide hybridizes
`to the tag complement. In many embodiments, associating
`simply means hybridizing a polynucleotide with a known
`property to a tag complement and detecting the hybridiza-
`tion. In other embodiments, associating means detecting a
`property of a polynucleotide that is already hybridized to a
`tag complement. In both cases, the result is information that
`the polynucleotide has a certain property and in addition
`hybridizes to the tag complement. The properties of a
`polynucleotide can include for example the length, terminal
`base, terminal landmark or other properties according to this
`invention.
`
`[0040] A “junction” as used herein in reference to inser-
`tion elements is the DNA that flanks one side of the insertion
`element.
`
`[0041] An “array sequencing reaction” is any method that
`is used to determine sequences from a plurality of poly-
`nucleotides in an array, for example methods described by
`Brenner (US. Pat. No. 5,695,934 and US. Pat. No. 5,763,
`175), Brenner et al. (US. Pat. No. 5,714,330), Cheeseman
`(US. Pat. No. 5,302,509), Drrnanac et al. (US. Pat. No.
`5,202,231), Pastinen et al. Genome Res., 72606-14, 1997,
`Dubiley et al. Nucleic Acids Res., 25:2259-65, 1997, Graber
`et al. Genet Anal., 142215-9, 1999, etc.
`
`Ariosa Exhibit 1032, pg.
`|PR2013-00277
`
`Ariosa Exhibit 1032, pg. 6
`IPR2013-00277
`
`
`
`US 2007/0224613 A1
`
`Sep. 27, 2007
`
`[0042] A “single-particle detection method” is defined as
`a method for detecting individual molecules or individual
`particles where a “particle” is a molecule that is amplified
`prior to detection in such a way that
`the amplification
`products remain associated in a group. Examples of particles
`include the products of rolling circle amplification (US. Pat.
`No. 5,854,033 and Lizardi, et al., Nat. Genet. 19:225-232,
`1998), polonies (Mitra, R. D. et al., Analyt. Biochem.
`320:55-65, 2003; Zhang, K. et al., Nature Biotech. 24:680-
`686, 2006), polynucleotides amplified in-situ on a substrate
`(e.g. bridge amplification, US. Pat. No. 5,641,658), BEAM-
`ing (Ghadessy et al. (2001) Proc. Natl. Acad. Sci. USA 98
`p 4552; Dressman et al. (2003) Proc. Natl. Acad. Sci. USA
`100 p 8817), etc. Single-particle detection methods encom-
`pass
`single-molecule detection methods,
`such as
`for
`example nanopores, atomic force microscopy, scanning tun-
`neling microscopy, scanning electrochemical microscopy,
`magnetic resonance force microscopy, surface enhanced
`raman spectroscopy, scanning near-field optical microscopy,
`etc.
`
`[0043] A “Bar code” in reference to a tag or sample-tag
`comprises a series of sequence elements, referred to as
`“segments”. These segments are chosen from a set of
`segments such that different tags comprise different subsets
`and/or combinations from this set.
`
`6.2 Multiplexed Sequencing
`
`[0044] A collection of sample-tagged clones is prepared
`by joining a set of sample polynucleotides with a set of
`sample tags so that many of the sample tags (i.e., preferably,
`at least approximately 35% of the total) are associated with
`unique sample polynucleotides. A preferred sample tag, as
`shown in FIG. 1a, comprises a distinct sequence element 12
`flanked on both sides by common regions 10 & 14 shared by
`the other clones. The sample sequence element 16 comprises
`the sample polynucleotide that is joined to the sample tag. A
`nucleic acid sequencing reaction is performed on the pooled
`collection of sample-tagged clones (i.e., Sanger chain-ter-
`mination method, Maxam & Gilbert chemical cleavage
`method, etc.) Typically, four separate reactions are per-
`formed, which correspond to the four (A, T, G, C) nucle-
`otides. The Sanger method employs the sequencing primer
`18, which hybridizes to the sequencing primer binding site
`in common region 10. In this example, only one sequencing
`primer binding site is needed for the sequencing reaction to
`be performed on the pool of sample-tagged clones. Of
`course, different collections of clones with different com-
`mon regions comprising different sequencing primer bind-
`ing sites may be pooled and more than one primer may be
`utilized, but preferably there will be many more sample-
`tagged clones than sequencing primer binding sites utilized
`in the sequencing reaction. One or a limited number of
`primer binding sites means only a small number of sequenc-
`ing primers are required for the sequencing reaction, which
`produces efficient priming and limits spurious priming arti-
`facts.
`
`[0045] The products of the sequencing reactions (i.e. the
`sequence “ladders”) are separated by size and four sets of
`fractions are collected. Any method of separation may be
`used that sufficiently resolves the sequencing fragments (i.e.
`single nucleotide resolution) and permits collection of the
`fragments in a state compatible with subsequent analysis
`(i.e. amplification and/or flow cytometry, attachment to a
`
`glass slide, etc. see below). Representative methods include
`polyacrylamide gel electrophoresis, capillary electrophore-
`sis, chromatography, etc. These methods are well known in
`the art and are described in, e.g. Ausubel et al. Current
`Protocols in Molecular Biology, John Wiley, New York,
`1997; Landers, Handbook of Capillary Electrophoresis,
`CRC Press, Boca Raton, Fla., 1996; and Thayer, J. R. et al.,
`Methods Enzymol., 271:147-74,